Statistics of collision parameters computed from 2D simulations

There are two popular ways to speed up simulations of planet formation via increasing the collision probability: ({\it i}) confine motion to 2D, ({\it ii}) artificially enhance the physical radii of the bodies by an expansion factor. In this paper I have performed 100 simulations each containing $10^4$ interacting bodies and computed the collision parameters from the results of the runs. Each run was executed for a lower and a higher accuracy parameter. The main goal is to determine the probability distribution functions of the collision parameters and their dependence on the expansion factor. A simple method is devised to improve the determination of the collision parameters from the simulation data. It was shown that the distribution of the impact parameter is uniform and independent of the expansion factor. For real collisions the impact velocity is greater than 1 mutual escape velocity, a finding that can be explained using the two-body problem. The results casts some doubts on simulations of the terrestrial planets' final accretion that have assumed merge. Collision outcome maps were created adopting the fragmentation model of \cite{Leinhardt2012} to estimate the number of different types of collisions. A detailed comparison with earlier works indicates that there are similarities as well as significant differences between the different works. The results indicate that as the planetary disc matures and the masses of the bodies differs progressively than the majority of collisions lead to mass growth either via partial accretion or via graze-and-merge collision.


INTRODUCTION
According to the most widely accepted model, the nebular hypothesis states that the formation process of the terrestrial planets is the result of a series of three consecutive but partly overlapping stages (Lissauer 1993;Chambers 2004;Morbidelli et al. 2012). In the first stage the micrometer-sized dust grains coagulates into small aggregates through pairwise collisions (Dullemond & Dominik 2005). The collisions may proceed further to form 1 -100 km sized bodies, which are called planetesimals. It must be noted that this process making planetesimals can only be effective if the conditions are suitable (Weidenschilling 1977(Weidenschilling , 1997. Weidenschilling (1977) has described the dust motion in protoplanetary discs and showed that dust grains from micron to a few meters in size experience a radial motion towards the star. This radial drift strongly depends on the size and bodies with a critical size spiral into the star in a fraction of the disc lifetime. This loss of material quickly clears the disc preventing planetesimal formation. This depletion process is called the "radial-drift barrier" of planet formation (Laibe, Gonzalez & Maddison 2012) and was first studied in a minimum mass solar nebula in which the critical size corresponds to meter-sized bodies and ⋆ E-mail: a.suli@astro.elte.hu thus later was inaccurately referred to as the "meter-size barrier".
There exist an alternative hypothesis, where planetesimals or even Mars-size embryos may form directly from small dust grains concentrated by turbulence followed by gravitational collapse (Johansen, Youdin & Mac Low 2009;Cuzzi, Hogan & Bottke 2010), or in pressure maxima of protoplanetary disks (Lyra et al. 2008). Moreover, if planetesimals are concentrated in a pressure maximum, they can be quickly accreted by a growing embryo leading to the rapid formation of a solid core and a giant planet (Lyra et al. 2008;Sándor, Lyra & Dullemond 2011;Guilera & Sándor 2017). In this study I focus on the collision theory.
Despite of the radial-drift barrier, the Solar System and the observed exoplanets prove the existence of an efficient and robust process that creates planetesimals in large numbers. Once planetesimals have formed their gravitational interactions controls further growth, which determines their velocities and the parameters of occasional collisions. If turbulence is not too vehement, than dynamical friction ensures that the largest planetesimals have low relative velocities. This leads to runaway growth in which the largest objects grow more rapidly than smaller ones (Wetherill & Stewart 1989;Kokubo & Ida 1996).
The next stage is the oligarchic phase when the largest objects contain enough mass to dominate the velocity dis-persion of smaller planetesimals (Ida & Makino 1993). In the oligarchic growth mode the disc may partitioned into rings where each ring is dominated by a single planetary embryo that sweeps up planetesimals in its vicinity (Kokubo & Ida 1998). Nearby protoplanets grow at similar rates and their orbits are separated by more than 5 Hill radius.
Oligarchic growth ends when the number of planetesimal drops below such a threshold where their damping effect on protoplanet orbits become insufficient to prevent orbit crossing. This initiates the last phase of terrestrial planet formation, involving the collision and mutual accretion of protoplanets and embryos the so called giant impacts phase. During this phase the number of embryos declines and the final planets forge, in stable non-crossing orbits. Most of the evolution took place after the gas disc had dissipated, so the processes mainly consisted of gravitational interactions and occasional collisions between embryos.
In the early stages of planet formation the vast number of objects present in the dynamical system render the problem unfeasible for direct N -body methods. But at the beginning of the last stage the number of significant bodies drops to a few 10 2 to 10 3 hence N -body integrations become applicable for calculating the evolution of the system. These methods are accurate and were used extensively to study the final assembly of the terrestrial planets.
According to previous studies (Chambers 2001(Chambers , 2013Liu, Zhou & Wang 2011) the timescale of planetary systems formation is ∼100 Myr. Integrating the motion of several 10 2 to 10 3 gravitationally interacting bodies for ∼100 Myr still requires a huge computational effort consequently two-dimensional (2D) model was frequently used. Another method to speed up N -body simulations of planet formation is to scale up the physical radii of the bodies by a factor. Both of these techniques increase the collision probability between objects and thus reducing the computational time.
The 2D integrations of planet formation (Cox & Lewis 1980;Lecar & Aarseth 1986;Beaugé & Aarseth 1990;Alexander & Agnor 1998) have notably contributed to our understanding of planetary accretion, and motivated scholars to investigate the problem in more detail or extend the model into 3D. On the other hand, there are findings indicating that 2D simulations may not provide a feasible model of a planet formation. According to Kokubo & Ida (1996) runaway growth takes place in 3D simulations but not in 2D, although this growth mode might have played a key role in planet formation. Another result of Chambers & Wetherill (1998) indicated that the time scales of collision in 2D and 3D simulation differ significantly (a factor of ∼10) thus the evolution of large bodies -the major concern in planet formation -follows different paths.
The numerical N -body simulations that assumed perfect accretion have been successful at reproducing the broad characteristics of the terrestrial planets in our Solar System (Wetherill 1994;Chambers 2001;Quintana et al. 2002;Raymond et al. 2004Raymond et al. , 2006Raymond et al. , 2009O'Brien et al. 2006;Quintana & Lissauer 2006Quintana et al. 2016;Liu, Zhou & Wang 2011). These results were all based on a small (up to a dozen) number of realizations performed for each set of initial conditions. A larger set of 50 simulations of planet formation around the Sun was recently performed by Fischer & Ciesla (2014), who using the perfect-accretion model, demonstrated the need for a larger suite of simulations in order to infer results from a distribution of final planet configurations.
To date the late stage has been mainly investigated by accurate N -body simulations while collisions were modeled generally by perfect accretion. However, this oversimplified assumption breaks for collisions with higher velocity and/or larger impact parameter (Kokubo & Genda 2010). This simplification was necessary in order to keep the number of bodies below the initial number and the lack of detailed models on collisions between planetary mass bodies. Having a better concept of collisions and how it influences the evolution of the planets is a primary key in understanding how planets emerge from a swarm of planetesimals. For this very reason , hereafter LS12 have conducted high-resolution simulations of collisions between planetesimals and the results were used to isolate the effects of different impact parameters on collision outcome. Their model predicts the size and velocity distribution of fragments as a function of the impact velocity, impact angle and projectile to target mass ratio. The authors identified the boundaries between different types of collision: (i) simple merger, (ii) merger with some mass escaping as fragments, and (iii) hitand-run collisions. The analytic model developed by LS12 is a powerful tool that has two major advantages: it improves the physics of collisions in numerical simulations of planet formation and collision evolution, and is easy to adapt to an N -body code.
The collision model of LS12 made possible to implement more sophisticated, inter-particle gravity enabled N -body simulations of late-stage planet formation. Lines at al. (2014) used this model to simulate planet formation in the Kepler-34(AB) system's circumbinary protoplanetary disk to examine whether planets can form in a hostile environment. The same collision model for gravity-dominated bodies has been implemented into an N -body tree code (Bonsor et al. 2015) and used to examine planet formation. Chambers (2013) implemented this comprehensive collision model into the widely used M ercury integration package (Chambers 2001). Eight simulations of planet formation using the new collision model were presented and compared to eight simulations that were previously performed using the perfectaccretion collision model (Chambers 2001). The new simulations form 3 to 5 terrestrial planets moving on widely spaced orbits with growth complete by 400 My. The final planets that formed in each of these sets were shown to be comparable despite the significant difference in the number and frequency of collisions. In addition, the accretion timescales were about twice as long when fragmentation was included.
Terrestrial planet formation are characterized by countless collisions among bodies highlighting that collisions are the core agent of planet formation. Impacts outcome span multiple regimes depending on the collision parameters. The collision history of the final body can largely influence the planet's growth, stability, bulk composition, and habitability (Chambers 2001(Chambers , 2013Bonsor et al. 2015).
In this paper I present the results of 100 simulations of planet formation in 2D started from the beginning of the stochastic stage. The main focus of the work is to present a detailed and comprehensive picture of the statistical distribution of the collision parameters. The dependence of these statistics on the expansion factor is also studied and a comparison of the results with others is presented. This is a first step towards examining the statistical distribution of the collision parameters. With a reliable statistics one can assess the frequency of different collision outcome and posterior estimate the number of collisions lead to perfect merging. This is essential since in the majority of the N -body simulations colliding bodies were assumed to merge into a single new object, however this basic premise has not been thoroughly and deliberately tested. This is a very strong assumption and can largely bias the result of planet formation, thus there is an impetus to assess the occurrence of perfect merging. The presented statistics can be also useful for N -body modelers to preform Monte Carlo simulations of planet formation including the model of LS12 .
Using the scaling laws derived by LS12 Stewart & Leinhardt (2012) presented a retrospective analysis of previous simulations to estimate the range of true collision outcomes. In this work a similar analysis is performed on the collision data adopting the LS12 model to estimate the occurance of collision outcomes and the results are compared to previous works.
The rest of this paper is organized as follows. Section 2 contains a description of the simulations and initial conditions. Section 3 describes the results of these simulations, including the speed of the runs, the description of the collision geometry, the collision data and the distribution of the collision parameter. Section 4 contains a simple analytic model which is used to discuss the observed impact velocity distribution. Section 5 shows the collision outcome maps derived by the model of LS12 and the collisions of the simulations are plotted on the maps. A comparison with other works are also presented and the main results are summarized in Section 6.

THE SIMULATIONS
The equations of motion of 10 4 protoplanets around a star in the barycentric coordinate system Oxy was integrated. The protoplanets are confined to 2D and placed initially in the terrestrial region extending from 0.5 to 1.5 au. In this model each body was interacting with all the other bodies and all bodies were modeled as a sphere.
In order to determine the total mass of solids in the ring the parameters of the minimum mass solar nebula (Hayashi 1981) was used in which the surface density of solids is: where Σ1 = 7 gcm −2 is the surface density of solids at 1 au and r is the distance from the star. Integrating Eq. (1) from 0.5 to 1.5 au results in mtot = 5.123 × 10 −6 M⊙ = 1.71M⊕ where M⊙ and M⊕ denote the mass of the Sun and the Earth, respectively. The simulations were started with Np = 10 4 bodies around a star with 1 M⊙. Each protoplanet has the same physical properties: mp = mtot/Np = 5.123 × 10 −10 M⊙ ≈ MCeres and the density is ρp = 2.0 gcm −3 which corresponds to the mean density of Ceres and consistent with silicon-rich rocky bodies. Consequently the radius of the bodies Rp is approximately 500 km (see Table 1).
The semi-major axes a for this number of protoplanets are generated randomly with probabilities weighted in order to reproduce the disc surface density profile as defined by Eq. (1). Eccentricities e are randomized from a Rayleigh distribution with rms values of 0.02 with an upper limit of 0.2. The mean anomaly M and argument of pericenter ω are randomized uniformly from [0, 2π].
In the simulations the gas disk had dissipated by this stage and all collisions were assumed to be perfectly inelastic, forming a new body by conserving mass and linear momentum. Most of the previous work on planet formation used this oversimplified treatment of collisions.
The N -body systems are stochastic and so a large number of simulations of a given system, with changes in the initial conditions, are required in order to reach relevant conclusions. Most numerical simulations designed to explore the stochastic stage lack the large number of realizations needed to account for the stochastic nature of N -body systems. I improve on this limitation by performing 100 simulations of planet formation around a Sun-like star. To study the dependence on the expansion factor f used to artificially enhance the physical radii of the bodies I have preformed 5 different set of simulation for f = 1, 2, 3, 5, 10. In order to have statistically meaningful results 10 sets of initial conditions were generated, denoted by run id = 1, 2, . . . , 10. For each value of f all these 10 initial condition were integrated for T = 10 6 years. The integrator used an adaptive step size with a tolerance of ǫ = 10 −10 and ǫ = 10 −13 for all 50 cases, resulting in total 100 simulations.
The applied integrator is the Runge-Kutta-Fehlberg 7(8) algorithm with adaptive step size, which has an acceptable speed and accurate in most situations (Fehlberg 1968). All simulations were performed with an open source GPU code red.cuda 1 designed to integrate planet and planetesimal dynamics in the framework of the core accretion planet formation. The red.cuda is written in CUDA C and runs on all NVIDIA GPUs with compute capability of at least 2.0. All simulation were performed on Tesla K20Xm device containing 2688 CUDA Cores and uses CUDA driver version and run-time version 7.0 and 6.5, respectively.

Speed of simulation run
I show how the typical CPU run-time of a simulation denoted by T depends on f . Fig. 1 displays T for each f value and for each of the 10 different runs. In the left panel the data correspond for the ǫ = 10 −10 accuracy level, while the right one for ǫ = 10 −13 . The statistics of the data are given in Table 2 where the minimum, maximum, mean and standard deviation of T is calculated from the sample of the 10 runs belonging to a given f . The standard deviations, denoted by Figure 1. The run-time of all the 2×50 simulations for the two accuracy parameters. The data are plotted for f = 1, 2, 3, 5, 10 as plus sign, asterisk, diamond, triangle, and square symbols, respectively.
sd are usually more useful to describe the variability of the data. From Fig. 1 and Table 2 it is clear that the variation of the run-time is the most significant for f = 1 and it gradually diminishes as f gets larger.
The other observable fact is the decrease of T as f increases. Each mean run-time T (4 th and 8 th columns in Table  2) were divided by the maximum of the means: and the ratios Q were plotted in Fig. 2 as a function of f for both accuracy values. From the figure it is clear that Q is approximately inversely proportional to f , and the 1/f function (dotted curve) is apparently a very good approximation of the measured data. In 2D one can state that increasing the collision cross section of the bodies by f will reduce the simulation time by f . The q value in Table 2 is calculated for each f as which has a mean of 1.527, indicating a 53% increase in the mean run-time for a 3 orders of magnitude higher accuracy. The run time can be written as where n is the number of collisions, N (ti, f ) is the number of bodies at time ti, δti is the simulation time between the ith and (i + 1)th collision and η is a parameter characterizing the hardware. The number of bodies versus time is depicted in Fig. 3 for different f on logarithmic scale. We have an almost identical figure for ǫ = 10 −13 . These curves are piecewise constant functions since each collision reduce the number by 1. It is visible that the curves in the t ≤ 1 and t ≥ 10 5 yr intervals are similar and in between their behavior are alike, but the bigger f is the smaller the time when the fast decrease begins. These curves match very well with previous 2D simulations e.g. Lecar & Aarseth (1986); Alexander & Agnor (1998). According to Eq. (4) the t ≤ 1 part hardly contributes to the total run time since it is only 1 yr (δti is small), while in the t ≥ 10 5 interval N is a few dozens for all f values therefore it takes the same amount of time to complete. In order to demonstrate the dependence of the run time on f shown in Fig. 2 let's write the ratio of the instantaneous run time belonging to f = 1 and f for specific simulation time epochs as follows where τ k = 10 k . The results are depicted in Fig. 4 for the five different time epochs. We have an almost identical figure for ǫ = 10 −13 . It is clearly visible that with increasing t the Q k ratios fit better and better to the 1/f function. The best fit is for k = 2, 3 and 4 (the Q3 and Q4 curves overlap). According to Eq. (4) and Fig. 3 it is obvious that in these cases both N and δt are large, therefore principally this interval determines the run-time. This reasoning is not an exact proof why the run time depends on the reciprocal of f but clearly demonstrates the underlying cause, i.e. the specific decrease of N as shown in Fig. 3.

Collision geometry and parameters
In the model initially all bodies have the same mass, but as collisions take place the masses will differ. Fig. 5 shows the geometry of a collision. In what follows the target is the body with the larger mass mt and the other body is called the projectile with mass mp. In the figure the target is stationary and the projectile is moving from right to left with speed Vi, Table 2. The run-time in seconds for ǫ = 10 −10 and ǫ = 10 −13 . In the 2 nd column the minimum, in the 3 rd the maximum, in the 4 th the mean and in the 5 th the standard deviation of the T is given for ǫ = 10 −10 . The same values are given in 6 th -9 th for ǫ = 10 −13 . In the last column q is the ratio between the means, see Eq.
(3). the units on the x and y axes are in km. The impact angle θ, is defined at the time of first contact as the angle between the line connecting the centers of the two bodies and the impact velocity vector (when θ = 0 • the collision is refereed as headon, while the θ = 90 • case corresponds to a grazing collision).
where B is the y coordinate of the projectile's centre and d is distance between the centers of the bodies. The impact parameter has a notable influence on the collision outcome, because the kinetic energy of the projectile may only partially intersect the target for oblique impact, e.g. in the collision geometry shown in Fig. 5, the top of the projectile does not directly hit the target. Consequently a portion of the projectile may shear off and only the kinetic energy of the lower part of the projectile will be involved in disrupting the target. According to LS12 the outcome of a collision is dependent on the kinetic energy of the interacting mass which strongly depends on b.
In order to describe the dependence of catastrophic disruption on impact angle, two geometrical collision groups were introduced by LS12: non-grazing, where most of the projec-   tile interacts with the target and grazing where less than half the projectile interacts with the target. Following Asphaug (2010), the critical impact parameter is reached when the centre of the projectile is tangent to the surface of the target. Grazing impacts are defined to occur when b > bcrit. In Fig. 6 a non-grazing collision from the simulation with f = 1, run id = 1 and ǫ = 10 −10 is plotted. The reference frame is fixed at the target whose centre is marked by Ot, the Otx and Oty axes are parallel with the Ox and Oy axes, respectively. In the figure Op marks the centre of the projectile. The solid circles denote the colliding bodies. The origin of the impact velocity vector Vi is at Op. To better visualize the impact geometry the system is rotated until the impact velocity vector became parallel with the Otx axis and points towards left (as in Fig. 5). The rotated projectile is plotted by a dashed blue circle and its centre is denoted by where rij = |rj − ri| is the mutual distance of the point masses, and ri is the particle's barycentric position vector. The projectile and the target are overlapping which is a consequence of the collision detection method applied in the numerical code. After each integration step the mutual distances of the bodies are calculated and compared against the sum of the bodies' radii multiplied by f . A collision happens whenever the criterion fulfilled. When solving the equations of motion the particles are treated as point masses therefore two bodies can get arbitrary close. Since the collision detection is done after the integration step the collision can be detected only after the real physical contact hence overlapping is inevitable. The extent of the overlap can be defined as when o = 0 the two spheres just touches each other. A consequence of the overlap is that the parameters defining the collisions cannot exactly determined from the data produced by the numerical simulation. An improvement to this problem is described in Section 3.3. The impact velocity is given in mutual escape velocity unit which is defined as follows (Genda et al. 2012): where kG is the Gaussian constant of gravity. In the literature on planetary collisions QR denotes the specific energy of impact and it is given by where vt and vp are the speed of the target and projectile with respect to the centre of mass, respectively. The specific impact energy is shown in the upper left corner of Figs. 6 and 7. The projectile-to-target mass ratio, γ = mp/mt and the overlap o are displayed in the upper right corner.

Improve the parameters
The difficulty caused by the overlap of the bodies can be overcome by the following method. Let us assume that the direction of the relative velocity vector vij does not change during the last integration step (after which the collision was detected). This assumption was checked with additional simulation where the step size was 10 seconds and vij was monitored. It was found that vij hardly changed after the first contact. In this case it is possible to shift the rotated projectile (dotted blue circle on Fig. 6) along the x-axis (i.e. parallel with Vi) until the first contact of projectile and target. The x ′′ p -coordinate of the shifted projectile is (see Figs. 6 and 7) and the improved collision parameter b ′ and the associated collision angle θ ′ can be computed as In Fig. 6 one can see that b = 0.465 and θ = 27.74 • , while the numeric values of the improved parameter is b ′ = 0.361 and θ ′ = 21.19 • . Since rij ≤ f (Rt + Rp) therefore it follows that b ′ ≤ b ≤ 1 and θ ′ ≤ θ which means that the real impacts tend to be more head-on collisions than the ones detected from the raw numerical data.
To check this method the results from the ǫ = 10 −13 simulation is used and the same collision is displayed in Fig. 7. As expected by using higher accuracy the collision detection happens with a significantly smaller overlap of 0.118. The computed b = 0.416 is smaller than in the ǫ = 10 −10 case. From Fig. 7 the improved parameter b ′ = 0.367, which agrees very well with the value computed from the ǫ = 10 −10 case using Eq. (14) proving the efficiency of the method.
Using the above procedure and applying the law of energy conservation, the impact velocity and specific impact energy can also improve. The total mechanical energy E1 when the collision is detected (blue circles on Figs. 6 and 7) and the energy of the shifted projectile (red circles) E2 are equal and it allows one to compute the improved impact speed of the shifted projectile: Table 3 presents the parameters discussed in detail above for the two accuracy parameters. The value of o decreased by 47%, b by 10% and b ′ increased by 2% for the higher precision. The Vi , V ′ i , QR and Q ′ R are almost the same for the two precisions, the relative changes are less than 3%. From Eq. (15) it follows that the real impact velocity is slightly less than the measured one.

Description of the collision data
During the simulations the ri and vi vectors and the physical properties of the colliding bodies were recorded in a database.
Hereafter I refer to a dataset as single run data when I select from the database records with a specific f and a single run id, and to a dataset as multiple run data when I select from the database records with a specific f and multiple run ids. For the multiple run data I always selected the run ids from 1 to 10.
In general at the end of each simulation ∼10 bodies remained so for each single run the number of collisions n, i.e. the sample size is n ≈ 9990 therefore a multiple run data set contains approximately 99 900 collisions. In total for the 50 runs there were 499507 and 499455 collisions for ǫ = 10 −10 and ǫ = 10 −13 , respectively. In this paper I used this database to study the distribution of the collision parameters. The details of the samples are given in Table A1 for ǫ = 10 −10 and ǫ = 10 −13 , where the rows with run = 1 -10 correspond to multiple run data, while the others correspond to single run data. In the table the minimum and maximum values of Vi and QR are also listed. For ǫ = 10 −10 the values are on the left side, while for ǫ = 10 −13 on the right side.
According to the results the minimum value of b ′ is 9.22 × 10 −7 and 2.99 × 10 −8 for ǫ = 10 −10 and ǫ = 10 −13 , respectively, while the maximum is 1 for both ǫ. This parameter has a theoretical range of [0, 1]. From the data it is clear that it does not depend on f . To save space b ′ is not listed.
In Table A1 with f = 1 the smallest value of V ′ i is 0.978 while the biggest is 19.517. The variation of min(V ′ i ) is minor for a fixed f and its value decreases with increasing f from 0.99 to 0.2. The values of max(V ′ i ) do not decrease with increasing f , they are 8.356 +2.001 −1.146 except for one case for f = 1 and run id = 6 when max(V ′ i ) = 19.517. In Table A1 the data for this specific run without the 19.517 value is also presented (empty cells have the same value as above). It turns out, that the maximum of V ′ i drops down to 8.299, which is well in the other runs' range. This is an extreme value and in this single run data all other impacts are within the above range.
The minimum of Q ′ R is 1.23×10 4 J kg −1 while the maximum is 1.31×10 7 J kg −1 for f = 1. The minimum values fall in a rather narrow range for a specific f which decreases with increasing f . The maximum values fall in a limited range for a specific f which shows a decreasing trend with increasing f .
For each f the table shows the range for the multiple run data, designated by 1 -10 in the run column. For f = 1 the data are listed containing the extreme impact speed and without it.
As reported by Table A1 in the case of ǫ = 10 −13 the sample sizes are in almost perfect agreement with the lower accuracy runs, the ranges of the parameters are very similar and the data show analogous behaviour. In this case there is no extreme value. Comparing the results for the two accuracy parameters one can conclude that the size of the populations, the ranges of the impact speed and specific impact energy are very similar.
In Table 4 the mean of the minimum and maximum values of b ′ are calculated and listed along with their standard deviations. It is clear that these values practically do not depend on f : the minimum is very close to 0 while the maximum to 1, the standard deviation is very limited for both quantities, the deviations around the mean is small. The figures are alike for ǫ = 10 −13 , therefore they are not shown.
In Table 5 the mean of the minimum and maximum values of V ′ i are calculated from the values given in Table A1 and Table 4. The mean of the minimum (2 nd column) and maximum (4 th column) of b ′ for multiple run data and their standard deviations in the 3 rd and 5 th columns, respectively. Table 5. The same for V ′ i as in Table 4. listed along with their standard deviations for ǫ = 10 −10 and ǫ = 10 −13 . The mean of the minimum value depends strongly on f , it decreases from 0.99 to 0.28 in both cases, while the maximum essentially does not depend on it, but stays around 8.3. The standard deviation is confined to a narrow interval for both the minimum and maximum, indicating modest variability of the values. In Table 6 the extreme values and the mean of Q ′ R derived from the data given in Table A1 are shown along with their standard deviations for the two accuracy parameters. As expected, the mean of the minimum value depends strongly on f , it decreases from ∼13 × 10 3 to ∼1.8 × 10 3 in both cases, while the maximum drops from ∼4.4 × 10 6 to ∼2.7 × 10 6 and than stays constant around this value as f increases.
As a final note the mean, median and sd for all multiple run data were calculated for the two accuracy parameters. These results are presented in Table 7. The median is the value separating the higher half from the lower half of a data sample. The impact parameters' mean and median value are close to 0.5, which is the middle of its range. In the case of V ′ i the mean is definitely larger than the median for all data sets. This implies that there are some large values which affect the mean and most of the values are closer to the median than to the mean. This fact also reflects in the relative large sd values. Since the specific impact energy is a strong function of the impact speed, the median of Q ′ R is significantly smaller than the mean. Again this is visible from the larger spread of the data, given the larger sd values.

The bin size
To construct the histograms, the first step is to bin the range of values and then count how many values fall into each interval. In this paper the bins are consecutive, adjacent and non-overlapping intervals of a variable. The bins have equal size. Since there is no ultimate method to determine the num-  Table 4. ber of bins, after some experimentation I found that the Rice rule gives the best number of bins: where bs is the bin size, min, max denotes the minimum and maximum of the underlying variable and the braces indicate the ceiling function. Throughout this paper I use normalized histograms to display relative frequencies which is the proportion of cases that fall into each of several categories, with the sum of the heights equalling 1. Histograms give an approximation of the probability density function, pdf of the underlying variable. A cumulative histogram refers to the running total of the values. That is, the cumulative histogram Hi of a histogram hj is defined as: hj.
Since the collision parameters have continuous probability distribution hereafter I will use the terms pdf and cdf.

Distribution of collision parameters
3.6.1 Analysis of the correction for the overlap In section 3.3 a method was presented to improve the impact parameter, speed and specific impact energy (see Eqs. (14) and (15)). Here I show how it influences their distribution. In Fig. 8 I have summarized the result for f = 1 and run id = 1 with ǫ = 10 −10 . Panel a) shows the pdf of θ (solid black curve) and θ ′ (red dotted curve), panel b) displays the pdf of b (solid black curve) and b ′ (red dotted curve). The number of collision is n1 = 9991 and the bin size is bs(θ ′ ) = 2.045 • and bs(b) = bs(b ′ ) = 0.023. The bin size for θ is ≈ 2.4 • which is larger than for θ ′ . The reason for this difference originate from that max(θ) = 105 • which is larger than the theoretically allowed 90 • ; another reason to improve the parameters. In Fig. 8 from panel a) it is clearly visible that the distribution of θ ′ is different from that of θ. For θ < 40 • the relative frequency of θ is similar to that of θ ′ , while for θ > 40 • the pdf curves diverge more and more, the relative frequency of θ is higher than θ ′ . Our 2D results do not support the observation of Shoemaker (1962), who concludes that a 45 • impact angle (b = 0.707) is most probable. Preliminary results of 3D runs shows evidence of a peak around 45 • therefore it seems that there is a significant difference between 2D and 3D model Table 7. The mean, median and standard deviation of b ′ , V ′ i and Q ′ R . The results are given for all multiple run data. when collision parameter is an important aspect. The causes of this discrepancy should be be discussed in a separate paper. From Table 8 cdf(θ ′ ) reaches 90% at θ ′ ≈ 61 • and 99% at θ ′ ≈ 77 • . These values are shown with dotted and solid black vertical lines denoted by θ ′ 90 and θ ′ 99 , respectively. In Fig. 8 Table 8 the cdf(b ′ ) reaches 0.9 at b ′ ≈ 0.864 and 0.99 at b ′ ≈ 0.977. These values are shown with dotted and solid black vertical lines denoted by b ′ 90 and b ′ 99 , respectively. By eyeballing the data it is reasonable to state that the distribution of b ′ is uniform in the range [0, 0.95], and has a sharp decrease in the last bin.
In Fig. 8  and V ′ i,99 , respectively. There are no data with V ′ i < 0.993, the maximum of the pdf is around 1 than it decreases fast as V ′ i gets larger. For Vi ≥ 5 the probability of a collision is less than 1%.
Panel d) shows the distribution of QR and Q ′ R . The pdf and cdf functions are almost identical. From Table 8 cdf reaches 90% at Q ′ R ≈ 3.07 × 10 5 and 99% at Q ′ R ≈ 10 6 J kg −1 . These values are shown with dotted and solid black vertical lines denoted by Q ′ R,90 and Q ′ R,99 , respectively. As it is apparent from Fig. 8 the difference between the raw and improved quantities are not negligible, primarily in the case of the impact angle, parameter (panel a, b) and speed (panel c). This finding provides a further argument to use the correction for the overlap method described in section 3.3.
In Fig. 9 the same quantities for ǫ = 10 −13 are shown. The overall behavior of the curves are very similar to the ǫ = 10 −10 case but, as expected, the difference between the raw and improved quantities are smaller.
To compare directly the results produced by the runs with ǫ = 10 −10 and ǫ = 10 −13 in Fig. 10 the distribution of b ′ are presented on the left panel for both accuracy values. Apart from a random variation the pdf curves are akin and fluctuates around the same mean while the cdf curves practically cannot be distinguished from each other. From the right panel of Fig. 10 there is apparently a notable difference between Table 8. Summary of the values for f = 1 and run id = 1 at which the cdf reaches 90% and 99% for ǫ = 10 −10 and ǫ = 10 −13 . the pdf curves of V ′ i around 1, where the red dotted curve is about 0.05 higher than the solid black one indicating more frequent collision with lower velocity when the accuracy is higher. Beyond 1 the shape of the two curves is very similar, although the dotted curve is slightly above the black curve, consequently there are more impact with lower velocity for the higher accuracy. The cdf curves are almost identical.
From these comparisons and analysis it is evident that one should always correct the impact parameters as it was described in section 3.3 and from a statistical point of view it is enough to follow the evolution of the system with a lower accuracy parameter if one is interested only in the collision statistical properties of the system. The only exception is the frequency of collisions around 1, where the lower accuracy simulations provide lower frequency.

Comparison of statistics derived from single runs
It is instructive to compare the results of the 10 different runs for a given f . In Fig. 11 the left panel shows the pdfs and cdfs of b ′ for f = 1 for all the 10 runs. The pdf curves are akin and all have a slight decrease at b ≥ 0.95. The mean of the pdfs is 0.0222 which is shown by the thick horizontal line. Lets assume a uniform distribution between [0, 1] with k = 44 bins then the mean of it is 1/k = 0.0227 which is very close the observed value. Except from a random variation around the mean all the pdfs are similar to each other. Furthermore the cdf curves are essentially identical and has a slope of 1. These properties strongly indicates that the impact parameter has a uniform distribution within [0, 0.95], for b > 0.95 the pdf drops off slightly.
As noted earlier the impact parameter has a notable influence on the collision outcome. For equal size bodies γ = 1 and bcrit = 0.5 which is shown by a solid thick black vertical line on the left panel of Fig. 11, denoted by γ1:1. According to the figure and make use of the result that b has a uniform distribution the probability of a grazing impact is P (grazing) = 1 − bcrit.
Assuming equal density the critical parameter can be express with γ as bcrit = 1 1 + γ 1/3 , so the probability of a grazing impact as a function of γ is which yields 50% for equal size bodies. For e.g. γ = 1 : 40 the bcrit = 0.77 which is denoted by γ1:40 on Fig. 11. This implies that less than 23% of the collisions lead to grazing impact for bodies with a γ ≤ 0.025. As the protoplanetary disc matures and larger and larger bodies emerge from the swarm γ may reach smaller values and the chance of a grazing impact reduces.
The right panel of Fig. 11 shows the pdfs and cdfs of V ′ i for the same 10 runs. There are no collisions with impact speed less than 0.993, the maxima of the pdfs are at ≈ 1.07, beyond it the pdfs drop off quite steeply at first, but then more slowly and finally the trend levels off around 5. Evidently there is a strong negative correlation, as V ′ i gets larger than ≈ 1.07 the relative frequency of collisions drop off quite steeply. From the figure it is obvious that the pdfs are very similar, while the cdfs look identical.

Dependence on f
In this section the distribution of the collision parameters are compared for all the f values. The distribution of the impact parameters is displayed in Fig. 12 for f = 1, 2, 3, 5, 10 with ǫ = 10 −13 . Again, the pdf curves show a very similar behavior while the cdf curves are practically identical. It is evident that the impact parameter does not depend on f , it has a uniform distribution. The distribution of the impact velocity for f = 1, . . . , 10 is shown in Fig. 13 for ǫ = 10 −13 . Apparently the pdf of the impact velocity strongly depends on f . The domain of the pdf is equal to [min(Vi), max(Vi)] (see Table A1) and min(Vi) shift towards lower values with increasing f , but the shape remains very similar as it is apparent from the inset plot, which was created by shifting all the pdfs right to the location of minimum belonging to f = 1 which is approximately 1. The lower boundary of the domain, min(Vi) and location of the pdfs' maxima which is denoted by Vi,m are listed in Table 9 and plotted in Fig. 15 with asterisk and triangle, respectively. The smooth solid curve is derived from theoretical considerations detailed in the next section.   Table 9. For each f the minimum value of the impact speed (2 nd column), the location of the pdfs' maxima (3 rd column), the calculated impact speed V ′ rel from Eq. (28) (4 th column) and the minimum distance f 0 and d 0 from Eq. (30) are listed for ǫ = 10 −13 .

TWO-BODY APPROXIMATION TO MODEL THE IMPACT VELOCITY
The observed behavior of the lower boundary of the domain and the location of pdfs' peak of the impact velocity can be explained by a simple model described below.
I have applied a straightforward physical model based on the two-body problem and used conservation laws to estimate the impact velocity as a function of the mutual distance d. In this model only the two colliding bodies P1 and P2 are considered and the effects of all others are neglected. In order to simplify the calculations I have assumed that the colliding bodies have the same mass m and radius R. The model is  At time t = t0 bodies P1 and P2 are d0 apart from each other, and both bodies are in rest, u0 = v0 = 0. Let us calculate the velocity u and v of P1 and P2 at time t when their distance is d. I note that the relative velocity plays the role of Vi. If P1 is the target than the impact speed is u − v.
From the conservation of the linear momentum it follows that u = −v and from the conservation of energy: where I have parameterized the distance with f ≥ 1 such that d = 2f R. Substituting v = −u into Eq. (21) one gets Figure 14. The two-body problem in the center of mass reference frame. BC denotes the barycenter.
The escape velocity of P2 with respect to P1 is therefore u can be written as The relative velocity V rel of P1 with respect to P2 is and introducing V ′ rel = V rel Ve than Eq. (25) can be written as I remark that V ′ rel is measured in escape velocity unit and plays the role of the impact speed in the preceding text. If one knows the initial distance d0 and thus f0 than the velocity V ′ rel at a distance of d can be computed from Eq. (26). If and then Eq. (26) becomes This result gives an approximation of the minimum of the impact speed. Real physical impacts happen when f = 1, substituting this into Eq. (28) results V ′ rel ≈ 1, i.e. the impact speed is approximately 1 escape velocity unit. This is exactly in line with the distribution of the impact velocities (see eg. Figs. 8,9, and the right panel of Fig. 11 and explains the lack of impact velocity less than about 1 for f = 1. In Fig. 15 the blue curve is the graph of Eq. (28). The min(Vi) data points match very well to this curve so this model explains nicely the shift observed on Fig. 13. From Fig. 15 one can see that the minimum values are just below the blue curve, which is a natural consequence of that some of the bodies are close to each other, i.e. the assumption f0 ≫ f breaks down. Substituting f0 > f = 1 into Eq. (26) one gets Using the formula of Eq. (28) for the different f values the results are listed in the 4 th column of Table 9. The initial distance characterized by f0 can be expressed from Eq. (26) Substituting the minimum of the impact speed into V ′ rel from the 2 nd column of Table 9 into Eq. (30) then the initial distance of the two bodies can be estimated and the resulting f0 is displayed in the 5 th column of Table 9.

COLLISION OUTCOME MAPS
Recent works (LS12, Genda et al. (2012)) based on the combination of hydrocode and N -body gravity code computations studied the impacts between planetary mass bodies and the outcome of planetary collisions have been parameterized in terms of the masses and velocities of the colliding bodies. According to these calculations, the authors devised formulae for the mass of the largest remnant denoted by M lr produced in a collision as a function of the masses of the bodies involved, the impact velocity and impact angle. The authors also identified the boundaries between different types of collision: (i) perfect merging (M lr = Mtot) (ii) partial accretion with some mass escaping as fragments (M lr < Mtot) (iii) partial erosion of the target (M lr < mt) (iv) pure hit-and-run (M lr = mt) and erosive hit-and-run (lead to some erosion of the target and more significant damage of the projectile.) where Mtot = mt + mp.
Using the analytic model of LS12 with adopted values of µ = 0.36 and c * = 1.9, I derived example collision outcome maps. I present an analysis of the simulations to assess the range of true collision outcomes. For this purpose two color-coded collision outcome maps were calculated which are shown in Fig. 16 for mass ratios of γ = 1 (left panel) and γ = 1 : 2 (right panel). In the figure dark blue denotes perfect merging, light blue partial accretion, white color net erosion to the target and finally green hit-and-run region. In every case, the simulation data were used to calculate the collision parameters and to determine the outcome based on the LS12 model. The vertical red line denotes the onset of hit-and-run events at bcrit. The details of the calculation of the collision regimes are given in the Appendix of LS12. The model assumes a sudden transition between grazing and non-grazing impacts, which is of course, artificial. In Fig. 16, the thick red curve corresponds to the critical velocity for catastrophic disruption, where the largest remnant contains half the total mass, M lr = 0.5Mtot. Note that this curve corresponds to the target erosion boundary for γ = 1 cases (the transition from partial accretion or hit-and-run to the erosion region). The lower red dot-dashed curves correspond to the impact velocity needed to disperse 10% and the upper red dashed curve to 90% of mt. Fig. 16 also shows details of all the collisions that occurred in the simulations for f = 1 and ǫ = 10 −13 for a single run data set, run id = 1 with each symbol representing a single collision with γ = 1 (left) and γ = 1 2 (right). The symbols indicate the type of collision involved: diamonds for mergers, squares for hit-and-run collisions where the target survives intact, circles for collisions that increased the mass of the target body (possibly with some mass from the projectile escaping as fragments), and triangles for collisions that eroded mass from the target. From the figure it is apparent that there is no correlation between Vi and b. The number of the different type collisions are summarized Table 10.
In Table 10 the total number of collisions n and the ratios of the different types of outcomes are given for five γ values. The number of perfect merging, partial accretion, erosion and hit-and-run events are denoted by nm, npa, ne and n hr , respectively. In the table the subscript e1 denotes collisions where M lr ≤ 0.5Mtot, i.e. the catastrophic disruption, e2 denotes collisions where M lr ≤ 0.9mt and e3 denotes collisions where M lr ≤ 0.1mt, i.e. the super-catastrophic collision. The number of merging collisions is very low, less than 1% for all listed γ values.
According to the table there is a clear trend which is visualized in Fig. 17, where I have plotted the number of the different type collisions against γ.
The initial single mass distribution relaxes into a contin- Figure 16. Predicted collision outcome maps using the analytic model of LS12 for strengthless planets (μ = 0.36 and c * = 1.9). Impact velocity is normalized by the mutual surface escape velocity assuming a bulk density of 2000 kg m −3 . Colored regions denote perfect merging (dark blue), partial accretion (light blue), net erosion to the target (white), and hit-and-run (green). The vertical red line denotes the onset of hit-and-run events at b crit . Thick black red denotes the critical disruption velocity for half the total mass (0.5Mtot) remaining; red dashed curve denotes 90% (0.9mt) and 10% (0.1mt) of target mass in largest remnant. The different symbols indicate the outcome of each collision: diamonds for mergers, squares for hit-and-run collisions, filled circles for collisions in which the target gained mass, and triangles for collisions in which the target lost mass.
uous power-law mass distribution in ∼100 years which is in line with the results of Kokubo & Ida (1996). As the simulation time proceeds the largest body of the continuous mass distribution separates from it. In the beginning of the simulation γ = 1 and as the continuous power-law mass distribution develops the decreasing minimum of γ reflects the time evolution of the system, at least early in the simulation. Later on this relationship breaks and all we can say is that the minimum value of γ is indicative of the minimum age of the system. In order to reflect this initial relationship between time and min(γ) the horizontal axis γ is reversed and the time is shown by the black arrow in the top of the figure.
It is clearly shown in Fig. 17 that the number of merging collisions remains very low, below 1% and shows a very modest increase as γ decreases. As time proceeds and the bodies grow via collisions γ may reach smaller values therefore in later times the chance of a merging collision slightly increases. The number of partial accretion depends strongly on γ. As min(γ) decreases the chance of partial accretion events increases and for γ ≤ 0.5 the increase speeds up such that the proportion of partial accretion reaches ≈ 70% for γ = 0.1. On the other hand the number of erosion declines with decreasing γ just as n hr which follows a similar pattern but it decreases with a smaller pace. If the collision rate is maintained by gravitational focusing and/or additional material is coming from outer space via migration than γ may reach smaller values and the chance of a grazing impact reduces, see Eq. (20). For γ < 0.2 partial accretion dominates and consequently as the disc matures and time increases the larger bodies grow faster, i.e. the rich get richer.
The importance of hit-and-run collisions was recognized by Asphaug, Agnor & Williams (2006), namely in these events, a pair of objects collides at an oblique angle. Momentum causes the bodies to graze each other while exchanging some mantel material and afterwards they separate again. Generally, the final bodies have similar masses to the original pair but smaller relative velocities. Hit-and-run collisions can be further divided into two subclasses. When the first collision speed is only slightly larger than the mutual escape velocity, the bodies collide, separate but in a second collision they merge. These are the so called graze-and-merge collisions. On the other hand when the initial impact velocity is higher, the two bodies separate with a relative speed larger than the escape velocity, which is not followed by a second impact. These are true hit-and-run collisions. The boundary between these two regimes has been clearly identified in a second series of impact simulations carried out by Genda et al. (2012).
According to the definition of Kokubo & Ida (1996) runaway growth means that the largest body in its feeding zone grows more rapidly than the second largest one and the ratio of the mass of the largest body Mmax and the mean mass m grows monotonically with time. In their 3D calculations they presented evidence of runaway growth and the results in this work summarized in Fig. 17 also supports this observation. As the single mass distribution relaxes into the continuous power-law mass distribution more and more collisions may involve a larger and smaller body, i.e. γ ≤ 0.2 hence partial accretion dominates therefore the larger the body is the faster it grows. The numerical simulations of Aarseth, Lin & Palmer (1993) showed that runaway coagulation occurs and can lead to the formation of protoplanetary cores. In their 2D simulations in the case of larger eccentricities (hot accretion) the degree of runaway growth can be the same as in 3D. In the 3D simulations of Kokubo & Ida (1996) the ratio Mmax/ m ≈ 140 at t = 20000 yr, in their 2D one the ratio ≈ 15, while in the work of Aarseth, Lin & Palmer (1993) it was ≈ 13. Using the data of the present work preliminary calculations show that Mmax/ m ≈ 10 at t = 20000 yr. It must be noted that the above mentioned works can not be directly compared because on one hand of the significant differences between the treatment of the gravitational forces and the collisions and on the other hand the different initial Figure 17. The frequency of different type of collision as a function of γ. Note that the horizontal axis is reversed in order to reflect the progress of time, as it is shown by the upper arrow. The symbols indicate the type of collision: diamonds for mergers, blue circles for partial accretion, triangles for collisions that eroded mass from the target, and squares for hit-and-run. The collisions were collected from a multiple run data with f = 1 and ǫ = 10 −13 .
conditions. However, the general conclusions and trends are similar. In the present work the main focus is on the statistics of the collision parameters and hence growth is studied from the collision point of view, therefore these results are valid for both 2D and 3D cases. The issue about the orderly and runaway growth as well as the mass distribution of the bodies and its time evolution will be investigated in a next paper.

Comparison with previous works
A direct comparison to previous works is hard to present because of the different simulation setups. The most important is that the present simulations were performed in 2D while the others in 3D. The initial conditions, the f expansion factor of the radii differs from work to work and the values in the tables or figures are given for different γ or γ is not exactly specified in the papers.
To estimate the effects of including the collision model of LS12 on planet formation  analyzed the impact parameters of previous N -body simulations that used only perfect accretion (O'Brien et al. 2006;Raymond et al. 2009). The authors divided the collision outcomes from the simulations into 3 groups: group 1 consists of all collisions from eight different simulations in O'Brien et al. (2006), group 2 includes all collisions from 40 simulations by Raymond et al. (2009) while group 3 contains only giant impacts from the latter simulations. Group 1 and 2 were subdivided into two subclass containing collisions between embryos and planetesimals (γ ≤ 1 40 ) and between only embryos (γ ≥ 1 10 ). The predicted collision outcome statistics from Table 1 of  is summarized in the columns from 5 to 9 in Table 11.
The simulations of Chambers (2013) also include the LS12 model to take care of fragmentation and hit-and-run collisions. The collision statistics of these calculations using Table  1 and Figs. 5 and 6 of Chambers (2013) are reported in the 10 th and 11 th columns of Table 11.
In a recent work Bonsor et al. (2015) performed N -body simulations to track the change in the bulk Mg/Fe and Si/Fe ratios of the protoplanets. The N -body code used the LS12 model to realistically model collisions. However, the paper did not present the collision statistics in a tabular form, this kind of data could be extracted from Figure 4 of the paper.
The approximate values are presented in the last 2 columns of Table 11. Similar runs were performed by Carter et al. (2015) and the collision statistics derived from Figure 5 of the paper is qualitatively the same as in Bonsor et al. (2015).
The impacts for multiple run data with f = 1 and ǫ = 10 −13 of this work is presented in the first three columns of Table 11, where the 2 nd column list data for all γ, 3 rd column for giant impacts with γ ≥ 1 10 and the 3 rd column shows for planetesimal impacts with γ ≤ 1 40 . These specific values were selected since the initial γ between the planetesimals and embryos were similar in .
Comparing the giant impacts of this work (3 rd column of Table 11) with that of  (5 th -7 th columns) one see similarities and significant differences. The perfect merge ratios roughly agree in all cases except in group 1 where 0 merge event was detected, but the sample size in that group is at least an order of magnitude smaller than in the other ones. The hit-and-run case represents 29.9% of all events in the present work and this value agrees well with that of group 1 (31.3%), 2 (27.7%) and 3 (28.2%). The partial accretion measured in this work is typically larger by a factor of ≈ 1.5, the best agreement is with group 2: 44.7% vs 39.2 %. Significant differences were found in the case of erosion: the present study found 24.9% while this quantity is less than 3% in all the groups. The reason for this large difference is unclear since the impact velocity distribution measured in this study (see Figs. 10,11,12 and 15) and those presented in   (Figs. 2 and 10) show good agreement. This large discrepancy deserves careful further studies since erosion plays a crucial role in the formation and the final bulk composition of the terrestrial planets. Notwithstanding, the special case of erosion the super-catastrophic events ratio of the present work (0.2%) matches quite well with the value in group 3 (0.3%).
Analyzing the planetesimal collisions detected in the present work (see the 4 th column) and that of  (8 th , 9 th columns) a very good agreement can be observed: partial accretion 76.8% vs. 72.6% and 69.4%, hit-and-run events 22.2% vs 23.6% and 25.4% and there was no super-catastrophic event in all three simulations.
Comparing all impacts of this work (2 nd column) with that of Chambers (2013) (10 th column) one sees that in overall the probabilities of different collision outcomes are similar. The results for partial accretion and erosion should be highlighted, 29.8% vs 32.4% and 24.1% vs 19.7%, respectively. These present a far better agreement then in the previous comparison. Also the hit-and-run events match closely: 45.7% vs 44.3%. On the other hand the comparison of giant impacts (3 rd column vs 11 th ) shows significant differences: the perfect accretion calculated in the present work is 0.3% much less than the 25% reported by Chambers (2013), the erosion in this work is 24.9% vs 0%. The values for partial accretion and hit-and-run broadly agrees. It must be noted that in the work of Chambers (2013) γ was not specified, these values were derived from the embryo-embryo collisions taken from the last row of Table 1 in Chambers (2013). Since the collision outcome depends strongly on γ these values are only approximate and therefore they are only indicative.
The last 2 columns contains the collision statistics compiled from Figs. 4 and 5 of Bonsor et al. (2015) both containing collision events with γ ∈ (0, 1]. The t = 0 means the beginning of the simulation while t = 6 × 10 5 is the end of it. These values should be compared with the 2 nd column and one see that in the present work perfect merging is about 20 times less than in Bonsor et al. (2015). The partial accretion is almost equal, while the hit-and-run ratio of this work is 45.7% which is in between the values of 50% and 36% at t = 0 and t = 6 × 10 5 , respectively. Again the erosion is largely different: at t = 0 it is 24% vs 3%, but this difference decreases and the erosion reaches 16% at the end of the simulation.
Comparing the works enumerated in Table 11 with each other the differences are large and the results vary in large intervals. On one hand this is a natural consequence of the different simulation setup, applied numerical integrator method and expansion factor but on the other hand this issue has to study further in detail; especially the cause behind the large difference in the measured erosion and perfect accretion.
I note that the model of LS12 applies a different definition for the escape velocity: where M ′ = mt + minteract. The minteract denotes the interacting mass of the of the projectile estimated to be involved in the collision. Since minteract ≤ mp therefore it may be a contributing factor in the difference of perfect mergers in Bonsor et al. (2015) and Chambers (2013).

CONCLUSIONS
I have performed 2D N -body simulations of planetary accretion. The statistics of the collision parameters is investigated with 10000 equal-mass protoplanets under perfect and gasfree accretion. In this paper the detailed statistics of the collision parameters are presented along with a simple method to improve the collision parameters. Using the two-body problem and the conservation laws I explained the main features of the observed impact velocity distributions. The collision outcome maps are shown for specific projectile-to-target mass ratios. Combining the results of the simulations with the model developed by LS12 estimates for the different type of collisions are given as well as detailed comparisons with previous works on collision outcomes and frequencies.
The main conclusions of this study are: • Using the two body approximation for the colliding bodies a simple method was presented to improve the impact parameter b to b ′ by Eq. (14) and the impact velocity Vi to V ′ i by Eq. (15). Using V ′ i the improved specific energy Q ′ R can be computed.
• It is apparent from Figs. 8 and 9 panel b) that the pdf curves of b and b ′ are similar and from a statistical point of view they do not deviate, i.e. the improvement of the impact parameter is not too remarkable in their pdfs. Similarly the pdfs of the improved impact velocity from panel c) of the same figures show little difference, although the largest discrepancy is for the most important interval. These findings are also supported by Fig. 10, where the results are directly compared for ǫ = 10 −10 and ǫ = 10 −13 . I emphasize that the correction is important for the individual cases and it is obvious that the improvement of b and Vi is essential and all future simulations that use the model of LS12 (or similar) should incorporate it to provide more reliable results.
• It was shown that in 2D the impact parameter has a uniform distribution within [0, 0.95], and for b > 0.95 the pdf drops off slightly.
• According to Fig. 11 the different runs produce the same distribution of b ′ , V ′ i and Q ′ R therefore it is enough to do only one run to obtain creditable statistical data.
• Making use of the two body approximation it was possible to explain the hiatus of impact speed less than 1 Vesc for f = 1 which is a consequence of the conservation laws. The derived formula gives an excellent approximation of the minimum impact speed for different f .
• It was shown that the impact parameter b does not depend on f as it is presented by Fig. 12. On the other hand the distribution of Vi is a strong function of f , the pdf of the impact speed shifts toward zero as f increases, see Fig. 13. This behaviour was explained by the two body approximation shown in Fig. 15.
• The model of LS12 was used to determine the number of different types of collisions, see Fig. 16 and Table 10. For equal-mass bodies the majority of collisions (43%) are hitand-run events and erosion (40%). Partial accretion constitutes 17% of all cases and 0.05% mergers were measured. The frequencies of different events are summarized in Fig 17. • A thorough comparison with previous works was presented and promising similarities along with significant differences were found. The most significant difference is in the case of erosion, where this work reports 25%, while all the others are much smaller, less than 3% expect for one case of Chambers (2013) where it reaches approximately 20%. The ratio of perfect merge roughly agree with those of  but differs largly from the results of Chambers (2013); Bonsor et al. (2015). The hit-andrun values match reasonably, while the partial accretion ratios broadly agree with other values. It was also shown that serious differencies are present among the previous works.
• The proportion of partial accretion increasing more and more steeply as min(γ) decreases, i.e. as the time increases, see Fig. 17. For γ ≤ 0.1 the majority of collisions (≈ 70%) is partial accretion and 25% is hit-and-run events (comparable values were found by ). A significant fraction of the latter events are graze-and-merge collisions which also leads to accretion. These results provide a independent evidence for the runaway growth mode described in detail by Kokubo & Ida (1996). The implications for planet formation is that the larger the difference between the masses of the objects (γ is small) the larger the probability for collisions in which M lr grows. This can further decrease min(γ) and thus induce a positive feedback favoring more intensive mass growth.
Below I summarize some technically important findings: • The run-times for the two accuracy parameters were Table 11. Predicted collision outcome ratios calculated from the work of ; Chambers (2013); Bonsor et al. (2015) and from the present simulations. The number of graze-and-merge events is denoted by ngm. In the work of Chambers (2013) the nm/n contains also the graze-and-merge events.
compared and it was shown that the simulations for ǫ = 10 −13 takes about 40% more time to complete than those with ǫ = 10 −10 .
• It was shown that the scaled mean run-time is inversely proportional to f for both accuracy parameters. The basic cause of this is the specific decrease of the number of bodies.
• Using the model described in section 3.3 the parameters of the collision computed from the simulations can be improved. Comparing the improved values it turned out that there are not too significant difference between the distributions, i.e. from a statistical point of view the lower accuracy simulations already provide useful statistical data. Table A1. Summary of simulation outcomes and results. Each row corresponds to one single run data, while the run with 1 -10 to multiple run data and gives the range within which the sample values fall for ǫ = 10 −10 and ǫ = 10 −13 , n is the number of collision. ǫ = 10 −10 ǫ = 10 −13 f run n min(V