Identikit 2: An Algorithm for Reconstructing Galactic Collisions

Using a combination of self-consistent and test-particle techniques, Identikit 1 provided a way to vary the initial geometry of a galactic collision and instantly visualize the outcome. Identikit 2 uses the same techniques to define a mapping from the current morphology and kinematics of a tidal encounter back to the initial conditions. By requiring that various regions along a tidal feature all originate from a single disc with a unique orientation, this mapping can be used to derive the initial collision geometry. In addition, Identikit 2 offers a robust way to measure how well a particular model reproduces the morphology and kinematics of a pair of interacting galaxies. A set of eight self-consistent simulations is used to demonstrate the algorithm's ability to search a ten-dimensional parameter space and find near-optimal matches; all eight systems are successfully reconstructed.


INTRODUCTION
Dynamical modeling of specific pairs of interacting galaxies is a subject with considerable history. Toomre & Toomre (1972, hereafter TT72) bolstered their interpretation of bridges and tails as tidal features by presenting test-particle models of Arp 295,M 51,NGC 4676,and NGC 4038/9; the power of such models was demonstrated when Stockton (1974) confirmed TT72's prediction for the relative velocities of the two galaxies making up NGC 4676. As observational and numerical techniques have improved, modeling of interacting systems has generated an extensive literature (see Barnes & Hibbard 2009, hereafter BH09, for a partial list). The motivation for dynamical modeling has evolved over time. While early studies focused on testing the tidal theory of galactic encounters, more recent work has used dynamical modeling to help interpret observations, probe the structure and dynamics of unseen matter, and reconstruct the dynamical histories of merging galaxies.
Despite the advances of the past few decades, it remains a challenge to create models matching the detailed morphology and kinematics of a pair of colliding galaxies. One fundamental complication is the inherent uncertainty in inferring the distribution and dynamics of dark matter strictly by its effects on luminous material. Another is the violent reprocessing of interstellar material -including rapid star formation -in galaxy collisions. However, there are three rather technical issues which also limit progress: (1) A galactic collision is described by a large number of parameters which interact in highly non-linear ways.
(3) The criteria for a successful match are not easily translated into quantitative terms.
The parameters necessary to simulate an encounter of two disc galaxies fall into three groups, as illustrated in Fig. 1. The first group specifies the initial orbits of the galaxies; assuming these orbits are asymptotically Keplerian at early times, the required parameters are the periapsis separation p, the orbital eccentricity e, and the mass ratio µ. The second group describes the spin vector of disc d (where d = 1, 2) with respect to the angular momentum of the relative orbit and the separation vector between the galaxies at periapsis; this vector is parametrized by the inclination i d and argument to periapsis ω d of each disc. Together with any parameters needed to describe the internal structures of the two galaxies, the first and second groups specify the initial conditions for a galactic encounter. The third group consists of the time t since first periapsis, and parameters which map the simulation onto the observational plane: three Euler angles θα specifying the viewing direction, Figure 1. An abstract representation of the sixteen-dimensional parameter space of galaxy interactions. The radial coordinate represents the initial orbit, the azimuthal coordinate represents the disc orientations, and the vertical coordinate represents the parameters chosen after a simulation is run. A conventional N-body simulation explores the parameter subspace represented by the dotted line, while a single Identikit simulation explores the entire cylindrical surface.
scaling factors L and V which transform dimensionless simulation positions and velocities, respectively, into real physical quantities, and the centre-of-mass position on the plane of the sky Rc and radial velocity Vc. These parameters may be chosen after a simulation has been run.
Of these sixteen parameters, only a few have a priori constraints. TT72 argued that the orbital eccentricity should be e ≃ 1; this is generally supported by cosmological simulations (Khochfar & Burkert 2006), although the e distribution extracted from these simulations includes a tail to e < 1. The mass ratio µ may be estimated from the relative luminosities of the two galaxies -provided that the galaxies are still distinct and that interaction-induced star formation has not significantly altered their luminosities. Finally, the scale factors L and V are are not completely arbitrary since the pre-encounter galaxies should have radii and circular velocities comparable to those of other disc galaxies.

Identikit 1
Identikit simulations combine test-particle and self-consistent techniques (BH09). Each galaxy is modeled by an initially spherical configuration of massive particles with cumulative mass profile m(r), in which is embedded a spherical swarm of massless test particles on initially circular orbits with angular momenta uniformly distributed over all directions. Two such models are launched towards each other with orbital parameters (p, µ, e). During the ensuing encounter, the massive components interact self-consistently, approximating the time-dependent potential and orbit decay of a fully self-consistent galactic collision. The test particles mimic the tidal response of embedded discs with all possible spin vectors; once such a simulation has been run, selecting the appropriate subset of test particles yields a good approximation to the tidal response of any particular disc.
In the simplest Identikit implementation, the test particles initially populating each galaxy model have the same radial distribution as the discs they are intended to mimic. Each test particle i is associated with a normalized vector si ∈ S 2 which records the direction of the particle's initial angular momentum with respect to the centre of its parent galaxy. An Identikit simulation yields a Monte Carlo representation of an 'extended' distribution function, This function gives the phase-space number density at time t of test particles from disc d with position r, velocity v which initially had angular momentum direction s. Here and throughout, ' . =' is used throughout to indicate an explicit Monte Carlo expression. On the right-hand side, ri(t) and vi(t) are the position and velocity of test particle i at time t; these are understood to also depend on the orbital parameters (p, µ, e). Once this extended function has been constructed, the distribution function f d for a disc with a specific initial spin s0 is estimated by wheres0 = {s ∈ S 2 | s · s0 ≥ 1 − σ} and σ ≪ 1 is a tolerance parameter which determines the solid angle contributing to the estimate of f d .
The reason why f d is estimated by integrating over a finite solid angles0 is not obvious; it may seem enough to simply evaluate g d (r, v, s0; t). However, g d and f d are both represented in a Monte Carlo fashion. To sample f d well enough for a visual comparison with observational data requires a few thousand particles; if g d is represented by Ntest ∼ 10 5 to 10 6 test particles per galaxy, this requires σ ≃ 10 −2 . This simple version of Identikit has some drawbacks. Only a small percentage of the test particles are initially placed at large radii where they are responsive to tidal forces; this wastes computer time. Moreover, simulated discs defined by si · s0 ≥ 1 − σ have scale heights which increase linearly with radius and look unrealistic when viewed edge-on. Both of these flaws can be addressed by radially biasing the distribution of test particles. Following BH09, the test particle density is multiplied by a factor of r 2 , and each test particle i is given a weight ξi = max(r init i /rmin, 1) −2 , where r init i is the initial orbital radius of particle i, and rmin is a small cut-off radius. The extended distribution function is then and the expression for f d becomes wheres0(ξ) = {s ∈ S 2 | s · s0 ≥ 1 − σξ}.

DERIVING DISC SPINS
Identikit 1 uses the extended distribution function g d in a 'forward' mode which allows a rapid exploration of the outcome for any choice of initial disc spin. However, the same function can also be used in an 'inverse' mode, in which the fact that a tidal feature populates certain regions of phase space can be used to solve for the initial spin vector of the disc it came from. Let q = (r, v) ∈ R 6 be a point in phase space, andq ⊂ R 6 be a region of phase space which is populated by tidal material from disc d. Define a density on the sphere of spin directions S 2 : where qi(t) = (ri(t), vi(t)). This function describes the inverse image of the material populating the phase space regionq at time t mapped back onto the sphere of initial spin directions. As will be seen shortly, the inverse image of a regionq may span a wide range of initial disc spins s, and the function Ω d can have a rather complicated behavior. Nonetheless, Ω d usually has compact support, meaning that only a subset of all possible disc spins can produce tidal features populatingq. Now consider a set of nreg phase-space regionsqj (where j = 1, . . . , nreg) which sample a given tidal structure. For example, imagine that disc d has produced a tidal tail, and that the regions are distributed along this tail. Each of these regionsqj can be populated by some range of initial spins, defined by the set of s for which Ω d (s;qj, t) > 0. However, the tail as a whole is presumably populated by one disc with a unique spin s0, and this spin must lie within the intersection of the images of the individual regionsqj .
To find this intersection numerically, it's convenient to work with a smoothed and normalized version of Ω d . Define where w(∆s) is a smoothing function with compact support, and the factor K(q) normalizes the maximum value of Ω d (s;q) to unity. Smoothing replaces the pointillistic Monte Carlo representation of Ω d with a continuous version in which the value of Ω d (s) reflects the density of points near s; accordingly, the radius of the smoothing function should be somewhat larger than the mean separation on S 2 between the points. The advantages of normalization will be illustrated later. Given smoothed functions for a set of regionsqj tracing the tidal features of a single disc, define the product function This function may vanish everywhere on S 2 , it may have a complicated structure, perhaps with multiple peaks, or it may have a simple structure with just one peak. If it vanishes everywhere, then no single disc spin can populate all of the target regionsqi; this may indicate that model parameters such as the time t since periapsis are incorrect. If it has a complicated structure, then there are many initial configurations which can populate all of the target regions. But if Ω * d has a single peak at s0 and vanishes everywhere else except for a small surrounding neighborhood, then the true spin of the disc must be close to s0 -provided that t and other model parameters are accurately known.

Observational constraints
Observations of distant galaxies can constrain only some components of phase-space. Let the observations of a pair of interacting galaxies be described using coordinates R = (X, Y, Z), with the Z axis coinciding with our line of sight to the system, so that (X, Y ) is parallel to the plane of the sky. For simplicity, assume that the origin of these coordinates lies near the system's centre of mass. Given a distance to the system, the (X, Y ) coordinates may be expressed in physical units. Likewise, describe velocities using coordinates V = (U, W, V ) = (Ẋ,Ẏ ,Ż), where the origin lies near the systemic velocity. As a rule, we can measure the first two components of (X, Y, Z) and the third component of (U, W, V ), while the other components are effectively unconstrained.
Consequently, if tidal material is detected with a specific position and line-of-sight velocity, we can only say that the phase-space region it occupies has finite extent in some directions and infinite extent in others. For example, consider a phase-space regioñ Here (XQ, YQ) specifies a point on the plane of the sky, RQ is a radius, VQ specifies a line-of-sight velocity, and SQ is a velocity range. Such a regionQ might, for example, correspond to a voxel in a HI data-cube. Simulation coordinates r and v may be projected onto the observational plane via Here the rotation matrix M = M(θα), which depends on the Euler angles, rotates simulation coordinates to observation coordinates. Assuming the simulation coordinates are dimensionless, the scale factors L and V will have units of length and velocity, respectively. Finally, Rc = (Xc, Yc, 0) and Vc = (0, 0, Vc) specify offsets in the plane of the sky and the line-of-sight velocity, respectively. These offsets vanish if the origins of R and V coincide exactly with the system's centre of mass position and velocity, but this is hard to insure in practice. It is convenient to define an operator P : q → Q which implements the mapping from simulation to observation: The observational version of (5) is then where P depends on the parameters θα, L, V, Rc, and Vc; these parameters are involved because the regionQ is specified observationally. Observational versions of Ω d and Ω * d can be defined analogously and likewise depend on these parameters.

Proof of concept tests
Since observations can constrain only three out of six phase-space dimensions, it's not clear if the algorithm described above can yield well-determined disc spins. One way to find out is to apply the algorithm to data generated using simulated galaxy encounters; if the algorithm reconstructs initial disc spins from simulated observational data then it passes the test. To make the test as clean as possible, the simulated encounters are run using test-particle discs embedded in a spherical self-consistent models, and the same mass model is used in the Identikit simulation to compute g d (r, v, s, ξ; t). This side-steps, for now, two potential complications: (1) the difference between test-particle and self-consistent tidal responses, and (2) any mismatch between the mass model used for the simulated data and the one used for the Identikit computation. This test uses the same composite mass model adopted in BH09, with a bulge/disc/halo mass ratio of 1:3:16. The bulge has a Hernquist (1990) profile, the disc has an exponential surface-density profile, and the halo has a Navarro, Frenk, & White (1996) profile and is tapered smoothly at ∼ 12 disc scale lengths. As in BH09, the calculations are carried out in units with G = 1; in these units the rotation period at 3 disc scale lengths is trot ≃ 1.23. Two discs with spins (i1, ω1) = (30, 0) and (i2, ω2) = (135, 0) are placed on approaching parabolic orbits, reaching periapsis at time t = 0. By time t = 1, disc 1 has produced a well-developed tidal tail. For simplicity, the observation operator P is replaced with the identity operator, so that q = Q. The simulated orbit lies in the (X, Y ) plane, while the system is viewed along the orbital axis Z.
As the left side of Fig. 2 shows, three phase-space regionsQj are defined along the tidal tail of disc 1. These regions trace the run of tail velocity V as well as tail position (X, Y ), but -as (8) implies -have infinite extent in Z and in (U, W ). The right side of Fig. 2 shows contours of the three functions Ω1(s;Qj , t), evaluated at time t = 1. All three regionsQj map to roughly linear features on the spin sphere; each region can be populated by a range of disc spins. However, the three sets of contours overlap in one place on the spin sphere, and the product function Ω * 1 (s, t) has a single peak very near the true disc spin (i1, ω1) = (30, 0). The white contour shows where Ω * 1 (s, t) equals 95% of its peak value. This contour, which neatly encloses the true (i1, ω1), illustrates the uncertainty in the derived spin.
A simple variation of this experiment is shown in Fig. 3, where the argument to periapsis of disc 1 has been changed to Figure 2. A proof-of-concept test of the Identikit 2 algorithm. On the left are (X, Y ) and (X, V ) projections of an encounter between two disc galaxies with spins (i 1 , ω 1 ) = (30, 0) and (i 2 , ω 2 ) = (135, 0); the system is viewed along the orbital axis. The small colored boxes show three phase-space regionsQ j distributed along the tidal tail produced by galaxy 1. On the right is an equal-area projection of one hemisphere of the spin sphere; lines of constant inclination i and argument ω are labeled. The contour plots show the functions Ω 1 (s;Q j , t); their colors correspond to the regions defined on the left. The grey-scale image shows the product function Ω * 1 (s; t); the single white contour is set at 95% of the peak value, and neatly encloses the actual disc spin (i 1 , ω 1 ) = (30, 0). ω1 = 90. This has little effect on the resulting tail's morphology, but yields a different run of line-of-sight velocities as shown on the bottom left of the figure. The three regions defined here have the same (X, Y ) coordinates as in the previous case; only their V coordinates have been modified. In response, the peak of the product function Ω * 1 (s, t) shifts to very near the actual spin (i1, ω1) = (30, 90). Again, the white contour at 95% of the peak shows how well the spin is constrained.
These simple tests demonstrate several key points. First, a single phase-space regionQ defined following (8) doesn't provide enough information to uniquely determine initial disc spin; the various Ω1(s;Qj , t) functions contoured in Figs. 2 and 3 each admit a range of possible solutions. Second, two or more regionsQj tracing a given tidal structure can, at least in theory, accurately determine initial disc spin. For best results the regions used should sample different parts of the tidal feature originating from a given disc, so the resulting Ω d (s;Qj, t) functions are not closely aligned with one another. Third, line-ofsight velocity information is necessary; morphological constraints alone can't distinguish between the two cases presented in this section.

SEARCHING PARAMETER SPACE
The algorithm described above yields disc spins assuming that all other model parameters are already known. This capability may be useful in examining variations on a known solution, but a more comprehensive methodology is required to model real systems from scratch. Some technique for searching parameter space or solving for the other parameters is needed; this section develops a possible approach.

Measuring quality of fit
A numerical measure of the quality of a solution is needed to implement an automated search of parameter space. The function Ω * d (s; t, . . .), where (t, . . .) represents all orbit and viewing parameters, may be useful in this role. If Ω * d vanishes for all disc spins s, then either: (1) The regionsQj do not all originate from a single disc with a unique spin vector.
(2) The adopted orbit parameters (p, µ, e) or viewing parameters (t, θα, L, V, Rc, Vc) are too inaccurate to yield a solution. . Another test of the algorithm. Here, disc 1 has spin (i 1 , ω 1 ) = (30, 90). The three colored boxes have the same locations on the (X, Y ) plane as in Fig. 2, but have been shifted in V to track the run of velocities along the tail. As a result, the function Ω * 1 now peaks very close to (i, ω) = (30, 90). This illustrates the key role velocity information plays in constraining encounter geometry.
The first interpretation may be correct if the disc was initially warped or if some of the regions are attributed to the wrong galaxy. In most cases, however, the second interpretation may be taken as a working hypothesis.
Extending this line of thought, it seems plausible to prefer model solutions which yield higher peak values of Ω * d . Let be the maximum value of Ω * d obtained for any s ∈ S 2 . The peak value basically measures the degree to which the functions Ω d for different regions overlap with one another; a solution in which the ridge-lines of these functions intersect is probably better than one in which their outskirts barely touch.
Just two regions, suitably chosen, suffice to constrain a disc's spin if all other parameters are known. However, for nreg = 2 regions the peak function Λ d (t, . . .) will be fairly insensitive to the viewing and orbit parameters. An inaccurate choice for these parameters will probably shift the locus of the overlap between the images ofQ1 andQ2, but won't appreciably reduce Λ d unless the error is quite large. To make Λ d a useful metric, at least nreg = 3 regions should be used to trace each disc's tidal features, and it seems likely that more regions will yield stronger constraints on the collision parameters.
Finally, a plausible model of a galactic collision must be able to account for the tidal features of both galaxies using the same set of orbit and viewing parameters. Suppose functions Ω * 1 (s; t, . . .) and Ω * 2 (s; t, . . .) have been defined using two different sets of regions, each tracing the tidal features of one of the galaxies. By analogy with (7), let this product, which depends on all orbit and viewing parameters, provides an estimate of the overall quality of a solution.

Centre 'locking'
In many cases, a subset of the viewing parameters may be determined straightforwardly. Even violently disturbed galaxies often have well-defined nuclei which can be accurately located, and a plausible simulation of an interacting pair should reproduce the positions of these nuclei. As BH09 noted, when modeling a system in which two galaxies appear well-separated on the plane of the sky, it's helpful to 'lock' the centres of the models. Locking derives the rotation about the viewing axis θZ, length scale L, and offset Rc by requiring the centres of the models to coincide with the nuclei of the real galaxies. These four viewing parameters, in effect, become functions of the viewing direction (θX, θY ), time since periapsis t, and orbital parameters (p, µ, e). Locking is completely independent of disc orientation, so it can be applied before attempting to solve for disc spins.

Scanning over viewing direction
If the pair of galaxies to be modeled exhibit distinct and well-separated nuclei, it's straightforward to perform a systematic search of all viewing directions. Suppose for the moment that trial values have been chosen for the orbit parameters (p, µ, e), time since periapsis t, and velocity scale V and offset Vc. The basic idea is to iterate over a lattice of possible viewing directions parametrized by the Euler angles (θX , θY ). For each viewing direction, centre locking determines the remaining viewing parameters, and a trial solution for disc spins is obtained using the algorithm in § 2; the best viewing direction is the one which yields the largest value of Λ. In effect, this prescription surveys a ten-parameter subset of the full parameter space (Fig. 1) at the cost of a blind search of only two parameters.
In the present implementation, the lattice of viewing directions is generated by starting with an icosahedron inscribed within a unit sphere, and replacing every one of its 20 triangular faces with four equilateral triangles, each with one quarter of the original triangle's area. The new vertexes introduced are projected out to unit radius, producing a solid with n face = 80 triangular faces approximating a sphere. This process can be iterated, producing a sequence of ever-better approximations with n face = 320, 1280, 5120, . . . faces.
The midpoints of these faces, which cover the sphere in a nearly uniform fashion, define a lattice of viewing directions. Each face, indexed by k (where k = 1, . . . , n face ), is associated with a viewing direction (θ X,k , θ Y,k ). Locking determines corresponding values for the angle θ Z,k , scale L k , and offset R c,k = (X c,k , Y c,k , 0). Solving for disc spins yields the peak values In addition to finding the viewing direction k which maximizes the product Λ k = Λ 1,k Λ 2,k , the algorithm tabulates values of Λ d,k for all faces.

TESTING PARAMETER SEARCH
The approach just outlined only works if the product Λ(t, . . .) can reliably identify good solutions. To establish this, the next test treats the viewing angles θα, length scale L and position offset Rc as unknowns, in addition to the disc spins (i d , ω d ).
Conversely, the orbit parameters (p, µ, e), time since periapsis t, and velocity scale V and offset Vc have known values. While this test therefore doesn't search the full parameter space, it does survey a non-trivial subset; in particular, this test aims to determine if Identikit 2 can reconstruct the encounter and viewing geometry for well-separated pairs of interacting galaxies. Fig. 4 presents a sample of equal-mass (µ = 1) initially-parabolic (e = 1) galaxy encounters with random disc spins and viewing angles. Out of the 36 self-consistent encounters BH09 used to test Identikit 1, these are the eight with the largest periapsis separations; they include an equal mix of direct and retrograde discs. All eight are 'observed' at ttrue = 1, one model time unit after first periapsis. At this time, these pairs all have well-separated centres, insuring that centre locking will work effectively.
Following BH09, the Identikit calculations use a spherically-averaged version of the same mass model employed in the self-consistent simulations. Since the range of actual periapsis separations ptrue represented in this sample is fairly small, all eight were matched using a single Identikit simulation with p comparable to the mean separation. This simulation was evolved to t = 1, exactly matching the actual time.
Five regions, shown in Fig. 4, were used to trace the the tidal features of each disc. For the most part, these regions simply follow the morphology and kinematics of the tidal structures seen in (X, Y ), (X, V ), and (V, Y ) projections of the self-consistent simulations. In some cases, however, these projections don't fully delineate the information contained in a (X, Y, V ) data-cube. An interactive routine was therefore used to evaluate the amount of tidal material in each region as its parameters were adjusted; this insured that the regions actually contain significant amounts of material even when the projected views were ambiguous. Of course, the same approach could be used when working with observational data.
Once these regions were defined, a preliminary model for each system was found by scanning over a lattice of n face = 320 to 1280 viewing directions. In two cases (encounters 2 and 5), the algorithm failed these preliminary tests. The nature of these failures will be discussed shortly, but fairly minor adjustments to the regions sampling just one of the two discs were enough to resolve both cases. With these adjustments made, final models for all eight systems were obtained by scanning n face = 5120 viewing directions. Each of these final models took about one hour of computing time on a 3 Ghz Intel processor; this time could be substantially reduced by optimizing and parallelizing the code. Fig. 5 shows these eight models (points), overplotted on the self-consistent simulations (grey-scale images). The algorithm successfully reproduced the morphologies and kinematics of all eight systems; by the standards of BH09, these are all 'good' matches.
The algorithm selects viewing direction by maximizing the product Λ 1,k Λ 2,k , but how do the constraints provided by the two discs work together? Fig. 6 shows how Λ d varies as a function of viewing direction. Each panel shows a perspective view of the viewing direction lattice, with the viewing direction selected by the algorithm situated dead centre. The color of each triangular face shows Λ 1,k and Λ 2,k , using red for disc 1 and green for disc 2. The eight models yield a wide range of 'landscapes' on this spherical lattice. Somewhat predictably, models 6 and 8, in which both discs display only subtle tidal disturbances, yield diffuse Λ d functions with multiple local maxima. More pronounced tidal disturbances, such as the classic 'bridge and tail' structures seen in both discs of models 1 and 4, and in one disc each in models 2, 3, 5, and 7, generally yield more localized Λ d functions; while multiple maxima may still appear, they are closer together. In models 2, 3 and 5, the individual Λ d functions appear nearly disjoint, only overlapping for a small range of viewing directions. Model 4, in contrast, produces Λ d functions which are roughly aligned with each other, although one function is considerably more diffuse than the other. Clearly, the intrinsic tidal response, viewing direction, and strategy used to pick regions are all factors in determining the landscape of the resulting Λ d functions; further experimentation with a larger set of encounters may help to tease these factors apart. In practice, the method used to select viewing direction works quite well for this set of galaxy encounters. Fig. 7 shows the cumulative distribution of angular errors in viewing direction ∆view. Numbered points represent the eight Identikit 2 models from Fig. 5, while the light grey dots show the set of 36 Identikit 1 models from BH09. The median error in viewing direction for the Identikit 2 models is ∆view ≃ 5.6 • , which is less than half the median error for the Identikit 1 models. Moreover, all of the new models are good fits; the tail of poor solutions represented by the grey dots extending to the upper right is absent. Of course, this comparison is not entirely fair; BH09 also fit for the time t, periapsis separation p, and velocity scale V, while here these parameter values are pre-determined. It's nonetheless remarkable that models 6 and 8, which would probably be quite tricky to model by hand, are among the most accurate of the eight models plotted. Inspection of Fig. 6 suggests that the best solutions arise when the peaks of the two Λ d functions fall fairly near each other, while somewhat less accurate results are obtained when the selected viewing direction is a compromise between separate peaks, as in models 2, 3 and 5. Fig. 8 plots the cumulative distribution of angular errors in disc spin direction ∆spin. Here the numbered points represent the sixteen discs from Fig. 5, color coded as in that figure. Given that the viewing directions for the eight Identikit 2 models are all determined with an error of ∼ 10 • or less, it's not too surprising that the spins of the individual discs are also well determined. The median error for this sample is ∆spin ≃ 8.0 • , again less than half of the analogous error for the BH09 sample (grey dots). In conjunction with Fig. 5, this plot reveals some interesting systematics. As a group, the discs with accurate spin directions tend to be face-on, while those with large errors tend to be edge-on. This is neatly illustrated by model 2, where galaxy 1 (red) is fairly face-on and has a spin error ∆spin = 2.3 • , while galaxy 2 (green) is nearly edge-on and has ∆spin = 17.7 • . The same pattern is also seen in model 8 (2.3 • vs. 10.8 • ) and to a smaller degree in model 6 (4.8 • vs. 7.9 • ). This trend may arise because face-on discs present truly two-dimensional velocity fields, while edge-on discs collapse the two spatial dimensions into one; in effect, the former provide more information. On the other hand, the error in spin direction ∆spin does not appear to correlate with disc inclination i. The four discs at the bottom of Fig. 8, which have the most accurately determined spins, are an equal mix of direct (i < 90 • ) and retrograde (i > 90 • ) passages; the four at the top include three retrograde passages, but the three discs just below this group are all direct. This is somewhat unexpected since direct passages produce pronounced tidal bridges and tails, which would seem to offer better constraints on model solutions; it is, however, consistent with the very accurate results already noted for the purely retrograde encounters 6 and 8.
The colored lines in Figs. 7 and 8 show results for different versions of the disc spin algorithm ( § 2). The red curves in these figures were obtained by calculating Ω d (5) with all particles weighted equally (in effect, setting ξi = 1). The blue curves result from computing Ω d (6) without normalizing the peak value to unity (setting K(q) = 1). Finally, the green curves combine both options (setting ξi = K(q) = 1). For the most part, these different variants produce similar results. Equal weighting, which effectively overrepresents the outer regions of the initial discs, may yield a modest increase in spin direction accuracy. Conversely, doing without normalization produces less accurate models, although in most cases the differences are slight -the exceptions will be discussed next. Fig. 7 shows that two of the models constructed without normalizing Ω d (blue curve) are clearly very poor solutions. These models have errors in viewing direction of ∆view = 63 • and 79 • , which induce comparable errors in spin direction, as shown in Fig. 8. A similar failure, with ∆view = 72 • , was seen in a preliminary model for encounter 5. In all of these cases, the algorithm selected a viewing direction roughly parallel to the line between the two galaxies, so that the centres of the models appear close in projection. Because the model centres are forced to coincide with the nuclei of the self-consistent galaxies, these viewing directions yield very large values for the scale factor L. As a result, the transformation from simulation to observation coordinates (9) splatters particles into all of the regionsQj used to constrain model solutions. Since both model galaxies are scaled by the same L, both Λ d functions have local peaks for the same viewing direction, and their product Λ rises higher than the peak representing the true viewing direction.

Failed models
This mode of failure is more likely when Ω d is not normalized because the particles splattered into theQj regions often Figure 7. Cumulative distribution functions for the error in viewing direction ∆ view . The numbered points show results for the eight final models. Colored lines show variants of the algorithm: red shows results for equal particle weights (ξ i = 1), blue for unnormalized peaks (K(q) = 1), and green for both together (ξ i = K(q) = 1). The light grey dots show results from BH09. Figure 8. Cumulative distribution functions for the error in spin direction ∆ spin . The numbered points show results for the eight final models, with red for disc 1 and green for disc 2. Colored lines show results for variants of the algorithm as in Fig. 7. The light grey dots show results from BH09; one dot, off-scale to the left, is not plotted come from the inner parts of the galaxies and therefore have relatively large weights ξi. These high-weight particles help to push the false peak of Λ above the one corresponding to the true solution. If, in addition to not normalizing Ω d , all particles are given equal weight (green curve in Fig. 7), then viewing directions which scatter particles from small radii far and wide are less favored and failure becomes less likely. However, using equal weights when the test-particle distribution is actually biased by a factor of r 2 seems rather arbitrary. In practice, failed solutions such as these are easy to recognize visually, and it would also be straightforward to detect them automatically. One way to do so is to place astrophysically motivated limits on L; for example, a solution which requires discs an order of magnitude larger than the discs of real galaxies could presumably be rejected out of hand. Another way is to detect and reject solutions which splatter particles far beyond the actual extent of the tidal features. For completeness, the other failure of the algorithm, which occurred in the first attempt to model encounter 2, should also be mentioned. In this encounter, disc 2 is nearly edge-on, and presents a wide range of line-of-sight velocities. The two small regions on either side of the centre of this disc in Fig. 4 were initially set to opposite extremes of this velocity range. But as BH09 showed, Identikit models with test particles on circular orbits generally don't reproduce the full velocity width of self-consistent discs. Consequently, Λ 2,k vanished for almost all viewing directions, and did not overlap at all with Λ 1,k . Reducing the range of velocities sampled by these two regions produced the satisfactory solution shown in Fig. 5.
Finally, it's worth noting that these failures are only partial. In every case, an acceptable solution could be recovered by ignoring one of the two Λ d functions and selecting a viewing direction corresponding to a peak of the other. With a good match to the viewing direction selected, the algorithm could then compute good solutions for the spins of both discs.

DISCUSSION
The problem of finding a model matching the morphology and kinematics of a pair of colliding galaxies has generally been solved by a process of trial-and-error, informed by physical insight into the dynamics of tidal interactions. In this approach, observational data can't be used to derive initial conditions directly; instead, the outcome of a model calculation is compared to the observations, and the initial conditions are modified on the basis of this comparison. Identikit 2 offers a shortcut: an important component of the initial conditions -the initial spin vectors of the interacting discs -can be derived directly from the observed morphology and kinematics of the tidal features. Moreover, by providing a way to assess the quality of a solution, the algorithm can be used to search parameter space automatically.
The Identikit 2 algorithm derives the initial spin vector of a tidally interacting disc by simultaneously populating a set of phase-space regions which trace that disc's tidal features. It may seem odd not to make any use of the amount of material found in each region, but this 'omission' is deliberate. The tracers commonly used to study the kinematics of interacting systems don't necessarily obey a continuity equation; for example, neutral hydrogen may be ionized or converted to molecular form, so the amount of HI found in a given region can't be predicted by purely dynamical models. On the other hand, even if much of the HI in a tidal feature has been converted to another phase, the remaining HI may still provide a useful constraint on disc spin as long as it has followed a free-fall trajectory.
Since the algorithm does not use a χ 2 statistic to quantify goodness-of-fit, it's not straightforward to obtain precise confidence limits on solutions. Nonetheless, inspection of the function Λ and its constituent factors (Fig. 6) offers some insight into a solution's uniqueness and accuracy. It may be worth exploring the landscape of this function in more detail. For example, instead of simply maximizing Λ, the algorithm could examine solutions around the peak, and look for secondary peaks which may represent alternate solutions. A similar examination of the functions Ω * d could likewise reveal uncertainties and alternate solutions for spin direction.

Other parameters
The version of the algorithm tested here performs a blind search of viewing direction, parameterized by (θX , θY ). It constrains four other viewing parameters -the line-of-sight rotation θZ, the length scale L, and two components of the offset Rc -from the positions of the galaxy centres, and computes initial disc spins (i d , ω d ) using the regions tracking each disc. This accounts for ten parameters out of the sixteen described in Fig. 1. Preliminary experiments indicate that some of the other parameters can also be determined by maximizing Λ(t, . . .). For example, the test in § 4 was repeated varying the time since periapsis t between 0.5 and 1.5; the times maximizing Λ clustered around the actual value ttrue = 1, with an r.m.s. of 0.18. Further tests with multiple parameters in play are necessary; it will be interesting to see if the algorithm can simultaneously fit for the time t, periapsis separation p, and velocity scale V as well as BH09 did.
The number of unknowns remaining depends on the type of system to be modeled as well as the nature of the data available. Suppose that accurate systemic velocities for both members of a pair of well-separated galaxies are available. By forcing the model nuclei to coincide with their real counterparts in velocity as well as position, locking can determine the velocity scale V and offset Vc in addition to θZ, L, and Rc. Adopting e = 1 and using photometry to estimate the mass ratio µ leaves just four parameters -the viewing direction (θX , θY ), the periapsis separation p, and time since periapsis t -as unknowns. In this case a blind search of these parameters seems reasonable. However, sufficiently accurate systemic velocities may be hard to determine; galactic nuclei have large velocity dispersions, and different tracers (stars, Hα, HI, CO) often give results differing by several tens of km/s. Only in cases where the systemic velocities of the nuclei differ by more than the uncertainties is velocity locking likely to yield useful constraints.
At the other extreme, fully merged systems such as NGC 7252 (Schweizer 1977;Hibbard et al. 1994;Hibbard & Mihos 1995) represent the most difficult class of objects to model. The position and systemic velocity of a merger remnant constrain the offsets Rc and Vc, but even assuming e = 1, a total of eight orbit and viewing parameters (Fig. 1) remain indeterminate.
Since the real galaxies have already merged, only models run past merger need be considered. Nonetheless, the available parameter space is still very large, and blindly searching for possible solutions may not be very rewarding. Several groups have used genetic algorithms to automate the search for models of interacting galaxies (Wahde 1998;Theis & Kohle 2001). In these algorithms, a population of candidate solutions compete to match the observational data; the less successful candidates are eliminated, and the most successful reproduce to replenish the population. After enough generations have passed, the population converges toward a solution matching the observations. Unlike simple 'hill-climbing' strategies, genetic algorithms are unlikely to be trapped by local maxima (a familiar problem with naive trial-and-error modeling). To date, most genetic algorithms have determined the reproductive fitness of candidate solutions by comparing low-resolution images of the actual system and candidate pixel by pixel, with only limited use of velocity data (Wahde & Donner 2001). For such a comparison to be truly meaningful, the observed material -typically stars or HI -must obey a continuity equation; as noted above, this may be violated in real systems. Moreover, as Figs. 2 and 3 illustrate, morphology alone is not enough to strongly constrain disc spins; low-resolution versions of the (X, Y ) images in the upper left of these figures would be almost indistinguishable.
If further tests confirm that maximizing Λ(t, . . .) is an effective way of constraining viewing and orbit parameters, it may be possible to combine Identikit 2 with a genetic algorithm. In this hybrid approach, each member of the candidate population would define a specific choice of viewing direction, periapsis time and separation, velocity scale, and possibly orbital eccentricity. The remaining viewing parameters would be determined by locking the centres, and initial spin directions could then be derived directly. The resulting Λ value could be used to determine the fitness of the candidate solution. This approach combines the strengths of both algorithms. A genetic algorithm should be able to out-perform a blind search without getting stuck on local maxima. Meanwhile, Identikit 2 could efficiently determine disc spins and provide a robust way to define reproductive fitness which fully includes velocity information and does not assume continuity.

Mass models
The mass models adopted in Identikit simulations will almost certainly influence the algorithm's accuracy. In the tests presented here, the same mass model has been used to construct self-consistent simulations and their Identikit reproductions. However, rotation curves of real disc galaxies exhibit a variety of shapes (Casertano & van Gorkom 1991;Catinella, Giovanelli, & Haynes 2006); small galaxies often have rising curves, while massive galaxies may have flat or even falling curves. This diversity presumably arises because real galaxies have a range of initial angular momenta, bulge/disc/halo mass ratios, and assembly histories (Mo, Mao, & White 1998). Several studies have shown that rotation curve shape (equivalently, potential well or halo structure) strongly influences the development of tidal features; in particular, long tidal tails can be inhibited by sufficiently deep galactic potential wells (Dubinski, Mihos, & Hernquist 1996Springel & White 1999). What happens if the mass model used in an Identikit simulation does not match the structure of the galaxies being modeled?
Suppose for a moment that the Identikit mass model and the real galaxy have the similar rotation curves but apportion mass differently between various components. In this situation the Identikit algorithm will probably yield good encounter parameters despite the mismatch. One test is shown in Figs. 7 and 8, where the results obtained by giving all particles equal weights (red lines) accurately reproduce the viewing and spin directions of all eight test systems. This can be interpreted as an experiment in which real galaxies with exponential discs (surface density Σ ∝ e −αR ) are matched to Identikit models with radially biased discs (Σ ∝ R 2 e −αR ) but identical rotation curves.
At the opposite extreme, suppose an attempt is made to match a pair of galaxies with long, well-developed tidal tails -which imply relatively shallow galactic potential wells -using an Identikit model with a very deep potential well. Since such a model would be unable to produce long tidal tails, it's unlikely that any set of encounter parameters could populate phase-space regions near the ends of the tails. As a result, the algorithm would fail to find a solution. This 'failure' points to a flaw in the adopted mass model; the obvious recourse is to try a model with a shallower potential well.
Between these extremes is a grey area in which Identikit 2 may yield a good match to the morphology and kinematics of an interacting system without providing accurate values for all encounter parameters. Consider the problem of modeling a system observed just after first encounter, in which tidal features have not yet had time to develop and probe the full extent of the galactic potential wells. Some constraint on potential well depth may still be afforded by the relative velocities of the two galaxies, but depth is likely to be degenerate with orbital eccentricity; for example, a large line-of-sight velocity difference may arise because the galaxies have deep potential wells or because their initial orbit was hyperbolic (e > 1).
What can be learned about galactic structure by matching the morphology and kinematics of a pair of interacting galaxies with a dynamical model? This larger question is independent of the specific technique -Identikit 2, genetic algorithm, or trial-and-error -used to produce the model. Clearly there's no simple answer; the outcome will vary from system to system. Identikit 2 offers a practical way to explore this question without laborious trial-and-error modeling. It's straightforward to construct self-consistent simulations with various mass models; these could then be tested against Identikit models with different rotation curves. Of particular interest will be tests varying the ratio of circular to escape velocity, which appears to predict the extent of the tidal tails produced in an encounter (Springel & White 1999;Dubinski et al. 1999).

Coda
Determining additional encounter parameters ( § 5.1) and exploring the results of different mass models ( § 5.2) both present intriguing theoretical problems. Beyond these, the effects of 'pre-existing conditions' such as bars and warps suggest additional lines of investigation. However, the algorithm already appears quite effective at reconstructing galactic collisions. It will be very interesting to see if it works with real galaxy data.
ACKNOWLEDGMENTS I am grateful to John Hibbard for his collaboration in developing Identikit 1, which led me towards the algorithm presented here. The Observatories of the Carnegie Institute of Washington, the California Institute of Technology, and Kyoto University provided support and hospitality during the initial stages of this work. I thank colleagues at these institutions, and at Academia Sinica, Columbia University, the Institute for Astronomy, and the National Astronomical Observatory of Japan for listening to my talks on this project as it developed and offering useful feedback. Finally, I thank the referee for a very positive and constructive report.