Substructure revealed by RR Lyraes in SDSS Stripe 82

We present an analysis of the substructure revealed by 407 RR Lyraes in Sloan Digital Sky Survey (SDSS) Stripe 82. Period estimates are determined to high accuracy using a string-length method. A subset of 178 RR Lyraes with spectrally derived metallicities are employed to derive metallicity-period-amplitude relations, which are then used to find metallicities and distances for the entire sample. The RR Lyraes lie between 5 and 115 kpc from the Galactic center. They are divided into subsets of 316 RRab types and 91 RRc types based on their period, colour and metallicity. The density distribution is not smooth, but dominated by clumps and substructure. Samples of 55 and 237 RR Lyraes associated with the Sagittarius Stream and the Hercules-Aquila Cloud respectively are identified. Hence, ~ 70 % of the RR Lyraes in Stripe 82 belong to known substructure. There is a sharp break in the density distribution at Galactocentric radii of 40 kpc, reflecting the fact that the dominant substructure in Stripe 82 - the Hercules-Aquila Cloud and the Sagittarius Stream - lies within 40 kpc. In fact, almost 60 % of all the RR Lyraes in Stripe 82 are associated with the Hercules-Aquila Cloud alone, which emphasises its pre-eminence. Additionally, evidence of a new and distant substructure - the Pisces Overdensity - is found, consisting of 28 faint RR Lyraes centered on Galactic coordinates (80 deg, -55 deg) and with distances of ~ 80 kpc. The total stellar mass in the Pisces Overdensity is ~10000 solar masses and its metallicity is [Fe/H] ~ -1.5.


INTRODUCTION
In Aristotle's model of the Universe, the stars were fixed on a rotating sphere and eternally invariable. Stellar variability has been known since Fabricius' discovery of Miras in 1596, although the ancient Chinese and Korean astronomers were already familiar with supernovae or "guest stars" (Clark & Stephenson 1979). Proper motions were discovered in 1718 by Edmund Halley, who noticed that Sirius, Arcturus and Aldebaran had moved from their fixed positions recorded in Aristotelian cosmology.
Despite this long history, our knowledge of both variable stars and high proper motion sources remains very incomplete. As Paczyński (2000) has emphasised, over 90 per cent of variable stars brighter than 12 mag have not been discovered. There are still comparatively few large archives of variable sources and, as a consequence, our knowledge of many classes of object, including novae, supernovae, RR Lyraes (the focus of this paper) and high proper motion objects remains limited. Indeed, the variable sky remains one of the most unexplored areas in astronomy, with the exciting possibility that even bright variable objects may correc 2004 RAS Figure 1. Left: Cumulative distribution of mean object type in the HLC. Galaxy-like objects are assigned an object type of 3 and star-like objects are an object type of 6. For multiple observations, the mean indicates whether the object is mostly classified as a star or a galaxy. The cut used to define the stellar sample is shown as a dashed vertical line. Right: Cumulative distribution of reduced χ 2 for g (green, solid) and r (red, dashed) bands for all stellar objects. The slope of the distribution turns over at a reduced χ 2 value of ∼ 3, which is taken as the boundary separating constant and variable stars; this value is marked on the graph as a dash-dot line. spond to completely unknown astronomical phenomena (see e.g., Paczyński 2001).
The modern era of massive variability searches begins with the microlensing surveys like MACHO (Alcock et al. 1993), EROS (Aubourg et al. 1993) and OGLE (Udalski et al. 1992). Typically, these surveys monitored millions of stars down to V∼21 a few times every night over several years in the directions of the Galactic Bulge and the Magellanic Clouds. The resulting huge databases of lightcurves yielded information on many rare types of astrophysical variability. They were the first projects that harnessed the power of large format CCD cameras and modern computers to show that the acquisition, processing and archiving of millions of photometric measurements was feasible. The surveys were soon followed by high-redshift supernova surveys (such as High-Z SN Search (Schmidt et al. 1998) and the Supernova Cosmology project (Perlmutter et al. 1999)), which typically had a lower time resolution and smaller area coverage than the microlensing surveys, although much deeper limiting magnitudes.
The Sloan Digital Sky Survey (SDSS; York et al. 2000) provides deep and homogeneous photometry in five bands in a large area around the North Galactic Cap, but with almost no variability information. The main exception is the compilation of repeat scans of the ∼290 square degree area -known as Stripe 82 (see e.g., Adelman-McCarthy et al. 2008). The dataset has allowed the discovery of many new supernovae, which are publicised and followed up spectroscopically with other telescopes (see e.g., Frieman et al. 2008;Dilday et al. 2008). By averaging repeat observations of unresolved sources in Stripe 82, Ivezić et al. (2007) built a catalogue of 1 million standard stars with r magnitudes between 14 and 22. Sesar at al. (2007) then carried out the first analysis of ∼1.4 million variable stars and quasars using a colour-colour plot to assign variable types.
Recently, Bramich et al. (2008) presented a catalogue of almost four million "light-motion curves" using the data available in Stripe 82. Objects are matched between the ∼30 epochs, taking into account the effects of any proper motion over the eight-year baseline stretching back to the earliest runs in 1998. Thanks to the high quality of the SDSS imaging, excellent astrometric and photometric precision is attainable. Bramich et al. (2008) also provide a Higher-Level Catalogue which is a set of 229 derived-quantities for each light-motion-curve. These quantities describe the mean magnitudes, photometric variability and astrometric motion of a subset of objects whose light-motion-curve entries pass certain quality constraints.
Variable stars of particular interest are RR Lyraes, which have often been used to identify Galactic substructure -for example, in studies of the Sagittarius Stream (Ivezić et al. 2000;Vivas et al. 2004;Keller et al. 2008), and the Monoceros and Virgo Overdensities (Vivas & Zinn 2006;Keller, Da Costa & Prior 2009). RR Lyraes are particularly useful for three reasons. First, they are constituents of the old, metal-poor halo, in which substructure is abundant. Second, they are standard candles, enabling an estimate of their distances to be made. Third, they are bright enough to be detected out to distances of ∼130 kpc in the SDSS data, giving us an insight into the structure of the remote Milky Way halo.
In this paper, we select first the variable objects in Stripe 82 and then the subset of RR Lyraes, using the Bramich et al. (2008) Light-Motion Curve Catalogue (LMCC) and Higher-Level Catalogue (HLC). §2 discusses the selection of the variable objects and their properties, whilst §3 describes the identification of the RR Lyrae stars and properties of the population in some detail. §4 dis- Figure 3. A sample of folded lightcurves. The period in days is recorded at the top of each plot. The solid line is the mean magnitude, whilst the dashed lines represent 1σ deviations. In the bottom right corner, the number refers to the region in the colour-colour plot in which the lightcurve lies (see Figure A1 in Appendix A). The two rightmost lightcurves in the middle row are probable RR Lyraes. cusses substructure seen in the distribution of the RR Lyraes and we conclude in §5.

Variable Selection
The HLC contains 3,700,548 objects. Every object in the survey is assigned an object type each time it is observed: 3 if it is galaxylike and 6 if it is star-like (see § 3.1 of Bramich et al. 2008). For multi-epoch data, the object type averaged over all the observations provides a relatively reliable indicator of whether the source is starlike or galaxy-like. The cumulative distribution of mean object type in the left panel of Figure 1 shows that ∼ 55 per cent of objects in the catalogue are purely star-like or purely galaxy-like. To extract a sample of stars with essentially zero contamination from galaxies, we require that the mean object type is 5.5 or greater. This results in a "stellar" sample of 1 671 582 objects.
For the stellar sample, a cumulative distribution of reduced χ 2 (that is, χ 2 per degree of freedom) for the g and r bands is shown in the right panel of Figure 1. The value of reduced χ 2 at which the distributions turn over is ∼3, which is taken as the χ 2 -value below which stars are assumed to be well-modelled by a constant baseline. The number of objects that simultaneously satisfy reduced χ 2 > 3 in both g and r is 41 729. The stars with reduced χ 2 > 3 in both g and r bands are mainly variables, but still contain some artifacts, typically due to one or two outlying photometric measurements. One way to test for true variability is to look for correlations between different bands; a true variable star will usually have changes in brightness that are correlated in all bands whereas a discordant measurement may exist in one band only. Throughout this analysis, the Stetson index (L g ) is used as a measure for correlated variability between the g and r band data (see Stetson 1996).
For the stellar sample, a density plot of reduced χ 2 in the g band against the Stetson L g is shown in Figure 2, in which two distinct populations can be discerned. The first has an almost linear correlation between reduced χ 2 and L g , and are almost all true variable stars. The second has a high reduced χ 2 , but L g is low, indicat- ing that the brightness changes which give rise to the high reduced χ 2 values are not correlated between bands. To extract a sample of high-quality variable stars, we impose the simultaneous restrictions L g > 1, reduced χ 2 r > 3 and reduced χ 2 g > 3, together with requiring at least 10 good epochs (see Bramich et al. 2008), leaving 21 939 objects. We present a comparison of the content of our variable catalogue with the earlier catalogue of Sesar at al. (2007) in Appendix A.
To show the quality of the data, a selection of folded lightmotion curves are shown in Figure 3, from which a variety of periodic phenomena such as stellar variability and eclipses are evident. In particular, the two right-most images in the middle row are very likely RR Lyraes. From the periods and the lightcurve shapes, we might surmise that the first lightcurve is an ab-type RR Lyrae and the second is a c-type RR Lyrae.

Variable Properties
Colour-magnitude diagrams (g versus u − g) are plotted for both the stellar sample with u band data and the subset of variable stars, in Figure 4. For the stellar sample, we see three prominent clumps associated with the thin disk, thick disk and halo on moving redwards in colour. For the variables, there is one prominent clump centered on u − g ≈ 0.2, which is primarily associated with variable quasars. There are less prominent, but still discernible, peaks associated with variable stars in the thin disk, thick disk and halo.
A crude discrimination between different classes of variable objects is possible in g − r versus u − g space. We find that the variable sample is largely comprised of stellar locus stars and low-redshift quasars; stellar locus stars are predominant at bright (g < 19) magnitudes while low-redshift quasars dominate at faint (g < 22) magnitudes. RR Lyraes also make a significant contribution (see Appendix A for more detail on the statistical properties of the variable sample).

The Proper Motions
The HLC offers an improvement over previous variability work in Stripe 82 through the availability of proper motions. The combination of stellar photometry and proper motions has proved to be a powerful way of classifying stars -in particular, members of elusive populations such as white dwarfs, brown dwarfs and wide binaries. Such combined catalogues, drawn from the intersections of SDSS data with USNO-B data (Monet et al. 2003), have been constructed before (Munn et al. 2004;Gould & Kollmeier 2004;Kilić et al. 2006). Compared to such catalogues, the HLC is restricted to Stripe 82 and the proper motion sensitivity is poorer, due to the much shorter time baseline. On the other hand, the Stripe 82 photometric catalogue extends approximately 1.5 magnitudes deeper than the limiting magnitude of USNO-B proper motions (V∼21).
The reduced proper motion is defined as H = r + 5 log µ + 5, where r is the apparent magnitude and µ the proper motion in arcsec yr −1 . The criteria for inclusion in the reduced proper motion diagram in the top panel of Figure 5 is that the object lies in our stellar sample, that the proper motion is measured with a signal-to-noise ratio (S/N) >10 and that the absolute value of the proper motion |µ| exceeds 2 mas yr −1 . The S/N-cut has been chosen primarily to ensure easy visibility of structure on the figure, whilst the proper motion cut enables us to excise quasars. We discern three distinct sequences of stars, namely Population I disk dwarfs, Population II main sequence subdwarfs and disk white dwarfs. Vidrih et al. (2007) have used this reduced proper motion diagram to identify new ultra-cool and halo white dwarfs in Stripe 82, Similarly, Smith et al. (2009a,b) have used the same procedure to extract a sample of halo subdwarfs in studies of the velocity ellipsoid and halo substructure. The lower panel of Figure 5 shows the reduced proper motion diagram just for the subset of variables with proper motion S/N>5 and |µ|>2 mas yr −1 . The variables are disproportionately drawn from the Population I disk dwarfs, although the other two sequences can still be seen.

Identification of RR Lyraes
Here, we develop the tools to extract a high quality sample of RR Lyraes from the variable catalogue. We begin by selecting the 873 candidates that simultaneously satisfy all the following criteria, which are adapted from Ivezić et al. (2005), namely 13.5 < r < 20.7, 0.98 < u − g < 1.35, D min ug < D ug < 0.35, D min gr < D gr < 0.55, where D ug = (u−g)+0.67(g−r)−1.07, D gr = 0.45(u−g)−(g−r)−0.12. (2) The r band magnitudes correspond to distances ∼5-130 kpc. D ug and D gr represent slopes in the u − g and g − r colour-colour plane. Combined with the cut on u − g, the D ug and D gr criteria constrain the RR Lyraes to a hexagonal box in colour-colour space, optimizing the selection of RR Lyraes. The values D min ug and D min gr can be altered to adjust the completeness and efficiency of the RR Lyrae selection. We chose to use values D min ug = -0.05 and D min gr = 0.06, which would give a completeness of 100 per cent for the QUEST survey RR Lyraes (Vivas et al. 2004;Ivezić et al. 2005).

RR Lyrae Periods
Determining periods for our RR Lyrae candidates is non-trivial. In general, there are 30 to 40 datapoints in a lightcurve, unevenly sampled over an eight-year baseline. From this sparsely-sampled data, we seek a period that is a fraction of a day. The multi-band nature of the SDSS survey is an advantage here, as we are able to verify that any period estimate we obtain in one band is consistent with the data in additional bands.
The LMCC contains all the data for a given lightcurve. Each datapoint has a flag that is set (or unset) if the datapoint passes (or fails) certain quality requirements (for more details, see Figure 6. Lomb-Scargle period estimates in the g and r bands for the candidate RR Lyraes. The coloured lines represent resonance lines along which one period is a harmonic of the other: P r = P g (red), P r = P g /(1 ± P g ) (orange), P r = (P g /1 ± 2P g ) (green), P r = P g /(P g − 1) (cyan), P r = P g /(2P g − 1) (blue).

Figure 7.
A sample periodogram spectrum. The highest peak in each spectrum is indicated by the blue line; the additional peaks selected by the method described in the text are indicated by the orange lines, with the eventual best-fit peak marked with a red line. At least one peak was found in each region delineated by the cyan lines. Bramich et al. 2008). For period estimation, it is important that we use only the reliable data to minimise errors. In general, the g and r bands have smaller errors and fewer outliers and their clean lightcurves are the best sampled, so these bands are used together to estimate periods for our lightcurves.
As a first pass, we run a Lomb-Scargle periodogram (see e.g., Press et al. 1989) on each of the g and r band lightcurves, taking care to ensure that the frequency range extends to the high frequencies (or low periods) expected for RR Lyrae stars and that the sampling rate is detailed enough to discern individual peaks. The resulting period estimates are plotted in Figure 6. First, we note that there are a number of candidates which have matching period estimates. A number of resonance lines -the locations at which one of the period estimates is a harmonic of the other -are plotted as solid lines in the graph. Figure 8. The distribution of periods for all RR Lyrae candidates with periods in the expected range for RR Lyraes (black, solid), for the ab-type RR Lyraes (red, dashed) and the c-type RR Lyraes (blue, dot-dashed). The spike at P < 0.1 is due in part to δ Scuti and SX Phe stars, and the population with the shortest periods is due mostly to eclipsing variables.
The resonance lines are well-populated, indicating that the Lomb-Scargle periodogram can return harmonics of the true period as well as the period itself. Hence, we must consider whether the exact period matches are indeed cases where the true period has been recovered, or whether both period estimates are harmonics, and so on. Lightcurves with periods that do not match or do not lie on resonance lines could be objects that are not periodically variable -quasars, for example -or objects for which the periodogram has failed to recover the true period or a harmonic of the period in one or both cases. Not all of the period estimates lie within the range expected for RR Lyraes (0.2 − 0.8 days), which is probably a consequence of contamination in our sample. However, it is unclear whether we can simply reject any objects with period estimates larger than those expected for RR Lyraes. Certainly, some of the larger period estimates could be attributed to the sparse sampling of the lightcurves, which may generate a signal that overwhelms the periodic nature of the RR Lyraes.
Accordingly, we experimented with a number of alternative methods, including binning, smoothing and phase dispersion minimization, before concluding that a string-length technique is the most effective for our dataset. In outline, a string-length method works by phasing a lightcurve with a trial period and calculating the sum of the straight-line distances between consecutive points. The sum of these distances, or the string-length, will be minimised if the trial period is the true period.
We use a variation of the Lafler & Kinman (1965) stringlength technique described in Stetson (1996). For each trial period, the string-length is computed for the g and r bands. Their sum is taken as the overall string-length, which is minimised to obtain a period estimate. Running a string-length period finder over a wide and finely-sampled range of trial periods is computationally expensive. However, we can short-cut the process by restricting the string-length period search to a narrow range of periods, centred on a set of the most likely periods identified via the Lombe-Scargle periodogram.
To ensure that the correct RR Lyrae periods are identified, the g and r band Lombe-Scargle periodogram spectra were combined and the highest peaks in each of the ranges P<0.2, 0.2<P<0.6, Figure 9. Period versus the combined g and r band amplitudes. Three selection boxes are shown: red for candidate RRabs, blue for candidate RRcs, and green for candidates warranting further study. 0.6<P<1.0 and P>1.0 days were selected. Then, four further peaks were selected from each of the g and r band spectra independently. This was done according to highest amplitude, until four distinct new peaks had been selected in each band. Each peak was then considered in turn: the string-length was calculated for a narrow range of periods spanning the peak, the period for which the string-length was minimised was taken to be the best period in the vicinity of that peak. Finally, the period that returned the shortest string-length overall was adopted as the period estimate for the lightcurve.
In Figure 8, the solid black line is the period distribution for all of the RR Lyrae candidates and shows four clear peaks. Moving from high to low periods, these populations are predominantly: ab-type RR Lyraes (peak at ∼0.6 d), c-type RR Lyraes (peak at ∼0.35 d), eclipses (peak at ∼0.18 d) and δ Scuti and SX Phe stars (peak at ∼0.05 d). Also present in this candidate sample, are a number of non-periodic variables. The red dashed line is the period distribution for only those stars later determined to be ab-type RR Lyraes and the blue dot-dashed line is for those stars we later determine to be c-type RR Lyraes.
Before proceeding any further, we clean the sample of some eclipsing variable stars, δ Scuti stars, SX Phe stars and non-periodic variable contaminants by adopting a stringent cut on Stetson index L g >2.5. To perform subsequent analysis, we require that objects have a sufficient number of clean data points; that is, we impose further cuts on the number of clean epochs in the g and r bands: N g >5 and N r >5, leaving 604 candidates.
Uncertainties in the period estimates may be attributed to two sources. If we assume that the estimate is indeed close to the true period and not a harmonic, then any error is due to the string-length fitting technique, estimated to be ∼10 −5 day, from analysis of the phased RR Lyrae lightcurves. However, for a small fraction of the sample, where the period estimate is a harmonic of the true period, the error will be ∼0.1-0.5 day. Bailey & Pickering (1902) first divided RR Lyraes into three classes -a, b and c -based on the appearance of their lightcurves. Further study of Lyraes has altered the classification instead to just ab and c. It is believed that RRab stars are pulsating in the funda- Figure 10. The performance of the photometric metallicity relationships in eqns (7) and (9) is shown. The left (right) panel plots the RRab (RRc) candidates with spectroscopic metallicities against the corresponding photometric estimate. mental mode and RRc stars are pulsating in their first overtone (e.g., Smith 1995).

RR Lyrae Classification
The two classes have somewhat different properties: RRab stars generally have relatively large amplitudes (of order a magnitude) and asymmetric lightcurves with a steep rising branch and a slow, steady decline. Their periods lie mostly in the range 0.4-1.0 day and studies of RR Lyraes in globular clusters have shown that the amplitude of RRab stars decreases with increasing period. RRc stars have smaller visual amplitudes (around half a magnitude) and their lightcurves are more symmetrical and nearly sinusoidal. They generally have periods of 0.24-0.5 day.
To separate the two classes, we proceed by plotting the candidates in the plane of (P, A), where P is the period and A is the sum of the amplitudes in the g and r band lightcurves, as shown in Figure 9. The red box is defined as the region 0.43 < P < 0.85, 0.23 < A < 1.34 and includes 296 RRab candidates. The blue box is defined as the region and includes 122 RRc candidates. Finally, the green boxes (not fully shown in Figure 9 for clarity, but which extend to a period of two days) include a number of further candidates, which we do not want to discard without further investigation. Objects with a period of one day are almost certainly spurious (the lightcurves have a sampling period of one day) and are removed.
For the 43 objects that remain, there is the possibility that they may lie away from the concentration of RR Lyraes because of an incorrect period estimate. As discussed in Section 3.2, errors in the period estimates are small, but sometimes a harmonic of the true period is obtained. Hence, we revisit the period analysis to determine whether any of the likely period peaks lie within the red RRab box or the blue RRc box. If more than one period peak lies within a box, the period with the minimum string-length is used. Any objects for which a fitting period can be found are added to the RRab or the RRc candidate set as appropriate. This reclassification results in 330 RRab and 137 RRc candidates.

The RRab types
126 of our RRab and 52 RRc candidates possess SDSS spectra and have spectroscopic metallicity estimates. We use these objects Figure 11. The RR Lyrae candidate selection boxes in the u−g versus g−r plane; the orange (dot-dashed) box is used to select RRab types, the cyan (dashed) box for RRc types. The RRab (RRc) candidates that pass the period, amplitude and metallicity cuts are shown as red squares (blue circles). Also shown are suspected eclipsing variables as green triangles.
to calibrate empirical relationships and thence derive photometric metallicities for the entire sample. Jurcsik & Kovacs (1996) found that the metallicity of an RR Lyrae depends on the period P and the shape of the lightcurve, which may be parameterised via a Fourier decomposition: The amplitudes A i and the phases φ i can then be combined as follows: Inspired by the analogous relation of Jurcsik & Kovacs (1996), we use the spectroscopic metallicities and the lightcurve properties to derive the metallicity-period-amplitude-phase relation which has a typical scatter σ = 0.25. Its performance is shown in the left panel of Fig. 10. To refine our RRab sample, we insist that and apply a further restriction to trim the sample of a remaining few eclipsing variables by imposing the selection box in orange (dot-dashed lines) in colour-colour space shown in the lower panel of Figure 11. This leaves us with 325 RRab candidates. Also shown in this figure are the confirmed RRab types as red squares and likely eclipsing variables as green triangles. Morgan et al. (2007) found that the metallicities of c-type RR Lyraes also vary with period and lightcurve shape. Owing to the difference in pulsation mode, and hence lightcurve shape, between the RRab stars and the RRc stars, a new metallicity relation must be derived for the RRc candidates. The lightcurves were Fourier decomposed and the 52 RRcs with spectroscopic metallicities were Listed are the amplitudes in the g and r bands, the period, the reduced χ 2 in g and r, the Stetson index, the metallicity, the heliocentric distance D and Galactocentric distance r, the absolute magnitude M z and the number of good epochs in g and r.

The RRc types
used to derive the following metallicity-period-amplitude relation, the form of which is suggested by Morgan et al. (2007) As the right panel of Fig. 10 shows, there is a somewhat greater scatter of σ = 0.38 in this relationship than the corresponding one for RRabs. A metallicity cut −3.75 < [Fe/H] < 0 removes a few obvious outliers. As for the RRabs, we discard eclipsing variables by imposing the selection box in cyan (dashed lines) in colour-colour space shown in the lower panel of Figure 11. This leaves us with 97 RRc type candidates. Also shown in this figure are the confirmed RRc types as blue dots and likely eclipsing variables as green triangles.
In the same way that the outlying candidates were reanalysed to determine whether a period could be found that placed the star into the RRab or RRc dataclouds, so the RRab and RRc rejects are also reanalysed, just in case they are misclassified RRc and RRab respectively. The metallicity and colour cuts described above then determine inclusion in the candidate sets. Finally, a visual inspection of the lightcurves confirms that these are very high-quality samples. Judging by lightcurve shape, there are at most 21 possible contaminants in the RRab set and only 6 possible contaminants in the RRc set. The lightcurves typically possess low S/N and/or are poorly sampled. While many of these objects will be RR Lyrae, to be conservative, they are removed, giving us a final sample of 316 RRab stars and 91 RRc stars, 407 RR Lyraes in all.

RR Lyrae Distances
RR Lyraes are "standard candles" because they have a well-defined absolute magnitude, which, nonetheless, depends on metallicity. We calculate distances via the distance modulus: The apparent magnitudes m z come directly from the HLC. The absolute magnitudes M z are obtained via the following relation from where P is the fundamental period, log Z = [Fe/H] − 1.5515 and C 0 = (u−g) 0 − (g−r) 0 , with the 0 subscript denoting that the colours are unreddened. The intrinsic scatter in this relation is small compared to the errors. Uncertainties on the metallicities [Fe/H] and distances D are computed using standard methods. From this, we find that the distance errors are typically around 8 per cent (this includes the scatter due to the metallicity relationships). The right ascension, declination, classification, mean magnitudes, amplitude, period and distance of all our 407 RR Lyrae candidates are given in an accompanying electronic table. The means and dispersions for some useful quantities for the RRab and RRc subsamples are given in Table 1.

The Sagittarius Stream, the Hercules-Aquila Cloud and the Pisces Overdensity
The distribution of RR Lyraes in right ascension and distance is shown in Figure 12, with the ab-types plotted as red circles and the c-types plotted as blue triangles. There are a number of things to notice. First, there are 296 RR Lyraes at right ascensions 20.7 h < α < 24 h , but only 111 at 0 h < α < 3.3 h . The greatest concentration of RR Lyraes is in the fields coincident with the Hercules-Aquila Cloud (Belokurov et al. 2007). Of course, not all these RR Lyraes are necessarily associated with the Cloud, as there may be contamination from the underlying smooth population associated with the Galactic Spheroid. It is known that the Bulge and Spheroid harbour a population of RR Lyraes, distributed in a roughly spherical manner around the Galactic Centre, with a metallicity distribution peaked at [Fe/H] ∼ −1 (see e.g. Walker & Terndrup 1991;Alcock et al. 1998;Collinge et al. 2006). The plane of the orbit of the Sagittarius dwarf galaxy crosses Stripe 82, and there is a visible overdensity of RR Lyraes at this location (α ≈ 2 h ). Finally, we note that there are few (specifically 47) RR Lyrae at large distances (D> 50 kpc), of which 28 lie in a clump at α ≈ 23.5 h .We term the structure the Pisces Overdensity. Distance uncertainties are shown as vertical bars for each RR Lyrae -although the error bars for the distant RR Lyrae are large enough to be visible, they cannot be responsible for the overdensity. The upper panel of Figure 13 shows the fraction of accessible volume as a function of Galactocentric radius r probed by our survey. The volumes are calculated via Monte Carlo integrations in which the RR Lyrae luminosity function is modelled as a Gaussian with mean and dispersion from Table 1; the magnitude limits used were those defining our RR Lyrae sample. The survey reaches at least r∼100 kpc before the accessible volume begins to decline. The brightest RR Lyraes in our sample have M z =0.1 and so are still detectable within r∼130 kpc. In classical models of the smooth halo, the RR Lyraes are distributed as a power-law like ρ ∼ r −n with n∼3.1 (Wetterer & McGraw 1996). With no substructure present in Stripe 82, the right ascension-distance graph of Figure 12 would look rather different. The fall-off in numbers would be steady, and not as sharp as the drop observed beyond D∼40 kpc, which is real and cannot be attributed to properties of the survey.
The presence of an edge to the RR Lyrae distribution in the     The insets show each distribution plotted separately, together with a white circle centered at the origin. There is no obvious distinction according to metallicity, and the low, medium and high metallicity RR Lyraes are clearly distended and seemingly belong to the Hercules-Aquila Cloud. In particular, none of the populations is distributed in a spherically symmetric manner around the Galactic Centre. stellar halo at r ∼50 kpc has been proposed before by Ivezić et al. (2000), using a sample of 148 RR Lyraes in SDSS commissioning data. However, the same authors later applied their method to a larger area of the sky and found no break until at least 70 kpc (Ivezić et al. 2004). Vivas et al. (2006) also found no break before the limit of their survey at ∼60 kpc. So, "edge" may be too strong a term, but the number density profile of the RR Lyraes does seem to be best matched by a broken power-law, as shown in the lower panel of Figure 13. Such a parameterisation was first advocated by Saha (1985), who noticed that the RR Lyrae density fell off much more rapidly beyond Galactocentric radii of 25 kpc. Adjusting by the fraction of Galactic volume sampled by our survey, and assuming that our efficiency is ǫ ≈ 1, we find that the sphericallyaveraged number density of RR Lyrae as if 23 < r < 100 kpc (12) out to ∼100 kpc, beyond which our data is highly incomplete, with only bright RR Lyraes detectable (see Figure 16). Our break radius of 23 kpc is very close to that proposed by Saha (1985). The inner power-law slope is almost the same as that found by Miceli et al (2008) -namely n=-2.43 -in a very large sample of RR Lyrae stars closer than 30 kpc in the LONEOS survey. Of course, formulae such as eqn (12) are just a parameterisation of the data, as the RR Lyrae distribution is neither spherically symmetric nor smooth, but dominated by the three structures in the Stripe. The break at r∼25 kpc is really a consequence of the fact that most of the RR Lyraes are in the Hercules-Aquila Cloud and the Sagittarius Stream substructures, which lie within 40 kpc of the Galactic centre. A similar conclusion regarding the importance of substructure is reached by Sesar at al. (2007), who divide their RR Lyrae distribution into 13 clumps, of which they suggest at least seven correspond to real substructures.
In Figure 14, the number density of RR Lyraes is plotted in the plane of Galactic longitude versus i band magnitude. The three substructures show up very clearly, together with some isolated hot pixels that may be indicators of real objects. We define the Sagittarius Stream RR Lyraes via 180 • > ℓ > 135 • , 16.5 < r < 18.5 (13) RR Lyraes associated with the Hercules-Aquila Cloud are extracted via For the Pisces Overdensity, we chose the stars satisfying These cuts give 55 stars in the Sagittarius Stream, 28 in the Pisces Overdensity, and 237 in the Hercules-Aquila Cloud. Ideally, we would like to separate any contaminating Bulge and Spheroid RR Lyrae from those of the Hercules-Aquila Cloud, but this is not easy. In particular, Figure 15 shows the density distribution of the RR Lyrae populations colour-coded according to metallicity. The comparatively metal-rich RR Lyraes (red in the Figure) do not seem to be distributed any differently from the comparatively metal-poor (green and blue). In fact, all the distributions are distended and distributed asymmetrically relative to the Galactic Centre, consistent with the bulk of the stars belonging to the Hercules-Aquila Cloud.
The properties of the RR Lyrae in the different substructures are listed in Table 2. Note that very nearly 60 per cent of all the RR Lyraes in Stripe 82 are associated with the Hercules-Aquila Cloud, emphasising the arguments made by Belokurov et al. (2007) as to the importance of this structure. The mean heliocentric distances of the Hercules-Aquila Cloud and the Sagittarius Stream are comparable in Stripe 82, but the Pisces Overdensity is much further away at D∼80 kpc. The Pisces Overdensity lies within a few degrees of the Magellanic Plane. Although the distance of the Overdensity is greater than that of the Large and Small Magellanic Clouds (D∼55 kpc), it is possible that the Magellanic Stream may be more distant in this area of the sky. Thus, at present, it is unclear whether the Pisces Overdensity is related to Magellanic Cloud debris. One way to estimate the mass of the Pisces Overdensity is to compare with the Carina dwarf spheroidal, which is at a similar distance (∼100 kpc; Mateo 1998). Carina has a total mass of ∼ 2 × 10 7 M ⊙ and a mass-to-light ratio of ∼70. At least 75 RR Lyrae stars have been detected by Dall'Ora et al. (2003) using well-sampled multi-epoch data in the B and V bands, although over a small baseline of a few days. Assuming similar stellar populations and similar efficiencies of detection of bright RR Lyrae in the surveys, then we can use a simple scaling argument to estimate the total stellar mass associated with the Pisces Overdensity as ∼ 10 5 M ⊙ . We can corroborate this mass estimate by comparison with our data on the Hercules-Aquila Cloud. The calculation using the Hercules-Aquila Cloud has the advantage that the RR Lyrae populations in the two structures have been discovered by the same algorithm, but the disadvantage that the properties of the Cloud are also rather uncertain. The absolute magnitude of the Cloud is given by Belokurov et al. (2007) as M r =-13, suggesting that its total stellar mass is ∼ 10 7 M ⊙ . Of course, the Cloud is an enormous structure, covering ∼80 • in longitude and probably extending above and below the Galactic plane by 50 • . Only a small fraction (∼ 1 per cent) of the Cloud is probed by the Stripe 82 data, suggesting that there must be ∼ 2×10 4 RR Lyraes associated with the Cloud in total. The mass of the Cloud covered by the Stripe is ∼ 10 5 M ⊙ . Again, assuming similar stellar populations, the mass associated with the Pisces Overdensity is at least ∼ 10 4 M ⊙ , a value encouragingly similar to our first estimate.
Of the 28 stars identified as members of the Pisces Overdensity, the intrinsically faintest has M z = 0.76. The stars extend over an area of 55 deg 2 of Stripe 82. Thus, the surface number density of the RR Lyrae is 0.51 deg −2 . By comparison, the Hercules-Aquila Cloud has 209 RR Lyraes with an absolute magnitude brighter than M z =0.76. They cover 95 deg 2 of Stripe 82, and hence the surface number density is 2.20 deg 2 . This suggests that the Pisces Overdensity is ∼4.33 times more diffuse than the Hercules-Aquila Cloud.
The distance and metallicity distributions of our RR Lyraes can be used to study the properties of the Hercules-Aquila Cloud, the Sagittarius Stream and the Pisces Overdensity. Plots are shown in Figure 16. The distribution of heliocentric distances for the Cloud has a mean of 22.0 kpc and a standard deviation of 12.1 kpc. One possible interpretation of the Cloud is that it is analogous to caustic features like the shells seen around elliptical galaxies. However, the considerable depth of the Cloud seen in the RR Lyrae distribution tends to argue against such an interpretation as a caustic structure.
In directions towards Stripe 82, the distances of the arms of the Sagittarius Stream are not well-known. Simulations offer a rough guide, but no more than that. The upper panel of Figure 3 of Fellhauer et al. (2006) shows the young leading arm (A), together with parts of the old trailing arm (B), at distances of 15-20 kpc; further, material belonging to parts of B, the old leading arm (C) and the young trailing arm (D) spread out over a swathe of distances 30-60 kpc. The distribution of distances of our Sagittarius RR Lyraes in Figure 16 does indeed show some evidence of bimodality. It is possible that the peak at distances D∼20 kpc corresponds to the A and B streams, whilst the peak at D∼35 kpc corresponds to the other wraps. However, the simulations also suggest that the second peak should be much broader than appears to be the case in the data. Our identifications are tentative and radial velocity data is required to enable a cleaner separation of the wraps, as is evident from the lower panel of Figure 3 of Fellhauer et al. (2006).
In fact, Vivas et al. (2005) have already carried out VLT spectroscopy of 14 RR Lyrae variables that lie in the leading arm of the Sagittarius Stream, finding a metallicity of [Fe/H]=-1.76±0.22. The stars lie well away from Stripe 82 at right ascensions 13 h < α < 16 h and at heliocentric distances of ∼50 kpc. Many of the RR Lyraes in our sample will belong to the trailing arm, which may account for some of the difference in the metallicity estimate.
We show a view of Stripe 82 as derived from SDSS mainsequence turn-off (MSTO) stars in Figure 17. The upper panel gives the number of MSTO stars as a function of right ascension and distance. The one-dimensional histogram plotted in black shows the dependence of number on right ascension alone. Note that the Sagittarius Stream is immediately visible at α ≈ 40 • . There are two density maxima in the black histogram, perhaps hinting that more than one wrap of the Stream is detectable in MSTO stars. The Hercules-Aquila Cloud substructure also shows up very clearly, although the fainter Pisces Overdensity is understandably absent. The distance estimates to the substructures derived from MSTO stars agree well with those from RR Lyraes. The SDSS DR6 includes a large number of stellar spectra which have been analysed to provide radial velocities and fundamental stellar atmospheric parameters (Lee et al. 2008). The velocities of all stars with spectra and satisfying g − i < 1, to remove most of the thin disk contaminants, are plotted against right ascension along Stripe 82 in the middle panel. Of course, most of the stellar targets are disk stars, and so the curve v GSR = 190 cos b sin ℓ kms −1 is plotted to show the locus of the thick disk in this dataset. The Sagittarius Stream stars are clearly offset in velocity from the thin disk at v GSR ≈ −130 kms −1 . The bottom panel shows the same data, but now colour-coded according to metallicity so as to highlight different structures. We can detect the kinematically bifurcated Sagittarius Stream and clearly see the separation of the more metal-rich Galactic disk and bulge stars from the Hercules-Aquila Cloud. The eye can also discern some fainter substructure, the reality of which remains to be established.
A final view of Stripe 82 is provided in Figure 18, which shows the density distribution of blue horizontal branch stars (BHBs) and blue stragglers (BSs), selected using the colour cuts of Yanny et al. (2000). The BHB population of course abut the RR Lyrae population in the Hertzsprung-Russel diagram. We might expect to see all three substructures in the BHB density -and so it is reassuring that the Sagittarius stream, the Hercules-Aquila cloud and the Pisces Overdensity are all visible. The same substructures are also identifiable in the BS populations with the exception of the Pisces Overdensity which is of course too distant. The Sagittarius Stream is clearly bifurcated in BHBs, although not in BSs, suggesting that the ratio of BHBs to BSs varies along the Stream. There is some evidence for bimodality in the BHB distance distribution of the Hercules-Aquila Cloud.

CONCLUSIONS
We have constructed a catalogue of 21 939 variable objects in Stripe 82. The catalogue of variables is published in full as an electronic supplement to this article. We have extracted a sample of RR Lyrae stars, 316 RRab types and RRc types, from the variable catalogue, using a combination of cuts based on colour, period, amplitude and metallicity. The RR Lyraes lie at distances 5-115 kpc from the Galactic centre and individual distance estimates, accurate to typi- cally 8 per cent, are calculated using the colour, period and metallicity to estimate absolute magnitude.
If the RR Lyrae data are modelled by a smooth density distribution, then a good fit is provided by a broken power-law. The number density of RR Lyrae falls with Galactocentric radius r like n(r) ∼ r −2.4 for 5 < r < 23 kiloparsecs, switching to a much steeper decline, n(r) ∼ r −4.5 for 23 < r < 100 kiloparsecs. However, smooth, spherically-averaged density laws do not tell the whole story, as in reality the RR Lyrae distribution is strongly clumped. In Stripe 82, the distribution of RR Lyraes is dominated by three enormous substructures -namely, the Hercules-Aquila Cloud, the Sagittarius Stream and the Pisces Overdensity. We identified samples of 237 RR Lyraes in the Hercules-Aquila Cloud, 55 stars in the Sagittarius Stream and 28 in the Pisces Overdensity.
RR Lyraes belonging to the Hercules-Aquila Cloud are very numerous, and comprise almost 60 per cent of our entire Stripe 82 sample. Although there may be some contamination from a smooth component of RR Lyraes associated with the Galactic Bulge and Spheroid, there is no doubt concerning the existence of the structure, supporting the initial identification of Belokurov et al. (2007). We estimate that the total number of RR Lyraes associated with the Cloud is 2 × 10 4 . The Hercules-Aquila RR Lyraes lie at distances from the Galactic Centre of 20.2±11.3 kpc, and are metal-poor with [Fe/H] = −1.42 ± 0.24.
Both leading and trailing arms of the Sagittarius Stream also intersect Stripe 82. Simulations predict that the leading wrap is closer in heliocentric distance than the trailing, but the locations of the arms are not accurately known in this region of the sky. The heliocentric distances of our Sagittarius RR Lyraes, which predominantly are associated with the trailing arm, have a mean of 26.2 kpc and a dispersion of 5.5 kpc, whilst their metallicity is [Fe/H] = −1.41 ± 0.19.
We have also identified a new concentration -the Pisces Overdensity -consisting of 28 RR Lyraes centered on Galactic coordinates of (ℓ ≈ 80 • , b ≈ −55 • ). This is one of the most distant clumps so far found in the halo, as the RR Lyrae lie at distances of ∼80 kpc. Although the location is close to the Magellanic Plane, the Pisces Overdensity is much more distant than the Magellanic Clouds and may well be unrelated to any known component of the Galaxy. We have made an order-of-magnitude estimate of the total mass associated with the Overdensity as at least ∼ 10 4 M ⊙ . The associated RR Lyrae have a metallicity [Fe/H]=-1.47±0.34, comparable to the Hercules-Aquila Cloud, but richer than the typical populations in the outer halo.
Our investigation has exploited the advantages of RR Lyrae stars for identifying remnants and substructure present in the halo of the Galaxy. Together with earlier SDSS discoveries (Belokurov et al. 2007;Jurić et al. 2008), the patchy and clumpy nature of the RR Lyrae distribution adds support to the picture of an outer halo composed of overdensities and voids, perhaps entirely devoid of any smooth component (e.g. Bell et al. 2008). Further study of the kinematics and metallicities of RR Lyraes in Stripe 82 should lead to a major advance in our understanding of the Galactic halo, albeit that significant observational resources will be required to acquire the necessary follow-up spectroscopy. Figure A1. Colour-colour plots for 4 648 variables brighter than g = 19.0 (left) and 21 789 variables brighter than g = 22.0 (right). The upper panels are g − r versus u − g, the lower panels are r − i versus g − r. Sesar et al. (2007) label these regions as white dwarfs (Region 1, red), low-redshift quasars (Region 2, orange), M dwarf/white dwarf binaries (Region 3, green), RR Lyraes (Region 4, cyan), main stellar locus (Region 5, blue) and high-redshift quasars (Region 6, purple).  Table A1. The distribution of candidate variable sources in the g − r versus u − g diagram. The columns list the fraction of the whole sample and the variable subsample lying in the six regions of the colour-colour plot. a width 2. • 52 in declination from δ = −1. • 26 to 1. • 26. We extracted a sample of high-quality variable stars from the LMCC by imposing the restrictions that: i) χ 2 r > 3 and χ 2 g > 3, ii) a cut on the Stetson index L g > 1, iii) at least 10 good epochs are retained, giving a a catalogue of 21 939 variable objects. Applying Sesar et al's cuts to our catalogue gives 22 483 objects, with ≈ 80% in common with our subsample based on Stetson index cuts. Even though Sesar et al.'s cuts give more candidates, the additional objects possess variability in different passbands that is not well-correlated.
Sesar and co-workers used a colour-colour plot to discriminate between different classes of variable objects. In Figure A1, our stellar subsample is plotted in g − r versus u − g and r − i versus g − r. Here, following Sesar at al. (2007), the g − r versus u − g plot is divided into six regions, and labelled according to possible occupants: white dwarfs (the red-coloured Region 1), low-redshift quasars (the orange-coloured Region 2), M dwarf/white dwarf binaries (the green-coloured Region 3), RR Lyraes (the cyancoloured Region 4), stellar locus stars (the blue-coloured Region 5) and high-redshift quasars (the purple-coloured Region 6). The colour-space divisions provide only very rough classifications. In some cases (such as region 1), the label does not even describe the typical population, and we merely use the labels as a point of comparison to Sesar's work.
The percentages of the variable subsample and the whole sample lying in the regions of the colour-colour plots are given in Table A1, whilst sample lightcurves have already been shown in Figure 3. Almost all (>93 per cent) of the variable objects lie in three regions -namely, low-redshift quasars (53 per cent of the catalogue), stellar locus stars (31.4 per cent), and RR Lyrae stars (9.3 per cent). When split according to magnitude, the bright (g< Figure A2. The spatial distribution of the variable subsample in Stripe 82. Objects are colour-coded according to the Regions of the colour-colour plot in which they lie (see Figure A1). The upper panel shows the number of all the variable objects versus right ascension.
19.0) variable-sky is dominated by stellar locus stars, but the faint (g<22.0) variable-sky is dominated by quasars. We can compare our results to Table 1 of (Sesar at al. 2007), which shows the same quantities for their variable subsample. Our variability criteria picks out more variable objects, and in particular more denizens of the main stellar locus.
The spatial distribution of variable objects in Stripe 82 is shown in Figure A2. The equatorial stripe reaches down to low Galactic latitudes beyond α ≈ 18 h (see e.g. Figure 1 of Belokurov et al. 2007). Variables belonging to Region 4 (RR Lyraes) and Region 5 (the main stellar locus) dominate here, whereas variables belonging to the other Regions are more uniformly dispersed in right ascension.