Discovery adductomics provides a comprehensive portrait of tissue-, age- and sex-specific DNA modifications in rodents and humans

Abstract DNA damage causes genomic instability underlying many diseases, with traditional analytical approaches providing minimal insight into the spectrum of DNA lesions in vivo. Here we used untargeted chromatography-coupled tandem mass spectrometry-based adductomics (LC–MS/MS) to begin to define the landscape of DNA modifications in rat and human tissues. A basis set of 114 putative DNA adducts was identified in heart, liver, brain, and kidney in 1–26-month-old rats and 111 in human heart and brain by ‘stepped MRM’ LC–MS/MS. Subsequent targeted analysis of these species revealed species-, tissue-, age- and sex-biases. Structural characterization of 10 selected adductomic signals as known DNA modifications validated the method and established confidence in the DNA origins of the signals. Along with strong tissue biases, we observed significant age-dependence for 36 adducts, including N2-CMdG, 5-HMdC and 8-Oxo-dG in rats and 1,N6-ϵdA in human heart, as well as sex biases for 67 adducts in rat tissues. These results demonstrate the potential of adductomics for discovering the true spectrum of disease-driving DNA adducts. Our dataset of 114 putative adducts serves as a resource for characterizing dozens of new forms of DNA damage, defining mechanisms of their formation and repair, and developing them as biomarkers of aging and disease.


Introduction
Even in the best case, DNA is subject to endogenous and environmental stressors that cause thousands of damage events in every cell each day (1)(2)(3) .In the worst case, DNA adducts resulting from exogenous and endogenous exposures drive a variety of cancers and other diseases ( 4-7 ) , with increased risk from excessive exposures or interindividual variations in metabolic activities and DNA repair capacity.There are numerous exogenous sources of DNA damage ( 8 ,9 ) and many endogenous sources, such as adventitious methylation by S-adenosylmethionine ( SAM ) ( 10 ) , alkylation by glyoxal, methylglyoxal, and a host of other reactive metabolites ( 11 ) , deamination by APOBEC and other editing enzymes ( 12 ) , oxidation of DNA and other molecules by reactive oxygen species ( ROS ) from leaky mitochondria, and the large variety of DNA-damaging chemistries generated during inflammation ( 13 ,14 ) .The availability of SAM as a cofactor for enzymatic DNA methylation may also regulate 5methyl-dC-based epigenetic marks in DNA, which can further drive expression of DNA repair or damage susceptibility genes ( 15 ) .While the literature is replete with > 100 known forms of DNA damage from exogenous and endogenous exposures, there are certainly many more forms of DNA damage products and structural variants that we have not yet discovered.
Confounding our relatively poor knowledge of the actual types and levels of DNA damage that arise in humans are the complexities of interindividual variations in environmental exposures and the genetics of DNA repair and metabolism, as well as the roles of sex differences and aging in the balance of DNA damage formation and repair.The mechanistic basis for sex-based differences in DNA damage includes estrogenrelated generation of ROS and DNA adducts, interference with some forms of DNA repair, and DNA-damaging inflammation during the estrous cycle, with exogenous endocrine disrupting chemicals complicating these DNA-damaging environments ( 16 ) .Emerging evidence indeed points to strong sex differences in DNA damage processing and repair pathways ( 17 ,18 ) .With regard to aging, one long-standing theory posits that mutagenic DNA damage caused by inflammation and metabolism can accumulate over time and drive age-related pathologies ( 4 ) .For example, the well-established shift in metabolism as a function of age ( 19 ) may be accompanied by age-related shifts in the spectrum of DNA-reactive metabolites.This accumulation can result from increased formation and decreased repair, with evidence of age dependence for both ( 20 ,21 ) .Indeed, slowly repaired or unrepaired DNA damage products are well-established sources of base substitutions, deletions, insertions, and genomic instability that drive many human diseases (22)(23)(24) .While there are general associations between some types of DNA damage and age-related pathologies, the tissue-, sex-and age-related mechanisms of damage formation and repair as well as the downstream consequences of age-associated damage events remain unknown ( 25 ) .
Traditional targeted approaches to analyzing DNA modifications have revealed species-, tissue-, sex, and age-related determinants of formation and repair ( 26 ,27 ) .There are numerous methods for targeted analysis of DNA lesions, including 32 P-postlabeling, immunoblot and comet assays, and chromatography-coupled tandem mass spectrometry ( LC-MS ) approaches (26)(27)(28) .For example, LC-MS with isotope-labeled internal standards provides the most precise, accurate, and chemically-specific approach to multiplexed quantification of dozens of DNA damage products ( 28 ) , with evidence for inflammation-induced increases in several classes of damage related to the chemical mediators of inflammation ( 29 ) .However, such targeted analyses are limited to known damage products and have yielded little information about the true spectrum of DNA adducts and lesions that arise in tissues and drive aging and disease in humans.Here we took an untargeted mass spectrometry-based 'adductomic' approach (30)(31)(32)(33)(34)(35) to define a comprehensive basis set of known and new DNA lesions and modifications in rat and human tissues.Subsequent targeted analysis of these putative DNA adducts revealed significant age-, tissue-, species-and sexdependent variations.This dataset not only serves as a starting point for defining biomarkers for aging, pathology, and disease, but it also begins the process of defining the landscape of biologically relevant DNA damage products mammalian tissues.

Materials and methods
All experiments reported in this study were performed in a single laboratory with the same reagents and sample processing methods to ensure consistency and control over experimental conditions.However, to assess the reproducibility of the results, two otherwise independent studies were performed by two different operators on two different Agilent tandem quadrupole LC-MS / MS systems with two different rat cohorts.These two nearly independent studies showed 70% identify among the adducts ( Supplementary Table S2 ) and strong similarity in tissue distribution of identified adducts from the two rat cohorts ( Figures 1 and 3 a ) , which rigorously validates the adductomics method.The resulting dataset serves as a resource to define the structures of novel DNA adducts, to define the mechanisms of their formation and repair, and to characterize their involvement in pathology and disease.

Reagents
T ris-hydrochloride ( T ris-HCl ) , magnesium chloride ( MgCl 2 ) , acetic acid, sodium acetate, formic acid, butylated hydroxytoluene ( BHT ) , tetrahydrouridine ( THU ) , coformycin, DNase I, benzonase, phosphatase alkaline, desferrioxamine mesylate salt, glyoxal, 4-thio-2 -deoxyuridine, 6-thio-2deoxyguanosine, 5-methyl-2 -deoxycytidine, 2-chloro-2deoxyadenosine, 8-hydroxy-2 -deoxyguanosine ( 8-oxodG ) , 5-hydroxymethyl-2 -deoxyuridine, 5-fluoro-2 -deoxycytidine and 6-thio-2 -deoxyguanosine were obtained from Sigma-Aldrich ( Saint-Louis, MO, USA ) .Phosphodiesterase I was purchased from Affymetrix ( Cleveland, Ohio, USA ) .[ 15 N ( U ) ]labeled 2 -deoxyguanosine was purchased from Cambridge Isotopes ( Andover, MA, USA ) .Isopropanol and acetonitrile were purchased from VWR Scientific ( Franklin, MA, USA ) .Deionized water was further filtered through MilliQ systems ( Millipore corporation, Bedford, MA, USA ) and used in the whole experiment.Anti-sarcomeric protein α-Actinin 2 and ProLong Gold Antifade were purchased from Thermo Fisher Scientific ( Waltham, MA, USA ) while 1,N  Here w e de v eloped a decision tree using different analytical filters to remo v e common salts, isotopomers, known artifacts, and potential noise ( signal intensity < 2 ) to obtain a final data set of DNA adducts for subsequent targeted analysis.The Venn diagram shows the final sets of 92 and 94 putative adducts identified in disco v ery studies f or tissues from two different rat cohorts, yielding a total set of 114 putative DNA adducts and modifications for subsequent quantitative analysis in each sample.( B ) Using the curated set of adducts, DNA adductome profiles were generated for individual DNA samples from rat liver ( B ) , kidney ( C ) , brain ( D ) and heart ( E ) in rats ranging in age from 1 to 26 months for rat group #1 ( blue ) and 4 to 26 months for rat group #2 ( yellow ) months.DNA adducts that were further identified using chemical standards are labelled with names and CID transitions.Each DNA adduct, defined or putative, is represented by its retention time ( min, y-axis ) , mass-to-charge ratio ( m / z , x-axis ) and relative abundance ( siz e of the bubble ) .Signals from the f our canonical DNA nucleobases are not represented in the plot.Blue bubbles and arrows are DNA adduct signals observed in group #1 while yellow are those observed in group #2.
C A, US A ) respectively.5-Hydroxymethyl-2 -deoxycytidine ( 5-HMdC ) , 1,N For the transgenic mice, two models were used.Reduced GLO1 activity was modeled using Glyoxalase I Knock Down ( Glo1-KD ) mice on a C57 / BL6 background and their wildtype littermates ( 36 ) , with the number of KD and WT mice for each tissue type as follows: liver, 11, 18; brain, 10, 15; and heart, 8, 12. Increased GLO1 activity was modeled using Glyoxalase I over-expression mice on a C57 / BL6 background and their wild-type littermates ( 37 ) , with the number of KD and WT mice for each tissue type as follows: brain, 5, 5; and heart, 7, 7. Frozen tissues were directly obtained from University of California San Diego.The ages for the knockdown and the overexpressing mice were 217-273 and 180-292 days, respectively.

Human samples
Procurement of 19 frozen samples of non-failing left ventricular myocardial tissue from human donors, deemed unsuitable for transplant, were obtained from University of Pennsylvania ( Philadelphia, PA USA ) ( 38 ) .Samples were immediately stored at -80 • C until DNA extraction.The ages of the 9 female and 10 male subjects ranged from 22 to 81 years.Donor demographics are detailed in Supplementary Table S1.Samples of human brain tissue from participants in the Religious Orders Study or Rush Memory and Aging Project ( ROSMAP ) without cognitive impairment ( control subjects, CT ) ranging in age from 71 to 89 years were obtained from the Rush Alzheimer's Disease Center ( Chicago, IL, USA ) ( 39 ) and were stored at -80 • C. Both studies were approved by an Institutional Review Board of Rush University Medical Center.Each participant signed an informed consent.Anatomic Gift Act, and Repository Consent to allow their data to be repurposed.NCI was defined as previously reported ( 40 ) .Donor demographics and definitions of clinical and neuropathological criteria are detailed in Supplementary Table S1.
For DNA adductome discovery analysis, mixtures of genomic DNA for each tissue type ( 5 μg from each mouse, 8 mice per age group, 4 age groups; 160 μg of DNA for each of 4 tissue types ) were dried under vacuum and reconstituted in 90 μl of 10 mM Tris-HCl buffer pH 8, 1 mM MgCl 2 , 10 μg / ml coformycin, 50 μg / ml THU, 1 mM desferrioxamine, 1 mM BHT and digested using 10 U benzonase, 5 U DNAse I, 17U alkaline phosphatase, and 0.1 U phosphodiesterase I. Blank samples were also prepared using the same procedure without the enzymes or without the DNA.Digestion was allowed to occur overnight at 37 • C, followed by protein removal using 10 kDa exclusion filters centrifuged at 12 800 × g for 10 min and analysis by LC-MS / MS.

DNA adductome analysis: adduct discovery in pooled DNA samples
Adductome analysis was performed on an HPLC Agilent 1290 Infinity II with an added Diode Array Detector ( DAD ) set at 260 nm absorbance coupled with the Agilent 6490 ( Group #1 rats, mice, humans ) or 6495 ( Group #2 rats ) triple quadrupole mass spectrometer with an electrospray ion source operated in positive ion mode with the following source parameters: drying gas temperature 200 • C with a flow of 12 l / min, nebulizer gas pressure 30 psi, sheath gas temperature 300 • C with a flow of 12 l / min, ESI capillary voltage 4000 V and nozzle voltage 0 V.A Kinetex EVO C18 column ( 2.1 × 150 mm, 2.6 μm, Phenomenex ) was used for chromatographic separation ( column temperature 30 • C ) .Binary mobile phase flow rate was 0.400 ml / min ( A -0.1% formic acid in water, B -0.1% formic acid in acetonitrile; delivered at 0% of B for 2 min; increased to 16% of B for 15 min; increased to 80% of B in 1 min and hold at 80% of B for 5 min; decreased to 0% of B in 1 min and re-equilibrate for 5 min ) .For 5-HMdC quantification, the binary mobile phase consisted of the following: A -water, B -5 mM of Ammonium Acetate in water, pH 5.3; delivered at 0% of B for 2 min; increased to 16% of B for 15 min; increased to 80% of B in 1 min and hold at 80% of B for 5 min; decreased to 0% of B in 1 min and re-equilibrate for 5 min ) .The effluent from the first min of LC elution was diverted to waste to minimize the contamination of the ESI source.The strategy was designed to detect the neutral loss of 2 -deoxyribose from positively ionized 2deoxynucleoside adducts by monitoring the samples transmitting their For each pooled DNA mixture for a tissue, analyses were performed in a series of 6 injections of 10 μl each of ( i ) enzymatically hydrolyzed DNA mixture ( 20 μg DNA ) , ( ii ) intact DNA mixture ( no enzymes; 20 μg DNA ) and ( iii ) a no-DNA blank, with an MRM table for each injection covering a 50 Da range in 1 Da increments from m / z 225 → 109 to m / z 525 → 409 for a total of 300 MRM transitions for the 6 injections ( collision energy 10 eV ) .Each of 8 methods covering the 300 MRM transitions had a final cycle length / scan time of 790.S2a for both groups of rats ( average of two rounds of stepped MRM for rat group #1 and one round for rat group #2 ) , were curated to remove artifacts, noise, and other spurious signals using the decision tree detailed in Figure 1 a to remove common salts, isotopomers, known artifacts and potential noise ( signal intensity < 2 ) .The resulting relative peak intensity of these signals are detailed in Supplementary Table S2e and plotted as a bubble chart in which the x-axis was the m / z and the y-axis was retention time ( Figure 1 ) .

DNA adductome analysis: adduct quantification in individual DNA samples
The final set of 114 putative DNA adducts and modifications detailed in Supplementary Table S2e was used for targeted analysis of individual tissue samples from rats.Here we injected 10 μg of DNA from each tissue sample on the LC-MS / MS system described earlier and analyzed the eluent with an MRM table for the 114 putative DNA adducts and modifications.The resulting signals were normalized by dividing by the sum of the UV signals for the four canonical 2 -deoxyribonucleosides, as described earlier.The normalized signal intensities for both groups of rats are detailed in Supplementary Table S3.

Identification of DNA damage products with standards
To tentatively identify the structures of adductomic signals, a cocktail of 2 -deoxyribonucleoside standards ( 10 nM each in 30 μl of water ) was analyzed by LC-MS / MS using the same conditions noted earlier for adductomic analysis.The goal here was to correlate retention times and MS / MS transitions for the standards with the adductomic data.The following transitions were monitored:

Identification of the dC dimer
The identity of the putative DNA adduct with m / z 455 → 339 was achieved by multi-stage mass spectrometry ( MS n ) using an HPLC-coupled Thermo Orbitrap ID-X Tribrid mass spectrometer coupled with a Vanquish LC ( Thermo Fisher ) .Five microliters of sample were injected on a Kinetex C18 column ( 150 × 2.1 mm, Phenomenex ) eluted with mobile phase A ( 0.1% formic acid ) and mobile phase B ( 0.1% formic acid in acetonitrile ) using the following gradient: 2 min at 0% then to 18% B in 13 min and to 60% B in 2 min.The original conditions were re-established in 1 min and the column reequilibrated for 5 min.MS 1 data were acquired at 120000 resolution in the orbitrap with a 70% RF lens.MS 2 data were acquired on the targeted m / z identified previously, using assisted HCD spanning 20, 35, and 60 normalized collision energy, and assisted CID at 15, 30, 45%.MS 2 fragments were measured in the Orbitrap at 60 000 resolution.MS 2 fragments with > 5% relative intensities were selected for MS 3 , with a fragmentation by CID at 30%, and analysis in the orbitrap at 30 000 resolution.All data were analyzed in xcalibur qual browser, with some of the fragmentation additionally analyzed in Massfrontier ( V8.0, Thermo fisher ) .

DNA adduct quantification by isotope dilution LC-MS / MS
For absolute quantification of specific DNA modifications and damage products, isotope dilution LC-MS / MS was performed on isolated DNA used for adductomic analyses.Ten microliters of each sample were injected into the LC-MS / MS instrument using the same LC-MS / MS parameters noted in 'DNA adductome analysis'.Characteristic reactions and collision energies for DNA adducts of interest are as follows ( m / z precursor ion → m / z product ion, collision energy ( eV ) , retention time ) :  15 N 2 ]-8-Oxo-dG ( m / z 286 → m / z 170, 10, 6.65 min ) .For canonical 2deoxyribonucleosides, calibration curves were obtained by injecting standard solutions on the LC-MS / MS system and measuring the UV signal with an in-line detector.The response factor was then calculated as the slope of the UV signal versus concentration.For DNA adducts, calibration samples containing fixed amounts of isotope-labeled standard ( IS ) and variable amounts of unlabeled standard ( std ) were injected on the LC-MS / MS and the response factor calculated as the slope of the std / IS area ratios versus the standard concentration.See Supplementary Figure S2 for both canonicals and adducts.Ultimately, the adduct quantities are normalized by dividing them by the quantities of canonical 2 -deoxyribonucleosides to correct for sample and injection variability.Raw mass spectrometric data were analyzed by MassHunter Qualitative Analysis software ( Agilent Technologies, USA ) .

Immunostaining and imaging
To prepare samples for immunostaining, samples were dissected from patients and stored at -80 • C freezer prior to transferring to Tissue-Tek optimal cutting temperature ( OCT ) compound for cryosection.Slides with sectioned samples were then fixed with ice-cold 100% methanol for 15 min at 4 • C, washed 3 times with PBS, and blocked for 1 h with a 5% goat serum, 0.3% Triton X-100 and PBS solution.Primary antibodies were then applied to samples in antibody buffer ( 1% BSA and 0.3% Triton X-100 in PBS ) , overnight at 4 • C. For cardiomyocyte staining, anti-sarcomeric protein α-actinin 2 was applied at 1:250 or 2 μg / ml, and 1,N 6ε dA antibody was applied at 1:100, in antibody buffer.Following five washes in PBS 0.3% Triton X-100, secondary antibodies were then applied for 1hr in antibody buffer, at 1:1000.α-Actinin 2 was detected with Goat anti-Rabbit IgG conjugated to Alexa 488, and 1,N 6 ε dA was detected with Anti-Mouse IgG conjugated to Alexa 555.To detect nuclei, Hoechst 33342 was applied for 10 min following secondary antibody removal, at 1:10 000 in a 0.3% Triton-PBS.Slides were then washed 5 times in 0.3% Triton-PBS, washed two times in PBS, and True-Black Plus Lipofuscin Autofluorescence Quencher was applied for 10 min to reduce autofluorescence.Finally, TrueBlack was washed 3 times with PBS, and slides were mounted in Pro-Long Gold Antifade.Images were captured with an LSM 980 Confocal Microscope ( Zeiss ) , using the Airyscan mode and a 63 × oil objective.Z-stacks were captured across nine tiles, to sample larger areas for image analysis.

Image analysis
To detect 1,N 6ε dA in cardiac cell nuclei within sectioned samples, we performed image analysis on captured confocal images using arivas Vision 4D software ( arivas ) .We used the machine learning module to detect cardiac cell networks identified by α-actinin 2 staining, the thresholding segmentation module to detect nuclei by Hoechst 33342 staining, and the blob finder module to detect 1,N 6 ε dA spots.Then we used the compartments module to detect nuclei within cardiac cells networks and 1,N 6 ε dA within these nuclei.Selected nuclei were encompassed by greater than 70% of the cardiac signal, and 1,N 6 ε dA puncta were selected when they were encompassed by greater than 50% nuclei signal in selected nuclei.

Statistical analyses
All statistical analyses were performed using either Graph-Pad Prism ( version 8.0 ) or R ( http://www.r-project.org ).Data are presented as means ± SEM (standard error of the mean).Each biological sample was prepared at least two times to confirm observation.Statistical significance was determined using one-way ANOVA with Bonferroni's multiple comparisons test for rat study and a two-sided unpaired Student's t -test to compare wild-type (WT) to KD / OE mice samples.Finally, we used Pearson's correlation in human non-failing heart samples.* P < 0.05, ** P < 0.01, *** P < 0.001 and **** P < 0.0001 was considered to indicate statistically significant differences.We traced differences of DNA adducts in rat sex and age using two-sided Mann-Whitney U test (wilcox.testfunction of stats package in R).The false discovery rate (FDR) was estimated using qvalue function in the qvalue package ( 41 ) in R. To identify the most strongly covarying DNA adducts, we performed principal component analysis (PCA) on shared adducts in rat samples (Figure 3 A, Supplementary Figure S3a, d) in both groups using the prcomp function (stats package in R).To find underlying patterns, we used mixture model clustering that is a probability-based approach in which we assume the data set is best described as a mixture of probability models.We employed the mclust package ( 42 ) for Gaussian mixture modeling (GMM) to determine the underlying Gaussian mixture distributions (Supplementary Figure S3c, f).For group #1, the first four PC spaces (cumulative contribution ration = 83.4%;Supplementary Figure S3a) and, for group #2, the first six PC spaces (cumulative contribution ration = 82.3%;Supplementary Figure S3d) were fed into the Mclust function with neither prior probability nor model assumptions.Gaussian distributions were randomly initialized and their parameters optimized iteratively to achieve best fit.Best model was chosen based on Bayesian Information Criterion (BIC) of various models with different covariances parameterisations as previously explained ( 42 ) as shown in Supplementary Figure S3b (group #1) and Supplementary Figure S3e (group #2).

Results
The adductomics approach involves an untargeted discovery phase in which the 2 -deoxynucleoside components of hydrolyzed DNA in pooled tissue samples are discovered by multiple reaction monitoring (MRM) of collision-induced dissociation (CID) products losing 2 -deoxyribose ( m / z 116) in 1 Dalton increments across the range of m / z 225-525 (i.e.'stepped MRM').This approach provides greater sensitivity ( ≤10 fmol; vide infra ) than simple neutral loss scanning.After curating data to remove known artefacts and other noise (Figure 1 A), the remaining putative DNA adduct signals serve as a basis set for subsequent MRM-based LC-MS / MS analysis of individual DNA samples to quantify the putative adducts.For all studies in rat, mouse, and human tissues, care was taken to avoid adventitious damage during DNA isolation and processing, with addition of antioxidants and deaminase inhibitors ( 28 ).Supplementary Table S2 details the adductomics discovery phase using DNA isolated from four tissues from groups of four male and four female Brown-Norway rats ranging age from 1-to 26-months-old, in two completely independent studies performed on different mass spectrometers with different rat cohorts.To provide the broadest coverage of adduct variation, DNA samples from each tissue were pooled across all ages to create four samples subjected to adductomics discovery analysis.The 300 mass spectrometer signals ( m / z 225-525) from adduct discovery for each organ were reduced to 92 and 94 mass spectrometer signals in the two independent rat studies and assigned as putative DNA adducts and modifications on the basis of several filtering criteria (Figure 1 A): (i) MS fragmentation released a mass consistent with 2deoxyribose, (ii) signals were detected in at least one tissue at a level that was at least three-times the average background signal intensity and were detected in all individual samples of each rat or human tissue, (iii) signals were not detected in no-DNA and no-nuclease controls (i.e.all reagents except DNA or nuclease / phosphatase), (iv) signals representing salts (e.g.Na + , K + ) were removed, (v) signals representing putative isotopomers of stronger signals ( m / z + 1 and + 2) were removed, (vi) known artifacts were removed, (vii) signals with normalized signal intensities ≤2.0 and not shared between the two independent analyses were removed.It is important to note that the mass spectrometer signals could represent co-eluting mixtures of chemicals with the same m / z value and cannot be assumed to represent individual 2 -deoxyribonucleosides until the signals are structurally validated.Further, the curation process may have inadvertently removed weak signals representing real DNA adducts, so the basis set of putative adducts must be viewed as a conservative estimate and a starting point for defining new adducts.For example, signals for putative M + 1 and M + 2 isotopomers were removed if they shared LC retention time and peak shape with a putative parent molecular ion.However, this amounted to ≤ 10% of all adduct signals for both groups of rats.The power of the adductomics method lies in the ability to discover a wide range of detectable damage products and modifications in cells or tissues ( > 100 here) and then quantify their variance among tissues and conditions.The resulting dataset serves as a foundation for defining the structures of novel DNA adducts, defining the mechanisms of their formation and repair, and characterizing their involvement in pathology and disease.
As evidence of the rigor of the discovery method, 72 of putative adducts (63%) were shared between the two independent cohorts of rats analyzed with two different mass spectrometer systems (Agilent 6490 and 6495) (Figure 1 A, Supplementary Table S2).We then attempted to assign chemical structures to each of the 114 putative adducts according to literature data ( 14 , 32-34 , 43-48 ), subsequent high-resolution mass spectrometric analysis of HPLC fractions, or by comparison to synthetic standards.The adductomic analysis revealed 66 previously undescribed putative 2 -deoxyribonucleoside species as well as 48 species with fragmentation and m / z values consistent with known damage products (average normalized signal intensities and tentative chemical identities in Supplementary Table S2 and Supplementary Figure S2).These results are summarized in the bubble plots in Figure 1 , depicting the identity and quantity of putative DNA damage products in liver (Figure 1 A), kidney (Figure 1 B), brain (Figure 1 C), and heart (Figure 1 D) tissues from the two rat cohorts (blue and yellow bubbles).
That the untargeted adductomic method revealed true DNA damage products and modifications is supported by confirmation of the structures of 10 signals as known DNA lesions based on comparison to chemical standards (Supplementary Figure S2c).For example, we detected a putative adduct with m / z of 242 → 126 in all tissues, which was val-idated as the epigenetic marker 5-methyl-2 -deoxycytidine (5-MdC; Figure 2 ).The fact that the signal intensity for the m / z of 242 → 126 5-MdC modification is several orders-ofmagnitude higher than other putative adducts in all tissues (Figure 1 , Supplementary Table S2) is consistent with its presence as ∼1% of all dC in the human genome ( 49 ,50 ).Also consistent with the quantitative rigor of the method, we detected an age-dependent signal with m / z 326 → 210 at much lower levels than 5-MdC and identified it as N 2 -carboxymethyl-2deoxyguanosine ( N 2 -CMdG) (Figure 2 , Supplementary Table S2)-an advanced glycation end-product resulting from the reaction between glyoxal and 2 -deoxyguanosine -by chemical synthesis of a standard for comparison by highresolution LC-MS analysis (Supplementary Figure S2c).Similar structural validation was achieved for DNA damage arising from oxidation (8-Oxo-dG) and alkylation (1,N 6ε dA, 3,N 4ε dC, M 1 dG, O 6 -or N 2 -MdG), as well as physiological epigenetic marks (5-HMdC, 5-FdC).The rigor of the method was further validated with a putative adduct having an m / z 455 → 339 (Supplementary Table S2).High-resolution MS 3  Orbitrap mass spectrometric analysis established the structure of this adduct as a protonated dimer of 2 -deoxycytosine, such as a cyclobutane dimer (Supplementary Figure S4).That this adduct co-eluted with dC is consistent with its formation as an artifact of ionization and not as a pre-existing damage product in DNA.

Tissue-and sex-specific DNA damage and modification in rats
The observation of 114 putative DNA adducts identified in two independent studies of four tissues from male and female rats that were 1-to 26-months-old raised the question of tissue-, sex-and age-specificity of the individual adduct species.At the simplest level, a comparison of adductomic signal intensities (Supplementary Figure S5) shows substantial variation among brain, heart, kidney, and liver.The Venn diagram in Figure 3 A quantifies the distributions of these adducts among the tissues, revealing that the epigenetic marks (5-MdC, 5-HMdC, 5-FdC) were detected in every tissue, as  S2 for adductomic signal data.
expected.While there were a few putative adducts uniquely detected in each tissue, 60-70% of the signals were detected in all tissues (52 of 92 in group #1 and 68 of 94 in group #2) (Supplementary Table S2).Among the adducts shared in all tissues, a principal component analysis (PCA) revealed strong tissue-specific variations in the levels of the shared adducts (Supplementary Figure S3a, c), with a Bayesian statistical approach (Supplementary Figure S3b,e) revealing 5 clusters common to both groups of rats (Supplementary Figure S3c, f).This similar clustering for the two independent adductomics experiments underscores the potential biological meaning of the datasets.A further Gaussian mixture model clustering of the 52 shared putative DNA adducts from group #1 revealed that signals from heart tissue were uniquely segregated into two clusters (II and III in Figure 3 b), with cluster III being heavily biased to sex (5 female) and age ( ≥16 months) (Figure 3 B), suggesting a strong sex and age bias for adducts in heart tissue.A significant number of sex-biased DNA adducts and modifications were observed in the four tissues from group #1, most strikingly for heart ( n = 30) and kidney ( n = 34) and less so for brain ( n = 1) and liver ( n = 2) (FDR = 1% in Figure 3C-F; Supplementary Table S4).Among the epigenetic marks, significant sex differences were observed for 5-HMdC in heart and kidney tissues, and for 5-FdC in kidney, with no difference observed for 5-MdC in any tissue (Supplementary Table S4).While clusters were strongly homogenous in group #1 rats, it was not possible to analyze group #2 rats for sex differences since there were no females in 2 of 4 age groups (Supplementary Figure S3d-f).

Adductome analysis in rat tissues reveals age-dependent DNA damage and modification
While most DNA adducts in rat tissues were unchanged with age, several adducts showed strong age dependence in the different tissues: brain, 8; heart, 3; kidney, 6; and liver, 19 ( P < 0.1).Among these are known adducts 5-HMdC, N 2 -CMdG, N 2 -methyl-dG, 8-Oxo-dG and 1,N 6ε dA (Supplementary Figure S6, Supplementary Table S5).To validate the adductomic results, we performed sensitive and specific absolute quantification of the known adducts, using isotope dilution triple-quadrupole mass spectrometry, as shown in Figure 4 , Supplementary Figure S7, and Supplementary Table S5.The epigenetic mark 5-HMdC was previously observed to increase with age in mouse liver, kidney and brain ( 49 ,50 ) and in human tissues ( 52 ).Our study confirmed this increase in rats aged 1 to 26 months in liver (2.8-fold, Figure 4  Blue and y ello w numbers (X-Y) correspond to the 92 putative adducts identified in group #1 rats and 94 in group #2 rats.Data are from Supplementary Table S2.( B ) Gaussian mixture model clustering of the 52 shared adducts in group #1 rats (Supplementary Figure S3a-c) re v ealed 5 underlying Gaussian distributions (Clusters I-V).The number of members in each cluster is shown in parentheses.Members of Cluster III comprise a striking group of 5 heart samples from mainly older female rats (labeled with sample # and age).( C-F ) Bar plots of q -values for comparisons of putative DNA adduct levels in tissues from female versus male group #1 rats: ( c ) brain (81 adducts), ( D ) heart (71 adducts), ( E ) kidney ( 71) and ( F ) liver ( 77 ).q -Values were calculated from P -values obtained with the Mann-Whitney U test on data from Supplementary Tables S2 and S4.Dotted lines represent the false discovery rates (FDR) at 10, 5 and 1%, with values in parentheses representing the number of significant adducts at each threshold.
in 8-Oxo-dG in rat liver ( 53 ), with a 1.8-fold increase over 26 months of age ( P = 0.0001; Supplementary Figure S7e).These results validate the quantitative robustness of the adductomic analyses.Similar analyses were not performed on rat group #2.We also found age-dependent increases in N 2 -CMdG, an advanced glycation end-product (AGE) that results from a reaction between glyoxal and 2 -deoxyguanosine (dG) ( 54 ) and that has been in mouse hepatoma cells, 293T human kidney cells, and calf thymus DNA ( 45 ,55 ).In rat, N 2 -CMdG increased in 3-fold in liver (Figure 4 B) and 1.5-fold in heart (Figure 4 H) over 1-26 months but remained unchanged in kidney (Figure 4 D) and brain (Figure 4 F).While mutations caused by glyoxal-induced DNA adducts can be prevented by nucleotide excision DNA repair (NER), mismatch DNA repair (MMR), and the protein DJ-1 in vitro ( 56 ,57 ), metabolicallyderived glyoxal, methylglyoxal, and other reactive aldehydes are removed by the glyoxalase system of GLO1 and GLO2 in humans to prevent DNA adduct formation.In this system, GLO1 converts glyoxal, methylglyoxal, and glutathione (GSH) to S -glycolylglutathione that is cleaved by GLO2 to glycolate, D-lactate, and GSH ( 58 ).Here, we quantified the ef-fect of GLO1 expression on N 2 -CMdG levels in brain, heart, and liver tissues from mice with Glo1 knock-down ( 36 ) or over-expression ( 37 ) and wild-type littermates.Surprisingly, the only change observed in N 2 -CMdG levels involved a significant increase in liver in GLO1 over-expressing mice compared to wild-type (Supplementary Figure S7f-j).These data indicate either that N 2 -CMdG is formed from sources other than glyoxal or that GLO1 does not significantly affect glyoxal levels in mice, either of which has implications for the observed age-dependent accumulation of N 2 -CMdG.

5-HMdC and 1,N 6ε dA are age-dependent DNA adducts in human heart
The observed results in rats raised the question of agedependent DNA damage and modification in humans.To this end, we performed adductomic analyses with DNA from nonfailing human myocardial tissues comprised of left ventricular myocardial samples from 17 people ranging in age from 22to 81-years-old ( 37 ,59 ) and from 10 human brain tissue samples from persons without cognitive impairment ranging in age from 71-to 89-years-old (Supplementary Table S1).The stepped MRM analysis of pooled tissues revealed 111 total putative DNA adducts and modifications (Supplementary Table S6a), with subsequent MRM-based LC-MS / MS analysis revealing 91 in brain and 87 in heart (Figure 5 a,b, Supplementary Table S6b).The Venn diagrams in Figure 5 B reveal that rats and humans share signals for 62 putative adducts in brain and 53 in heart.Among these, adductomics revealed that 5-HMdC and 1,N 6 ε dA all increased with age in human hearts (Figure 5 A, Supplementary Table S6b), with this age dependence corroborated by isotope dilution LC-MS / MS that shows a significant and positive correlation for 5-HMdC ( P = 0.021) and 1,N 6ε dA ( P = 0.0048) in non-failing human hearts (Figure 5 C, D; Supplementary Table S5).As a product of reaction of dA with electrophilic products of lipid peroxidation ( 60 ), 1,N 6ε dA levels are elevated in human diseases such as Wilson's disease, hemochromatosis and cirrhosis ( 61 ), and show age-dependent increases in liver and brain in the O XY S rat model of aging due to oxidative stress ( 62 ).Supplementary Figure S8 shows 1, N 6 ε dA concentrated mainly in select cardiomyocyte nuclei, raising the question of the age of these cells in the local tissue environment.We do note that, despite the statistically significant age dependence of several adducts, the small sample size limits interpretation of the data.

Discussion
In spite of the overwhelming evidence that imbalances in DNA damage and DNA repair cause and drive many human diseases, such as in cancer ( 5), the actual landscape of DNA dam-age chemistry that arises in human cells and tissues has not been systematically defined.Here, we took a first step to unravel this problem with systems-level discovery of the spectrum of DNA damage products and modification in rat and human tissues.Such an analysis is possible only with the emergence of convergent adductomic technology consisting of untargeted mass spectrometric detection and quantification of virtually any type of nucleobase damage in DNA ( 33 ) coupled with statistical modeling to visualize patterns and covariations among the putative DNA damage products.This systems-level untargeted DNA adductomic approach serves as a discovery tool by revealing all detectable forms of DNA damage, including new chemical species, followed by providing relative quantification of adduct levels across different conditions.The sets of 114 putative DNA adducts in rats and 111 in humans identified here provide an opportunity to identify their structures, define their mechanisms of formation and repair, and ultimately link them to the genetic toxicology of aging and disease.

Validation of the adductomic method
As an untargeted method, there is always concern about detecting artifacts caused by non-DNA contaminants that produce a low-resolution CID fragmentation mimicking loss of 2 -deoxyribose.Such artifacts are mitigated by starting with highly purified DNA and accounting for contaminants in non-DNA components of the analyzed samples.Each analytical run must include controls lacking buffer, enzymes, and DNA to identify contaminants.Our adductomic method helped us to achieve these requirements.Selected signals were reproducibly present at 3-times the background level in all 32 samples of each analyzed rat tissue ( i.e. 8 samples of each tissue at 4 different ages; 22% average error), with more than half of the detected adducts being shared between rats and humans in brain and heart tissues (Figure 5 b) in two different sets of rats.The adductomic method was validated (i) by post hoc identification of 10 detected adducts using chemical standards and (ii) by high-resolution structural analysis of products, such as the dC dimer.Of the remaining 114 unidentified species, about half had low-resolution m / z values consistent with known DNA damage structures (Supplementary Table S2).More interesting after filtering our data, we were able to find 63% of shared adducts between the two studies showing the robustness of our approach.Thus, there is a high probability that most signals detected in our adductomic analyses are indeed DNA adducts that merit identification and definition of formation and repair mechanisms to understand their roles in pathobiology and the basis for our observed age, tissue, and sex biases.Of course, the adductomic method is not as sensitive as isotope-dilution LC-MS / MS and will not detect labile nucleobase species, such as rapid depurination of 8-nitro-2 -deoxyguanosine arising during inflammatory stresses ( 14 ), or species arising from oxidation of the 2 -deoxyribose moiety ( 13 ).There are also complexities in the adductomic method, such as MS ionization of adducts as salts (e.g.possible ammonium salt of 5-Cl-dC for m / z 279-163; Supplementary Table S2) and rearrangements of free 2 -deoxyribonuclesides (e.g.sugar tautomers unique to FAPy-dG; vide infra ).Despite these constraints, the strength of the adductomic approach lies in the insights gained from mining the systems-level datasets for correlative patterns that reveal behaviors of classes and groups of DNA adducts across different conditions of health and disease.By first identifying adductomic patterns, then identifying the chemical species driving the patterns, and finally doing the biochemical detective work, we can define the DNA damage products that truly drive human disease.

Mining adductomic data for chemical classes of DNA adducts
There are many ways to mine adductomic datasets.One initial approach is to identify a putative DNA damage species that correlates with a condition of interest, such as the age-dependent increases in 8-Oxo-dG (Supplementary Figure S7e), and then explore other mechanistically related species, such as the large and well-studied class of purine damage products arising during inflammation and other oxidative and nitrosative stresses (Figure 2 ) ( 14 ).Taking this path, we identified 8-Oxo-dG among the 114 adducts using a standard but failed to detect signals accordant with 8-Oxo-dA ( m / z 268 → 152) or FAPy-dA ( m / z 270 → 154).This is consistent with the observation of 10-fold lower levels of 8-Oxo-dA than 8-Oxo-dG and the relative amounts of 8-Oxo-dA, 8-Oxo-dG, F APy-dA, and F APy-dG in cells and tissues subjected to irradiation and other oxidative stresses (Figure 2 ) (63)(64)(65).While we did detect two signals suggestive of the known anomeric or sugar tautomer forms of FAPy-dG ( m / z 286 → 170) (66)(67)(68), the observed HPLC retention times of 2.9 and 4.9 min for m / z 286 → 170 (Supplementary Table S2) are inconsistent with expectations-based retention times of ∼0.7 and ∼0.9 min for 5-methyl-FAPy-dG (Figure 2 ).We also observed signals suggestive of the other oxidized purine nucleobase products noted in Figure 2 : oxazolone (Oz; m / z 247 → 131), nitroimidazole (NitroIm; m / z 287 → 171), spiroiminodihydantoin (Sp; m / z 300 → 184; same as FAPy-dG), and guanidinohydantoin (Gh; m / z 274 → 158).While 8-Oxo-dG and putative Oz were variably present in rat and human tissues, the putative NitroIm signal was only present in rat liver tissue and a signal consistent with Gh was present in all rat and human tissues at high levels.The observation with Gh runs contrary to published in vitro studies of DNA damage caused by a variety of oxidants in which Gh rose at levels nearly two orders-of-magnitude lower than 8-Oxo-dG in all cases ( 69 ).However, the biochemical environment in cells and tissues has the potential to strongly influence the chemical reactions involved in DNA damage formation and partitioning, which leaves the door open to the identity of m / z 274 → 158 as Gh.Based on the HPLC retention times of chemical standards, we ruled out Sp and 5-methyl-FAPy-dG as the adduct present at m / z 300 → 184: ∼0.95 min for Sp (consistent with its hydrophilicity) and ∼0.7 and ∼0.9 min for the anomeric or tautomeric pair of FAPy-dG species (Figure 2 ).The adductomic data thus provides the foundation for a host of testable hypotheses about the in vivo behavior of entire classes of adducts arising by similar chemical mechanisms: Are in vitro observations recapitulated in vivo ?How does the cellular environment alter chemical partitioning?What are the true oxidants causing damage to dG and dA in vivo ?Biochemical detective work, structural validation, and subsequent targeted analyses can then be used to test these models.Structural validation using high-resolution mass spectrometry and chemical synthesis is well-practiced and entirely feasible, as we demonstrated with the dimer of dC and N 2 -CMdG.One can also define the role of specific DNA repair mechanisms by comparing adductomic profiles between wild-type and DNA repair mutant animal models or across human populations with defined interindividual repair capacities.

DNA adduct biases among tissues
The power of comparing adductomic profiles across different conditions is illustrated by our observations of tissue-, age-and sex-specific biases in adduct spectra.Strong tissuespecific biases were observed in rat for 114 putative and validated DNA adducts (Figure 3; Supplementary Figure 3d-f, S5), with a variety of tissue-specific or at least tissue-enriched DNA damage products (Supplementary Table S2).The normalized signal intensities for adducts noted in Supplementary Table S2 represent quantitative metrics that allow relative comparisons of adduct levels across tissues given the precision of the LC-MS signals, with appropriate caution that the signal intensities do not reflect absolute amounts of the putative adducts due to differences in ionization and fragmentation efficiencies.After removing the large signal for epigenetic mark 5-MdC, summing the signal values in a tissue represents a crude measure of the DNA adduct load in the tissue (number of adduct types and percentage of total adduct load in parentheses): rat brain, 3517 (81; 28%); rat heart, 1458 (71; 12%); rat kidney, 3574 (71; 29%); rat liver, 3790 (77; 31%); human brain, 6142 (91; 63%); human heart, 3584 (87; 37%).The DNA adduct loads were higher in the tissues analyzed in group #2: rat brain, 9483 (85; 24%); rat heart, 7085 (84; 18%); rat kidney, 10552 (90; 26%); rat liver, 13080 (90; 32%).The higher normalized signal intensities observed for the group #2 rat tissues compared to group #1 reflect higher sensitivity of the Agilent 6495C MS used to analyze group #2 compared to the Agilent 6490 MS used for group #1.While there is little meaningful information that can be derived from the adduct load data, it is notable that both rat and human heart have the lowest level of adducts relative to other tissues and that the two rat groups shared a similar order of adduct loads (heart < brain < kidney < liver).The complexity of the adductomic datasets, as mixtures of potentially all classes of lesion types and repair pathways, precludes comparisons to published studies showing tissue-specific variation in adduct repair for one or a few representative adducts or a single repair class (70)(71)(72).For example, Gosponodov et al. estimated NER rates among rat kidney (0.1 lesions / kb / hr), liver (0.1 lesions / kb / hr), brain (0.08 lesions / kb / hr), and heart (0.04 lesions / kb / hr) ( 73 ), while Langie et al. found lower BER activity in mouse brain compared to liver ( 74 ).Cell replication rates could also account for tissue differences.For example, repair of 8-Oxo-dG and FAPy-dG occurs by OGG1 baseexcision repair throughout the cell cycle and by MUTYH removal of A at 8-Oxo-dG:A mispairs during DNA replication ( 75 ).MUTYH activity is highest in rapidly dividing cells ( e.g .gastrointestinal tract) due to replication-coupled repair of 8-Oxo-dG:A mismatches.However, in addition to other repair metrics, there is no correlation between tissue-specific levels or loads of adducts and cell turnover rates in humans: hepatocyte cells (200-400 days) ( 76 ), endothelial kidney cells ( > 1 year) ( 77 ), cardiomyocytes ( ∼50 years) ( 78 ) and neurons (lifespan) ( 79 ).Even less is known about tissue-specific metabolism that causes DNA adducts.For example, mitochondrial density and energy activity correlate with production of reactive oxygen species and associated oxidative damage to lipids and carbohydrates, to form DNA-reactive alkylating agents, as well as to DNA and RNA directly ( 70 ,71 ).Brain and liver are lower in both metrics compared to heart and kidney ( 70 ,71 ), but, again, differences among the tissues in adduct load and the number of types of adducts in these tissues do not correlate with these metrics.These attempts to explain the observed biased tissue distribution of known and putative DNA adducts point to potential value in performing adductomic analyses in genetically engineered mouse models for the various DNA repair pathways, by identifying true substrates of DNA repair enzymes, quantifying the contribution of repair to adduct levels in each tissue, and to the age-and sex-dependence of adduct levels.

DNA adduct biases among different cell types in tissues
One of the limitations of performing adductomic analyses on whole tissues is the potential for biases among specific cell types comprising the tissue.Enzymatic disaggregation of tissues may be the optimal approach for isolating cell populations, with the potential for adequate DNA for adductomic analyses, though DNA damage artifacts during processing are likely.Heat-induced artifacts are highly likely to occur with laser-capture microdissection, with DNA yield (30 ng / mm 2 ) and low resolution also limiting its utility ( 80 ).While the 5-10 μm resolution of imaging mass spectrometry may provide adequate resolution in tissues, the method is likely to be limiting in terms of sensitivity (1 fmol) ( 81 ).Immunohistochemistry with highly specific antibodies ultimately optimizes chemical specificity with resolution.Here we used immunostaining to detect 1,N 6ε A in the nuclei of cardiomyocytes in human ventricular myocardial samples, which indicated that accumulation may be limited to a small number of cells.While this may result from a sensitivity threshold and all cells show some level of the adduct, substantial accumulation of DNA adducts in a small number of cells might indicate damage specific to cell subtypes or to the oldest cells in the tissue, with the latter indicating compromised genomic integrity during senescence.

DNA adduct age-dependence
Comparative analysis of DNA adducts in rats across the age span of 1 to 26 months revealed that a subset of DNA adducts did not change, several adducts showed significant agedependent accumulation and decreases with age in a tissueand sex-specific manner (Figure 4 , Supplementary Figure S7).This suggests shifts in the balance between adduct formation and repair mechanisms over the life of the animal.We observed significant age-dependent increases of N 2 -CMdG in rat liver and heart (Figure 4 ).N 2 -CMdG and the related N 2 -(1carboxyethyl)-2 -deoxyguanosine ( N 2 -CEdG) are advanced glycation end-products resulting from the reaction of glyoxal and methylglyoxal, respectively, with 2 -deoxyguanosine in DNA.This is not surprising since the methylglyoxal and glyoxal are reactive α-oxoaldehydes formed in abundance during the glucose metabolism ( 82 ) in liver ( 83 ) and there is evidence for age-dependent decreases in base excision repair (BER) mechanism in mouse liver ( 84 ), with N 2 -CMdG likely repaired by the BER enzyme alkyladenine glycosylase (Aag) ( 85 ).In studies examining age-dependent mechanisms of formation, we observed no effect of glyoxalase I activity on N 2 -CMdG levels.It is known that glyoxalase I prefers methylglyoxal as a substrate over glyoxal and protects against N 2 -CEdG ( 86 ,87 ).Interestingly, we detected a signal consistent with N 2 -CEdG at m / z 340 → 224, which has been shown to be a substrate for nucleotide excision repair (NER; ( 88 ), and found strong covariance of this signal with that for N 2 -CMdG (Supplementary Figure S7k, Supplementary Table S2).
Adductomic analysis of human heart tissue also revealed age-dependent DNA damage and epigenetic marks, with the caveat that the number of tissue samples available is relatively small and data should be interpreted with caution.We observed a significant age-dependent accumulation of 5-HMdC, which results from oxidation of 5-MdC ( 52 ), in non-failing heart tissue, which parallels a published observation of increases in 5-HMdC with age in grey and white matter of the human cerebrum ( 50 ).We also found an age-dependent increase in the 1,N 6ε dA damage product.While this contrasts the inconsistent changes in 1,N 6 ε dA in rat heart (Supplementary Figure S7d), it is consistent with the idea of cardiomyocytes becoming more susceptible to oxidative stress with age ( 89 ), with cardiac DNA repair mechanisms tending to be less efficient with age ( 90 ), and with the observation of agedependent increases in the 4-hydroxy-2-nonenal (HNE) lipid peroxidation product that reacts with DNA to form 1,N 6ε dA ( 91 ,92 ).The immunohistochemical evidence for concentration of 1,N 6ε dA in subpopulations of cardiomyocyte nuclei suggests a possible correlation between adduct formation and cell age or senescence.Wang and coworkers have used targeted LC-MS / MS to reveal age-and tissue-dependent accumulation of oxidative DNA damage ( e.g.8,5 -cyclopurine lesions, DNA intrastrand cross-links) in tissues from rodent models (93)(94)(95).Importantly, the accumulation of these DNA adducts correlated with loss of DNA repair activity (93)(94)(95).

DNA adduct sex-dependence
The complexity of the network of determinants of DNA adduct spectra is increased by our observation of sex-specific differences in adduct levels in all four of the tissues studied (Figure 3 C-F, Supplementary Table S4).However, the plot in Figure 3 B shows that tissue is a more distinguishing feature for adduct levels than age or sex, with the exception for heart showing significant biases for older female rats (Figure 3 B).Interestingly, while the epigenetic mark 5-MdC did not show a sex bias in any of the rat tissues, its oxidation products 5-HMdC and 5-FdC were significantly higher or trending higher in female kidney and heart (Supplementary Table S4).We recognize that the choice of the two independent rat cohorts was suboptimal due to availability limitations.As a result, we were unable to compare the sex dependency between rat groups #1 and #2.While sex-based biases in metabolism may account for the observation that females show higher levels of smoking-related DNA adducts in human lung tissue ( 96 ) and in mice treated with 2-amino-3-methylimidazo [4,5-f]quinoline, a potent food mutagen ( 97 ), our observations with 5-HMdC and 5-FdC suggest sex differences in epigenetic regulation of gene expression.Unfortunately, we were unable to study the sex-dependence of adducts levels in human hearts and brains due to the restricted availability of human samples and their disparate age distribution.The variety of unknown putative DNA adducts showing sex biases merits further investigation of their structures and definition of their potential roles in cell biology, aging, and pathology.

Conclusions
While much remains to be done for structural characterization, mechanisms of formation and repair, and involvement in diseases, the adductomics method proved to be an effective approach for discovering individual DNA damage products as well as for systems-level analysis of the behaviors of dozens of putative DNA adducts as a function of tissue type, age and sex.Such covariate behavior can be highly informative about local environment of DNA repair and metabolism, especially when combined with large metabolomics ( 98 ), proteomics ( 99 ,100 ) and other datasets.By opening the door to defining mechanisms underlying the age-sex-and tissue-specific variations in the levels of > 100 putative and method-validated DNA adducts, the adductomic technology illustrated here is a powerful approach to (i) assigning substrates to DNA repair pathways using genetically engineered animal models, (ii) understanding mechanisms of DNA damage formation by monitoring classes of DNA lesions and manipulating metabolic pathways and (iii) quantifying inter-individual variation in DNA damage and repair across populations and pathologies.These efforts will be enhanced by advancing the mass spectrometry and data analysis tools for adductomic profiling ( 100 ).The adductomic dataset developed here not only serves as a starting point for defining biomarkers for aging, pathology, and disease, but it also begins the process of defining the landscape of biologically relevant DNA damage products in mammalian tissues.

Figure 1 .
Figure 1.Adductomic analysis in two different sets of rat tissues.Adductomics involves two stages of data acquisition: ( i ) disco v ery of putative adducts by stepped MRM analysis of pooled DNA samples and ( ii ) subsequent targeted analysis of a high-confidence set of adductomic signals representing putative DNA adducts in each tissue sample.( A ) For the first step of identifying putative DNA adducts, the stepped MRM signals must be curated to remo v e artif acts, noise, and other spurious signals.Here w e de v eloped a decision tree using different analytical filters to remo v e common salts, isotopomers, known artifacts, and potential noise ( signal intensity < 2 ) to obtain a final data set of DNA adducts for subsequent targeted analysis.The Venn diagram shows the final sets of 92 and 94 putative adducts identified in disco v ery studies f or tissues from two different rat cohorts, yielding a total set of 114 putative DNA adducts and modifications for subsequent quantitative analysis in each sample.( B ) Using the curated set of adducts, DNA adductome profiles were generated for individual DNA samples from rat liver ( B ) , kidney ( C ) , brain ( D ) and heart ( E ) in rats ranging in age from 1 to 26 months for rat group #1 ( blue ) and 4 to 26 months for rat group #2 ( yellow ) months.DNA adducts that were further identified using chemical standards are labelled with names and CID transitions.Each DNA adduct, defined or putative, is represented by its retention time ( min, y-axis ) , mass-to-charge ratio ( m / z , x-axis ) and relative abundance ( siz e of the bubble ) .Signals from the f our canonical DNA nucleobases are not represented in the plot.Blue bubbles and arrows are DNA adduct signals observed in group #1 while yellow are those observed in group #2.

Figure 3 .
Figure 3.DNA adducts are strongly tissue-and gender-specific in rats.( A ) Venn diagram of the putative DNA adducts shared among the four rat tissues.Blue and y ello w numbers (X-Y) correspond to the 92 putative adducts identified in group #1 rats and 94 in group #2 rats.Data are from Supplementary TableS2.( B ) Gaussian mixture model clustering of the 52 shared adducts in group #1 rats (Supplementary FigureS3a-c) re v ealed 5 underlying Gaussian distributions (Clusters I-V).The number of members in each cluster is shown in parentheses.Members of Cluster III comprise a striking group of 5 heart samples from mainly older female rats (labeled with sample # and age).( C-F ) Bar plots of q -values for comparisons of putative DNA adduct levels in tissues from female versus male group #1 rats: ( c ) brain (81 adducts), ( D ) heart (71 adducts), ( E ) kidney (71) and ( F ) liver( 77 ).q -Values were calculated from P -values obtained with the Mann-Whitney U test on data from Supplementary TablesS2 and S4.Dotted lines represent the false discovery rates (FDR) at 10, 5 and 1%, with values in parentheses representing the number of significant adducts at each threshold.

Figure 5 .
Figure 5. Disco v ery of age-tracking DNA adducts in the human heart.( A ) T he bubble plot sho ws the DNA adductome map in non-f ailing human heart.Mass spectrometry transitions are noted abo v e the names of adducts verified by chemical standards.Adducts that increase with age are highlighted in red .( B ) Venn diagrams showing overlap among the adductomic signals detected in human and rat (group #1) heart and brain.( C, D ) Levels of 5-HMdC and 1, N 6 -ε dA, 5-HMdC were quantified in 19 human non-failing heart samples from donors ranging in age from 22-to 81-years-old by isotope dilution chromatograph y -coupled triple quadrupole mass spectrometric analysis.P earson 's correlation test was used to establish correlation between 5-HMdC and 1, N 6 -ε dA le v els and age; * P < 0.05 and ** P < 0.01.