DNA molecular combing-based replication fork directionality profiling

Abstract The replication strategy of metazoan genomes is still unclear, mainly because definitive maps of replication origins are missing. High-throughput methods are based on population average and thus may exclusively identify efficient initiation sites, whereas inefficient origins go undetected. Single-molecule analyses of specific loci can detect both common and rare initiation events along the targeted regions. However, these usually concentrate on positioning individual events, which only gives an overview of the replication dynamics. Here, we computed the replication fork directionality (RFD) profiles of two large genes in different transcriptional states in chicken DT40 cells, namely untranscribed and transcribed DMD and CCSER1 expressed at WT levels or overexpressed, by aggregating hundreds of oriented replication tracks detected on individual DNA fibres stretched by molecular combing. These profiles reconstituted RFD domains composed of zones of initiation flanking a zone of termination originally observed in mammalian genomes and were highly consistent with independent population-averaging profiles generated by Okazaki fragment sequencing. Importantly, we demonstrate that inefficient origins do not appear as detectable RFD shifts, explaining why dispersed initiation has remained invisible to population-based assays. Our method can both generate quantitative profiles and identify discrete events, thereby constituting a comprehensive approach to study metazoan genome replication.

Single-molecule (SM) methods are capable of revealing rare initiation events. DNA fibre stretching by DNA spreading or molecular combing, combined with the immunodetection of distinct thymidine analogs sequentially incorporated in cells grown in vitro, allows direct visualisation of newly synthesized DNA along individual molecules and discrimination between elongating forks, initiation and termination events (31)(32)(33). Moreover, together with the use of fluorescence in situ hybridization (FISH) probes, DNA fibre analysis can map replication signals within a specific locus (34). However, the throughput of classical SM techniques is drastically low and incompatible with genomewide studies. Very recently, novel nanopore sequencingbased methods to study DNA replication at the SM level, namely DNAscent (35), FORK-seq (36) and Replipore sequencing (37), tipped SM analyses into the high-throughput era. Noticeably, both DNAscent and FORK-seq techniques successfully reproduced population-based replication profiles of the yeast Saccharomyces cerevisiae genome from the assembly of thousands of individual replication signals, while uncovering that, in addition to efficient initiation at known origins, a significant proportion (9-20%) of initiation takes place at dispersed, inefficient sites previously missed by population-based assays (35,36). This novel generation of SM techniques, together with the development of high-throughput optical mapping of DNA replication (38)(39)(40)(41), will undoubtedly allow routine, genome-wide replication profiling in higher eukaryotes in the near future. Still, current nanopore-based methods are limited by the sequencing costs to achieve high coverage studies of metazoan genome replication, while optical replication mapping approaches are presently unable to measure fork velocity, a critical replication parameter. At the same time, standard SM analyses of specific loci in metazoans, which can theoretically provide a comprehensive analysis of the replication dynamics of the targeted locus, have been underexploited. These usually concentrate on positioning individual replication signals, particularly initiation and termination events (34,(42)(43)(44)(45)(46), leaving aside the opportunity to assemble a reliable replication fork directionality (RFD) profile of the locus from the collection of a large number of oriented replication tracks, similar to what was done genomewide in yeast by FORK-seq (36). RFD reveals the proportion of rightward-and leftward-replicated DNA and allows a quantitative analysis of replication initiation, progression and termination (11)(12)(13)(14)(15)(16)(17)(18)(19)(47)(48)(49)(50)(51).
We have recently used DNA combing to demonstrate that transcriptional activity modulates origin distribution in the 616-and 996-kb-long CCSER1 and DMD genes, respectively, in chicken DT40 cells, and proposed that transcription favours efficient initiation at the 5 and 3 ends of large genes at the expense of the gene body (46). Here, we reanalyzed these datasets and extracted the orientation of each replication track to compute the RFD profiles of inactive and active DMD, as well as those of CCSER1 when it is transcribed at wild-type (WT) levels or overexpressed. In order to compare SM-and population-based approaches, we also generated the RFD profiles of DMD and CCSER1 in WT cells using Okazaki fragment sequencing (OK-seq) (11,14). We show that DNA combing-and OK-seq-based RFD profiles are highly concordant, with DNA combing-based RFD successfully reproducing domains of replication composed of two zones of initiation flanking a zone of termination detected by OK-seq. Importantly, we demonstrate that a significant portion of initiation events mapped on single molecules are too diffuse to generate signals indicative of initiation on the RFD profiles, explaining why dispersed origin firing has largely been missed by population-based techniques. By combining the ability to build accurate replication profiles with the capacity to monitor cell-to-cell variability and visualize rare events, DNA combing-based RFD profiling is therefore a powerful new tool to investigate the replication of metazoan genomes.

Biological resources
Chicken DT40 cells were purchased from the American Type Culture Collection (ATCC, #CRL-2111). DMD Tet/Tet and CCSER1 ␤a/␤a cells come from (46).

RFD profiling by OK-seq
RFD profiling of DT40 cells by OK-seq was performed as described in (11) and (14). Briefly, asynchronously growing cells were labelled with 20 M 5-ethynyl-2 -deoxyuridine (EdU, Jenabioscience) for 2 min. After DNA purification and heat-denaturation, Okazaki fragments were sizepurified on sucrose gradients, labelled with biotin at EdU sites using click-chemistry, captured on magnetic beads coated with streptavidin and amplified by PCR. Sequencing was performed on HiSeq (Illumina) at the IB2C highthroughput sequencing platform (Gif-sur-Yvette, France). Aligned OK-seq data (BAM files) were imported in R (52) as GenomicAlignments, and reads were converted into Ge-nomicRanges. RFD was computed using R as the difference between rightward-and leftward-fork coverage normalized by total coverage. Data were binned into nonoverlapping 10 kb windows. RFD profiles were exported into bigwig files using the export function from the rtracklayer package. The OK-seq experiment was performed three times. RFD profiles of the three biological replicates were highly similar, with Spearman's pairwise correlation coefficients computed in 10 kb windows ranging from 0.972 to 0.986. DMD and CCSER1 OK-seq-based RFD profiles for the three replicates are presented in Supplementary Figure S2.

Single molecule-based RFD and coverage profiles of DMD and CCSER1 loci
The length of rightward-and leftward-replicated DNA was determined as described in Figure 1A for each DNA fibre with replication signals spanning DMD, DMD Tet , CCSER1 or CCSER1 β a obtained by combing and schematized in Supplementary Figure 4 of (46); examples of raw DNA combing data with replication and FISH signals on a DNA molecule are shown in Supplementary Figure 3 of (46). Coordinates, length and directionality of replicated DNA were compiled using Microsoft Excel. Coverages of rightward-(R) and leftward-replicated DNA (L) along DMD and CCSER1 loci were determined and RFD was computed for each position as follows: RFD = (R − L)/(R + L) using custom R scripts. Initiation and termination zones were annotated by manually scanning the profiles.

Fork speed distribution in DMD and CCSER1 loci
The position and velocity of rightward and leftward forks progressing along DMD and CCSER1 in WT and mutant cells were estimated from Supplementary Figure 4 of (46). Speed was determined for individual forks given a 20 minutes labelling time for both IdU and CldU as described in (46). Fork position corresponded either to: (i) the centre of the IdU and CldU tracks (d IdU + d CldU )/2, with d IdU and d CldU being the lengths of the IdU and CldU tracks of the same fork, respectively, for forks displaying an intact IdU track flanked on one side by an intact CldU track; (ii) the centre of the IdU track for forks with an intact IdU signal flanked on one side by an interrupted CldU signal and (iii) the centre of the CldU track for forks with an intact CldU signal flanked on one side by an interrupted IdU signal. The interrupted signals reflect a physical break in the DNA since no underlying DNA was observed after immunodetection, in which case only the unbroken half of the signal track was used for the RFD computation, as illustrated in Figure 1A.

(I − T) and OEM profiles
OEM (Origin Efficiency Metric) was computed as described in (48) from the DNA combing oriented tracks on 10 kb sliding (with 10 kb step) windows and the (I − T) density profile was calculated by binning initiation minus termination events in non-overlapping 10 kb windows using custom R scripts.

Subsampling of OK-seq data
For subsampling, 2500 reads mapping on the locus of interest (either DMD or CCSER1) were selected randomly from the OK-seq dataset with the highest number of reads and used to compute RFD and coverage profiles using bins of 10 kb. This was performed 100 times and the 100 subsampled profiles are presented in Figure 4C, D along with the RFD and coverage from the corresponding single molecule data with the same binning interval.
discriminate between elongating replication forks, initiation (i) and termination (t) events, and enable determination for each DNA fibre mapped on a reference genome of the length of DNA replicated by rightward-and leftward-moving forks (purple and orange arrows, respectively), as illus-  Ini.
Ter.  (46)). Each vertical line corresponds to the position of one initiation or termination event; numbers indicate colocalized events; the total number of events is indicated on the right. Because RFD profiles were strictly computed within the limits of the probes used to identify combed DNA molecules spanning DMD, the very limited number of events situated outside of those boundaries are not represented, which explains why the total number of initiation and termination events is sometimes slightly lower than in (46). (G, H) Density profile of initiation minus termination events (I − T) in 10 kb windows of DMD in WT cells (G) and DMD Tet/Tet cells with tetracycline (H).

Visualisation tools
RFD profiles and graphical representations of coverages were made using Integrative Genomics Viewer (IGV) v2.8.10 or custom R scripts.

Statistical analyses
Spearman's pairwise correlations between the RFD profiles of OK-seq biological replicates, between the DNA combing-and OK-seq-based RFD profiles binned into 10 kb windows and between the (I − T) and OEM profiles were computed with binned correlation function based on the R cor function.

Genomic coordinates
Coordinates are given according to the ICGSC/galGal4 chicken genome assembly.

Principles of DNA combing-based RFD profiling
The principles of directionality measurement for all the replication signals visualized on combed DNA molecules following a consecutive pulse-labelling of cells with iododeoxyuridine (IdU) then chlorodeoxyuridine (CldU) are presented in Figure 1A. In DNA combing experiments, the Table 1. Calculation of the average distance travelled by rightward replication forks emanating from the IZ located in 5 of DMD Tet . The 5 IZ was considered to be fully efficient. Replication fork velocity was assumed to be constant throughout S-phase. fS1, fS2, fS3 and fS4 replication timing values correspond to the percentage of BrdU-labelled DNA relative to total S-phase in S1, S2, S3 and S4 fractions, respectively, at the position of the 5 IZ (data from Figure 1F in (46)

kb
= dS1*fS1+dS2*fS2+dS3*fS3+dS4*fS4 positions of initiation and termination events are defined as the midpoints between diverging and converging forks, respectively. This is based on observations that, in most cases, diverging forks emanating from one origin, as well as converging forks coming from neighbouring origins, travel at a similar speed in mammalian cells (32,53). For an initiation event, the directionality of replicated DNA can therefore be determined starting from the midpoint between two diverging forks (i.e. the estimated position of the initiation event); conversely, for a termination event, directionality can be determined up to the midpoint between two converging forks (i.e. the estimated position of the termination event). It follows that, for an elongating replication fork moving to the right, rightward-replicated DNA extends from the midpoint between the start of the DNA fibre and the start of the IdU track up to the midpoint between the end of the DNA molecule and the end of the CldU track. The first midpoint corresponds to the estimated position of a hypothetical initiation event in case the DNA fibre was broken during the combing procedure precisely at the site of IdU incorporation for the diverging fork; replication directionality can no longer be unambiguously established beyond this point as the rightward fork might have emanated from a more distant origin. In the same line, replication directionality can only be assigned up to the second midpoint since the rightward fork may either continue its progression or merge with a converging fork beyond this position. The reasoning is similar for a leftward elongating fork: leftward-replicated DNA extends from the midpoint between the start of the DNA fibre and the start of the CldU track up to the midpoint between the end of the DNA molecule and the end of the IdU track. RFD is computed as the difference between the proportions of rightward-and leftward-replicated DNA ( Figure  1B). RFD ranges from −1 (100% of leftward-replicated DNA) to +1 (100% of rightward-replicated DNA). A null RFD means that DNA is replicated equally often by forks travelling leftward or rightward. As reported earlier, zones of predominant initiation (initiation zones, IZs) are de-tected as upward slopes on RFD profiles; conversely, downward slopes correspond to zones of predominant termination (termination zones, TZs) (11).

DNA combing-based RFD profiles of transcriptionally inactive and active DMD
We first computed the RFD profile of DMD whether it is transcribed or not (Figure 2). A globally decreasing RFD was computed along the inactive DMD gene in WT cells (Figure 2A), with the downward slope reflecting a prevalence of termination over initiation in agreement with the total number of mapped individual termination (n = 71) and initiation (n = 56) events ( Figure 2E). As previously indicated by the redistribution of initiation and termination events (46), activation of DMD transcription through the insertion of an inducible Tet-promoter dramatically reshuffled its replication dynamics, creating an extremely efficient IZ in 5 of the gene followed by a large region where forks travelled almost exclusively rightward ( Figure 2B). This region ended at a sharp TZ bordering a second, less efficient and broader IZ. The 3 part of DMD was replicated predominantly leftward regardless of DMD transcriptional status (Figure 2A, B). Importantly, for active DMD, individual initiation and termination events clustered inside the IZs and TZs, respectively, as anticipated ( Figure 2F). The unidirectional replication of the 5 third of active DMD was also in good agreement with the paucity of initiation and termination events in that region. Moreover, given the duration of S-phase in DMD Tet/Tet cells (46), the firing time of the IZ located at the 5 end of active DMD (46) and the velocity of rightward forks emanating from that IZ (Supplementary Figure S1), we calculated that the first ≈300 kb of DMD should be unidirectionally replicated (Table 1), as observed on the RFD profile ( Figure 2B).
RFD shifts over a defined region are mathematically predicted to be proportional to the difference between the number of initiation and termination events in this region (54). Accordingly, we found that the segments of the DMD Tet al-lele exhibiting steep RFD shifts, namely the very efficient 5 IZ and the adjacent intragenic TZ, coincided with localized maxima and minima of initiation minus termination (I − T) values, respectively ( Figure 2H). In parallel, we measured RFD slopes from the whole set of oriented tracks by computing the origin efficiency metric (OEM) profiles, corresponding to the difference of the normalized leftward coverage between adjacent 10 kb windows (48), of the WT and DMD Tet alleles ( Figure 2C, D). The (I − T) and OEM profiles were comparable (Spearman's pairwise correlation coefficient = 0.434 and 0.596 for DMD and DMD Tet , respectively; Figure 2C, G and D, H), highlighting the consistency between the analysis of individual events and the ensembleaveraging analysis of oriented replication tracks. The difference profile was less discriminating for segments of moderate RFD shifts, which presumably require a higher number of observations to accurately reflect population tendencies (compare for instance the OEM and (I − T) profiles of the 3 TZ inside DMD Tet in Figure 2D, H). Despite the mapping of multiple initiation events inside the inactive DMD gene ( Figure 2E), computation of initiation efficiency either from the oriented tracks or from the individual events did not identify regions of strong origin firing within the locus ( Figure 2C, G), in line with the RFD pattern ( Figure 2A).
Altogether, our results show that DNA combing-based RFD profiling goes beyond recapitulating the location of individual replication events. In addition to being a potent visual tool, it allows an in-depth analysis of the replication dynamics of a locus of interest, that is, the quantification of initiation, termination and fork progression; it also enables the cross-validation of orthogonally defined replication parameters. We found that initiation signals either occur in clusters associated with a positive RFD shift indicative of efficient usage in the cell population or as randomly distributed isolated events not accompanied by an upward slope on the RFD profile, confirming the existence of origins that are too rarely used to be detected by cellpopulation methods.

DNA combing-based RFD profiling of CCSER1 transcribed at different levels
We next examined CCSER1 RFD profile in WT DT40 cells, where CCSER1 is transcribed, and in CCSER1 ␤a/␤a cells where CCSER1 transcription was enhanced thanks to the insertion of the chicken ␤-actin promoter next to the endogenous one (46). The RFD profile of WT CCSER1 showed a strong IZ upstream of the promoter and predominant rightward progression over the 5 third of the gene, followed by a ≈150-kb-long TZ ( Figure 3A). A broader, less efficient IZ was located at the 3 end of CCSER1. Increasing CCSER1 transcription enhanced the efficiency of both IZs, which caused an extension of the rightward replicated zone and the replication of the 3 third of CCSER1 by forks travelling almost exclusively leftward ( Figure 3B). These results confirm our previous conclusion, based on the location of individual initiation events, that transcription promotes initiation both upstream and downstream of CCSER1 (46).
There was a high concordance between IZ and TZ position and efficiency on CCSER1 RFD profiles in WT and CCSER1 ␤a/␤a cells and the location of previously mapped clusters of initiation and termination events, respectively ( Figure 3E, F), as shown for active DMD. Consistently, for both the WT and CCSER1 β a alleles, the (I − T) and OEM profiles were very much alike (Spearman's pairwise correlation coefficient = 0.549 and 0.587, respectively; Figure  3C, G and D, H). However, some features revealed by RFD profiling were hardly predictable from the sole positioning of individual events, such as the broad, poorly efficient IZ at its 3 end (Figure 3A, E). This further exemplifies that DNA combing-based RFD profiling can surpass classical analyses focusing on initiation and termination events to comprehend the replication program, particularly in regions of mixed directionality. The RFD profile of the CCSER1 β a allele also demonstrated that it was primarily replicated by converging forks emanating from the IZs located at both ends of the gene and meeting two-thirds of the way into the CCSER1 gene ( Figure 3B), which was less clearly shown by the sole location of initiation and termination events (Figure 3F).
Interestingly, CCSER1 β a RFD profile closely resembled the crenelated N-shaped RFD profile observed by Petryk and colleagues in the human genome (11) consisting of two ascending segments (the IZs in 5 and 3 ' of the CCSER1 β a allele) followed by flat sections of high absolute value of RFD (the regions of CCSER1 β a with predominant unidirectional fork progression) meeting within a descending segment (the TZ located at two thirds of CCSER1 β a ) ( Figure  3B). Crenelated N-shaped RFD profiles are presumed to be created when two IZs are far and efficient enough to prevent the reaching of one IZ by forks emanating from the other one (11). Accordingly, taking into account the distance between the IZs in 5 and 3 of CCSER1 β a (i.e. the length of the allele), their firing time (46) and the velocity of forks emanating from each IZ (Supplementary Figure S1), we computed that less than 10% of rightward forks emanating from the 5 IZ could reach the 3 IZ, whereas reaching of the 5 IZ by forks coming from the 3 IZ was virtually impossible (Table 2). Moreover, we calculated that rightward and leftward forks should merge on average ≈380 kb from the start of the CCSER1 β a allele (Table 2), in good agreement with what was observed on the RFD profile ( Figure 3B). In conclusion, our DNA combing-based RFD profiling method substantiated the replication mode of crenelated N-shaped RFD domains, herein exemplified by the CCSER1 β a allele.

Comparison of DNA combing-based RFD profiles with population-based RFD analyses
To further validate our SM profiling method, we compared the DNA combing-based RFD profiles of WT DMD and CCSER1 to independent RFD profiles obtained by the sequencing of ethynyldeoxyuridine-labelled Okazaki fragments purified from DT40 cells ( Figure 4A, B and Supplementary Figure S2). DMD profiles from DNA combing and from each of the three biological replicates generated for OK-seq were highly similar (Spearman's pairwise correlation coefficient ranging from 0.812 to 0.831), demonstrating the accuracy of our approach. Of note, the RFD profile of the 5 -most part of DMD computed from OK-seq data was flatter than the profile assembled from single molecules ( Figure 4A), which was likely explained by a sampling effect for the latter ( Figure 4C). The number of oriented tracks (n = 279) aggregated for WT DMD DNA combing-based RFD computation indeed remained drastically low compared to the throng of Okazaki fragments (n ≈ 60 000) used to build the OK-seq profile, which certainly provided a better estimate of the average RFD. We also observed that WT CCSER1 DNA combing-and OK-seq-based RFD profiles were largely comparable (Spearman's pairwise correlation coefficient ranging from 0.645 to 0.657 depending on the OK-seq sample; Figure 4B), although not completely superimposable. In contrast to DMD, the observed discrepancies seemed not to be solely due to a sampling effect for the DNA combing-based RFD profile ( Figure 4D), and their origin remains unclear. The visibly increased efficiency of the IZ upstream of CCSER1 in the OK-seq conditions would fully explain the extension of the region of predominant rightward fork progression, the shift in the TZ location and the partial suppression of the 3 IZ compared to the DNA combing profile.

DISCUSSION
A seminal study by Czajkowsky and colleagues first demonstrated that the stacking of a large number of individual replication signals reproduces population-averaging profiles (55). Very recently, the FORK-seq method yielded an RFD profile of the yeast genome assembled from thousands of oriented replication tracks that was indistinguishable from that produced by the sequencing of Okazaki fragments (36). In higher eukaryotes, only a few SM-based replica- Table 2. Calculation of the average meeting point of rightward and leftward replication forks progressing along CCSER1 β a and of the proportion of rightward (leftward) forks reaching the IZ located in 3 (5 ) of CCSER1 β a . To reach the 3 (5 ) IZ, rightward (leftward) forks coming from the 5 (3 ) IZ must have travelled a minimum of 621 kb (i.e., CCSER1 β a length) before the 3 (5 ) IZ fired, that is, at the latest, before S4. Only rightward (leftward) forks emanating from the 5 (3 ) IZ at the beginning of S-phase (S1 fraction), which could theoretically travel 663 (628) kb by the end of the third quarter of S-phase, fulfil these criteria. These forks accounted for 26.3 (5.7) % of total rightward (leftward) forks, while the 3 (5 ) IZ fired after S3 33.2 (2.9) % of the time. The 5 and 3 IZs were considered to be fully efficient. Replication fork velocity was assumed to be constant throughout S-phase. fS1 (fS1 ), fS2 (fS2 ), fS3 (fS3 ) and fS4 (fS4 ) replication timing values correspond to the percentage of BrdU-labelled DNA relative to total S-phase in S1, S2, S3 and S4 fractions, respectively, at the position of the 5 (3 ) IZ (data from Figure 2F in (46) tion analyses, for instance those using the SMARD (single molecule analysis of replicated DNA) technique (56), went beyond the usual mapping of individual replication signals. And yet, whereas the successive incorporation of IdU then CldU only gives a relative direction of fork progression on anonymous DNA fibres the orientation of which remains elusive, FISH-based positioning of stretched DNA molecules on a reference genome allows correct orientation of each replication track and permits reliable quantification of RFD provided that a sufficient number of oriented tracks are aggregated. As a proof of principle, using the DNA fibres collected in our previous study (46), we computed the RFD profiles of two large genes, namely DMD and CCSER1, each in different transcriptional states, from the hundreds of oriented replication tracks spanning each locus. Importantly, these profiles are consistent with the location of individual initiation and termination events, with measured replication fork velocities as well as with replication timing analyses. They are also concordant with RFD profiles produced by OK-seq, reciprocally validat-ing each other. Altogether, these results demonstrate the accuracy of our method, although discrepancies between DNA combing and OK-seq RFD profiles are observed for both WT DMD and CCSER1. In the case of DMD, differences seem largely attributable to a sampling effect for the DNA combing-based profile due to the relatively low number of aggregated oriented tracks. However, this cannot explain the disparities for CCSER1. Intriguingly, CCSER1 OK-seq profile resembles the DNA combing profile of the CCSER1 β a allele (i.e. of overexpressed CCSER1), suggesting that CCSER1 transcription level might have been higher when DT40 cells were grown for OK-seq than when they were cultured for DNA combing analyses. In this regard, although cells used for either method were grown in distinct laboratories and over a time window that precluded the use of the exact same culture medium, the obtained RFD profiles are still globally similar (average Spearman's pairwise correlation coefficient of 0.822 and 0.651 for DMD and CCSER1, respectively), confirming the robustness of our approach.

WT CCSER1
DNA combing Subsampled OK-seq By granting access to fork velocity, cell-to-cell heterogeneity and replication events invisible to population assays, DNA combing-based RFD profiles allow the interrogation of models built on ensemble-averaging techniques. We could substantiate the mode of replication of crenelated N-shaped RFD domains initially detected on RFD profiles of the human genome (11). Moreover, the detection of randomly distributed initiation events in WT DMD near-horizontal and descending RFD sections unambiguously establishes that initiation is not solely limited to the ascending segments bordering RFD domains, that is, to IZs. We therefore propose that the replication of metazoan genomes associates strong initiation in delimited zones with widely dispersed, low efficiency origins elsewhere, as previously suggested (11). The mapping of initiation events unable to appear as detectable RFD upward shifts likely explains long-standing discrepancies between cell-population and SM methods regarding the number of active origins per genome, for example why only 5684 IZs were counted in human lymphoblasts (11) whereas over three times more could have been expected according to the ≈150 kb inter-origin distance measured by DNA fibre analysis in those cells (32). In line with this, a recent high-resolution replication timing profiling technique found that >70% of the genome of mammalian cells host initiation sites although <10% are constituted by IZs (57). Finally, our approach makes it possible to explain if an observed variation in RFD is due either to a change in initiation, a change in termination, or both, and to decide between several scenarios that could all explain a given RFD pattern. Notably, DNA combing-based RFD profiling demonstrates that descending segments correspond either to regions of nearly exclusive fork fusion, as inside CCSER1 β a , in agreement with the 'two-origin' model suggesting that replication initiation is restricted to the borders of RFD domains (58), or to regions displaying a combination of initiation and termination events, as in the case of WT DMD, reminiscent of the 'cascade' model of sequential origin activation with increasing synchrony from RFD domain borders to center (11,45,59). It is anticipated that initiation within RFD domains only becomes mandatory when these domains are too large to be replicated by forks emanating exclusively from the bordering IZs, which depends on fork lifespan and velocity. Consistently, we computed that the ≈600-kb-long crenelated N-shaped RFD domain made of CCSER1 β a can be entirely replicated by converging forks (Table 2), and therefore does not require additional initiation events.
In conclusion, besides providing a good proxy for population-averaged RFD from a limited number of replication tracks, our DNA combing-based replication profiling method has the unique ability to identify discrete events on individual DNA molecules, helping to reconcile SM-and population-based assays. There is little doubt that similar analyses will shortly be performed on complete metazoan genomes using either nanopore sequencingbased techniques or optical replication mapping. While this manuscript was in preparation, a preprint (41) actually reported the optical mapping of millions of single DNA molecules from synchronized human cells electroporated with a fluorescent dUTP at the beginning of S-phase, thereby identifying early-firing initiation sites genome-wide.
Replication tracks from fluorescent dUTP-labelled asynchronously growing cells were also mapped, and track polarity was inferred thanks to the decreasing label density as electroporated nucleotides were consumed, allowing to approximate an RFD profile of the human genome. Although still preliminary, this work illustrates the recent impressive progress in DNA fibre analysis, with the promise of elucidating the DNA replication strategy of higher eukaryotes.

DATA AVAILABILITY
OK-Seq data for DMD and CCSER1 loci in DT40 cells have been deposited at NCBI's Gene Expression Omnibus (GEO) repository and are accessible under accession number GSE164252. The datasets generated during the current study, including bed files with DT40 OK-seq reads for WT CCSER1 and DMD loci, oriented tracks, initiation and termination events from DNA combing analyses, as well as custom R scripts are available from the github repository (https://github.com/LacroixLaurent/DT40 RFD).