Circle-map profiling of extrachromosomal circular DNA as diagnostic biomarkers for lung cancer

their c har acteristics and genomic landscape . T hough with distinct expression signatures, eccDNA can be present in lung cancer and healthy human plasma. Ho w ever, the expr ession le v el of eccDN As sho w ed no significant differences, possibly due to the limited sample size and immature analysis methods. Based on the consistent presence of eccDNAs in early lung cancer plasma samples, we established a model to demonstrate that EccDNA00019042 could serve as a diagnostic biomarker for earl y-sta ge lung cancer. Ho w e v er, due to the relativ el y small clinical sample size, individual variations in eccDNAs among the samples included in this study may have affected the robustness of our findings to a certain extent, emphasizing the need for additional studies with larger sample sizes


Differ entially expr essed eccDN As in plasma of healthy controls and lung cancer patients
The c har acteristics of plasma eccDNAs fr om 6 healthy contr ols, 17 earl y sta ge lung cancer patients ( 16 of Stage I and 1 of Stage II ) , and 8 lung cancer patients in advanced stage ( 2 of Stage III and 6 of Stage IV ) were analyzed for diagnosis of lung cancer by the Circle-Seq method [8][9][10] ( Fig. 1 A; Tables S1 −S3 , see on-line supplementary material ) .The results sho w ed the tumor plasma sample had more eccDNAs than that from healthy controls ( Fig. S1A and B, see on-line supplementary material ) ; in particular, samples fr om adv anced-sta ge patients had the lar gest count of eccDNA among the three groups ( Fig. 1 B ) .There was no difference in the number of eccDNA-carrying gene fr a gments .T he genomic distribution of the eccDNAs in early lung cancer plasma was similar to that in total plasma samples and was different and comparable among the thr ee gr oups ( Fig. 1 C; Fig. S1C , see on-line supplementary material ) .Ho w e v er, no significant differ ences existed in the genomic elements of the annotated eccDNAs among the three groups of plasma samples, which were also enriched in introns and intr onic r egions ( Fig. 1 D; Fig. S1D , see on-line supplementary material ) .In terms of the length of plasma eccDNAs, the order was healthy controls < early-stage tumors < advanced-stage tumors ( Fig. 1 E ) .There was a positive correlation between the ratio of coding genes/Mb and eccDNA/Mb on all c hr omosomes ( earl y plasma samples: r = 0.8533, P < 0.001; Fig. 1 F; Fig. S1F , see on-line supplementary material ) .

Differ entially expr essed eccDN As in plasma of healthy controls and patients of early lung cancer
To verify whether the unique eccDNAs in earl y-sta ge tumor plasma could be used as earl y dia gnostic biomarkers, we selected consistent eccDNAs and eccDNA-carrying gene fr a gments.As shown in Fig. 1 G and H, five consistent eccDNAs were pr esent onl y in the earl y tumor samples ( EccDNA00051561, EccDN A00045920, EccDN A00019042, EccDN A00051542, and EccDNA00051549 ) ( Table S4 , see on-line supplementary material ) .Four consistent eccDN As w er e pr esent onl y in the healthy samples ( EccDN A00006696, EccDN A00026092, EccDN A00026093 and EccDNA00026094 ) ( Table S5 , see on-line supplementary material ) .Three consistent eccDNA genes wer e pr esent onl y in the healthy samples ( ENSG00000274167, ENSG00000261978 and ENSG00000167291 ) .Four consistent eccDNA genes were pr esent onl y in the earl y plasma samples ( ENSG000253978, ENSG000198099, ENSG00089916 and ENSG000145934 ) , and they wer e significantl y differ ent fr om those in the healthy plasma samples ( Table S6 , see on-line supplementary material ) .

eccDNA as potential biomarkers for NSCLC diagnosis
We used machine learning to further determine the diagnostic potential factors of the above selected eccDNAs.As shown in Fig. 1 I, str ong corr elations existed among EccDNA00019042, Ec-cDNA00045920 and EccDNA00051561.EccDNA00019042 showed the largest adjusted R2 value and thus was selected as an important factor in LASSO r egr ession ( Fig. 1 J ) .In addition, EccDN A00019042, EccDN A00045920 and EccDNA00051561 were found in the split site of the c hr omosome ( Fig. 1 K ) .Furthermor e, r eceiv er oper ating c har acteristic ( ROC ) curv e anal ysis was performed for the three eccDNAs .T he area under the curve ( AUC ) for discriminating earl y-sta ge tumors fr om normal samples was 0.75 ( 95% CI: 0.531-0.969 ) for EccDNA00019042 ( Fig. 1 L ) , 0.583 ( 95% CI: 0.420-0.747 ) for EccDNA00045920, and 0.583 ( 95% CI: 0.420-0.747 ) for EccDNA00051561 ( Fig. S1G and H, see on-line supplementary material ) .Mor eov er, based on the anal ysis of eccDNA sequencing results of 3 NSCLC tissues and normal lung tissues in our pr e vious r esearc h [ 10 ], we found a tr end of higher expr ession of EccDNA00019042 in NSCLC tissues compared with normal lung tissues ( Table S7 , see on-line supplementary material ) .These results suggested the high diagnostic potential of EccDNA00019042 for earl y-sta ge lung cancer.
In summary, we demonstrated the presence of eccDNAs in plasma and described their c har acteristics and genomic landscape .T hough with distinct expression signatures, eccDNA can be present in lung cancer and healthy human plasma.Ho w ever, the expr ession le v el of eccDN As sho w ed no significant differences, possibly due to the limited sample size and immature analysis methods.Based on the consistent presence of eccDNAs in early lung cancer plasma samples, we established a model to demonstrate that EccDNA00019042 could serve as a diagnostic biomarker for earl y-sta ge lung cancer.Ho w e v er, due to the relativ el y small clinical sample size, individual variations in eccDNAs among the samples included in this study may have affected the robustness of our findings to a certain extent, emphasizing the need for additional studies with larger sample sizes.

Figure 1 .
Figure 1.Identifying potential eccDNAs in early NSCLC plasma samples for diagnosis .( A ) T he w orkflo w of the processes for detecting eccDN As in plasma.( B ) The counts of eccDNAs and eccDNA genes in samples from healthy controls, early stage and late stage lung cancer patients .( C ) T he ratios of coding genes/Mb and eccDNAs/Mb of c hr omosomes in samples from healthy controls, early stage and late stage lung cancer patients.( D ) Genomic distributions of eccDNAs in samples from healthy controls, early stage and late stage lung cancer patients.( E ) Size distribution of eccDNAs in samples from healthy controls, early stage and late stage lung cancer patients.( F ) Pearson correlation analysis between the coding gene/Mb and eccDNA/Mb r atios on c hr omosomes fr om healthy contr ols, earl y sta ge and late stage lung cancer patients.( G, H ) Consistent eccDNAs and consistent eccDNA genes sequences in tissue samples from healthy controls and early stage lung cancer patients.( I ) Heatmap showing the distribution of adjusted R2 values in eac h model, whic h w as calculated b y the lea ps R pac ka ge .( J ) T he selection w orkflo w for important diagnostic eccDN As to distinguish early NSCLC from healthy plasma samples .( K ) T he split sites of ECCDNA00045920 and ECCDNA00051561.( L ) ROC curves for EccDNA00019042 in the test dataset ( ROC analyses of EccDNA00019042 in early NSCLC and normal plasma samples; early AUC = 0.750, 95% CI = 0.531-0.969 ) .