Competitive inhibition and mutualistic growth in co-infections: deciphering Staphylococcus aureus–Acinetobacter baumannii interaction dynamics

Abstract Staphylococcus aureus (Sa) and Acinetobacter baumannii (Ab) are frequently co-isolated from polymicrobial infections that are severe and refractory to therapy. Here, we apply a combination of wet-lab experiments and in silico modeling to unveil the intricate nature of the Ab/Sa interaction using both, representative laboratory strains and strains co-isolated from clinical samples. This comprehensive methodology allowed uncovering Sa's capability to exert a partial interference on Ab by the expression of phenol-soluble modulins. In addition, we observed a cross-feeding mechanism by which Sa supports the growth of Ab by providing acetoin as an alternative carbon source. This study is the first to dissect the Ab/Sa interaction dynamics wherein competitive and cooperative strategies can intertwine. Through our findings, we illuminate the ecological mechanisms supporting their coexistence in the context of polymicrobial infections. Our research not only enriches our understanding but also opens doors to potential therapeutic avenues in managing these challenging infections.


Quality control
The quality of the DIA data was evaluated based on intra-group coefficient of variation ( CV ), principal component analysis (PCA), and quantitative correlation of samples.When the sample size is large, quality control (QC) samples, which are generally a mixture of all samples, are inserted intermittently between the continuous original samples.Thus, experimental conditions can be evaluated by the following QC analysis to ensure stability and repeatability of the experiment.This analysis was used to calculate intra-group CV of different sample groups.The X-axis denotes the sample group and the Y-axis denotes the corresponding CV. (When the sample size in project is large, there is a QC group to be used to evaluate stability and repeatability of the experiment).
In this project, Pearson correlation coefficient of all protein expression between every two samples was calculated to demonstrate correlation of protein quantification between samples and was represented as a heat map as follows: Both X and Y axes represent samples.The color represents the correlation coefficient (the deeper color represents the higher correlation; the lighter color represents the lower correlation).

Identification and quantification detail table
The following are the lists of peptides and proteins for each sample.For the meaning of each column in the list, please refer to the attached Help documentations.

DDA Spectral Library
The samples of interest went through mass spectrometry data collection in data dependent acquisition (DDA) mode.MaxQuant was then used to carry out database search identification process and obtain all detectable non-redundant high-quality MS/MS spectral information as DIA spectral library, which contains fragment ion intensity and retention time describing the peak characteristics of the peptide, for quantification.The statistics of peptides and proteins in the spectral library are listed as follows:

Item Number
Peptide 18970 Protein 2831 The following figures are the basic statistics of the DDA identification results, which are unique peptide distribution, protein mass distribution and protein coverage distribution respectively.The X-axis is the number of unique peptides for each protein, and the Y-axis is the number of proteins.The X-axis is coverage percentage interval, and the Y-axis is the number of proteins.
The following lists are peptide list and protein list after DDA data are analysed by MaxQuant.For the meaning of each column in the list, please refer to the attached Help documentation.

Quantitative differential analysis
In this project, MSstats software package was applied to intra-system error correction, normalization for each sample.Then based on the predefined comparison groups and the linear mixed effect model, the significance of differentially expressed proteins (DEPs) was evaluated.Two filtration criteria (Fold change > 1.5 and Pvalue < 0.05) were used to get significant differential proteins.The statistics of the differences between the comparison groups are as follows:  The following is a volcano plot that helps to filter differentially expressed proteins visually: Figure 7 Volcano plot.
The X-axis of the graph is the protein fold change (log 2), and the Y-axis is the corresponding -log10 (P-value).In the figure, the red dot indicates significantly up-regulated proteins, the green dot indicates significantly down-regulated proteins, and the grey dot indicates proteins without significant change.
The following is a table of quantitative differential results for each comparison group.As an effective data analysis tool, cluster analysis has been widely used in the fields such as image processing, information retrieval, data mining.Cluster analysis is also widely used in the gene and protein expression data analysis, including discovering the unknown function of genes or proteins by clustering genes or proteins, automatically classifying pathological features or experimental conditions by clustering samples, finding regulatory genes or protein cluster under certain conditions through two-way clustering.We used the Euclidean distance and Hierarchical Cluster to cluster the differential proteins.Gene Ontology (GO) is an international standard gene function classification system that provides a timely updated standard vocabulary (Controlled Vocabulary) to comprehensively describe the properties of genes and gene products in organisms.GO includes in total three ontologies, describing Molecular Function, Cellular Component, and Biological Process of genes.
We carried out a GO function annotation analysis to all identified proteins, and the results include two parts: protein2go and go2protein.protein2go:For each protein, a list of IDs and all corresponding GO functions are given.go2protein:For the GO entries involved in the three ontologies (cellular component, biological process, molecular function), the IDs and the number of all the corresponding proteins are listed, and a statistical chart is made, and the GO entries without the corresponding proteins are excluded.In vivo, DEPs coordinate and carry out their biological behavior, and Pathway-based analysis helps to further understand their biological functions.KEGG is the Kyoto Encyclopedia of Genes and Genomes, founded in 1995 by the Kanehisa Laboratory of the Bioinformatics Center of Kyoto University, Japan, which is one of the most commonly used bioinformatics databases in the world and is known for understanding the advanced functions and utility libraries of biological systems.The KEGG PATHWAY database is the core of the KEGG database.Its distinctive feature includes powerful graphics functionality.It uses graphics instead of verbose words to introduce metabolic pathways and their relationships, so that researchers can have an intuitive and comprehensive understanding towards different pathways.References: http://www.genome.jp/kegg/pathway.html10/36 BGI Co.， Ltd.

GO enrichment analysis
For proteins with significant up/down-regulation, GO enrichment analysis provides GO entries that are significantly enriched for differential proteins, revealing the biological functions on which researchers mostly focus.This result is usually a significantly enriched GO entries with P-value < 0.05.The X-axis represents the number of differential proteins and the Y-axis represents the GO annotation entry.

Pathway enrichment analysis
The following figure shows a screenshot of the pathway enrichment analysis result web page.In the list, P-value<0.05 is the threshold for metabolic pathway with significant enrichment of differential proteins.
Confirm Show All
400-706-6615  This figure shows the metabolic pathway in which the differential proteins are significantly enriched.The X-axis is enrichment factor (RichFactor) which represents the number of differential proteins annotated to the pathway divided by all the proteins identified in the pathway.The larger the value, indicating the greater the proportion of differential proteins annotated to the pathway.The size of the circle represents the number of differential proteins annotated to the pathway.

Protein-Protein interaction analysis of DEPs
Proteins often carry out a specific function after combining into a complex though protein-protein interaction.STRING [1] is a database of known and predicted protein-protein interactions(PPI).PPI analysis of DEPs was done by searching STRING PPI database, and top 100 interactions with confidence was used to construct the interaction map, as below:

Methods 1 Pipeline Introduction
This project was analyzed using next generation label-free quantitative proteomics technology.In data independent acquisition (DIA) mode, it can deliver unprecedented proteomic coverage while enabling accurate and highly repeatable quantification for large amounts of proteins per sample.The DIA analysis pipeline provides an ideal differential proteomic analysis or a proteomic quantification platform for large amounts of samples.
The DIA analysis pipeline is based on three essential steps: 1) Spectral library construction: A spectral library collects all detectable non-redundant, highquality peptide information (MS/MS spectra) of the sample that can be used as a peptide identification template for subsequent data analysis.It contains fragment ion intensity and retention time that characterize the peptide spectrum.The spectral library is constructed from samples of interest by using data dependent acquisition (DDA) technique.
2) Large sample data acquisition in DIA mode: Data independent acquisition (DIA, also called SWATH) mode utilizes the latest high-resolution mass spectrometer to simultaneously acquire peptide ion characteristics in mass and retention time space.Compared to traditional technique of extracting single ion for fragmentation analysis, in DIA mode the mass spectrometer is set to a wide precursor ion Confirm Show All
3) Data analysis: Identificaton and quantification of peptides and proteins were obtained from DDA spectral library by deconvolution of the DIA data.MSstats software package was used to perform differential analysis, followed by functional analysis of the differential proteins.

Experimental Pipeline
The main experimental steps are shown below: (2) Add a 5mm steel bead and appropriate amount of Lysis Buffer 3, add PMSF with a final concentration of 1mM, EDTA with a final concentration of 2mM, vortex and let stand for 5 minutes, add DTT with a final concentration of 10mM; (3) Oscillate with a tissue grinder for 2 minutes (frequency is 50HZ); (4) 25,000g*4°C centrifugation for 20 minutes, take the supernatant; (5) Add a final concentration of 10mM DTT, water bath at 56°C for 1 hour; (6) After returning to room temperature, a final concentration of 55mM IAM was added and incubated in a dark room for 45 minutes; (7) Add cold acetone by 4 times volume of the sample, and stand at -20°C for 2 hours; (8) Repeat step (7) until the supernatant is colorless, if necessary; (9) 25,000g*4°C centrifugation for 20 minutes, discard the supernatant; (10) Add an appropriate amount of Lysis Buffer 3 to precipitate, followed by ultrasonication to dissolve the precipitated proteins; (11) After centrifugation at 25,000g*4°C for 20 minutes, the supernatant was taken for quantification.

Protein extraction quality control (1)Bradford quantification
Standard proteins (0.2μg/μL BSA) 0, 2, 4, 6, 8, 10, 12, 14, 16, 18μL were sequentially added to the 96-well microtiter plates A1 to A10, followed by the addition of pure water 20, 18, 16, 14, 12, 10, 8, 6, 4, 2μL, and then 180μL of Coomassie Brilliant Blue G-250 Quantitative Working Solution was added to each well.The OD595 was measured with a microplate reader, and a linear standard curve was drawn based on the OD595 and protein concentration.Diluted the protein solution to be tested several times, added 180μL of the quantitative working solution to 20μL of the protein solution, and read at OD595.The sample protein concentration was calculated from the standard curve and sample OD595.
(2) SDS-PAGE Each 10μg of protein solution was mixed with an appropriate amount of loading buffer, heated at 95°C for 5 minutes, centrifuged at 25,000g for 5 minutes, and the supernatant was loaded into a well of a 12% SDS polyacrylamide gel.120V constant pressure electrophoresis for 120 minutes; After electrophoresis, Coomassie blue staining was carried out for 2 hours, after which an appropriate amount of decolorizing solution (40% ethanol 10% acetic acid) was added to the shaker to decolorize for 3 to 5 times for 30 minutes each time.

Protein enzymatic hydrolysis
(1) Take 100μg of protein solution per sample and dilute with 50mM NH4HCO3 by 4 times volumes; (2) Add 2.5μg of Trypsin enzyme in the ratio of protein: enzyme = 40:1, and digest for 4 hours at 37°C; (3) Enzymatic peptides were desalted using a Strata X column and vacuumed to dryness.

High pH RP separation
Equal amount of peptides were extracted from all samples to mix, and the mixture was diluted with mobile phase A (5% ACN pH 9.8) and injected.The Shimadzu LC-20AB HPLC system coupled with a Gemini high pH C18 column (5μm, 4.6 x 250mm) was used.The sample was subjected to the column and then eluted at a flow rate of 1mL/min by gradient: 5% mobile phase B (95% ACN, pH 9.8) for 10 minutes, 5% to 35% mobile phase B for 40 minutes, 35% to 95% mobile phase B for 1 minute, flow Phase B lasted 3 minutes and 5% mobile phase B equilibrated for 10 minutes.The elution peak was monitored at a wavelength of 214nm and component was collected every minute.Components were combined into a total of 10 fractions, which were then freeze-dried.

DDA and DIA analysis by nano-LC-MS/MS
The dried peptide samples were reconstituted with mobile phase A (2% ACN, 0.1% FA), centrifuge at 20,000g for 10 minutes,and the supernatant was taken for injection.Separation was carried out by a Thermo UltiMate 3000 UHPLC liquid chromatograph.The sample was first enriched in the trap column and desalted, and then entered a tandem self-packed C18 column (150μm internal diameter, 1.8μm column size, 35cm column length), and separated at a flow rate of 500nL/min by the following effective gradient: 0~5 minutes, 5% mobile phase B (98% ACN, 0.1% FA); 5~130 minutes, mobile phase B linearly increased from 5% to 25%; 130~150 minutes, mobile phase B rose from 25% to 35%; 150~160 minutes, mobile phase B rose from 35% to 80%; 160~175 minutes, 80% mobile phase B; 175~175.5 minutes, mobile phase B decreased from 80% to 5%; 175.5~180 minutes, 5% mobile phase B. The nanoliter liquid phase separation end was directly connected to the mass spectrometer as the following settings.
For DDA analysis, LC separated peptides were ionized by nanoESI and injected to tandem mass spectrometer Fusion Lumos (Thermo Fisher Scientific, San Jose, CA) with DDA (data-dependent acquisiton) detection mode.The main settings were: ion source voltage 2kV; MS scan range 350~1,500m/z; MS resolution 60,000, maximal injection time (MIT) 50ms; MS/MS collision type HCD, collision energy NCE 30; MS/MS resolution 15,000, MIT 50ms, dynamic exclusion duration 30 seconds.The start m/z for MS/MS was fixed to 100.Precursor for MS/MS scan satisfied: charge range 2+ to 6+, top 30 precursors with intensity over 2E4.AGC was: MS 3E6, MS/MS 1E5.
For DIA analysis, LC separated peptides were ionized by nanoESI and injected to tandem mass spectrometer Fusion Lumos (Thermo Fisher Scientific, San Jose, CA) with DIA (data-independent acquisiton) detection mode.The main settings were: ion source voltage 2kV; MS scan range 400~1,500m/z; MS resolution 60,000, MIT 50ms; 400~1,500m/z was eqaully divided to 44 continuous windows MS/MS scan.MS/MS collision type HCD, MIT 54ms.Fragment ions were scanned in Orbitrap, MS/MS resolution 30,000, collision energy 30; AGC was 5E4.

Bioinformatic Analysis Pipeline
This process is based on the sample data generated from a high-resolution mass spectrometer.DDA data was identified by Andromeda search engine within MaxQuant, and identification results were used for spectral library construction.For large-scale DIA data, mProphet algorithm was used to complete analytical quality control, thus obtaining a large number of reliable quantitative results.This pipeline also performed GO, COG, Pathway functional annotation analysis and time series analysis.Based on the quantitative results, the differential proteins between comparison groups were found, and finally function enrichment analysis, protein-protein interaction (PPI) and subcellular localization analysis of the differential proteins were performed.

Database selection
The selection of database is an important step in MS based protein identification, and the final identified protein sequences are from the selected database.
Currently databases in use can be divided into three main categories:

1) UniProt protein database
UniProt is the most informative and resourceful protein database.It consists of data from three major databases, i.e.Swiss-Prot, TrEMBL and PIR-PSD.It is a data set verified by experts and consists of two parts: UniProtKB/Swiss-Prot (with reviewed, manually annotated entries) and UniProtKB/TrEMBL (with unreviewed, automatically annotated entries).In general, it is recommended to give priority using the subset of UniProtKB/Swiss-Prot for protein identification.When it aims to find novel sequences (such as alternative splicing, new transcripts) or to identify allied species, UniProtKB/TrEMBL datatbase can be considered .

2) The protein databases based on genome annotation
The databaseses mainly include a series of databases derived from NCBI and Ensembl gene annotation databases.
Among them, we choose protein database from reference sequence (RefSeq) of NCBI, which is a non-redundant proteome database.It is widely used in the analysis of multi-omics studies due to the importance of the NCBI annotation system.NCBI's RefSeq provides reference sequence for molecules that are naturally involved in central dogma, from chromosomes to mRNA and proteins.The RefSeq standard provides a basis for functional annotation of the human genome.It provides a stable reference for mutation analysis, gene expression studies, and polymorphic discovery.In addition, NCBI provides completed non-redundant protein sequence database (NCBI_nr), including animal, plant, microbial, bacterial and other taxonomy.Since the database is derived from various sources (including GeneBank, RefSeq, SwissProt, PDB, etc.), unless the species is without complete genome annotation, or it is necessary to search for homologous, it is not recommended to use this database for protein identification.
Ensembl aims to develop a software package with automatic annotation and maintenance for the eukaryotic genome.Ensembl has relatively complete and consistent genomic, transcriptome, and proteomic annotation information, which is ideal for multi omics analysis.

3) Databases from other sources
They usually refer to the protein target databases provided by the client, or new gene sequence generated from genome or transcriptome sequencing de novo assembly.They may also contain sequences of new features such as alternative splicing, mutation site, fusion genes, etc.

DDA data analysis
MaxQuant [2] (http://www.maxquant.org) is a free protein identification and quantification software developed by Max Planck Institutes for high-resolution MS data.This project was executed using this software for identification of DDA data, served as a spectrum library for subsequent DIA analysis.The analysis used raw data as input files, and set corresponding parameters and databases, then performed identification and quantitative analysis.The identified peptides satisfies FDR <=1% will be used to construct the final spectral library.
During the identification of this project, the parameters were configurated as follows:

DIA data analysis
The DIA data was analyzed using the iRT peptides for retention time calibration.Then, based on the target-decoy model applicable to SWATH-MS, false positive control was performed with FDR 1%, therefore obtaining significant quantitative results.

MSstats differential analysis
MSstats [3] is an R package from the Bioconductor repository.It can be used for statistical evaluation of significant differences in proteins or peptides from different samples, and is widely used in targeted proteomics MRM, label free quantitation, and SWATH quantitative experiments.The core algorithm is linear mixed effect model.The process preprocessed the data according to the predefined comparison group, and then performed the significance test based on the model.Thereafter, differential protein screening was performed based on the fold change >1.5 and Pvalue<0.05as the criterion for the significant difference.At the same time, the enrichment analysis is performed on the differential proteins.

Help 1 Protein Sequence: FASTA Format
Text-based FASTA format files are used to store DNA or protein sequence.The first line of the sequence file begins with the symbol ">", followed by the sequence ID, which can then be followed by sequence annotation information.The second line is the DNA base sequence or protein amino acid sequence corresponding to the first row.FASTA files can be opened directly with WordPad.

File Format Description of all_peptideSummary.xls
This file contains peptide information identified from the DDA data using MaxQuant.Each column is Tab-delimitated.

File Format Description of proteinGroups_recalibration.txt
This file contains protein information identified from the DDA data using MaxQuant.Each column is Tab-delimitated.

File Format Description of annotation_allprotein.xls
This file contains protein information identified in this DIA experiment.Each column is Tabdelimitated.

File Format Description of XX2-VS-XX1.All.xls
This document contains all differential proteins information after statistical analysis of XX2 sample and XX1 sample.Each column is Tab-delimitated.

File Format Description of dia-proteinSummary.xls
This file contains relative quantitation values for proteins of all samples.Each column is Tabdelimitated.
Table 5 Format description of relative quantitative values for proteins of all samples ( Download)

Field Description
Protein Protein ID

Case_1
The relative quantitative value of the sample Case_1 protein, the value is obtained by sample normalization and log2 transformation

Case_2
The relative quantitative value of the sample Case_2 protein, the value is obtained by sample normalization and log2 transformation

Case_3
The relative quantitative value of the sample Case_3 protein, the value is obtained by sample normalization and log2 transformation

Control_1
The relative quantitative value of the sample Control_1 protein, the value is obtained by sample normalization and log2 conversion

Control_2
The relative quantitative value of the sample Control_2 protein, the value is obtained by sample normalization and log2 conversion

Control_3
The relative quantitative value of the sample Control_3 protein, the value is obtained by sample normalization and log2 conversion ProteinGroup Protein group ID

File Format Description of dia-peptideSummary.xls
This file contains relative quantitation values for peptides of all samples.Each column is Tabdelimitated.

Case_1
The relative quantitative value of the sample Case_1 peptide, the value is obtained by sample normalization and log2 transformation

Case_2
The relative quantitative value of the sample Case_2 peptide, the value is obtained by sample normalization and log2 transformation

Case_3
The relative quantitative value of the sample Case_3 peptide, the value is obtained by sample normalization and log2 transformation

Control_1
The relative quantitative value of the sample Control_1 peptide, the value is obtained by sample normalization and log2 conversion

Control_2
The relative quantitative value of the sample Control_2 peptide, the value is obtained by sample normalization and log2 conversion

Control_3
The relative quantitative value of the sample Control_3 peptide, the value is obtained by sample normalization and log2 conversion

How to Read Report of Clustering Analysis
Each cluster plan which is consisted of more than two pairwises, has two types clustering results: intersection and union.The value range is 0~1, it is used to evaluate the similarity between the protein and a cluster.The closer the value is to 1, the more consistent trend between the protein and the cluster 10 How to Read Report of GO Functional Annotation and Enrichment Results       The first column is the GO entry, the second column is the number and proportion of DEPs annotations to the GO entry, the third column is the number and proportion of all identified proteins annotations to the GO entries, and the fourth column is the P-value of hypergeometric test, the lower the value, the more significant the GO entry enrichment.When P-value<0.05, the GO entry is significantly enriched.

File Format Description of Time Series Analysis Results
Click the GO entry 'BLOC complex' in Figure5, it will automatically jump to http://amigo.geneontology.org/amigo for more detailed annotation information when connected to internet.Click on 'view genes' to query the protein IDs of annotation to the GO entry, as shown in Figure5.Two differential proteins are annotated to the GO entry 'BLOC complex'.

How to Read Report of COG/KOG Functional Annotation Results
The COG/KOG annotation results are packaged in sample_COG/KOG.zipunder Function_analysis directory, sample.cog/kog2protein.xls is the COG or KOG classification file (relations between COG/KOG entries and proteins).This file can be opened with Excel.

How to Read Report of Pathway Functional Annotation and Enrichment Results
The pathway functional annotation results are packaged in sample_Pathway.zipunder Function_Analyse directory.sample.ko is a list of protein IDs and related KO numbers (this file can be opened with Excel, or it can be viewed with less command in terminal).

Field Description
First to third row Description of analysis process

First column Protein
Second column KEGG Orthology sample.path is a pathway list file for proteins (this file can be opened with Excel, or it can be viewed with less command in terminal).After detecting the most significant enriched pathway of DEPs, we can view detailed pathway map via pathway ID.For example, click the hyperlink on "Vascular smooth muscle contraction" in Figure11, it will automatically jump to pathway map as shown in Figure12.Significant up regulation proteins are marked with red rectangles and significant down regulation proteins are marked with green rectangles.When mouse hover on red or green rectangles, the related DEPs and log2 fold change will appear on the top left.Click protein name in the figure, the page will redirect to KEGG website when it is online.

How to Read Report of Protein-protein Interaction Results
The protein-protein interaction results are packaged in PPI.zip under Differential_enrichment directory.network.relation.xlsis the result file of the interaction of the proteins, which can be opened with Excel.

NR Database
The NR database is called the non-redundant database and is maintained by the National Center for Biotechnology Information (NCBI).It integrates multiple protein databases such as the GenBank CDS region translation sequence, the Refseq protein library, and the SwissProt protein database.The NR database is very comprehensive, but most proteins have not been validated.

Swiss-Prot Database
SWISS-PROT is a protein sequence database with detailed annotations.It is maintained by the European Bioinformatics Center (EBI) and has been incorporated into the UniProt database.It aims to help genomics and proteomics and related molecular biology researchers provide the latest information on protein sequences.The SWISS-PROT database contains a carefully checked and accurately annotated protein sequence in the EMBL nucleic acid sequence database and therefore has a high degree of confidence.

COG/KOG Database
The COG is called the Cluster of Orthologous Groups of proteins.Each COG entry contains a series of orthologs or paralogs.Orthologous proteins refer to proteins that have evolved from vertical families from different species and typically retain the same function as the original protein.Paralogous proteins are those proteins that are derived from gene replication in certain species and may evolve new functions related to the original (for prokaryotic organisms).
The KOG is called the Eukaryotic orthologous groups.Each KOG entry contains a series of orthologs or paralogs.An orthologous protein is a protein that has evolved from a vertical family from different species and specifically retains the same function as the original protein.Paralogous proteins are those proteins that are derived from gene replication in certain species and may evolve new functions related to the original (for eukaryotes).

GO Database
The full name of GO is Gene Ontology.The original intention of creating a gene ontology is to provide a working platform for a representative description of terminology or semantic interpretation of the characteristics of gene and gene products, enabling bioinformatics researchers to generalize, process, explain, share gene and gene product data.
The gene and gene product vocabularies involved in gene ontology are divided into three categories, covering three aspects of biology: 1) cellular component: each part of the cell and the extracellular environment.2) Molecular function: can be described as a molecular level activity, such as catalytic or binding activity.3) Biological process: Biological process refers to a series of events that result from the functional combination of one or more molecules.

KEGG Database
The whole name of KEGG is Kyoto Encyclopedia of Genes and Genomes.KEGG version 0.1 is published by Kanehisa Laboratories in 1995, and it is developed into an integrity database now.Its core database is KEGG PATHWAY database.KEGG database has advantage on the figures of metabolic pathways.For example, if we want to know the genes involved in alanine metabolic pathway, we could search "Alanine" in the annotation result (x.kegg.list.anno).

FAQs
In the comparison group difference file, why do some proteins have a difference ratio but the P-value is NA?
Calculating P-values generally requires a certain amount of data points, so that it is sufficient to construct a corresponding distribution for hypothesis testing.The described condition is likely to occur when certain proteins have quantitative values in only a few samples.Usually this protein is considered as a non-differentiating protein.

Figure 2
Figure 2 Heat map of sample correlation analysis.

Figure 3
Figure 3 Unique peptide distribution.

Figure 4
Figure 4 Protein mass distribution.

Figure 5
Figure 5 Protein coverage distribution.

Figure 6
Figure 6 Bar chart of differentially expressed proteins.

Figure 8
Figure 8 Principal component analysis.

Figure 10
Figure 10 GO function annotation.
original protein.Paralogs are proteins that are derived from gene duplication in certain species and may evolve new and previously relevant functions.The analysis compares the identified proteins with the COG database, predicts the possible functions of these proteins and performs functional classification statistics.

Figure 11
Figure 11 Bar plot of the COG analysis.

Figure 12
Figure 12 Pathway annotation result.

Figure 14
Figure 14 Demonstration of GO enrichment analysis results.

Figure 15
Figure 15 Differential protein GO function classification.

Figure 16
Figure 16 Up or down regulation of differential proteins in GO function classification.

Figure 17
Figure 17 GO term relationship network.

Figure 18
Figure 18 Pathway enrichment analysis results demo.

Figure 19
Figure 19 Differential protein pathway classification.

Figure 20
Figure 20 Up and down regulation differential protein pathway classification.

Figure 22
Figure 22 Pathway relationship networks.

Figure 23
Figure 23 COG annotation of DEPs.

Figure 24
Figure 24 PPI network of DEPs.

Figure 1
Figure 1 Heat map of DEPs expression clustering.
of proteins in each GO entry Proteins_of_* Protein IDs in each GO entry Sample.fa.protein2GO.xls is the GO annotation (relations between proteins and GO entries) file, which can be opened with Excel.
results of GO enrichment to DEPs are packaged in GO_Enrichment.zip under Differential_enrichment directory.The results can be viewed in IE browser by opening the web page result GOView.html.The left navigation bar includes three aspects for GO entries (C: cellular component, P: biological process, F: molecular function).Click one of them and the results of GO enrichment are shown in figure{5}.29/36 BGI Co.， Ltd. 400-706-6615 ©2021 BGI All Rights Reserved.

Figure 4
Figure 4 GO entries of DEPs' enrichment.

Figure 5
Figure 5 Protein IDs annotated to a GO entry.
Figure 10 Pathway enrichment analysis of DEPs.

2 14
the STRING database aligned for differentially expressed protein 1 protein_cluster2 Protein cluster 2 in the STRING database aligned for differentially expressed protein 2 score Interaction scores of differentially expressed protein 1 and How to Read Report of Subcellular Localization The subcellular localization results are packaged in Subcellular.zipunder Differential_enrichment directory.subcellular2protein.xls is a differential protein subcellular localization classification file (relations between subcellular localization and proteins), which can be opened with Excel.
is a differential protein subcellular localization file (the relations between proteins and subcellular localizations), which can be opened with Excel.

Table 1
Format description of peptide list in library ( See all) 400-706-6615©2021 BGI All Rights Reserved.

Table 2
Format description of protein list in library ( See all)

Table 3
Format description of DIA identified protein annotation list ( See all)

Table 4
Format description of differential proteins from different comparison groups ( See all)

Table 6
Format description of relative quantitative values for peptides of all samples ( Download)

Table 9
Format description of GO2protein ( Download)

Table 10
Format description of protein2GO ( Download)

Table 11
Format description of COG/KOG2protein ( Download)The sample.protein2COG/KOG.xls is a COG or KOG annotation (relations between proteins and COG or KOG entries) file that can be opened with Excel.

Table 15
Format description of PPI ( Download) ©2021 BGI All Rights Reserved.

Table 16
Format description of subcellular2protein ( Download)

Table 17
Format description of protein2subcellular ( Download) WoLF PSORT subcellular localization results are represented by abbreviations.Detailed explanations of abbreviations are given in the following table:

Table 18
©2021 BGI All Rights Reserved.Localization classes including underscores indicate the possibility of double localization, for example "E.R._golg" indicates proteins are located in both endoplasmic reticulum and Golgi apparatus.