RiboToolkit: an integrated platform for analysis and annotation of ribosome profiling data to decode mRNA translation at codon resolution

Abstract Ribosome profiling (Ribo-seq) is a powerful technology for globally monitoring RNA translation; ranging from codon occupancy profiling, identification of actively translated open reading frames (ORFs), to the quantification of translational efficiency under various physiological or experimental conditions. However, analyzing and decoding translation information from Ribo-seq data is not trivial. Although there are many existing tools to analyze Ribo-seq data, most of these tools are designed for specific or limited functionalities and an easy-to-use integrated tool to analyze Ribo-seq data is lacking. Fortunately, the small size (26–34 nt) of ribosome protected fragments (RPFs) in Ribo-seq and the relatively small amount of sequencing data greatly facilitates the development of such a web platform, which is easy to manipulate for users with or without bioinformatic expertise. Thus, we developed RiboToolkit (http://rnabioinfor.tch.harvard.edu/RiboToolkit), a convenient, freely available, web-based service to centralize Ribo-seq data analyses, including data cleaning and quality evaluation, expression analysis based on RPFs, codon occupancy, translation efficiency analysis, differential translation analysis, functional annotation, translation metagene analysis, and identification of actively translated ORFs. Besides, easy-to-use web interfaces were developed to facilitate data analysis and intuitively visualize results. Thus, RiboToolkit will greatly facilitate the study of mRNA translation based on ribosome profiling.


INTRODUCTION
Ribosome profiling (Ribo-seq), also known as ribosome footprinting, has revolutionized the 'translatomics' field by mapping the position of ribosome-protected fragments (RPFs), which typically range in length from 26 to 34 nucleotides (nt), over the entire transcriptome (1,2). The scientific community has employed Ribo-seq to answer a wide range of questions, ranging from the identification of translated open reading frames (ORFs) to the quantification of relative translational efficiencies, while gaining precious mechanistic insight into the mRNA translation process (3). Translation is the bridge between RNA and protein, which is highly interconnected and subject to extensive, multistep, post-transcriptional regulation, including pre-mRNA splicing, small RNA-mediated regulation, mRNA turnover, mRNA modifications, as well as many other mechanisms of translational control (4,5). More and more investigators are beginning to use Ribo-seq in their research to study various processes of post-transcriptional gene regulation.
Here we present, RiboToolkit (http://rnabioinfor.tch. harvard.edu/RiboToolkit and https://bioinformatics.sc.cn/ RiboToolkit), the first integrated web server for Ribo-seq data analysis, that we developed with these main functionalities: (i) data quality control by filtering low quality sequence reads and distinguishing RPFs from tRNA, snRNA, and rRNA tags; (ii) RPFs length distribution, coding frame distribution, and 3-nt periodicity analyses for Ribo-seq quality evaluation; (iii) codon usage and ribosome stalling analyses were designed to identify highly active codons and codon stalling events; (iv) actively translated ORFs can be efficiently identified with higher speed; (v) unbiased mRNA translation efficiency and differential translation analysis; (vi) functional annotation of differentially translated mRNAs can be performed using various gene functional datasets; (vii) metagene analysis designed to show the RPFs distribution for entire translatome; (viii) reproducibility analyses between replicates can be performed based on RPF expression, gene expression, and codon occupancy; (ix) RPF mapping can be interactively visualized on the webpage based on IGV.js; (x) CodonFreq tool was developed to study the codon constitution among different gene groups; (xi) supports different ways of data uploading, including collapsed FASTA and data web links; (xii) very user-friendly web interfaces and a convenient data analysis queuing system was developed; (xiii) the results can be flexibly exported in different formats; (xiv) mRNA translation can be studied for as many as 16 model species (Supplementary Table S1). Therefore, RiboToolkit is a very comprehensive and convenient tool for Ribosome profiling and will greatly benefit the study of mRNA translation.

RiboToolkit WORKFLOW
RiboToolkit was constructed based on diverse data sources (Supplementary Table S1) and algorithms. tRNA sequences were downloaded from the GtRNAdb database (30). rRNA and snRNA sequences were retrieved from noncoding RNA annotations in Ensembl Genomes database (31). Protein coding gene sequences and gene annotations were downloaded from GENCODE database (32) for human (V19 and V32 for hg19 and hg38, respectively) and mouse (M23), and Ensembl Genomes database (31) for other species (Supplementary Table S1). The overall workflow contains three major parts: (i) Ribo-seq data preprocessing; (ii) RPF mapping and sequences analyses; and (iii) differential translation and functional analyses ( Figure  1).
The uploaded sequences were first aligned to rRNAs, tRNA, and snRNA to exclude the RPFs coming from rRNA, tRNA, and snRNA using Bowtie v1.2.2 (33) with a maximum of two mismatches (-v 2) by default. Cleaned RPF sequences were then mapped to the ref-erence genome using STAR v2.7.3a (34) with parameters (-outFilterMismatchNmax 2 -quantMode Tran-scriptomeSAM GeneCounts -outSAMattributes MD NH -outFilterMultimapNmax 1) by default. The unique genome-mapped RPFs are then mapped against protein coding transcripts using bowtie v1.2.2 with parameters 'a -v 2' by default (33). Coding frame distribution and 3nt periodicity analyses for Ribo-seq quality evaluation are performed based on riboWaltz v1.1.0 (11). The feature-Counts program in the Subread package v1.6.3 (35) is used to count the number of RPFs uniquely mapped to CDS regions based on genome mapping file (-t CDS -g gene id), which were then normalized as RPF Per Kilobase per Million mapped RPFs (RPKM). For codon-based analyses, 5 mapped sites of RPFs (26-32 nt by default) translated in 0frame were used to infer the P-sites with the offsets, which can be set by users or calculated based on the RPF mapping distribution around translation start sites using psite function in plastid v0.4.8 (7). The codon occupancy was further normalized by the basal occupancy which was calculated as the average occupancy of +1, +2, and +3 position downstream of A-sites (36). Pause score is further used to evaluate codon pause events using PausePred local version with default parameters (37). The upstream and downstream sequences (±50 nt) around pause sites were extracted from transcript sequences and different sequence features were calculated, including RNA secondary structure, minimum free energy (MFE), and GC content. RNA secondary structure and minimum free energy were calculated using RNAfold program in ViennaRNA Package v2.0 (38) with default parameters. For actively translated ORF identification, RPF reads mapped to the genome in end-toend mode were extracted by removing the soft clipped reads from the BAM file generated by STAR, then RiboCode v1.2.11 (23), which shows high speed and sensitivity for annotating ORF (23), was used to identify all actively translated ORFs. In this process, RiboCode first constructs the candidate ORF library based on the constitution of start codons and stop codons on different transcripts (including both protein coding and non-coding RNAs). The actively translated ORFs were then identified by evaluating the statistically significant 3-nt periodicity (P-value < 0.05 by default) in each candidate ORF based on the distribution of RPFs in each frame.
The translation efficiency was calculated as the ratio between CDS RPF abundance and mRNA abundance for each gene, for which gene expression matrix (raw read counts) needs to be uploaded by the users in the group case web page ( Figure 2). The gene expression count matrix is generated by merging raw read counts from accompanying RNA-seq data for different samples. The users can use many tools to count the reads from mapping BAM files of RNAseq data, such as featureCounts (35) and HTseq (39). Ri-boToolkit provides the information and download links of gtf files used for each species. The difference in translation efficiency between two groups with more than two replicates is analyzed using Riborex v2.4.0 (24) based on DE-Seq2 engine, which models a natural dependence of translation on mRNA levels as a generalized linear model (40). For two groups without replicates, only fold change is calculated. To explore the biological implication of differen-W220 Nucleic Acids Research, 2020, Vol. 48, Web Server issue tially translated genes (Fold change > 1.5 and adjust Pvalue < 0.05 by default), various functional gene enrichments are performed, including: (i) Gene Ontology (GO) and KEGG pathway from clusterProfiler package v3.14.3 (41) for all supported species; (ii) Reactome pathway from ReactomePA packages v1.30.0 (42) for human, mouse, rat, zebrafish, fly, and Caenorhabditis elegans; (iii) Disease Ontology, Network of Cancer Gene, DisGeNET disease genes from DOSE packages v3.12.0 (43) for human, mouse, rat, zebrafish, fly, and C. elegans. Meanwhile, Gene Set Enrichment Analysis (GSEA) for GO, KEGG and MSigDB functional gene sets (44) are supported for human, mouse, rat, zebrafish, fly and C. elegans. In the functional enrichment process, Fisher's exact test is used to perform enrichment analysis, while for the GSEA analysis, clusterProfiler package (41) is utilized. FDR < 0.05 was set as the statistically significant level by default.

WEB SERVER INPUTS
The single case module allows users to submit Ribo-seq data or accessible web links to the data. The Ribo-seq data are required in collapsed FASTA format or as collapsed FASTA file further compressed in zip or gz format to accelerate the uploading. The header in the collapsed FASTA format likes '>seq1 x160', where 'seq1' is a user-definable unique ID, while '160' represents the frequency of RPFs of 'seq1'. Meanwhile, batch submission of multiple samples is supported by clicking 'add' icon in the web page. The collapsed FASTA format can sharply reduce the size of FASTQ format files. For instance, a gzip compressed FASTQ file with 1.6 Gb size can be converted to a compressed collapsed FASTA file of 38 Mb size. Meanwhile, RiboToolkit (server #1) also provides users the options to upload FASTQ file, although the uploading speed is much slower than uploading collapsed FASTA. There is up to 5 Gb maximum upload size restriction. For the FASTQ file uploading, the adapter information is required, including 3 adapter, 5 adapter (optional), maximum allowed mismatches or match error rate, minimum overlap length between read and adapter, number of nucleotides clipped from both ends, and number of rounds for adapter trimming. After submission of data, the analysis queue system will provide the users with job IDs (a string with 16 characters) that can be used to retrieve the results once the job is finished.
In the group case module, the job IDs of single case module are required as inputs and each group should contain at least one sample. During the data analysis process, the web server will retrieve corresponding results for the jobs automatically. When the gene expression matrix file (raw Nucleic Acids Research, 2020, Vol. 48, Web Server issue W221 Figure 2. Inputs of RiboToolkit. In single case, RiboToolkit utilizes collapse FASTA file of Ribo-seq data as input. In group case, the job IDs of single case module are required as inputs and each group should contain at least one sample. The gene expression count matrix file of according input samples (RNA-Seq) is required to perform differential translation analysis. read counts) of according input samples (RNA-Seq) is provided in group case, RiboToolkit will perform the translation efficiency calculation, differential translation analysis, and functional annotation of differentially translated genes. The gene expression count matrix can be generated by merging the raw count outputs from many tools, such as fea-tureCount (35) and HTseq (39). The codonFreq tool can be used to perform codon enrichment analysis and compare the codon frequencies in the user's submitted genes compared with other background genes. Users can define a codon subset by inputting codon list in the web page.
RiboToolkit provides several flexible parameters for the users. A length interval can be set in advance and only the RPF sequences within this interval (26-32 nt by default) will be considered for downstream analyses. Meanwhile, RiboToolkit provides many other useful parameters (Figure 2), such as the number of allowed mismatches (with the default a maximum of two mismatches), maximum of multiple-mapping (unique mapping by default) in RPF sequence mapping, offsets to infer P-sites (calculated by psite function or inputted by users), minimum of RPF coverage, fold change compared with background for codon pause site identification, and P-value for actively translated ORF calling. For Ribo-seq data which use unique molecular identifier (UMI) for PCR duplication elimination (45), the algorithm implemented in RiboFlow-RiboR-RiboPy (46) and UMI-Reducer (https: //github.com/smangul1/UMI-Reducer) are used to remove the PCR duplication. To detect differential translation between samples, the desired statistical significance of interest with P-value threshold and fold change in normalized sequence counts can be defined by users. The statistically significant level for functional enrichments of differentially translated genes can be set by users. In the codonFreq tool, the users can set the P-value threshold to define the codon enrichment between the codon frequency of the gene and genome background. All input webpages are organized with examples to help users achieve correct inputs.

WEB SERVER OUTPUTS
All RiboToolkit outputs are presented in intuitive web interfaces, which typically contain the following information: (i) basic statistics of RPF tags, including RPF cleaning statistics by mapping to different potential contamination RNA types (rRNA, tRNA and snRNA) (46), RPF length distribution, RPF distribution on different gene biotypes (protein coding, lincRNA, antisense RNA, etc.) and RPF distribution on different gene features (5 UTR, CDS, 3 UTR, etc.); (ii) Ribo-seq quality statistics, including RPF coding frame distribution (frame 0, 1 and 2) on 5 UTR, CDS and 3 UTR, respectively, RPF coding frame distribution with different RPF length, RPF mapping around start codon for P-site inferring, RPF metagene distribution around translation start/end sites for 3-nt periodicity checking, and metagene coverage plots for whole CDS, CDS start region (300 bp) and CDS end region (300 bp); (iii) codon occupancy statistics, including codon occupancies of E, P, A, A +1, A + 2 and A + 3 sites; (iv) metaplot for individual codon; (v) Gene expression table from RPF counts, including RPF counts and RPKM values for 5 UTR, CDS, 3 UTR and whole mRNA; (vi) codon pause score and sequence context information (RNA secondary structure, minimum free energy, and GC content for both upstream and downstream sequences) for codon pausing sites; (vii) Actively translated ORF statistics, including actively translated ORF distribution plot and table of detailed ORF list. All the ORFs with statistically significant 3-nt periodicity distribution (P-value < 0.05 by default) are reported in the table. The users can further filter the ORF list by using RPF raw number or normalized RPF number to identify high confidence ORFs from the full list ( Figure 3).
In group case study, the outputs contain: (i) a heatmap of RPF length distribution for all the samples in two groups; (ii) reproducibility analyses using RPF and gene expression, including correlation scatter plots between different replicates, PCA analysis and correlograms for different samples; (iii) codon occupancy scatter plot for A site between two groups and correlation plots among replicates in each group; (iv) codon occupancy changes for E, P, A, A +1, A + 2 and A + 3 sites between two groups; (v) cumulative codon occupancy plot for individual codons; (vi) cumulative codon occupancy plot for individual amino acids; (vii) the statistic plots of expression and translation changes, including the volcano plot of differential translation, the scatter plot of translation efficiency changes versus expression changes, gene expression scatter plot between two groups, translation efficiency scatter plot between two groups, the correlation plot between normalized RPF count and normalized gene counts; (viii) RPF metagene distribution between two groups; (ix) RPF metagene distribution for translation up-regulated and down-regulated, respectively; (x) differentially translated gene list; (xi) functional enrichment barplots and detailed lists of enriched terms for both translational up-regulated genes and down-regulated genes and (xii) GSEA result list. In codonFreq tool, the output includes the difference of total codon frequency between input genes and background genes, boxplots of codon frequency of each codon and the table of codon frequencies and enrichments for uploaded genes (Figure 3).
For each table in the results web pages, more detailed gene information (including sequence lengths for 5 UTR, CDS, 3 UTR, and whole transcript) are provided for each gene, which can be downloaded by clicking the 'down-Nucleic Acids Research, 2020, Vol. 48, Web Server issue W223 load CSV' button located above the table. For each interactive plot generated using Highcharts JavaScript library (https://www.highcharts.com), RiboToolkit provides links (above the plot) to download the data for the plots in both txt and csv formats. Meanwhile, the users can download all the results in the tables and figures using 'Download the results' button in the front of result pages.

COMPARISON WITH OTHER INTEGRATED TOOLS
There is a wide range of publicly available tools for Riboseq data analysis. However, to the best of our knowledge, the focus of many available tools is directed towards actively translated ORF identification, such as Ri-bORF (15), RiboTaper (16) and ORF-RATER (17). Some tools are designed specifically for visualizing RPF distribution and codon statistics, such as riboWaltz (11), GWIPSviz (12) and Trips-Viz (14). Other tools, such as Riborex (24) and Xtail (29), focus on differential translation efficiency analysis. Although integrated tools are designed for Ribo-seq analysis, including RiboTools (47), riboSeqR (6), Plastid (7), RiboProfiling (10), PROTEOFORMER (48,49), systemPipeR (50), RiboVIEW (13), riboStreamR (51), RiboFlow-RiboR-RiboPy (46) and RiboGalaxy (52), these tools provided just a limited number of functionalities and/or required many bioinformatics expertise to install, configure and manipulate them (Supplementary Table  S2). Although RiboGalaxy provides many functions on the Galaxy web, but they are not integrated with each other. RiboToolkit is the first integrated one-stop web server for Ribo-seq data analysis (Supplementary Table S2): which provides many useful functionalities: (i) data quality control; (ii) Ribo-seq quality evaluation; (iii) Codon usage and ribosome stalling analyses, (iv) Actively translated ORFs identification; (v) gene translation efficiency and differential translation analysis; (vi) differential translation gene functional annotation based on various functional sets; (vii) RPF metagene analysis for CDS region and translation start/end sites; (viii) interactive visualization of RPF mapping on the web page; (ix) CodonFreq tool was developed to study the codon constitution of different gene groups; (x) different ways of data uploading; (xi) very user-friendly web interfaces and a convenient data analysis queuing system; (xii) RNA translation can be studied for as many as 16 species.

CASE STUDIES
Transfer RNAs (tRNAs) are subjected to numerous RNA modifications, which can directly control their folding and stability. N 7 -Methylguanosine (m 7 G) at nucleotide 46 (m 7 G46) is one of the most prevalent modifications and has important physiological functions in mammals. A total of 22 m 7 G modifications were identified in mouse embryonic stem cells (mESCs) and knockout of METTL1 was shown to greatly decrease the stability of 22 m 7 G tRNAs and further impact mRNA translation of cell cycle and neurodevelopmental genes (53). RiboToolkit was used to study the translation changes based on Ribo-seq data of Mettl1 knockout and control in mouse embryonic stem cells (mESCs) (GSE112670, Supplementary Table S3) (53). The mapping statistics, RPF periodicity, RPF length distribution, and metagene plot by RiboToolkit confirmed the good quality of the Ribo-seq data (Figure 4). Codon occupancy analysis confirmed that the majority of m 7 G-modified tR-NAs decoded codons showed significantly higher occupancy than codons that are decoded by tRNAs that are not m 7 G-modified ( Figure 5A and B). Translation efficiency analysis by RiboToolkit showed that the translation is obviously impacted upon knocking out Mettl1 compared with the mRNA expression level changes ( Figure 5C). Codon frequency distribution from RiboToolkit indicated that the frequency of m 7 G tRNAs decoded codons are significantly enriched in translation down-regulated genes ( Figure 5E). The functional annotation of Gene Ontology and various pathways by RiboToolkit showed that cell cycle and neural genes are significantly enriched among the translationally down-regulated genes ( Figure 5D), which are consistent with the original findings. Further analyses also confirmed the significant higher m 7 G codon frequencies of cell cycle genes and neural genes compared with random background genes ( Figure 5F and G).
Certain yeast strains show a large proportion of sites with high codon occupancy due to a high abundance of paused ribosomes (54). Yeast treated with 3-amino-1,2,4triazole (3-AT), an inhibitor of histidine biosynthesis, can induce ribosome pausing. Based on the Ribo-seq data of 3-AT treatment in Yeast (GSE52968, Supplementary Table S3) (54). RiboToolkit analyses showed a significant shift in a cumulative distribution of pause scores and a peak in metagene distribution plot, indicating the significant pauses at histidine codons (Supplementary Figure S1). In yeast, the wobble uridine (U34) in tRNA wobble nucleoside is almost always modified and can enhance codon recognition and binding (36). RiboToolkit analyses based on ribosome profiling data of ncs2 elp6 yeast mutant (lacking all U34 modifications) and wide type (GSE67387, Supplementary Table S3) (36) revealed strikingly distinct effects of U34 modification loss on ribosome occupancy. CAA and AAA codons, decoded by the mcm 5 s 2 U34-containing tRNA-UUG and tRNA-UUU, were enriched within the Asite in the mutant (Supplementary Figure S2A), suggesting they are translated more slowly when U34 modification is depleted or attenuated. For other codons, including GAA, which is also decoded by a mcm 5 s 2 U34-containing tRNA, the effect on ribosome occupancy is modest. The comparison of the A-site ribosome occupancy at individual codons in wild type and ncs2 elp6 mutant showed that a significantly larger proportion of CAA and AAA codons had high occupancy in mutant ( Supplementary Figure S2B), indicative of widespread translational slowdown. By contrast, single-codon A-site occupancy change at GAA was not significant in the two strains, consistent with the global codon occupancy measurements (Supplementary Figure S2B).
Endoplasmic reticulum (ER) stress impacts translation (55). We used RiboToolkit to systematical profile translation in ER-stress conditions of NIH3T3 cells (GSE103667, Supplementary Table S3) (56). Translation efficiency indi- cates that a total of 120 genes are significantly differentially translated (fold change > 1.5 and adjust P-value < 0.05). There are many more down-regulated genes compared with up-regulated genes (91 versus 29) ( Figure 6A). The up-regulated gene includes Atf4, a transcriptional factor, which is well known from other studies to be translationally up-regulated upon ER stress (57,58) ( Figure 6F). Codon occupancy analysis indicated that the global translation on specific codons is not affected under the ERstress condition ( Figure 6B). Functional annotation of differentially translated genes revealed significant enrichment in oxidative phosphorylation, electron transport chain, endoplasmic reticulum unfolded protein response, response to endoplasmic reticulum stress, cell adhesion, and extracellular matrix, etc. GSEA results also indicate the significant association between ER-stress and extracellular matrix function ( Figure 6C and D). Active ORF analysis showed in NIH3T3 cells that most ORFs come from known CDS region (annotated ORF). There are however many other ORFs identified, including uORF (upstream ORF), overlapping uORF, dORF (downstream ORF), overlapping dORF (translation read through), internal ORF (ORF on CDS with different coding frame or frame shift) and novel ORF ( Figure 6E).
Global repression of protein synthesis occurs during heat shock response and has been attributed primarily to inhibition of translation initiation (59). RiboToolkit was used to study global translational regulation during chronic heat stress (42 • C for 8 h, HS8M) and acute heat stress (44 • C for 2 h, HS2S) in mouse 3T3 fibroblast cell (GSE32060, Supplementary Table S3) (59). Metagene RPF distribution analysis showed that change is generally modest in response to chronic heat stress ( Figure 7A), while a dramatic change in relative ribosome occupancy occurs in response to acute heat stress, especially at the translation initiation region (∼200 nt after the initiation site) ( Figure 7B), which may indicate translation regulation after initiation. Numerous individual genes with sufficient RPF coverage showed a similar distribution to the RPF metagene plot, such as Vim and Serpine1 ( Figure 7C). There are exceptions with some genes escaping from the global elongation and initiation blocks, such as Atf4 and Atf5 ( Figure 7D), two important transcription factor genes that regulate responses to a variety of stress conditions (55,58). Both of these factors have been re- vealed to be translationally up-regulated under stress conditions via a mechanism involving translation of uORFs (Figure 7D) (55,57,60).

DATA UPLOAD SPEED AND ANALYSIS SPEED EVAL-UATION
To test the data upload and analysis efficiency, we uploaded the dataset used as case studies (Supplementary   Table S3) to the two high-performance computer servers from both the USA and China. The compressed file sizes of samples GSE103667 (four samples), GSE67387 (two samples), GSE52968 (two samples), GSE112670 (two samples), and GSE32060 (3 samples) were 97, 289, 67, 44 and 78MB, respectively. The upload speed of server #1 (rnabioinfor.tch.harvard.edu) is faster than server #2 (bioinformatics.sc.cn), while the data analysis speed is slower than server #2 (Supplementary Table S3). The upload speed also depends upon the web condition when uploading the data. It is expected that the analysis speed is higher in server #2 due to its better hardware.

IMPLEMENTATION
The web servers are hosted within a Linux system containing PHP/Apache environment. server #1 is equipped with 16 cores Intel Xeon E7440 (2.4 GHz) and 128GB of RAM while server #2 is equipped with four hexadeca-core (64 cores) Intel Xeon processors (2.1 GHz each) and 512GB of RAM. The back-end pipeline is implemented in the Perl/R language. The bitmap plots in PNG and PDF formats are drawn by R (http://www.r-project.org) packages, including ggplot2, cowplot, and ggpubr. The visualization web interfaces are created using several JavaScript libraries, including JQuery, Bootstrap.js, DataTable.js, Highchart.js and igv.js libraries, etc. which provide users with a highly dynamic, interactive, and intuitive interfaces for manipulating the software and viewing the analysis results.

CONCLUSIONS
Ribosome profiling (Ribo-seq) is proven as a very powerful technology for globally monitoring mRNA translation and more and more laboratories have started using this powerful approach in their studies. However, a convenient and integrated tool for Ribo-seq data is still lacking. In this study, RiboToolkit, the first integrated one-stop web-based toolkit, was developed to analyze ribosome profiling data for users with or without bioinformatics expertise. Various case studies validated the useful functionalities and reproducibility of RiboToolkit. Currently, it supports 16 model species, and additional reference genomes will be integrated in future updates. Moreover, additional functionality (such as ribosome drop-off detection (61,62)) and more input formats (such as BAM file) will be supported in RiboToolkit in the future. Due to the large size of RNA-seq data, Ri-boToolkit web server only supports the uploading of gene expression counts. The virtureBox or Docker versions of Ri-boToolkit will be developed in the future, which will support the inputs of accompanying RNA-seq data with the Ribo-