The Chromosomal Distribution of Sex-Biased MicroRNAs in Drosophila is Nonadaptive

Abstract Genes are often differentially expressed between males and females. In Drosophila melanogaster, the analysis of sex-biased microRNAs (short noncoding regulatory molecules) has revealed striking differences with protein-coding genes. Mainly, the X chromosome is enriched in male-biased microRNA genes, although it is depleted of male-biased protein-coding genes. The paucity of male-biased genes in the X chromosome is generally explained by an evolutionary process called demasculinization. I suggest that the excess of male-biased microRNAs in the X chromosome is due to high rates of de novo emergence of microRNAs (mostly in other neighboring microRNAs), a tendency of novel microRNAs in the X chromosome to be expressed in testis, and to a lack of a demasculinization process. To test this hypothesis, I analyzed the expression profile of microRNAs in males, females, and gonads in D. pseudoobscura, in which an autosome translocated into the X chromosome effectively becoming part of a sex chromosome (neo-X). I found that the pattern of sex-biased expression is generally conserved between D. melanogaster and D. pseudoobscura. Also, orthologous microRNAs in both species conserve their chromosomal location, indicating that there is no evidence of demasculinization or other interchromosomal movement of microRNAs. Drosophila pseudoobscura-specific microRNAs in the neo-X chromosome tend to be male-biased and particularly expressed in testis. In summary, the apparent paradox resulting from male-biased protein-coding genes depleted in the X chromosome and an enrichment in male-biased microRNAs is consistent with different evolutionary dynamics between coding genes and short RNAs.


Introduction
Gene expression is tightly regulated by mechanisms that make gene products to be expressed in specific organs at specific times. This spatiotemporal control of gene expression determines the development of fertilized eggs into adult organisms (Davidson 2006;Wolpert et al. 2006). In organisms with two sexes, a large fraction of genes are also differentially expressed between males and females (Parisi et al. 2003;Ranz et al. 2003). Also, most male-or female-specific expression occurs in the gonads, although other tissues also show sex-expression bias (Parisi et al. 2004;Chang et al. 2011). Female-biased genes tend to be located in the X chromosome (in XY systems) and malebiased genes tend to be depleted in the X chromosome (Parisi et al. 2003;Ranz et al. 2003;Khil et al. 2004).
In the model species Drosophila melanogaster, the study of sex-biased expression indicates that male-biased genes often appear in the X chromosome (Arguello et al. 2006;Levine et al. 2006;Begun et al. 2007;Chen et al. 2007), but they are later retroposed (copied) to an autosome and the original copy is eventually lost (Betran et al. 2002;Zhang, Vibranovski, Krinsky, et al. 2010). This process, called "demasculinization," was predicted already by early theoretical models (Rice 1984), and recent works seem to support it (Jiang and Machado 2009;Bachtrog et al. 2010;Zhang, Vibranovski, Krinsky, et al. 2010;Zhang, Vibranovski, Landback, et al. 2010), although there is still some controversy around it, and alternative models should be taken into consideration (Meiklejohn and Presgraves 2012). However, these conclusions cannot be extrapolated to all gene products, as most work on sexbiased expression has been focused on a specific type of gene: protein-coding genes. Characterizing the distinct evolutionary dynamics of different types of genes is paramount in evolutionary biology, as it can reveal how genetic and genomic features influence, and ultimately determine, the fate of genes (Lynch 2007).
MicroRNAs, short RNA posttranscriptional regulators (reviewed in Bartel 2004;Axtell et al. 2011;, are also expressed differently in males and in females (Marco, Kozomara, et al. 2013;Marco 2014Marco , 2015Warnefors et al. 2017;Fowler et al. 2019). Although an early investigation suggested that they were also subject to demasculinization (Zhang, Vibranovski, Krinsky, et al. 2010), a more recent work indicates that this is probably not the case (Marco 2014). Indeed, male microRNAs tend to be enriched in the X chromosome (Mishima et al. 2008;Li et al. 2010), contrary to what is observed in protein-coding genes. In D. melanogaster, novel microRNAs in the X chromosome are often expressed in testis and most are evolutionarily young (Marco 2014). To investigate the evolutionary dynamics of microRNAs in the X chromosome, I characterized the sexbiased microRNA complement of the species D. pseudoobcura, which diverged from D. melanogaster ∼37 Ma (Hedges et al. 2015), and had an autosome translocated into an X-chromosome, becoming a newly evolved X-chromosome (or neo-X). This neo-X emerged ∼13 Ma (Kaiser and Bachtrog 2010) which, despite being a short evolutionary period, has been enough to identify demasculinization affecting protein-coding genes in several studies (Zhang, Vibranovski, Krinsky, et al. 2010;Nozawa et al. 2014). Indeed, it has been estimated that, in Drosophila, one retrogene emerges every 2 Myr at an approximate constant rate (Bai et al. 2007), and we would expect (at least in theory) a similar rate for microRNA genes. Hence, this system will shed light on the evolution of chromosomal composition of sex-biased microRNAs.

Results
In order to characterize differentially expressed microRNAs between males and females in D. pseudoobscura, I analyzed three available small RNA datasets for D. pseudoobscura, two male samples and one female sample (Mohammed et al. 2018). However, these datasets have either no biological replicates, or they have been sequencing using different library construction methods. Hence, I also sequenced two male and two unfertilized female samples in two paired sequencing reactions (one male sample and one female sample in each sequencing array) using the same library preparation protocols. The differential gene expression (DGE) analysis for this set was performed taking into account batch (paired) effects. For simplicity, the first set is called in this paper the unpaired dataset, and the newly reported here the paired dataset.
The differential expression analysis of the unpaired dataset, for a false discovery rate (FDR) or 10% and a expression difference of 25% or over, identified 38 microRNAs with sex-biased expression: 19 overexpressed in females and 19 overexpressed in males ( To investigate whether the chromosomal location between D. melanogaster and D. pseudoobscura orthologs is conserved, the one-to-one orthologous microRNAs as annotated in Mohammed et al. (2018) were first considered. There were 128 annotated orthologs, of which 122 were successfully mapped to the latest genome versions. The chromosome in which each microRNA is located is conserved for all studied sequences, although some evidence of translocations and inversion is observed within chromosomes ( fig. 2). Importantly, among this set of orthologs, there is not a single case of a microRNA in one of the Drosophila species that moved to a different chromosome in the other species. In other words, this specific analysis did not identify a microRNA that moved out of a sex chromosome to an autosome (or vice versa), a phenomena that is well described for protein-coding genes, and it is generally due to retroposition (Sturgill et al. 2007;see Discussion).
A potential bias in the chromosomal location analysis is that orthology may have been defined, partly, by using chromosomal context. To control for that I run reciprocal BLAST among all D. melanogaster and D. pseudoobscura annotated microRNAs (see Materials and Methods). Clusters of similar sequences were aligned and similarity trees were built (supplementary file S1, Supplementary Material online). Only one microRNA in D. pseudoobscura remained unpaired to an ortholog: dps-mir-92c. Mir-92 is one of the most conserved microRNA families in animals, and it has two copies in D. melanogaster (mir-92a and mir-92b) and three (mir-92a, mir-92b, and mir-92c) in D. pseudoobscura. Recent work has shown that mir-92c is deeply conserved in insects and it is not associated with a recent duplication of any of the other mir-92 in the pseudoobscura lineage (Ninova et al. 2014), so it seems that it has been lost in D. melanogaster, although its homology relationships with other microRNA family members are not fully understood. As a matter of fact, Mohammed et al. (2018) suggest that D. melanogaster mir-311 is mir-92c's ortholog, a relationship maintained in the analyses for consistency (see below; fig. 3), but that did not alter the results.
In addition, the D. pseudoobscura genome was scanned for potential homologous microRNAs annotated so far only in D. melanogaster (see Materials and Methods), to identify copies that may have not been detected in earlier works (either because of limited genome sequence information or by the use of strict criteria to filter putative microRNAs). The BLAST searches predicted 26 putative one-to-one previously not identified microRNA orthologs between both species (table 1). Only one newly identified microRNAs was in different chromosomes between the two species: dme-mir-2281. However, independently of whether this is a bona fide microRNA (it is lowly expressed and poorly conserved), it would be an autosome to autosome translocation. There were two microRNAs in D. melanogaster that have two copies in D. pseudoobscura with at least one of those not previously annotated (table 2). In all these cases, all copies were in the analogous chromosomes in both species.
Finally, I considered those D. melanogaster microRNAs with more than two copies in the D. pseudoobscura genome. There were six microRNAs in this category (dme_208, dme_417, dme_455, dme_5, dme_7, and dme-mir-10404), all hitting 55 positions in the genome. These sequences are actually derived from rRNAs, and similar sequences are found associated to rRNA genes in other insects (Chak et al. 2015). In summary, these comparative genomics analyses did not reveal any microRNA that may have moved out, or copied from, the X to any other autosomal chromosome.
To investigate whether the expression profile of microRNAs change in a different chromosomal context, I compared the sex-bias in the expression of orthologogous microRNAs between both Drosophila species. A scatterplot of the log2 fold-change values for pairs of autosomal To characterize the expression of novel and conserved microRNAs depending on the chromosomal context, I compared the expression bias (log2 fold-change) of novel and conserved microRNAs in both autosomes and the X chromosome in D. pseudoobscura. By building a linear model on ranked log2 fold-changes considering chromosomal context and evolutionary age (Scheirer-Ray-Hare test), there was strong statistical support for novel microRNAs to be overexpressed in males, in both the unpaired (P = 0.0045; fig. 4A) and the paired (P = 0.0187; fig. 4B) datasets. In the controlled paired dataset, within the novel microRNAs, those in the X chromosome tend to have a higher expression bias toward males with respect to those in the autosomes ( fig. 4B). More specifically, when I compare the testis to ovary expression ratio for microRNAs expressed in the gonads, a clear pattern of novel microRNAs in the X tending to be overexpressed in testis becomes evident ( fig. 4C; conservation P < 0.0001, chromosome P = 0.0115).
A caveat in this analysis is the nonindependence of expression profiles among clustered microRNAs, as they tend to be produced from the same transcript. To evaluate the impact of microRNA clusters, only the highest expressed microRNA in each cluster (microRNAs within 10 kb to each other) was considered. Overall the log2 foldchange in expression is similar in both species ( fig. 5A) with the main exception of the microRNA cluster already mentioned above. For novel X-linked microRNA clusters, there were four loci, all highly expressed in testis ( fig. 5B). This also suggests that novel X-linked microRNAs tend to emerge within other microRNA clusters/loci (see Discussion).

Discussion
In this work, I characterized the sex-expression pattern of microRNAs in D. pseudoobscura to investigate the evolutionary dynamics of male-expressed microRNAs, first by analyzing already available datasets (two female and one male RNAseq datasets with different library preparations), and after by sequencing paired samples (two males and two females) to control for batch effects. By comparing the expression profile of D. pseudoobscura and D. melanogaster, it was evident that the sex-bias of microRNAs is largely conserved between both species. This is also the case for microRNAs that were located in an autosome which became the neo-X arm after a translocation, supporting that the relative expression between males and females does not depend on the chromosomal context. In other words, male-biased microRNAs may acquire their expression profile early during evolution, but then their expression is maintained even if the chromosome acquires a sex chromosome status. The conservation of the expression bias as well as the conservation of the chromosome in which microRNAs are located provides strong evidence against demasculinization having any impact on microRNAs. There were two exceptional cases, when analyzing the paired dataset, of microRNAs strongly male-biased in D. pseudoobscura but strongly female-biased in D. melanogaster. These microRNAs belong to the mir-310-313, whose products are maternally deposited in the egg (Ninova et al. 2014;Marco 2015). In this study, to avoid the excessive contamination from maternal/embryonic microRNAs in females (Marco 2014(Marco , 2015Ninova et al. 2014), the RNA of young virgin females was sequenced. That may explain the differences observed in these two particular clusters. In any case, both microRNAs clusters are located in the homologous autosome in both species, so the observed difference is surely not a consequence of a different chromosomal context.
In the lineage of the species D. willistoni, there was an independent fusion of the Muller element D (chr3L in D. melanogaster) with the X chromosome, generating a neo-X similar to what happened in the pseudoobscura lineage. Of the male-biased D. melanogaster microRNAs, only two were located in the 3L chromosome: mir-274 and mir-276a. BLAST searches revealed that both microRNAs are conserved in D. willistoni as well as the neighboring region, suggesting that the synteny is conserved. However, as far as I know, these scaffolds are not mapped into the Muller elements and a detailed analysis of any potential movement out of the D. willistoni neo-X is not possible at the moment.
The demasculinization model described in the Introduction assumes that male-expressed (malebeneficial) genes are more likely to be evolutionarily lost if they are located in the X chromosome, as damaging mutations are often lethal in hemizygosity ( fig. 6A). Hence, gene copies in the autosome are favored, whereas the ancestral X located copy is eventually lost (Sturgill et al. 2007). Some groups have suggested that novel microRNAs are under positive selection upon emergence (Lyu et al. 2014), and therefore, they are more likely to be fixed in the X chromosome if they are male-beneficial. However, once fixed, recessive deleterious mutations will be more damaging for males if the microRNA is in the X chromosome. As a matter of fact, a majority of novel microRNAs is lost within a relatively short evolutionary period (Lyu et al. 2014). The genes are copied by retroposition ( fig. 6B), in which a reverse transcriptase makes a DNA copy of the processed transcript which is inserted (by a retroviral integrase) in the genome. The original copies of genes retroposed into the autosomes  are eventually lost ( fig. 6C). This mechanism is highly unlikely, if not impossible, in microRNAs because processed microRNA precursors leaving the nucleus (and therefore exposed to endogenous reverse transcriptases) are short (<100 nucleotides) RNA molecules with an internal hairpin structure (Bartel 2004), lacking a polyadenylation sequence that could be used as a primer by the endogenous reverse transcriptases (Wei et al. 2001). DNA-based gene relocation (i.e., not involving retroposition) has also been described in X to autosome gene movement (Vibranovski et al. 2009). However, it was not detected in this analysis any microRNA X to autosome translocation, so DNA-based demasculinization has not been observed either, yet it will be interesting to explore microRNA interchromosomal movement at longer evolutionary distances (although the current assemblies do not allow to perform this analysis yet). Nevertheless, in future works DNA-based transposition should be taken into account as the most likely mechanism of microRNA interchromosomal movement.
This work supports an alternative model of evolution. MicroRNAs have a high rate of turn-over. That is, novel microRNAs appear at high frequencies ( fig. 6D), which is less common in protein-coding genes. Novel genes are often male-biased (Metta and Schlötterer 2008), and if they are in the X they are frequently highly expressed in testis (this work, see also Mohammed et al. 2014). Once a new microRNA transcript with male-biased expression emerges in the X chromosome, it can potentially serve as a source of novel linked microRNAs within the transcript, which may explain why evolutionarily young X-linked male-biased microRNAs tend to appear in clusters, not only in Drosophila, but also in mammals ( fig. 5B; Li et al. 2010;Devor et al. 2011;Marco, Ninova, Ronshaugen, et al. 2013). This bias in gene expression of newly emerged microRNAs, together with the high rate of evolutionarily young microRNA loss ( fig. 6E) eventually leads to an enrichment of male (testis) expressed microRNAs in the X chromosome ( fig. 6F). Alternatively, if X-located male-biased microRNAs had an important function and, therefore, were under purifying selection, this selection is expected be significantly weaker than in protein-coding genes, as only a very small fraction of nucleotides of a whole microRNA transcript is functional (Rodriguez et al. 2004), whereas for protein-coding genes a large fraction of exonic nucleotides code for functional amino acids. In other words, whereas a microRNA gene contains dozens of nucleotides necessary for the production of a mature microRNA, a protein-coding gene of the same size contains thousands of these nucleotides, and are therefore more likely to be hit by a random mutation. Either way, natural selection seems to have a limited role in the birth and maintenance of male-biased X-linked microRNAs. Future work on older neo-X chromosomes in Drosophila and other insects will shed further light on the validity of this model in other species and over significantly longer evolutionary periods.
It is also important to keep into consideration that male-biased expression does not necessarily mean beneficial for males, or that losing the gene males fertility will be compromised. Indeed, genes whose lack of function lead to infertility are male expressed, but the converse does not hold true, male-biased genes are not generally fertility genes (Lindsley et al. 2013). Putting all this together, this work strongly suggests that the X-chromosome is enriched in male-biased microRNAs as a consequence of mutation bias (tendency of novel microRNAs in the X to be male biased), lack of retroposition, and a conservation of the expression bias independently of the chromosomal context. In summary, the evidence suggests that the chromosomal location of sex-biased microRNAs is nonadaptive.

Sample Collection
The work was done with a D. pseudoobscura stock kindly donated by Tom Price (University of Liverpool) and Nina Wedell (University of Exeter), collected in Show Low (Arizona, USA). All flies were kept at 18°C on cornmealbased media, with 12 h light/dark cycles. Adult males and females were collected at age 7 days. Unfertilized eggs were collected from virgin females in population cages (∼30-50 females) in cycles of 12 h. Virgin females were allowed to lay eggs in control vials to ensure they were virgins. We discarded population cages with females coming from control vials having larvae within 2 weeks (in these experiments, it only happened once). Eggs were collected with a sieve and washed with wash solution (NaCl 100 mM, Triton X-100 0.7 mM) and water.

RNA Extraction and Sequencing
Total RNA from four samples (two males and two females) samples was extracted using TRIzol (Life Technologies) as recommended by the manufacturer. RNA was dissolved in RNase-free water. For small RNA sequencing, I used the TruSeq Small RNA Sample Preparation Kit (Illumina) to generate the cDNA library using selected constructs of sizes 145-160 bp in a 6% PAGE gel, and precipitated in ethanol. DNA integrity was checked with TapeStation (Agilent). Samples were sequenced in-house with an Illumina MiSeq sequencing machine.

MicroRNA Expression Analysis
Reads were mapped to the D. pseudoobscura genome sequence assembly 104 (Liao et al. 2021) with HISAT2 version 2.2.1 (Kim et al. 2019) with default parameters. Raw reads were deposited in the Gene Expression Omnibus (GEO) at NCBI with accession number GSE179989. Additionally, reads from other D. pseudoobscura samples were retrieved from GEO: GSE98013 (Mohammed et al. 2018) (two males and one female samples) and GSE48017 (Lyu et al. 2014) (one ovary and one testis samples) and also mapped to the same reference genome. Reads from D. melanogaster SRR016854, SRR018039, SRR069836, and SRR069837 (Chung et al. 2008;Roy et al. 2010) were mapped with the same method to the reference genome dm6 (Hoskins et al. 2015). Adapters were trimmed with Cutadapt 1.18 (Martin 2011) before mapping. Read counts were obtained with featureCounts 2.0.2 (Liao et al. 2014) using the annotation coordinates from miRBase release 21 (Kozomara et al. 2018) and the additional microRNAs originally annotated by Mohammed et al. (2018), filtering out those mapping with 100% identity to multiple regions in the current genome assembly. MicroRNAs labeled as erroneously annotated by this study were also removed from the analysis. DGE analyses of our D. pseudoobscura samples and the D. melanogaster male and female available datasets, were conducted with DESeq2 version 1.24.0 (Love et al. 2014: 2) with local fitting. Alternatively, DGE was also done with limma version 4.2.0 (Ritchie et al. 2015) using TMM normalization and voom transformation (Law et al. 2014). For both DGE methods, the model used was expression ∼ batch + sex, where batch is the sequencing run. In the fold-change comparison between chromosomes and novel and conserved microRNAs, only those with at least one read count in one of the samples in each sex and a base mean (from DESeq2) of at least one were considered.

Homology Analysis
Annotated microRNAs from both species were compared via BLAST 2.9.0 (Altschul et al. 1997) using the R wrapper rBLAST 0.992 (https://github.com/mhahsler/rBLAST/): blastn with word size 10 and E-value threshold of 0.1. Pairs of similar sequences between species were built into a graph with igraph 1.2.11 (https://igraph.org) and connected graphs were extracted as similarity groups (potentially homology groups). For each similarity group, a sequence alignment with MAFFT 2.9.0 (Katoh and Toh 2008) and a neighbor-joining tree (Saitou and Nei 1987) using uncorrected distances with ips 0.0.11 (https://www. rdocumentation.org/packages/ips) were built for visual inspection. To identify potential, and previously missed, microRNAs, D. melanogaster microRNA precursors were compared against the D. pseudoobscura genome with blastn (word size 10, E-value threshold of 0.01). Alignments of precursor microRNAs of length 60 or over were further considered. Lineage-specific (novel) microRNAs were those described by Mohammed et al. (2018).
All data generated and scripts to reproduce the full analysis is available from GitHub: https://github.com/ antoniomarco/miRpseudoobscura.