The genome of the zebra mussel, Dreissena polymorpha: a resource for comparative genomics, invasion genetics, and biocontrol

Zebra mussels are one of the world’s most damaging invasive species. Native to Eurasia, they have continued to spread rapidly though Europe and in recent decades through North America, causing billions of dollars in economic damage and dramatically altering the ecosystems of infested lakes and rivers. Here we report the sequencing of the zebra mussel genome which will be an important tool for invasive species research and biocontrol efforts.

Multiple sequence alignment of the six shematrin-like proteins identified in the D. polymorpha genome using CLUSTAL Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/). Figure 11. Structures of D. polymorpha shematrin-like proteins. a-f are the six highly expressed shematrin-like proteins from mantle transcriptomes, labeled with their DPMN number. Amino acid sequences from these highly expressed genes are numbered starting from residue 1 following the signal peptide (which is not shown). Regions of low complexity (i.e. single-residue bias) detected by fLPS are labeled above the colored bars. The grey triangles below the sequences mark tandem repeat regions detected using XSTREAM.

Supplemental
The most highly expressed gene, DPMN 014835, is shown in two segments due to its length.

Supplemental Figure 12. Structures of pearl oyster shematrin proteins. For comparison,
we also examined the repetitive low complexity domains in the most well characterized shematrins, Shematrins 1-7 from Pinctada fucata (panels a-g). Labeling as in figure 10. Each protein is divided into two segments for clarity. Figure 13. Temptins and similar proteins. Yellow-highlight sequence labels are the two temptin proteins from the sea hare Aplysia and the two, highly expressed temptinlike proteins from zebra mussel mantle tissue, in a multiple alignment to proteins with similar domains. Sequence labels show GenBank accessions/sequence IDs followed by short names for annotated proteins (DBH = dopamine b-hydroxylase, MoxD1 = monooxygenase DBH-like 1, PHM = peptidylglycine a-hydroxylating monooxygenase). Dreissena polymorpha genes that show high tissue specificity are labeled with the tissue of highest expression and the t value.

Supplemental
The alignment starts at position 1 of the calcium-binding EGF (cbEGF)-like domain. The region near the N-terminus, including the signal peptide, is not shown. Locations of the three major domains, determined from NCBI CD searches, are labeled atop the sequences. In the cbEGFlike domain, the calcium-binding loop region is shaded in grey. Conserved residues flanking and stabilizing this region are marked (the two W residues with green arrows, and the disulfide bond between C residues with black arrows). All the DBH-like proteins contain both of the copperbinding Mox type II domains, and extend about 400 residues beyond their cbEGF-like domains.
The temptins and temptin-like proteins lack the Mox domains and are much shorter, extending only about 30 residues beyond the cbEGF-like domain. The PHM-like proteins have both Mox domains, but lack the cbEGF-like domain. And finally, all of the mollusk DBH-like proteins, and the temptin and temptin-like proteins have cbEGF-like domains containing multiple residues that are conserved among the bivalves and the gastropod Aplysia.
Supplemental Figure 14. Temptin and DBH-like proteins in Aplysia. Aplysia californica temptin was used in tBLASTn searches restricted to A. californica, and six of the top hits were aligned to temptin. The alignment produced results similar to those described above. Figure 15. Foot gene expression during byssal thread formation. a) Schematic depicting experimental design; byssal threads were severed at day zero, and dissected foot tissue was collected on days zero, four, and eight (n = 4 animals per condition). b) Gene expression changes (log2 fold-change) relative to the day zero time point. c) List of up-and down-regulated genes in the foot at the day-four time point.