Phylogenetics and environmental distribution of nitric oxide-forming nitrite reductases reveal their distinct functional and ecological roles

Abstract The two evolutionarily unrelated nitric oxide-producing nitrite reductases, NirK and NirS, are best known for their redundant role in denitrification. They are also often found in organisms that do not perform denitrification. To assess the functional roles of the two enzymes and to address the sequence and structural variation within each, we reconstructed robust phylogenies of both proteins with sequences recovered from 6973 isolate and metagenome-assembled genomes and identified 32 well-supported clades of structurally distinct protein lineages. We then inferred the potential niche of each clade by considering other functional genes of the organisms carrying them as well as the relative abundances of each nir gene in 4082 environmental metagenomes across diverse aquatic, terrestrial, host-associated, and engineered biomes. We demonstrate that Nir phylogenies recapitulate ecology distinctly from the corresponding organismal phylogeny. While some clades of the nitrite reductase were equally prevalent across biomes, others had more restricted ranges. Nitrifiers make up a sizeable proportion of the nitrite-reducing community, especially for NirK in marine waters and dry soils. Furthermore, the two reductases showed distinct associations with genes involved in oxidizing and reducing other compounds, indicating that the NirS and NirK activities may be linked to different elemental cycles. Accordingly, the relative abundance and diversity of NirS versus NirK vary between biomes. Our results show the divergent ecological roles NirK and NirS-encoding organisms may play in the environment and provide a phylogenetic framework to distinguish the traits associated with organisms encoding the different lineages of nitrite reductases.

generating new clades from previously proposed clades that were not well-supported in the present phylogeny.Thus, clades with similar naming number-letter hierarchy level do not necessarily originate at similar depths on the phylogeny, and we refer to these well-supported groups of sequences in the phylogeny as clades independent of number-letter hierarchy.
Despite carrying all the conserved motifs, we excluded these Pyrobaculum sequences from our phylogeny because the haem d1 and cytc domains are encoded in opposite directions in the genome, which led to an unresolvable long branch upon re-orienting and concatenating them.
Halobacteriota NirS-like proteins were included as an outgroup and excluded from nirS counts in the metagenome survey because they lack the first two characteristic motifs corresponding to the cyt c domain.Using the previously described search for genes encoding enzymes involved in NirS assembly combined with genome viewer in NCBI to look for potential alternative heme assembly proteins [3] and cyt c domain-containing motifs, we confirmed the absence of the evidence that these proteins are cyt cd NirS.An additional reason for excluding this clade from nirS gene fragments counts is that absence of the cyt c domain led to strange behaviour of the search and place algorithm.An excess of nirS reads were placed in this gapped region of the alignment and subsequently annotated as haloarchaeal, despite being derived from habitats such as forest soils where Halobacteriota are rare, and where Halobacteriota nirK reads were below detection.Furthermore, BLAST of a subset (n=20) of the reads annotated as haloarchaeal nirS but mapped in their entirety to this 75aa gap in the beginning of the alignment against the UniProt database indicated that none of them most closely matched archaea; instead, most (18; 90%) mapped as non-nitrite reductase bacterial cytochrome C or cytochrome C oxidase, often with at least 60% identity (13; 65%).By contrast, short reads from this N-terminal region which GraftM placed in the canonical nirS portion of the tree correctly mapped to proteobacterial nitrite reductase or cytochrome C.

Structural features and nitrite reductase helper genes
We searched for nirF, nirN, nirJ, nirE, nirB and nirT in assemblies carrying nirS, and nirV in assemblies carrying nirK.The seed alignments for NirJ (TIGR04051, TIGR04055, TIGR04054) and NirE (cd11642) were derived from the NCBI's Conserved Domains Database.Seed alignments for NirF and NirN were derived from the original NirS search, and were readily differentiated from NirS using a phylogenetic approach.The HMMs for NirB (UniProt P24037), NirT (UniProt P24038) and NirV (NCBI AAK08123.1)were generated using protein BLAST searches with the aforementioned reference sequences against NCBI's ClusteredNR database [5].We stochastically selected a subset of the 1000 top hits to be aligned and exported using the multiple alignment function accessible from the BLAST outputs, and then checked if the sequences were aligned at the important ligands and catalytic residues in ARB [6].We searched for the structural features TAT, lipobox, and Sectype signal peptides using the online version of SignalP 6.0 [7] and transmembrane domains using DeepTMHMM [8], both with default settings.Clade-specific insertions and deletions [9], and cytochrome C (-CX 2 CHX 50 M-) and cupredoxin (CX 2-4 HX 2-4 M) motifs in the C and N termini of the proteins were identified in ARB [6].

methods figure 1 :
complete denitrification refers to the potential to transform nitrite into dinitrogen via NO (completed by Nir), N 2 O (Nor) and finally N 2 (Nos).Incomplete denitrification in organisms encoding NirK and/or NirS refers to the presence of Nir+Nor (green), Nir+Nos (orange), or the presence of just Nir (yellow).

Fig.
Fig. S3 Relative abundance of nirS clades in metagenomes from various biomes.Basal

Fig.
Fig. S4 Relative abundance of nirK clades in metagenomes from various biomes.Basal