Abstract

Spinosyns, a novel class of insect active macrolides produced by Saccharopolyspora spinosa, are used for insect control in a number of commercial crops. Recently, a new class of spinosyns was discovered from S. pogona NRRL 30141. The butenyl-spinosyns, also called pogonins, are very similar to spinosyns, differing in the length of the side chain at C-21 and in the variety of novel minor factors. The butenyl-spinosyn biosynthetic genes (bus) were cloned on four cosmids covering a contiguous 110-kb region of the NRRL 30141 chromosome. Their function in butenyl-spinosyn biosynthesis was confirmed by a loss-of-function deletion, and subsequent complementation by cloned genes. The coding sequences of the butenyl-spinosyn biosynthetic genes and the spinosyn biosynthetic genes from S. spinosa were highly conserved. In particular, the PKS-coding genes from S. spinosa and S. pogona have 91–94% nucleic acid identity, with one notable exception. The butenyl-spinosyn gene sequence codes for one additional PKS module, which is responsible for the additional two carbons in the C-21 tail. The DNA sequence of spinosyn genes in this region suggested that the S. spinosa spnA gene could have been the result of an in-frame deletion of the S. pogona busA gene. Therefore, the butenyl-spinosyn genes represent the putative parental gene structure that was naturally engineered by deletion to create the spinosyn genes.

Introduction

Recently, a new naturally occurring series of insect active compounds was discovered from a novel soil isolate, Saccharopolyspora pogona NRRL30141. The culture produced a unique family of over 30 new spinosyns [21]. These butenyl-spinosyns, also called pogonins, have a butenyl substitution at the 21 position on the spinosyn lactone (Fig. 1). The previously known spinosyns produced by S. spinosa [20] were substituted with ethyl or methyl at C-21 (Fig. 1).
Fig. 1

Structures of spinosyn A and butenyl-spinosyn

The spinosyn molecules have a unique tetra-cyclic macrolide base with two reduced sugars, forosamine and tri-O-methylrhamnose, which are required for bioactivity [20]. Spinosyns are highly potent natural insect control agents which have been commercialized under the trademark Naturalyte®. Naturalyte® insect control has been used since 1997 for the control of chewing insects on a variety of crops [32]. Spinosyn formulations were recently approved for use on organic crops (Entrust®) and for animal health applications (Elector®).

The biosynthetic genes for spinosyn production include 19 genes encoded on 80 kb of S. spinosa genomic DNA [35]. The spinosyn gene cluster included five genes encoding a large PKS, four unique genes involved in cross-bridging of the polyketide lactone and ten genes involved in sugar biosynthesis. Because of the unique tetracyclic structure of spinosyns, the spinosyn genes have recently been the subject of a number of investigations into the mechanisms of polyketide biosynthesis [9, 15, 16, 22, 23, 29].

In addition to their butenyl tail at C-21, the butenyl-spinosyns have a number of distinct variations from the published spinosyn factors [21]. The unique spinosyns in the butenyl-spinosyn series include nonforosamine sugars at C-17, hydroxylation at C-8 and C-24, and a tridecenolactone spinosyn (14-membered lactone). Therefore, we expected that the biosynthetic genes could reveal some interesting variations from spinosyn biosynthesis. We report here the cloning and sequencing of the genes for biosynthesis of butenyl-spinosyns. The biosynthetic origin of the butenyl-spinosyn butenyl tail suggests an example of natural genetic engineering by homologous recombination.

Materials and methods

Microbial strains and growth conditions

Escherichia coli DH5α-MRC+ (Gibco BRL, Gaithersburg, MD, USA) used for DNA cloning was grown on LB agar (BD, Franklin Lakes, NJ, USA) and Terrific Broth [2] + 0.4% v/v glycerol (TB). When used, apramycin (am, Sigma Chemical Co., St. Louis, MO, USA) was added to LB and TB at 100 mg/l. E. coli ATCC 47055 was obtained from ATCC (Manassas, VA, USA). S. pogona NRRL 30141 is a novel soil isolate [21], S. pogona NRRL 30421 was derived from S. pogona NRRL 30141 through mutagenesis [17]. For genomic DNA isolation, S. pogona NRRL 30141 was grown in INV-2 media (9.0 g/L dextrose, 30 g/L trypticase soy broth, 3.0 g/L yeast extract, 2.0 g/L magnesium sulfate. 7 H2O), and for fermentation, S. pogona or derivative cultures were grown, extracted and analyzed by LC/MS according to Hahn et al. [17].

Molecular methods

Unless specifically listed, standard protocols for DNA manipulations were used [3]. Chromosomal DNA was isolated using a Genomic DNA purification kit (Qiagen Inc., Valencia, CA, USA) and cosmid DNA was isolated using the NucleoSpin Nucleic Acid Purification Kit (CLONTECH Laboratories, Inc., Palo Alto, CA, USA). S. spinosa or S. pogona DNA probes were PCR amplified using AmpliTaq DNA Polymerase Kit (Perkin Elmer/Roche, Branchburg, NJ, USA) in a 48-sample DNA Thermal Cycler (Perkin Elmer Cetus) under the following cycle conditions: (1) 94 °C, 1 min; 55 °C, 2 min; 72 °C, 3 min; 25 cycles and (2) 72 °C, 10 min; 1 cycle. PCR products were gel-extracted utilizing Qiagen II Gel Extraction Kit (Qiagen Inc.). DNA probes were random-prime labeled with 50 μCi [α32P]dCTP, 3,000 Ci/mMol using 4 μl High Prime reaction mixture (Boehringer Mannheim, Mannheim, GDR). Separation of unincorporated nucleotides from radiolabeled DNA probes was performed using NucTrap Push Columns (Stratagene, LaJolla, CA, USA). Approximately 2.0×107 cpm were added to membranes for all DNA hybridizations. Hybridization conditions for all probes were for 16 h in a 65 °C shaking water bath. Hybridization solutions containing radiolabeled probes spnF, spnS, and spnE (TE) were washed under medium stringency conditions: (1) Fifteen minutes room temperature in 300 ml 3× SSC/0.5% SDS, (2) 30 min, 65 °C shaking in 300 ml fresh 3× SSC/0.5% SDS, (3) 30 min, room temperature in 300 ml 1× SSC/0.5% SDS. Membranes screened with the radiolabeled probe derived from S. pogona cosmid 9D3 sequence were washed under stringent conditions: (1) 30 min, 65 °C shaking in 300 ml fresh 1× SSC/0.5% SDS, (2) 30 min, 65 °C shaking in 300 ml fresh 0.33× SSC/0.5% SDS, (3) 30 min, 65 °C shaking in 300 ml fresh 0.1× SSC/0.5% SDS.

Construction of S. pogona cosmid libraries

Total cellular DNA isolated from S. pogona was partially digested with Sau3AI and cloned into the BamHI site of cosmid pOJ436 [5]. Insert size of the constructed cosmid clones ranged from 20 to 40 kb. Cosmid clones were packaged in vitro using Gigapack III Gold Packaging Extract (Stratagene) and E. coli transductants were spotted in duplicate onto Hybond N+ (Amersham Pharmacia Biotech, Piscataway, NJ, USA) nucleic acid binding membranes. Membranes were supported on LB agar plates + am and incubated overnight at 37 °C. Membranes were processed according to the manufacturers’ protocols and DNA was cross-linked to the membrane with 1,200 μJ using an UV Stratalinker 1800 (Stratagene).

Screening of S. pogona library and identification of cosmids containing butenyl-spinosyn biosynthetic genes

The 16S rRNA genes of S. spinosa and S. pogona have 98% DNA sequence identity (D.R. Hahn et al., manuscript in preparation). Because of this phylogenetic similarity between the producing strains and the high structural similarity between the two families of spinosyns, it was expected that the butenyl-spinosyn gene cluster would be highly similar to the spinosyn gene cluster. Therefore, three DNA probes based upon unique spinosyn biosynthetic genes (spn) or domains from S. spinosa [35] were synthesized. The three probes were spaced to maximize the chance of cloning the entire butenyl-spinosyn biosynthetic gene cluster (Fig. 2): (1) the spnS gene probe (forward primer 5′-GTGCCGAATACGCGAAGGTC-3′; reverse primer 5′-TCCAGGAAGGTATTCCGCGC-3′) was at the left end of the spinosyn biosynthetic cluster, (2) the spnE thioesterase domain (TE) was at the right end (forward primer 5′-TCCCGATGCCTGGATTCATTG-3′; reverse primer 5′- CGTCCATCATCGAGAAGTGGTC-3′) and (3) the spnF gene was in the central portion of the spn genes (forward primer 5′-GCGACAACGCGATCCAGATC-3′; reverse primer 5′-CCATGTCGTGGGCATATTTCTC-3′). Eight cosmid clones were identified as positively hybridizing to S. spinosa probes spnS, spnF or spnE (TE) (Fig. 2).
Fig. 2

A graphical comparison of the spinosyn biosynthetic genes from S. spinosa (top) and the butenyl-spinosyn biosynthetic genes from S. pogona (bottom). Colors of the genes involved in spinosyn or butenyl-spinosyn biosynthesis correspond to their function (polyketide synthase and aglycone bridging are shown in black; rhamnose biosynthesis are shown in red; forosamine biosynthesis are shown in green). Genes from S. spinosa which are not involved in spinosyn biosynthesis are shown in orange. Non-bus genes and sequences unique to S. pogona are shown in blue. Approximate extents of cosmid clones are indicated at the bottom of the figure. Vertical bars indicate the location of primers used for cloning: the three primers based on spn genes are designated in gray; the one primer based on bus genes is designated in yellow. Cosmids which were sequenced are shown in bold and cosmids which extended beyond the sequenced region are indicated as arrows

The eight cosmid clones were further characterized by restriction digestion and the end of the insert of each cosmid was sequenced so that putative genes could be surmised from comparison to the spn genes. Cosmids, 8H3, 9D3, and 10C1 covered the maximal amount of the butenyl-spinosyn gene cluster (designated bus for butenyl-spinosyn) and the NRRL 30141 chromosome (Fig. 2). The insert in cosmid 8H3 was 40.3 kb and hybridization indicated homology to both the spnS and spnF genes. The right end sequence was in a ketosynthase (KS) domain which was similar to several spn KS domains. The left end sequence had no similarity to any known gene of S. spinosa indicating that this cosmid extended beyond the homology to the spnS gene. Cosmid 9D3 which hybridized to the spnF probe had a 31.7-kb insert. The left end sequence was homologous to the spnG gene and the right end was highly similar to the KS domain of module 5 in the spnD PKS gene. The insert in cosmid 10C1 was 40.6 kb and hybridized only to the spnE (TE) probe. The left end sequence of cosmid 10C1 was homologous to the KS6 domain of the spnD PKS gene and the right end had no S. spinosa homology. From this gene homology/domain order (spnS, spnG, spnF, KS?, KS5, KS6, TE) it appeared that the butenyl-spinosyn genes were collinear with the spinosyn genes (Fig. 2), although the 32-kb distance between the spnG gene and the KS5 domain (cosmid 9D3) was approximately 5 kb longer than in the spn PKS genes. It was also apparent that approximately 5 kb of the butenyl-spinosyn biosynthetic genes had not been cloned (between cosmids 9D3 and 10C1).

In order to identify a clone spanning the region between cosmids 9D3 and 10C1, one additional probe based on the end sequence of cosmid 9D3 was synthesized (forward primer 5′-CGTACGTGGCGATCAG-3′; reverse primer 5′-GTCCAAGTTTCGGTTGCGTTC-3′). Using high-stringency hybridization conditions, three additional cosmids were identified from the genomic library (Fig. 2). Cosmid 9F4 had a 36.9-kb insert and the right end sequence was homologous to the KS-AT domains of module 9 in the spnE gene. The left end had homology to the ER domain of module 2 of the spnB gene. The left end sequence also had some DNA bound by a Sau3AI site which was not similar to the spn genes. It is assumed that the 9F4 cosmid had a small second insert of noncontiguous S. pogona DNA.

DNA sequencing

Nucleotide sequence from the cosmid/vector junctions was obtained by fluorescent cycle sequencing according to the methods of Burgett and Rosteck [8] under thermal cycler conditions: 96 °C, 30 s; 50 °C, 15 s; 60 °C, 4 min; 25 cycles with a 377 ABI Prism Sequencer (Applied Biosystems, Inc., Foster City, CA, USA). The complete sequences of cosmids 8H3, 9D3, 9F4, and 10C1 were determined at SeqWright, Inc. (Houston, TX, USA). The cosmid clones represented over 110 kb of S. pogona genomic DNA. For ease of analysis, the sequence was divided into two segments [18]: Seq. ID no. 1 (GenBank accession number: AX600586) included the start codon of busA (+1 in Fig. 2) and all DNA to the 3′ of that, which included the five PKS genes. Seq. ID no. 2 (GenBank accession number: DQ087286) began at the base before the busA start codon and included all DNA to the 5′ side of that base.

Transformation of S. pogona

Cosmid 8H3 (Fig. 2) and plasmids derived from pOJ260 [5] were transferred from E. coli ATCC 47055 into S. pogona NRRL 30141 or NRRL 30421 by conjugal transfer [25].

Gene disruption of busO in S. pogona

A pair of oligonucleotides (busOa, 5′-TAGAAGGCCTGCAGGTCGAGAC-3′; and busOb 5′-TAGTTGGCCACACTGCACTGGACC-3′) were used to amplify a 912-bp region internal to the 1,457-bp busO gene using FailSafe PCR (Epicenter, Madison, WI, USA) and cloned into pCRII (Invitrogen, Carlsbad, CA, USA). The resulting plasmid was digested with EcoRI and the busO fragment was cloned into the EcoRI site of pOJ260 [25]. The resultant plasmid was conjugated from E. coli ATCC 47055 into a derivative of S. pogona by conjugal transfer [25]. Six independent amR exconjugants were fermented and analyzed for production of butenyl-spinosyn and derivatives.

Results and discussion

The DNA sequence of the butenyl-spinosyn biosynthetic genes

The polyketide synthase (PKS) genes

The S. pogona DNA sequence AX600586 included a region of about 60 kb with striking homology to the DNA encoding the polyketide synthases of known macrolide producers [11, 14, 26]. The butenyl-spinosyn PKS DNA region consisted of five large open reading frames (ORFs) with in-frame stop codons at the end of ACP domains, similar to the PKS ORFs in the other macrolide-producing bacteria. The five butenyl-spinosyn PKS genes were arranged head-to-tail (see Fig. 2), without any intervening non-PKS functions such as the insertion element found between the erythromycin PKS genes AI and AII [12]. The PKS genes are designated busA, busB, busC, busD, and busE.

The boundaries and functions of the 12 modules and 55 domains identified in the bus PKS genes were predicted based on similarities to the conserved amino acid sequences of the domains in other polyketide synthases, particularly the erythromycin polyketide synthase [13]. Each of the five bus PKS genes encodes one or more PKS modules (Fig. 3a). Each module has all the functional domains required for addition of a single two carbon ketide unit to the polyketide chain. It would require 12 ketide units to construct the putative butenyl-spinosyn polyketide precursor (Fig. 3b, c). The modules and domains of the bus PKS are encoded in the same order in which they would be used in the biosynthesis of the polyketide (Fig. 3).
Fig. 3

Model of the butenyl spinosyn polyketide synthase and its polyketide product. a Bus polyketide synthase. The extent of the five PKS genes encoding the five proteins of the butenyl-spinosyn PKS is represented by arrows at the top and the extent of each of the 12 PKS modules is indicated by bars below. Functional domains are represented by circles which are color-coded and labeled by the functions. Abbreviations of domains: KS ketosynthase, AT acetyltransferase, ACP acyl carrier protein, DH dehydratase, KR ketoreductase, ER enoylreductase, TE thioesterase. The first KS is distinctive in that it is the KSQ loading domain for the PKS. b The putative polyketide predicted from the bus PKS is shown. The carbons added and reduction due to each module is indicated (M2 module 2, etc.). c The butenyl-spinosyn aglycone. Carbons resulting from each module are indicated with arrows. Important carbons where modification or variations occur are numbered

Like the spn PKS, the bus PKS has a KSQ domain at the amino terminus of the loading module. It is expected that this KSQ domain cannot function as a β-ketosynthase because it contains a glutamine residue at amino acid 172 (Table 1), in place of the cysteine required for β-ketosynthase activity [31]. It has been reported that KSQ domains function to decarboxylate malonyl-ACP and are chain initiation factors [7]. None of the other butenyl-spinosyn PKS domains contains the sequence characteristics of the inactive domains found in the erythromycin and rapamycin PKS genes [1, 14].

Sequence of KS & KSQ Domains in the BusA & SpnA polyketide synthases

Sequence of KS & KSQ Domains in the BusA & SpnA polyketide synthases

Although busB-E are comparable in size and highly homologous to spnB-E (Table 2) , busA is significantly larger (by 5,244 bp) than spnA. The first 4,245 bp (module L) and the last 3,486 bp (module 1a) of busA have many similarities to spnA. These similarities are readily picked up in a BLAST search using the busA gene in which spnA is detected as the sequence in GenBank most similar to busA. However, bases 4,246–9,548 (module 1b) do not have direct counterparts in the spnA gene. The similarity between the module 1b domains and the spn genes is comparable to the similarity of module 1b to the domains of other PKS domains such as erythromycin. This 5-kb region codes for an additional module with five functional domains: KS1b, AT1b, DH1b, KR1b, and ACP1b. These functions together with the preceding initiation domain appeared to be responsible for the biosynthesis of the butenyl side chain, characteristic of butenyl-spinosyns relative to spinosyns.

Similarity of bus PKS and spn PKS genes

Butenyl-spinosyn genebus ORF length bp (aa)Bus functional domainaBest match in A83543 spinosyn PKSspn ORF length bp (aa)Spn functional domainaORF percentage of identity (DNA) (%)ORF percentage of identity (aa) (%)
BusA13,032 (4,344)spnA7,788 (2,595)
1–4,2454,245 (1,415)KSQ-KS1b21,111–25,2144,245 (1,415)KSQ-KS19291.2
4,246–6,5485,301 (1,767)AT1b-KS1aNonebNA
9,549–13,0323,486 (1,162)AT1a-ACP1a26,407–28,8963,486 (1,162)AT1-ACP19187.6
BusB6,450 (2,149)KS2-ACP2spnB6,459 (2,152)KS2-ACP29393.1
BusC9,546 (3,167)KS3-ACP4spnC9,513 (3,170)KS3-ACP49493.5
BusD14,805 (4,935)KS5-ACP7spnD14,787 (4,928)KS5-ACP79493.6
BusE16,692 (5,564)KS8-ACP10spnE16,767 (5,588)KS8-ACP109490.6
Butenyl-spinosyn genebus ORF length bp (aa)Bus functional domainaBest match in A83543 spinosyn PKSspn ORF length bp (aa)Spn functional domainaORF percentage of identity (DNA) (%)ORF percentage of identity (aa) (%)
BusA13,032 (4,344)spnA7,788 (2,595)
1–4,2454,245 (1,415)KSQ-KS1b21,111–25,2144,245 (1,415)KSQ-KS19291.2
4,246–6,5485,301 (1,767)AT1b-KS1aNonebNA
9,549–13,0323,486 (1,162)AT1a-ACP1a26,407–28,8963,486 (1,162)AT1-ACP19187.6
BusB6,450 (2,149)KS2-ACP2spnB6,459 (2,152)KS2-ACP29393.1
BusC9,546 (3,167)KS3-ACP4spnC9,513 (3,170)KS3-ACP49493.5
BusD14,805 (4,935)KS5-ACP7spnD14,787 (4,928)KS5-ACP79493.6
BusE16,692 (5,564)KS8-ACP10spnE16,767 (5,588)KS8-ACP109490.6

bp base pairs, aa amino acids

afunctional domain names correspond to Fig. 3

bSimilarity to S. spinosa PKS genes was in the same range as similarity to other like domains of the bus & spn PKS genes

Similarity of bus PKS and spn PKS genes

Butenyl-spinosyn genebus ORF length bp (aa)Bus functional domainaBest match in A83543 spinosyn PKSspn ORF length bp (aa)Spn functional domainaORF percentage of identity (DNA) (%)ORF percentage of identity (aa) (%)
BusA13,032 (4,344)spnA7,788 (2,595)
1–4,2454,245 (1,415)KSQ-KS1b21,111–25,2144,245 (1,415)KSQ-KS19291.2
4,246–6,5485,301 (1,767)AT1b-KS1aNonebNA
9,549–13,0323,486 (1,162)AT1a-ACP1a26,407–28,8963,486 (1,162)AT1-ACP19187.6
BusB6,450 (2,149)KS2-ACP2spnB6,459 (2,152)KS2-ACP29393.1
BusC9,546 (3,167)KS3-ACP4spnC9,513 (3,170)KS3-ACP49493.5
BusD14,805 (4,935)KS5-ACP7spnD14,787 (4,928)KS5-ACP79493.6
BusE16,692 (5,564)KS8-ACP10spnE16,767 (5,588)KS8-ACP109490.6
Butenyl-spinosyn genebus ORF length bp (aa)Bus functional domainaBest match in A83543 spinosyn PKSspn ORF length bp (aa)Spn functional domainaORF percentage of identity (DNA) (%)ORF percentage of identity (aa) (%)
BusA13,032 (4,344)spnA7,788 (2,595)
1–4,2454,245 (1,415)KSQ-KS1b21,111–25,2144,245 (1,415)KSQ-KS19291.2
4,246–6,5485,301 (1,767)AT1b-KS1aNonebNA
9,549–13,0323,486 (1,162)AT1a-ACP1a26,407–28,8963,486 (1,162)AT1-ACP19187.6
BusB6,450 (2,149)KS2-ACP2spnB6,459 (2,152)KS2-ACP29393.1
BusC9,546 (3,167)KS3-ACP4spnC9,513 (3,170)KS3-ACP49493.5
BusD14,805 (4,935)KS5-ACP7spnD14,787 (4,928)KS5-ACP79493.6
BusE16,692 (5,564)KS8-ACP10spnE16,767 (5,588)KS8-ACP109490.6

bp base pairs, aa amino acids

afunctional domain names correspond to Fig. 3

bSimilarity to S. spinosa PKS genes was in the same range as similarity to other like domains of the bus & spn PKS genes

The PKS proteins, which perform similar reactions in the biosynthesis of spinosyns, share 87–93% amino acid identity and the genes range from 93–94% DNA sequence identity (Table 2). It should be noted that the spn PKS enzymes SpnB-E and the similar bus PKS enzymes BusB-E must maintain distinct substrate specificity because, although the reactions performed by the enzymes are identical, each of the substrate polyketides are slightly different. Formation of the lactone intermediate from the butenyl-spinosyn polyketide is catalyzed by the thioesterase in module 10 of BusE. This cyclization takes place between the hydroxyl at C-21 and the terminal hydroxyl attached to the ACP of module 10 (Fig. 4a). Action of the thioesterase (TE) on the projected butenyl-spinosyn polyketide would yield the tetracyclic butenyl-spinosyn aglycone which has a 12-membered lactone (Fig. 4a). One minor factor in butenyl-spinosyn fermentation the tridecenolactone spinosyn A (TDL) has the 5-6-5 ring system characteristic of the spinosyns, but a 14-membered lactone [21]. The butenyl-spinosyn polyketide required to form the lactone precursor to TDL spinosyn would require a hydroxyl at C-23. This polyketide precursor could be formed by an allelic rearrangement of the C-21 hydroxyl with the double bond at C-22/C-23; the hydroxyl would be displaced to C-23 and the double bond to C-21/C-22 (Fig. 4b). It is likely that this allelic rearrangement would be favored in vitro under acidic conditions (P. Graupner, personal communication); however, it is unclear how this is accomplished in vivo.
Fig. 4

Cyclization of the Butenyl-Spinosyn Polyketide. (a) The putative polyketide product of module 10 is shown covalently attached to the ACP10 cysteine. The residues involved in cyclization to form the 12-membered lactone are indicated by the red bracket. (b) The polyketide required for TDL spinosyn formation is shown at right with the residues required for 14-membered lactone formation indicated by the red bracket. The postulated allelic rearrangement around C-22 is indicated in blue

Genes adjacent to the PKS responsible for additional modifications

In the DNA upstream of the PKS genes (GenBank accession number DQ087286) there were 20 ORFs, with the features of genes: each consists of at least 100 codons, beginning with ATG or GTG and ending with TAA, TAG or TGA, and each has the codon bias expected of protein-coding regions in an organism whose DNA contains a high percentage of guanine and cytosine residues [4]. These 20 ORFs are represented graphically in Fig. 2. The ORFs were compared directly to the sequence of the spinosyn biosynthetic genes from S. spinosa (Genbank accession number AY007564). The high degree of similarity in both the DNA and protein sequence was a strong indication that the genes performed similar functions in biosynthesis of spinosyns. Therefore, 14 of the ORFs have been designated as butenyl-spinosyn biosynthetic genes, namely: busF, busG, busH, busI, busJ, busK, busL, busM, busN, busO, busP, busQ, busR, and busS (labeled F through S in Fig. 2). The letter designation of these bus genes was made to correspond to their spn gene counterparts (Table 3). Genes, busG, busH, and busI were highly similar to spn genes involved in tri-methyl rhamnose biosynthesis [16, 35]. Likewise genes busN, busO, busP, busQ, busR, and busS are putatively involved in forosamine biosynthesis like their spn gene counterparts [35, 37]. The remaining four genes, busK, busF, busJ, busL, busM, are projected as carbon-bridging genes [23, 29, 35]. The spn counterparts of these genes have been examined in depth elsewhere [16, 23, 29, 35, 37].

DNA similarity of bus and spn biosynthetic genes

Pogonin geneBus ORF length bp (a.a.)Spinosyn geneSpn ORF length bp (a.a.)BLAST scoreORF percentage of identity (DNA) (%)ORF percentage of identity (aa) (%)Function reported in GenBank
busF828 (275)spnF828 (275)1,2479491C-methylation
busG1,173 (390)spnG1,173 (390)1,8449590Rhamnose glycosyltransferase
busH753 (250)spnH753 (250)1,3289797Rhamnose methylation
busI1,188 (395)spnI1,188 (395)1,9669692Rhamnose methylation
busJ1,620 (539)spnJ1,620 (539)2,5879583Oxido-reduction
busK1,194 (397)spnK1,194 (397)2,1639688Rhamnose methylation
busL852 (283)spnL852 (283)2,2749494C-methylation
busM933 (310)spnM963 (320)1,9099596C-bridging
busN999 (332)spnN999 (332)1,7729691Forosamine synthesis
busO1,461 (486)spnO1,461 (486)2,3199592Forosamine synthesis
busP1,314 (437)spnP1,368 (455)2,0049489Forosamine glycosyltransferase
busQ1,344 (447)spnQ1,389 (462)2,3559481Forosamine synthesis
busR1,137 (378)spnR1,158 (385)1,8529589Sugar transamination
busS750 (249)spnS750 (249)1,2559693Aminosugar methylation
Pogonin geneBus ORF length bp (a.a.)Spinosyn geneSpn ORF length bp (a.a.)BLAST scoreORF percentage of identity (DNA) (%)ORF percentage of identity (aa) (%)Function reported in GenBank
busF828 (275)spnF828 (275)1,2479491C-methylation
busG1,173 (390)spnG1,173 (390)1,8449590Rhamnose glycosyltransferase
busH753 (250)spnH753 (250)1,3289797Rhamnose methylation
busI1,188 (395)spnI1,188 (395)1,9669692Rhamnose methylation
busJ1,620 (539)spnJ1,620 (539)2,5879583Oxido-reduction
busK1,194 (397)spnK1,194 (397)2,1639688Rhamnose methylation
busL852 (283)spnL852 (283)2,2749494C-methylation
busM933 (310)spnM963 (320)1,9099596C-bridging
busN999 (332)spnN999 (332)1,7729691Forosamine synthesis
busO1,461 (486)spnO1,461 (486)2,3199592Forosamine synthesis
busP1,314 (437)spnP1,368 (455)2,0049489Forosamine glycosyltransferase
busQ1,344 (447)spnQ1,389 (462)2,3559481Forosamine synthesis
busR1,137 (378)spnR1,158 (385)1,8529589Sugar transamination
busS750 (249)spnS750 (249)1,2559693Aminosugar methylation

DNA similarity of bus and spn biosynthetic genes

Pogonin geneBus ORF length bp (a.a.)Spinosyn geneSpn ORF length bp (a.a.)BLAST scoreORF percentage of identity (DNA) (%)ORF percentage of identity (aa) (%)Function reported in GenBank
busF828 (275)spnF828 (275)1,2479491C-methylation
busG1,173 (390)spnG1,173 (390)1,8449590Rhamnose glycosyltransferase
busH753 (250)spnH753 (250)1,3289797Rhamnose methylation
busI1,188 (395)spnI1,188 (395)1,9669692Rhamnose methylation
busJ1,620 (539)spnJ1,620 (539)2,5879583Oxido-reduction
busK1,194 (397)spnK1,194 (397)2,1639688Rhamnose methylation
busL852 (283)spnL852 (283)2,2749494C-methylation
busM933 (310)spnM963 (320)1,9099596C-bridging
busN999 (332)spnN999 (332)1,7729691Forosamine synthesis
busO1,461 (486)spnO1,461 (486)2,3199592Forosamine synthesis
busP1,314 (437)spnP1,368 (455)2,0049489Forosamine glycosyltransferase
busQ1,344 (447)spnQ1,389 (462)2,3559481Forosamine synthesis
busR1,137 (378)spnR1,158 (385)1,8529589Sugar transamination
busS750 (249)spnS750 (249)1,2559693Aminosugar methylation
Pogonin geneBus ORF length bp (a.a.)Spinosyn geneSpn ORF length bp (a.a.)BLAST scoreORF percentage of identity (DNA) (%)ORF percentage of identity (aa) (%)Function reported in GenBank
busF828 (275)spnF828 (275)1,2479491C-methylation
busG1,173 (390)spnG1,173 (390)1,8449590Rhamnose glycosyltransferase
busH753 (250)spnH753 (250)1,3289797Rhamnose methylation
busI1,188 (395)spnI1,188 (395)1,9669692Rhamnose methylation
busJ1,620 (539)spnJ1,620 (539)2,5879583Oxido-reduction
busK1,194 (397)spnK1,194 (397)2,1639688Rhamnose methylation
busL852 (283)spnL852 (283)2,2749494C-methylation
busM933 (310)spnM963 (320)1,9099596C-bridging
busN999 (332)spnN999 (332)1,7729691Forosamine synthesis
busO1,461 (486)spnO1,461 (486)2,3199592Forosamine synthesis
busP1,314 (437)spnP1,368 (455)2,0049489Forosamine glycosyltransferase
busQ1,344 (447)spnQ1,389 (462)2,3559481Forosamine synthesis
busR1,137 (378)spnR1,158 (385)1,8529589Sugar transamination
busS750 (249)spnS750 (249)1,2559693Aminosugar methylation

In addition, there were a number of ORFs found immediately downstream of busS (in cosmid 8H3) and 3 ORF’s downstream of the PKS genes (in cosmid 2C10). To assign functions to the polypeptides identified, the amino acid sequences of the predicted polypeptides were compared to sequences deposited in the databases at the National Center for Biotechnology Information (NCBI, Washington, DC, USA), using the BLASTX algorithm to determine how well they are related to known proteins. After BLAST analysis the significant protein matches presented in Table 4 were selected as the sequence with the highest BLAST score for which there was direct experimental evidence supporting the stated function. In a few cases, no such confirmed sequences were available; those scores are presented in parenthesis (Table 4).

Putative functions of open reading frames linked to the bus genes

GeneSignificant protein matchGenBank accessionBLAST ScoreaReported function
ORF LIngt N-glycosyltransferase (Saccharothrix aerocologenies)AB023593221Glycosyltransfer
ORF LIVurdR hexose-4-ketoreductase (Streptomyces fradiae)AF080235243Hexose ketoreduction
ORF LVIfkbM, FK506 O-methyltransferaseU65940100Methyltransfer
ORF LVIIoleP, P450 monooxygenase (Streptomyces antibioticus)L37200387Monooxygenase
ORF LVIIITransposase (Mycobacterium avium)AF107207(180)Transposition
ORF LIXmmcR (Streptomyces lavendulae)AF127374124Methyl transfer
ORF RIresolvase-like protein (Acidithiobacillus ferrooxidans)U73041(97)Transposition
ORF RIIhypothetical protein yvmC (Bacillus subtillus)AF017113(120)Unknown
ORF RIIIalcohol dehydrogenase [ Streptomyces coelicolor A3(2)]AL133236(155)Alcohol dehydrogenase
GeneSignificant protein matchGenBank accessionBLAST ScoreaReported function
ORF LIngt N-glycosyltransferase (Saccharothrix aerocologenies)AB023593221Glycosyltransfer
ORF LIVurdR hexose-4-ketoreductase (Streptomyces fradiae)AF080235243Hexose ketoreduction
ORF LVIfkbM, FK506 O-methyltransferaseU65940100Methyltransfer
ORF LVIIoleP, P450 monooxygenase (Streptomyces antibioticus)L37200387Monooxygenase
ORF LVIIITransposase (Mycobacterium avium)AF107207(180)Transposition
ORF LIXmmcR (Streptomyces lavendulae)AF127374124Methyl transfer
ORF RIresolvase-like protein (Acidithiobacillus ferrooxidans)U73041(97)Transposition
ORF RIIhypothetical protein yvmC (Bacillus subtillus)AF017113(120)Unknown
ORF RIIIalcohol dehydrogenase [ Streptomyces coelicolor A3(2)]AL133236(155)Alcohol dehydrogenase

aGreater similarity is associated with higher BLAST scores (Altschul et al. 1990)

Putative functions of open reading frames linked to the bus genes

GeneSignificant protein matchGenBank accessionBLAST ScoreaReported function
ORF LIngt N-glycosyltransferase (Saccharothrix aerocologenies)AB023593221Glycosyltransfer
ORF LIVurdR hexose-4-ketoreductase (Streptomyces fradiae)AF080235243Hexose ketoreduction
ORF LVIfkbM, FK506 O-methyltransferaseU65940100Methyltransfer
ORF LVIIoleP, P450 monooxygenase (Streptomyces antibioticus)L37200387Monooxygenase
ORF LVIIITransposase (Mycobacterium avium)AF107207(180)Transposition
ORF LIXmmcR (Streptomyces lavendulae)AF127374124Methyl transfer
ORF RIresolvase-like protein (Acidithiobacillus ferrooxidans)U73041(97)Transposition
ORF RIIhypothetical protein yvmC (Bacillus subtillus)AF017113(120)Unknown
ORF RIIIalcohol dehydrogenase [ Streptomyces coelicolor A3(2)]AL133236(155)Alcohol dehydrogenase
GeneSignificant protein matchGenBank accessionBLAST ScoreaReported function
ORF LIngt N-glycosyltransferase (Saccharothrix aerocologenies)AB023593221Glycosyltransfer
ORF LIVurdR hexose-4-ketoreductase (Streptomyces fradiae)AF080235243Hexose ketoreduction
ORF LVIfkbM, FK506 O-methyltransferaseU65940100Methyltransfer
ORF LVIIoleP, P450 monooxygenase (Streptomyces antibioticus)L37200387Monooxygenase
ORF LVIIITransposase (Mycobacterium avium)AF107207(180)Transposition
ORF LIXmmcR (Streptomyces lavendulae)AF127374124Methyl transfer
ORF RIresolvase-like protein (Acidithiobacillus ferrooxidans)U73041(97)Transposition
ORF RIIhypothetical protein yvmC (Bacillus subtillus)AF017113(120)Unknown
ORF RIIIalcohol dehydrogenase [ Streptomyces coelicolor A3(2)]AL133236(155)Alcohol dehydrogenase

aGreater similarity is associated with higher BLAST scores (Altschul et al. 1990)

Complementation of a butenyl-spinosyn O-methylation mutation by Cosmid 8H3

In an experiment to test whether the genes cloned were indeed responsible for butenyl-spinosyn biosynthesis, cosmid 8H3 was transformed into a previously isolated butenyl-spinosyn biosynthetic mutant (NRRL 30421) with altered rhamnose methylation. If the cloned genes were able to complement the mutation, then the genes were highly likely to be involved in butenyl-spinosyn production. S. pogona strain NRRL 30421 is a mutant of S. pogona NRRL 30141 which was unable to fully methylate the rhamnose on butenyl-spinosyns [17]. Strain NRRL 30421 accumulated 3′-O-desmethylrhamnosyl butenyl-spinosyn (3′-ODM; Fig. 5) and several additional butenyl-spinosyn factors which lack O-methylation at the 3′ position of the rhamnose [17]. This methylation defect is presumed to be the result of a mutation in one of the O-methyltransferases encoded by the busH, busI or busK genes. All three of these genes were cloned on cosmid 8H3 (Fig. 2).
Fig. 5

Structure of butenyl-spinosyns produced by NRRL 30141 and NRRL 30421. 3′-O-desmethylrhamnosyl butenyl-spinosyn (3′-ODM) is the primary metabolite of NRRL 30421. The butenyl-spinosyn pseudoaglycone (PSA) and 17-(3′′-O-methylglucosyl)-butenyl-spinosyn (MGB) are both minor metabolites of NRRL 30141

Cosmid 8H3 was transferred from E. coli ATCC 47055 into strain NRRL 30421 by conjugal transfer. Although this cosmid has a ϕC31 att site, cosmids transferred into S. spinosa by this method are preferentially integrated into the chromosome by homologous recombination [25]. Therefore, the S. pogona transformants are likely to have a duplication of the cloned segment in cosmid 8H3, separated by the plasmid. Two independent isolates transformed with cosmid 8H3 were fermented and analyzed for production of butenyl-spinosyn and 3′-ODM.

While NRRL 30421 produced predominantly 3’-ODM, strains of NRRL 30421 containing cosmid 8H3 produced mostly butenyl-spinosyn (Table 5). The production of butenyl-spinosyn and 3′-ODM in NRRL 30421 containing cosmid 8H3 was similar to the production in nonmutant culture NRRL 30141 (Table 5). It has, therefore, been demonstrated that the genes present on cosmid 8H3 were able to complement the methylation defect in strain NRRL 30421 to restore production of fully methylated butenyl-spinosyn.

Butenyl-spinosyns produced by S. pogona transformants

Strain (genotype)Pogonin μg/ml3′-ODM μg/mlRatio of compoundsb
NRRL 30421 (3′-ODMa)0.71.00.7
NRRL 30421 (3′-ODM)/8H3-428.90.517.8
NRRL 30421 (3′-ODM)/8H3-453.00.130.0
NRRL 301419.70.424.3
Strain (genotype)Pogonin μg/ml3′-ODM μg/mlRatio of compoundsb
NRRL 30421 (3′-ODMa)0.71.00.7
NRRL 30421 (3′-ODM)/8H3-428.90.517.8
NRRL 30421 (3′-ODM)/8H3-453.00.130.0
NRRL 301419.70.424.3

a 3′-ODM = mutation preventing methylation of rhamnose at 3′ position. The numbers 42 and 45 represent different isolates transformed with cosmid 8H3

b The ratio of compounds was determined by dividing the concentration of butenyl-spinosy in each fermentation by the concentration of 3′-ODM

Butenyl-spinosyns produced by S. pogona transformants

Strain (genotype)Pogonin μg/ml3′-ODM μg/mlRatio of compoundsb
NRRL 30421 (3′-ODMa)0.71.00.7
NRRL 30421 (3′-ODM)/8H3-428.90.517.8
NRRL 30421 (3′-ODM)/8H3-453.00.130.0
NRRL 301419.70.424.3
Strain (genotype)Pogonin μg/ml3′-ODM μg/mlRatio of compoundsb
NRRL 30421 (3′-ODMa)0.71.00.7
NRRL 30421 (3′-ODM)/8H3-428.90.517.8
NRRL 30421 (3′-ODM)/8H3-453.00.130.0
NRRL 301419.70.424.3

a 3′-ODM = mutation preventing methylation of rhamnose at 3′ position. The numbers 42 and 45 represent different isolates transformed with cosmid 8H3

b The ratio of compounds was determined by dividing the concentration of butenyl-spinosy in each fermentation by the concentration of 3′-ODM

Accumulation of butenyl-spinosyn precursor and shunt product caused by disruption of busO

In a second experiment to test if the genes cloned were indeed responsible for butenyl-spinosyn biosynthesis, the cloned genes were used to construct a knock-out mutation in S. pogona NRRL 30141. As in S. spinosa, it is projected that S. pogona requires six genes, busN, busO, busP, busQ, busR, and busS for biosynthesis of forosamine and its addition to the butenyl-spinosyn pseudoaglycone (PSA; Fig. 5). Inactivation of any of these genes would be expected to disrupt formation of forosamine and prevent production of butenyl-spinosyn. The busO gene was inactivated by integration of a cloned internal fragment of the busO gene which resulted in partial duplication of the busO gene, to yield two truncated copies of the gene flanking the plasmid and antibiotic resistant gene.

The parental strain, S. pogona NRRL 30141 produced high levels of butenyl-spinosyn and low levels of 17-hydroxy buthenyl-spinosyn (PSA; Table 6) and 17-(3′′-O-methylglucosyl)-butenyl-spinosyn (MGB; Fig. 5). Although butenyl-spinosyn was produced at high levels in NRRL 30141, butenyl-spinosyn could not be detected in any of the six busO mutants by LC/MS. This demonstrated that busO was required for biosynthesis of butenyl-spinosyn. The isolation of PSA from all busO mutants indicated that all butenyl-spinosyn biosynthetic genes not required for forosamine biosynthesis were functional. Levels of PSA were increased in all six mutants (Table 6), as would be predicted from a deficiency in forosamine supply. The levels of MGB, which has a sugar (3′′-O-methyl glucose) other than forosamine at C-17, also increased in the busO mutants. This suggested that the forosamyltransferase (BusP) encoded by the busP gene was functional in these busO mutants and could transfer other sugars to the butenyl-spinosyn PSA.

Butenyl-spinosyns produced by S. pogona mutants

Strain (genotype)Butenyl-spinosynaPSAaMGBa
NRRL 30141366.31.00.4
NRRL 30141 busO65ND13.81.7
NRRL 30141 busO67ND12.33.7
NRRL 30141 busO68ND6.73.8
NRRL 30141 busO70ND9.31.3
NRRL 30141 busO71ND12.32.4
NRRL 30141 busO72ND5.41.6
Strain (genotype)Butenyl-spinosynaPSAaMGBa
NRRL 30141366.31.00.4
NRRL 30141 busO65ND13.81.7
NRRL 30141 busO67ND12.33.7
NRRL 30141 busO68ND6.73.8
NRRL 30141 busO70ND9.31.3
NRRL 30141 busO71ND12.32.4
NRRL 30141 busO72ND5.41.6

ND not detected

aAmounts reported are relative to the concentration of PSA in NRRL 30141

Butenyl-spinosyns produced by S. pogona mutants

Strain (genotype)Butenyl-spinosynaPSAaMGBa
NRRL 30141366.31.00.4
NRRL 30141 busO65ND13.81.7
NRRL 30141 busO67ND12.33.7
NRRL 30141 busO68ND6.73.8
NRRL 30141 busO70ND9.31.3
NRRL 30141 busO71ND12.32.4
NRRL 30141 busO72ND5.41.6
Strain (genotype)Butenyl-spinosynaPSAaMGBa
NRRL 30141366.31.00.4
NRRL 30141 busO65ND13.81.7
NRRL 30141 busO67ND12.33.7
NRRL 30141 busO68ND6.73.8
NRRL 30141 busO70ND9.31.3
NRRL 30141 busO71ND12.32.4
NRRL 30141 busO72ND5.41.6

ND not detected

aAmounts reported are relative to the concentration of PSA in NRRL 30141

Genes responsible for minor butenyl-spinosyn metabolites

In spite of the high degree of DNA and amino acid similarity between some bus and spn genes, it should be noted that some of the bus gene products catalyze different reactions in the biosynthesis of butenyl-spinosyns relative to spinosyns. These differences are manifested in the distinct butenyl-spinosyn compounds that have been isolated from S. pogona.

All natural spinosyns are substituted at C-17 with forosamine or a specific forosamine isomer [20]. Butenyl-spinosyns, on the other hand, are substituted at C-17 with a wider range of forosamine isomers, as well as neutral sugars like amicetose, O-methyl-glucose, and O-methyloleandrose [21]. This C-17 glycosylation diversity relative to spinosyns requires biosynthetic enzymes to make the sugars and a glycosyltransferase capable of catalyzing these glycosylations. These sugars might be synthesized by specific synthase genes located near the bus genes or elsewhere in the chromosome, or they may be synthesized by alternate substrate specificity of the listed butenyl-spinosyn biosynthetic genes. Amicetose could be produced by genes outside of the bus gene cluster or it may be a shunt product of an intermediate in the biosynthesis of forosamine as outlined in Fig. 6. Methyl-oleandrose (mole) could be synthesized from NDP-4-keto-2,6-deoxy-D-glucose, an intermediate in the biosynthesis of forosamine (Fig. 6). Ketoreduction (putatively ORF LIV) and O-methylation of this precursor by the bus genes (busH, busI or busK) and/or other S. pogona genes (putatively ORF LVI) could lead to the biosynthesis of butenyl-spinosyn derivatives containing methyloleandrose (Fig. 6).
Fig. 6

Putative biosynthesis of alternate sugars using bus & linked genes. NDP-4-keto-2,6-dideoxy-D-glucose, shown in the box, is an intermediate in the biosynthesis of forosamine (the product of BusN or SpnN) [35]. The product of the BusQ or SpnQ proteins is a putative unstable intermediate (brackets) based on deoxyhexose biosynthesis [33, 35]

The spinosyn forosamyl transferase, spnP, cloned into S. erythraea SGT2 (ery PKS-deleted strain) was shown to add the alternate sugars mycarose and D-glucose to the C-17 position of spinosyn [15]. Other glycosyl transferases exogenous to S. erythraea were unable to glycosylate spinosyn, indicating that the inherent spinosyn specificity of SpnP was required [15]. Likewise, we expect that the forosamyl glycosyltransferase, BusP, was responsible for attaching multiple sugars to the butenyl-spinosyn pseudoaglycone. This was supported by enhanced addition of methyl-glucose to butenyl-spinosyn in six busO mutants of S. pogona, all of which have an unaltered busP gene. This natural ability of the BusP glycosyl transferase to transfer both amino and neutral sugars is unique in secondary metabolite biosynthesis [15]. However, this evidence does not firmly rule out the involvement of a glycosyl transferase other than BusP, such as ORF LI, in the attachment of alternate sugars at C-17.

Several butenyl-spinosyn analogs produced by S. pogona are hydroxylated at C-8 or C-24 (Fig. 1) [21]. Macrolides can be hydroxylated postsynthesis by P-450 monooxygenases as in hydroxylation at C-6 in erythromycin biosynthesis [36]. ORF LVII was highly similar to oleP a P-450 monooxygenase involved in polyketide hydroxylation in oleandomycin production in Streptomyces antibioticus (Table 4). Therefore, it may be responsible for the hydroxylations at C-8 or C-24 of butenyl-spinosyns. Alternatively, hydroxylated precursors such as glycolate or glycerol can be incorporated during polyketide synthesis, as in leukomycin [30]. It has been reported that the AT domain specific for addition of glycolate in the niddamycin producer (nid AT6) is similar to methyl-malonyl-CoA specific AT domains of the erythromycin and rapamycin PKS genes [19]. PKS module 7 is responsible for the addition of carbons 8 and 9 of the butenyl-spinosyn polyketide; however, the sequence of the busD AT7 domain is not similar to the putative glycoate specific sequences of nid AT6. Although this seems to indicate that bus AT7 is not specific for glycolate, there are other unique sequences in bus AT7 relative to other AT domains and nid AT6 which could denote alternate specificity. It seems likely that a monooxygenase such as ORF LVII would be responsible for the C-8 or C-24 hydroxylations. No C-8 hydroxylated spinosyns are produced by S. spinosa, therefore, the butenyl-spinosyn biosynthetic genes responsible for these modifications are unique to S. pogona.

In addition, rhamnose methylation is altered in S. pogona relative to S. spinosa. Mutants of S. spinosa which exhibited altered methylation of the rhamnose on spinosyn [27, 28, 34], typically produced mono-desmethylated rhamnose derivatives of spinosyns. Di-desmethyl rhamnose derivatives of spinosyns were only detected in the presence of methyltransferase inhibitors like sinefungin. No tri-desmethyl rhamnose derivatives of spinosyns were ever isolated. Mutants of S. pogona with altered methylation of rhamnose [17], produced di- and tri-desmethyl rhamnose derivatives of butenyl-spinosyns in high amounts, in the absence of methyltransferase inhibitors.

Putative origins of the spinosyn and butenyl-spinosyn genes

Eleven of the twelve PKS modules of the spinosyn and butenyl-spinosyn biosynthetic genes were highly similar (Table 2). However, module 1b of the busA gene was markedly different from any spn PKS module. A comparison of the protein (or DNA) sequence of the busA and spnA genes showed strong similarity between the loading modules of both genes. This similarity was over 90% at both the DNA and amino acid sequence level and continued through the KS domains of spnA M1 and busA M1b (Fig. 7). However, the similarity between busA M1b and spnA M1 dropped significantly in the AT domains and low similarity continued through the end of busA M1b. The strong similarity between busA and spnA M1 resumed 150 amino acids into the M1a KS domain and continued through the end of each gene (Fig. 7).
Fig. 7

Illustration of putative natural genetic engineering of the spnA gene from the busA gene. Green boxes and lines indicate regions of busA and spnA genes with >90% DNA identity, blue boxes and lines indicate unique regions of busA with <90% DNA identity to spnA. Yellow indicates the region of homology between all three domains where the postulated recombination crossover (represented by the redX") would occur. Numbers on the flags correspond to the nucleotide in the busA gene (AX600586); numbers in parenthesis indicate the amino acid number in the KS domain of busA M1b

Therefore, all three modules (busA M1b and M1a and spnA M1) showed strong similarity over the last 350 amino acids of the KS domains. If the KS domain of busA M1b and the KS domain of busA M1a were lined up as shown in Fig. 7, there appears to be sufficient similarities to support homologous recombination. The product of such a recombination would result in crossover in the KS domain and an in-frame deletion of one entire PKS domain. The resulting module would have the arrangement found in the spnA gene. It could, therefore, be postulated that the spnA gene was derived from the busA gene by homologous recombination across the highly similar KS domains of M1b and M1a.

Conclusions

Analysis of the bus gene cluster revealed a high degree of conservation with the spn cluster from S. spinosa. The gene order and gene orientation was totally conserved between S. spinosa and S. pogona. DNA flanking the bus gene cluster, on the other hand, was completely diverged from the spn cluster. As in S. spinosa, no regulatory genes nor genes for biosynthesis of rhamnose were directly linked to the bus biosynthetic cluster. Several of the unique genes flanking the bus cluster may be involved in formation of some of the unique butenyl-spinosyn factors, but these genes need further investigation.

In this analysis, we found that the origin of the butenyl tail in butenyl-spinosyns was due to an additional PKS module in the bus PKS relative to the spn PKS. The functional domains of module 1b in the busA gene have the functions necessary to synthesize this unique addition. We found a high degree of similarity between the KS domains of the busA gene which contains this additional module and the spnA gene. An in-frame deletion between the homologous module 1b KS and module 1a KS within the busA gene would result in a gene with very similar structure to the spnA gene. Thus the spinosyn biosynthetic cluster may have been derived as an in-frame deletion from an ancestral cluster which produced butenyl-spinosyns, analogous to the bus cluster.

The butenyl-spinosyn-producing strain S. pogona NRRL 30141 has a number of significant differences from S. spinosa: a hairy rather than spiny spore coat, bacteriophage sensitivity and different 16S rRNA secondary structure. However, S. pogona was very similar to S. spinosa in its growth characteristics and biochemical tests. The 16S rRNA sequence similarity between the two strains was 98% identity (D. Hahn, manuscript in preparation) and BLAST analysis indicated that the two 16S rRNA gene sequences were nearest neighbors within the Saccharopolyspora. Therefore the strains, although different, are so closely related that the proposed common origin of spinosyn genes is feasible.

Acknowledgements

We would like to acknowledge the assistance of Dennis Duebelbeis and Paul Lewer who provided LC and LC/MS analysis of fermentations. We also acknowledge Dow AgroSciences Discovery management for enthusiastic support of this work.

References

1.

Aparicio
JF
,
Molnar
I
,
Schwecke
T
,
Konig
A
,
Haydock
SF
,
Khaw
LE
,
Staunton
J
,
Leadlay
PF
Organization of the biosynthetic gene cluster for rapamycin in Streptomyces hygroscopicus: analysis of the enzymatic domains in the modular polyketide synthase
 
Gene
 
1996
 
169
 
9
 
16
 

2.

Atlas
RM
 
Handbook of microbiological media
 
1997
 2  
Boca Raton
 
CRC

3.

Ausebel
F
,
Brent
R
,
Kingston
R
,
Moore
D
,
Smith
J
,
Seidman
J
,
Struhl
K
 
Current protocols in molecular biology
 
1987
 
New York
 
Wiley

4.

Bibb
MJ
,
Findlay
PR
,
Johnson
MW
The relationship between base composition and codon usage in bacterial genes and its use for the simple and reliable identification of protein-coding sequences
 
Gene
 
1984
 
30
 
157
 
166
 

5.

Bierman
M
,
Logan
R
,
O’Brien
K
,
Seno
ET
,
Rao
RN
,
Schoner
BE
Plasmid cloning vectors for the conjugal transfer of DNA from Escherichia coli to Streptomyces spp
 
Gene
 
1992
 
116
 
43
 
49
 

6.

Brautaset
SE
,
Borgos
F
,
Sletta
H
,
Ellingsen
TE
,
Zotchev
SB
Site-specific mutagenesis and somain substitutions in the loading module of the nystatin polyketide synthase and their effects on nystatin biosynthesis in Streptomyces noursei
 
J Biol Chem
 
2003
 
278
 
14913
 
14919
 

7.

Bisang
C
,
Long
PF
,
Cortes
J
,
Westcott
J
,
Crosby
J
,
Matharu
A-L
,
Cox
RJ
,
Simpson
TJ
,
Staunton
J
,
Leadlay
P
A chain initiation factor common to both modular and aromatic polyketide synthases
 
Nature
 
1999
 
401
 
502
 
505
 

8.

Burgett
SG
,
Rosteck
PRJ
Adams
M
,
Fields
C
,
Venter
JC
Use of dimethyl sulfoxide to improve fluorescent, Taq cycle sequencing
 
Automated DNA sequencing and analysis
 
1994
 
New York
 
Academic
 
211
 
215

9.

Burns LS, Graupner PR, Lewer P, Martin CJ, Vousden WA, Waldron C, Wilkinson B (2003) Spinosyn polyketide synthase fusion products synthesizing novel spinosyns and their preparation and use. WO 2003/070908 A2

10.

Crouse GD, Hahn DR, Graupner PR, Gilbert JR, Lewer P, Balcer JL, Anzeveno PB, Daeuble JF, Oliver PM, Sparks TC (2002) Synthetic derivatives of 21-butenyl and related spinosyns. WO 02/077004 A1

11.

Dehoff BS, Kuhstoss SA, Rosteck PR, Sutton KL (1997) Polyketide synthase genes. EPA 0791655

12.

Donadio
S
,
McAlpine
JB
,
Sheldon
PS
,
Jackson
M
,
Katz
L
An erythromycin analog produced by reprogramming of polyketide synthesis
 
Proc Natl Acad Sci USA
 
1993
 
90
 
7119
 
7123
 

13.

Donadio
S
,
Katz
L
Organization of the enzymatic domains in the multifunctional polyketide synthase involved in erythromycin formation in Saccharopolyspora erythrae
 
Gene
 
1992
 
111
 
51
 
60
 

14.

Donadio
S
,
Staver
MJ
,
McAlpine
JB
,
Swanson
SJ
,
Katz
L
Modular organization of genes required for complex polyketide biosynthesis
 
Science
 
1991
 
252
 
675
 
679
 

15.

Gaisser
S
,
Martin
CJ
,
Wilkinson
B
,
Sheridan
RM
,
Lill
RE
,
Weston
AJ
,
Ready
SJ
,
Waldron
C
,
Crouse
CD
,
Leadlay
PF
,
Staunton
J
Engineered biosynthesis of novel spinosyns bearing altered deoxyhexose substituents
 
Chem Comm (Camb, UK)
 
2002
 
6
 
618
 
619
 

16.

Gaisser
S
,
Lill
R
,
Wirtz
G
,
Grolle
F
,
Staunton
J
,
Leadlay
PF
New erythromycin derivatives from Saccharopolyspora erythraea using sugar O-methyltransferases from the spinosyn biosynthetic gene cluster
 
Mol Microbiol
 
2001
 
41
 
1223
 
1231
 

17.

Hahn DR, Balcer JL, Lewer P, Gilbert JR, Graupner P (2002a) Pesticidal spinosyn derivatives. WO 02/077005 A1

18.

Hahn DR, Jackson JD, Bullard BS, Gustafson GD, Waldron C, Mitchell JC (2002b) Biosynthetic genes for butenyl-spinosyn insecticide production. WO 02/079477 A1

19.

Katz L, Stassi DL, Summers RG, Ruan X, Pereda-Lopez A, Kakavs SJ. (2000) Polyketide derivatives and recombinant methods for making same. US Patent 6,060,234

20.

Kirst
HA
,
Michel
KH
,
Martin
JW
,
Creemer
LC
,
Chino
EH
,
Yao
RC
,
Nakatsukasa
WM
,
Boeck
LD
,
Occolowitz
JL
,
Paschal
JW
,
Deeter
JB
,
Jones
ND
,
Thompson
GD
A83543A-D, unique fermentation-derived tetracyclic macrolides
 
Tetrahedron Lett
 
1991
 
32
 
4839
 
4842
 

21.

Lewer P, Hahn DR, Karr LL, Graupner PR, Gilbert JR, Worden T, Yao R, Norton DW (2002) Pesticidal macrolides. US Patent 6,455,504

22.

Madduri
K
,
Waldron
C
,
Matsushima
P
,
Broughton
MC
,
Crawford
K
,
Merlo
DJ
,
Baltz
RH
Genes for the biosynthesis of spinosyns: applications for yield improvement in Saccharopolyspora spinosa
 
J Ind Microbiol Biotechnol
 
2001
 
27
 
399
 
402
 

23.

Martin
CJ
,
Timoney
MC
,
Sheridan
RM
,
Kendrew
SG
,
Wilkinson
B
,
Staunton
J
,
Leadlay
PF
Heterologous expression in Saccharopolyspora erythraea of a pentaketide synthase derived from the spinosyn polyketide synthase
 
Org Biomol Chem
 
2003
 
1
 
4144
 
4147
 

24.

Matsushima
PM
,
Baltz
RH
Transformation of Saccharopolyspora spinosa protoplasts with plasmid DNA modified in vitro to avoid host restriction
 
Microbiology
 
1994
 
140
 
139
 
143
 

25.

Matsushima
P
,
Broughton
MC
,
Turner
JR
,
Baltz
RH
Conjugal transfer of cosmid DNA from Escherichia coli to Saccharopolyspora spinosa: effects of chromosomal insertion on macrolide A83543 production
 
Gene
 
1994
 
146
 
39
 
45
 

26.

McDaniel
R
,
Katz
L
Lohner
K
Genetic engineering of novel macrolide antibiotics
 
Development of novel antimicrobial agents: emerging strategies
 
2001
 
Wymondham
 
Horizon Scientific
 
45
 
60

27.

Mynderse JS, Martin JW, Turner JR, Creemer LC, Kirst HA, Broughton MC, Huber MLB (1993) A83543 compounds and process for production thereof. US Patent 5,202,242

28.

Mynderse JS, Broughton MC, Nakatsukasa WM, Mabe JA, Turner JR, Creemer L, Huber MLB, Kirst HA, Martin JW (1998) A83543 compounds and process for production thereof. US Patent 5,840,861

29.

Oikawa
H
Biosynthesis of structurally unique fungal metabolite GKK1032A2: indication of novel carbocyclic formation mechanism in polyketide biosynthesis
 
J Org Chem
 
2003
 
68
 
3552
 
3557
 

30.

Omura
S
,
Tsuzuki
K
,
Nakagawa
A
,
Lukacs
G
Biosynthetic origin of carbons 3 and 4 of leucomycin aglycone
 
J Antibiot
 
1983
 
36
 
611
 
613

31.

Siggard-Andersen
M
Conserved residues in condensing enzyme domains of fatty acid synthases and related sequences
 
Protein Seq Data Anal
 
1993
 
5
 
325
 
335

32.

Sparks
TC
,
Crouse
GD
,
Durst
G
Natural products as insecticides: the biology, biochemistry and quantitative structure-activity relationships of spinosyns and spinosoids
 
Pest Manage Sci
 
2001
 
57
 
896
 
905
 

33.

Thorson
JS
,
Lo
SF
,
Liu
H
Biosynthesis of 3,6-dideoxyhexoses: new mechanistic reflections upon 2,6-dideoxy, 4,6-dideoxy, and amino sugar construction
 
J Am Chem Soc
 
1993
 
115
 
6993
 
6994
 

34.

Turner JR, Huber MLB, Broughton MC, Mynderse JS, Martin JW (1998) A83543 compounds: factors Q. R, S and T. US Patent 5,767,253

35.

Waldron
C
,
Matsushima
P
,
Rosteck
PR
,
Broughton
MC
,
Turner
J
,
Madduri
K
,
Crawford
KP
,
Merlo
DJ
,
Baltz
RH
Cloning and analysis of the spinosad biosynthetic gene cluster of Saccharopolyspora spinosa
 
Chem Biol
 
2001
 
8
 
487
 
499
 

36.

Weber JM, McAlpine JB (1992) Erythromycin derivatives. US Patent 5,141,926

37.

Zhao
Z
,
Hong
L
,
Liu
H
Characterization of protein encoded by spnR from the spinosyn gene cluster of Saccharopolyspora spinosa: mechanistic implications for forosamine biosynthesis
 
J Am Chem Soc
 
2005
 
127
 
7692
 
7693
 

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)