Complete genome sequence of butenyl-spinosyn-producing Saccharopolyspora strain ASAGF58

This study aimed to analyze the complete genome sequence of the butenyl-spinosyn-producing strain Saccharopolyspora sp. ASAGF58, isolated from Zhejiang province. PacBio RS II sequencing platform with single-molecule real-time technology was used to obtain the complete genome sequence of Saccharopolyspora sp. ASAGF58. Gene prediction and annotation analysis were carried out through several software and databases. The antiSMASH online server was used to evaluate the secondary metabolite potential of strain ASAGF58. The whole genome of Saccharopolyspora sp. ASAGF58 is 8,190,340 bp divided into one chromosome of 8,044,361 bp with a GC content of 68.1% and a plasmid of 145,979 bp with a GC content of 64.6%. A total of 7486 coding sequences, 15 rRNA genes, 61 tRNA genes, 41 miscRNA genes, and 1 tmRNA gene were predicted. The domains encoded by one of the type I polyketide synthase (T1PKS) gene clusters have 91% similarity with those encoded by a spinosad biosynthetic gene cluster from Saccharopolyspora spinosa. In addition, antiSMASH results predicted that the strain also contains the biosynthetic gene clusters for the synthesis of ectoine, geosmin, and erythreapeptin. Our data revealed the complete genome sequence of a new isolated butenyl-spinosyn-producing strain. This work will provide some methods, from genetics to biotechnology and biochemistry, aimed at the production improvement of butenyl-spinosyns.

A great percentage of natural compounds are produced by microorganisms especially actinobacteria (Katz and Baltz 2016;Genilloud 2017). There is an increased interest in the isolation of actinobacteria because of the potential to discover new compounds having novel chemical structures (Tiwari and Gupta 2012). In this study, soil samples from different ecological environments were collected throughout China. Rare actinobacteria were screened according to morphology. Briefly, 1 g of air-dried soil samples was macerated in phosphate-buffered saline (1.5 mM NaH 2 PO 4 ·2H 2 O, 8.3 mM Na 2 HPO 4 ·12H 2 O, 154 mM NaCl, and 1.73 mM sodium dodecyl sulfate, pH 7.4), ultrasonic shaking at 50-60 Hz and heating at 60°C for 10 min. Then, dilutions of the resulting suspension were plated onto 1/10 ATCC 172 agar medium (50 μg/mL nystatin, 50 μg/mL cycloheximide, and 1.25 μg/mL rifampicin) and incubated at 30°C for 14 days (Hong et al. 2009). Screened rare actinobacteria were further purified under the same conditions. Strains were incubated in 96 deep well plates containing a rich medium (glucose 50.0 g/L, cottonseed protein 20.0 g/L, NaCl 3.0 g/L, K 2 HPO 4 0.2 g/L, FeSO 4 ·7H 2 O 0.05 g/L, CaCO 3 5.0 g/L, pH 7.2) at 30°C for 7 days. The insecticidal activity test method, which was established by the lethal effect of active substances on mosquito larvae, was used for fermentation broth screening (Chen et al. 2013). A strain isolated from Zhejiang province, China, with the shortest fatality time to mosquito larvae was selected and named ASAGF58 (Guo et al. 2019).
Confirmation of active substance structure was performed by liquid chromatography mass spectrometer/ mass spectrometer (LC-MS/MS). This LC-MS/MS system consisted of Agilent 1290 Infinity II and Agilent 6545 Q-TOF LC/MS (Agilent Technologies, Santa Clara, CA, USA). The mixture of fermentation broth and 2× volume of methanol was vortexed and left overnight at 4°C. The samples were centrifuged at 4°C, 12,000 rpm, for 10 min, and the supernatant used in HPLC analysis as described by Zhao et al. (2017). The qualitative analysis of target compounds was carried out by electrospray ionization mass spectrometry (ESI-MS) under the positive mode with multiple reaction monitoring (MRM). The ionspray voltage was 4.5 kV, and the gas temperature was 350°C with a drying gas flow rate of 10 L/min. The result revealed that two active substance compounds were in the fermentation broth with mass-to-charge ratios the same as spinosyn α1 and δ1 which could also be produced by Saccharopolyspora pogona (Lewer et al. 2009). The product ion which was obtained in the secondary ion mass spectrum (MS/MS) under the condition of 60 eV of collision energy and 120 V of fragmentor was forosamine sugar fragment ion, further evidence of the two active compounds' structure (Lewer et al. 2009) (Fig. S1).
To assign taxonomy, the 16S rRNA gene sequence of strain ASAGF58 was identified via EzBioCloud (https:// www.ezbiocloud.net), and the result showed that it shared a 100% similarity with Saccharopolyspora hattusasensis CR 3506 T and a 99.1% similarity with Saccharopolyspora spinosa NRRL 18395 T . A phylogenetic tree was constructed with the neighbor-joining method using the MEGA software version 5.0 ( Fig. 1). It indicates that strain ASAGF58 forms a distinct cluster with members of Saccharopolyspora species and is most likely a strain of S. hattusasensis which has a close relationship with spinosyn-and butenyl-spinosyn-producing strain S. spinosa NRRL 18395 T and S. pogona NRRL 3014. S. hattusasensis, a new species of Saccharopolyspora sp., was isolated from Turkey and was found to exhibit antimicrobial activity against Bacillus subtilis NRRL B-209, Citrobacter freundi NRRL B-2643, and Staphylococcus aureus ATCC 29213 (Veyisoglu et al. 2017). However, no insecticidal activity has been reported.
Whole-genome sequencing was carried out for strain ASAGF58. It was cultivated in 20 mL of tryptic soy broth in 300 mL flasks at 30°C and 240 rpm for 48 h. DNA was extracted using the Wizard® Genomic DNA Purification Kit (Promega Corporation). The extracted DNA was sequenced by Annoroad, Inc. (Beijing, China), using the PacBio RS II sequencing platform and singlemolecule real-time (SMRT) technology. The raw PacBio reads were quality filtered by SMRT Pipe version 2.3, and 75,095 subreads were obtained with an N 50 value of 12,242 and a mean value of 9453. Due to the low-quality and high randomness errors of PacBio sequencing data, de novo assembly was carried out with HGAP version 3.0 in order to obtain high-accuracy data that could meet the demand of the analysis (Chin et al. 2013). The assembled results contained 2 circle contigs, a chromosome with a base coverage of 76.15 and a plasmid with a base coverage of 52.1. After data filtering, 71,077 subreads could be mapped back to the contig. The data utilization rate was 94.69%.
We used Prodigal software version 2.6 to predict the coding sequence (Hyatt et al. 2010). Infernal version 1.1.1 (Nawrocki et al. 2009) and RNAmmer version 1.2 (Lagesen et al. 2007) were applied for the prediction of tRNA, rRNA, and ncRNA. Signal peptide, insertion sequence, phage precursor, clustered regularly interspaced short palindromic repeats (CRISPR) locus, and gene island were predicted using SignalP version 4.1 (Petersen et al. 2011), ISFinder (https://www-is.biotoul.fr/) (Siguier et al. 2006), Phage Finder version 2.0 (Fouts 2006), PILE-CR version 1.0 (Edgar 2007), and GIHunter version 1.0 (http://www5. esu.edu/cpsc/bioinfo/software/GIHunter/), respectively. The whole genome of strain ASAGF58 is 8,190,340 bp divided into one chromosome of 8,044,361 bp with a high GC content of 68.1% and a plasmid of 145,979 bp with a high GC content of 64.6% (Fig. 2). Table 1 shows that a total of 7486 coding sequences, 15 rRNA genes, 61 tRNA Fig. 2 Circular representation of strain ASAGF58 chromosome (from outside to inside): tRNA genes, Clusters of Orthologous Groups of proteins (COG) annotation on the forward strand, location of genes on the forward strand, rRNA genes, COG annotation on the reverse strand, location of genes on the reverse strand, GC content (taking the mean value as the baseline, the expression of the outward protrusion is higher than the mean value, and the expression of the inward protrusion is lower than the mean value), GC skew (the expression of purple means less than 0, and the expression of green means more than 0) genes, 41 miscRNA genes, 1 tmRNA gene, and a 239-bplong CRISPR unit between 1,510,973 bp and 1,511,212 bp were predicted. The genome sequence was deposited under GenBank accession number CP040605.
Spinosyns are a couple of biological pesticides with high efficiency, broad spectrum, and low toxicity to birds and mammals. It includes spinosyns produced by S. spinosa and butenyl-spinosyns produced by S. pogona. The genetic relationship of the two strains is very close, and perhaps, they have a common origin of spinosyn genes (Hahn et al. 2006). The biosynthetic pathway for the butenyl-spinosyns was proposed by Hahn et al. (2006). As novel antibiotics, many efforts have been made to improve the production of butenyl-spinosyns. The mutant S. pogona-Δfcl was constructed and found that the yield of butenyl-spinosyns was 130% compared with that in S. pogona. The reason is that the GDP-fucose synthetase encoded by fcl gene is involved in the synthesis of GDPfucose from GDP-mannose. The GDP-rhamnose transformed from GDP-mannose is the precursor of butenylspinosyns synthesis (Peng et al. 2019). Polynucleotide phosphorylase overexpression mutant of S. pogona also had a high production because of the improvement of biomass (Li et al. 2018). A strain of yield 1.79-fold higher than the parent strain was obtained by ribosome engineering (Luo et al. 2016). The complete genome of Saccharopolyspora sp. ASAGF58 will promote the research of the synthesis mechanism of butenyl-spinosyns and also stimulate a wide range of approaches to improve butenyl-spinosyns synthesis.