Bacterial and archaeal communities within the alkaline soda Langaco Lake in the Qinghai-Tibet Plateau

Langaco Lake (LGL) is a soda lake located at an altitude of 4548 m in the Qinghai-Tibet Plateau in China. LGL exhibits unique hydrochemical characteristics among soda lakes, but little is known about the microbial diversity of LGL and the microbial interactions with environmental factors. The water samples were filtered using chemical-grade cellulose acetate membrane (pore size of 0.45 μm), and the hydrochemical characteristics were analyzed. Community DNA was extracted, and then high-throughput sequencing of 16S rRNA genes was conducted to evaluate the composition of the microbial community. The high-throughput sequencing of 16S rRNA genes revealed that the bacterial diversity in LGL consisted of 327 genera in 24 phyla (4871 operational taxonomic units (OTUs); Shannon index values of 5.20–6.07), with a significantly higher diversity than that of the Archaea (eight phyla and 29 genera comprising 1008 OTUs; Shannon index values of 2.98–3.30). The bacterial communities were dominated by Proteobacteria (relative abundances of 42.79–53.70%), followed by Bacteroidetes (11.13–15.18%), Planctomycetes (4.20–12.82%), Acidobacteria (5.91–9.50%), Actinobacteria (2.60–5.80%), and Verrucomicrobia (2.11–4.08%). Furthermore, the archaeal communities were dominated by Crenarchaeota (35.97–58.29%), Euryarchaeota (33.02–39.89%), and Woesearchaeota (6.50–21.57%). The dominant bacterial genus was Thiobacillus (8.92–16.78%), and its abundances were most strongly correlated with the total phosphorus (TP) content, pH value, CO32− concentration, and temperature. The most abundant archaeal genus was Methanoregula (21.40–28.29%), and its abundances were the most highly correlated with the total organic carbon (TOC) content, total salinity (TS), and K+ and Na+ concentrations. The results of this study provide valuable insights for developing a more comprehensive understanding of microbial diversity in these unique carbonate alkaline environments, as well as a better understanding of the microbial resources on the Qinghai-Tibet Plateau.


Introduction
Soda lakes are exceptional among aquatic ecosystems because they simultaneously exhibit high productivity rates (i.e., carbon and the large amount of dissolved organic matter produced by photosynthesis) and high pH values (9.5-11.0) (Banda et al. 2019;Paul et al. 2015). The lakes are naturally occurring alkaline environments that contain high concentrations of sodium carbonate owing to evaporation. In addition, high concentrations of other salts can also accumulate, especially sodium chloride, leading to the formation of alkaline saline lakes (Namsaraev et al. 2015). Soda lakes have likely massively contributed to global primary productivity in Earth's geological past and these soda lake environments represent examples of contemporary extreme environments. Generally, they are inland lakes and have a propensity to become meromictic due to regional and local hydrologic events. In addition, they are highly productive due to elevated temperatures, high sunlight incidence, and large supplies of CO 2 . Furthermore, diverse microbial populations are abundant in soda lakes (Lanzen et al. 2013). Many regional examples of soda lakes have been reported, including in the East African Rift Zone, the rain-shadowed regions in California and Nevada, the Kulunda steppe in Russia, and on the Cariboo Plateau in Canada. Similarly, many microorganisms have been isolated from such lakes, including Cyanobacteria, chemolithoautotrophic sulfide oxidizing bacteria, sulfatereducing/nitrifying/denitrifying bacteria, aerobic heterotrophic bacteria, fermentative bacteria, methanotrophs, and methanogens (Zorz et al. 2019;Tiodjio et al. 2014).
The underlying basaltic rocks in some areas of these plateaus originate from Miocene and Pliocene volcanic activity and have led to ideal conditions for the formation of soda lakes owing to the low solubility of calcium and magnesium in the basaltic formations. Thus, soda lakes are important components of terrestrial ecosystems, contain abundant microbial resources, and play critical roles in geochemical cycles by promoting material exchange, such as those in the Qinghai-Tibet Plateau in China (Xing et al. 2019). Soda lakes are widely distributed in terrestrial plateau ecosystems with extreme environmental conditions, such as the persistence of extreme droughts, intense solar ultraviolet radiation, extreme daily temperature changes, and low partial pressures of dissolved and atmospheric oxygen (Namsaraev et al. 2015;Lanzen et al. 2013). Consequently, changes in these environmental conditions in association with elevation can lead to changes in bacterial or archaeal lake community diversity in high-altitude areas (Liu et al. 2010).
Langaco Lake (LGL) is one of the highest typical soda lakes in the Qinghai-Tibet Plateau. However, the microbial structure and diversity of LGL have not been previously investigated.
LGL is also minimally directly affected by human activities and thus remains ecologically intact, thereby providing a natural laboratory for scientific studies. These rarely explored lakes may harbor new microbial species, and thus, an understanding of the bacterial diversity in these high-altitude lakes is critical for species protection and ecosystem conservation (Mesbah et al. 2007). High-throughput sequencing has been extensively used to investigate microbial communities in recent years via 16S rRNA gene compositional analysis of samples from natural environments. These methodologies have become increasingly used to determine differences in microbial community diversity and structure among environments, thereby helping to reveal the interactions among microorganisms in such environments, as well as their adaptions to specific environments (Paul et al. 2015). In this study, Illumina high-throughput sequencing analysis of community 16S rRNA genes was used to comprehensively investigate the bacterial and archaeal communities in LGL, and to explore the dominant genera in LGL and their associations with environmental factors. Therefore, the results of this study provide a theoretical framework for understanding the relationships among microorganisms and environments under alkaline conditions on plateaus.

Sample sites and sample collection
Langaco Lake is located in the northwestern margin of the Tibetan Plateau (30°40′30.6″N, 81°18′32.3″E) in Pulan County in the Ngari Region at an altitude of 4548 m.
LGL has an area of 256.2 km 2 and experiences a frigid semi-arid plateau climate (Mianping 1997). The lake is irregularly, slightly spoon-shaped. There are several islands exposed within the lake, and the terrain around the islands is very steep. The northern portion of the lake is a smaller open lake area that is connected to the open southern part by a narrow channel, while the center is flat (Wang et al. 2013). The lake has a sodium concentration of 106.24 mg/L, a total salinity of 1.00 mg/L, and a pH of 8.62 (Mianping 1997). The total area of the lake has decreased since the 1970s, especially in the northwestern region, while the temperatures have generally increased and precipitation has significantly decreased in this region (Dai 2020). Consequently, the lake water is primarily replenished by meltwater from the glaciers to the north (Wang et al. 2013).
Four samples were collected from LGL in mid-July 2018 from a sediment depth of 30-40 cm. These samples were mixtures of water and sediment (about 4 L total). The distance between any two samples was greater than 4 km (Fig. 1), and they were all collected about 5 m from the coastline. In addition, approximately 2 L of water was immediately filtered through a 0.22-μm filter (Millipore, USA) on site for subsequent DNA extraction. A portable pH meter (LEICI/PHBJ-261L, Shanghai) was used to measure the pH in situ. The filters were taken back to the laboratory on ice, while the water samples collected for physicochemical analysis were stored at 4°C.

Hydrochemical analyses
The water samples were filtered through a chemicalgrade cellulose acetate membrane (pore size of 0.45 μm, Millipore, USA), and the physical and chemical properties of water samples were determined according to the general rules of analytical methods (JY/T020-1996). The major cation concentrations (Na + , K + , Ca 2+ , and Mg 2+ ) were measured using atomic absorption spectrometry (CE 3000 series spectrometer, Thermo Scientific, USA). The anion (Cl − and SO 4 2− ) concentrations were measured using an ion chromatograph (Dionex/ICS-6000, Thermo Scientific, USA). The concentrations of CO 3 2− and HCO 3 − were detected by titration. The total salinity (TS) was determined by the drying gravimetric method (HJ/T51-1999), while the total organic carbon (TOC) and total nitrogen (TN) were detected by the total organic carbon/total nitrogen analyzer (Multi N/C2100, Jena, Germany). Finally, ammonium molybdate spectrophotometry (GB11893-891) was used to determine the concentration of total phosphate (TP).

Microbial community DNA extraction and PCR amplification
The 0.22-μm filter membranes (Millipore, USA) used to filter the water samples via vacuum filtration were sectioned according to the manufacturer's instructions. Then, the community DNA was extracted using an E.Z.N.A Mag-Bind Soil DNA Kit (Omega Bio-Tek, USA). The integrity of the extracted DNA was evaluated using 1% agarose gel electrophoresis, and a Qubit ® 2.0 Fluorometer Q32866 type (Invitrogen, USA) was used to determine the DNA concentrations followed by stored at −80°C.
The community alpha-diversity was calculated using the Mothur software package (v.1.30.1) (Schloss et al. 2009), including the abundance-based coverage estimation (ACE), terminal richness estimation (Chao1), Simpson index, Shannon-Weiner index, rarefaction analysis, and Good's coverage estimation. Venn diagrams were used to assess the numbers of shared and unique OTUs among the samples. The beta-diversity was measured based on the Bray-Curtis distances between the samples, and the overall community differences were evaluated through a full linkage cluster analysis of the Bray-Curtis distances. Canonical correspondence analysis (CCA) was performed using the CCA function in the vegetarian R package, the community distance matrix, and the hydrochemical factors (Han et al. 2017). The variables that significantly explained the differences in the community composition were evaluated using permutation tests under a simplified model.

Taxonomic classification analysis
The relative abundances of the bacterial and archaeal communities were summarized at the phylum, class, and genus levels. The R software suite was used to construct boxplots of the relative taxonomic abundances among the samples, while the GraPhlAn software package (Asnicar et al. 2015); and the online Tree Of Life interactive tool (ITOL, v.3.2.1) (Letunic and Bork 2015) were used for phylogenetic tree visualization of the 100 most abundant OTUs.

Sequence accession numbers
The raw 16S rRNA gene sequences were deposited in the National Center for Biotechnology Information (NCBI) database under the BioSample accession Nos. SAMN20703607 to SAMN20703610 for Bacteria, and SAMN20703611 to SAMN20703614 for Archaea.

Bacterial and archaeal community diversity
The bacterial and archaeal community compositions of the four LGL samples were investigated through highthroughput Illumina sequencing of the community 16S rRNA genes (Table 2). A total of 5879 OTUs were recovered, including 4871 bacterial OTUs and 1008 archaeal OTUs. Among the bacterial samples, the richness and diversity of samples L2 and L4 were slightly higher than those of samples L1 and L3. The observed bacterial community OTU richness, Shannon index, and ACE values were 1025-1293, 5.20-6.07, and 2148.23-2376.17, respectively. In contrast, the archaeal diversity (221-269 observed OTUs, Shannon index values of 2.98-3.30, and ACE index values of 417.63-536.42) was significantly lower than that of the bacterial communities.

Associations between environmental factors and dominant genera
The differences in the abundances among the samples were investigated (Fig. 3) while considering the twelve and nine most abundant bacterial and archaeal phyla. The most abundant (relative abundances of > 1%) bacterial genera among the four samples were the betaproteobacterial genera Thiobacillus and Hydrogenophaga, the alphaproteobacterial genus Gemmobacter, and the gammaproteobacterial genus Thermomonas. In addition, Methanoregula was the most abundant archaeal genus, and other archaeal genera exhibited higher abundances in individual samples, including Thermocladium and Methanomassiliicoccus.
CCA was conducted to evaluate the relationships among community structures and environmental parameters, yielding numerous associations between the overall community composition and several environmental parameters. Consequently, the environmental parameters were analyzed in the context of the representative genera (Fig. 4). The abundances of the dominant bacterial genus Thiobacillus were most strongly correlated with the TP content, followed by the pH, CO 3 2− concentration, and temperature. The abundances of the next most dominant bacterial genera (Hydrogenophaga and Gemmobacter) were associated with the TP and HCO 3 − content. The moderately abundant (1.50-4.00%) bacterial genera Thermomonas, LGL. Samples are indicated by red circles, while genera are indicated by blue triangles, and environmental variables are indicated by arrows. TS, TOC, TN, and TP correspond to total salinity, total organic carbon, total nitrogen, and total phosphate, respectively Algoriphagus, and Sphingorhabdus in sample L1 were correlated with the TP content, pH, and HCO 3 − concentration. The abundances of the Parcubacteria genera incertae sedis and group GP16 were strongly correlated with the pH and Cl − concentration, respectively. The less abundant (1.5-2.5%) genera Rheinheimera, Nitrospira, and Gp6 in sample L2 were significantly correlated with the TN content, as well as the Ca 2+ , SO 4 2− , and Cl − concentrations. Furthermore, the abundances of the genera Pirellula, Gimesia, and Spartobacteria genera incertae sedis were related to the K + , Na + , and TOC concentrations and the TS in samples L3 and L4. The abundances of the dominant (>20%) archaeal genera Methanoregula, Methanothrix, Methanomassiliicoccus, Pacearchaeota incertae sedis-AR13, and Woesearchaeota incertae sedis-AR16 were highly correlated with the TOC, K + , and Na + contents and TS. In addition, the Thermocladium abundances were particularly highly associated with the HCO 3 − concentration.  (Boros and Kolpakova 2018). Soda lakes generally form in hydrologically closed lake basins, and their water has high Na + concentrations, in addition to high CO 3 2− (0.023-63.20 g/L) or HCO 3 − (0.11-20.40 g/L) concentrations (Table 3) LGL is an example of an extreme soda lake with a lower ionic content, with CO 3 2− , and HCO 3 − concentrations of 76.30 and 475.56 mg/L, respectively. Soda lake water also usually contains high concentrations of Cl − (73,000 mg/L) (Schagerl and Renaut 2016). In contrast, LGL has a Cl − concentration of 55 mg/L. Thus, LGL is considered a weakly ionic soda lake. The formation of alkalinity is closely related to the hydrology, climate, and regional geology, which ultimately affect the diversity of microorganisms within these systems. The pH of the water in LGL was 8.62; whereas it is greater than 9.00 in most other soda lakes (Table 3). Due to the extreme conditions within these systems, the characteristics of soda lakes (including high pH values) provide unique environments for microbial communities (Table 3). Thus, a predominance of carbonate and bicarbonate (in addition to the associated alkalinity) are hallmarks of soda lake ecosystems, and diversity has been observed in their hydrochemical characteristics.

Bacterial and archaeal diversity within LGL
Despite the extreme environments within soda lakes, they have high levels of microbial diversity. This is evinced by the 327 and 29 bacterial and archaeal genera, respectively, within LGL, which corresponded to 4871 and 1008 16S rRNA gene OTUs, respectively. Under the condition of using the same sequencing method, LGL exhibited a higher diversity than other soda lakes that have been previously investigated including four alkaline soda lakes in the Cariboo Plateau (bacterial OTUs: 1662), Doroninskoe Lake (bacterial OTUs: 2254), and Lonar Lake (bacterial OTUs: 1,568) (Table 3). Furthermore, the Shannon diversity index values calculated in this study (Bacteria: 5.20-6.07; Archaea: 2.98-3.30) were higher than previously calculated for soda lakes, including Doroninskoe Lake (Bacteria: 1.49-3.46) and five soda lakes in the Badain Jaran Desert (Bacteria: 1.15-3.24) (Table 3). Thus, the microbial diversity of LGL appears to be considerably higher than those of other previously studied soda lakes.

Microbial community structure of LGL
High-throughput sequencing of community 16S rRNA genes was used to conduct a comprehensive investigation of the microbial community diversity in LGL. Twentyfour bacterial phyla and 50 classes were detected in the LGL microbial communities, representing a significantly higher level of taxonomic diversity than in other soda lakes, including 17 bacterial phyla in Mono Lake (CA, USA) and 11 bacterial phyla in Doroninskoe Lake (Transbaikalia, Russia) (Rojas et al. 2018;Matyugina et al. 2018). The most dominant bacterial phylum was Proteobacteria (42.79-53.70%), followed by Bacteroidetes (11.13-15.18%), Planctomycetes (4.20-12.82%), and Acidobacteria (5.91-9.50%). Proteobacteria, Bacteroidetes, and Firmicutes are typically the dominant bacterial taxa in soda lakes (Paul et al. 2015;Mesbah et al. 2007). The unique presence of sodium carbonate may be one of the factors affecting the differences in the microbial diversity in LGL compared to other soda lakes. As was previously mentioned, Proteobacteria and Bacteroidetes have been detected in many soda lakes (Paul et al. 2015;Mesbah et al. 2007), but their relative abundances considerably vary in these systems. The relative abundance of Proteobacteria was reported to be 29.5% in Lonar Lake (India) (Paul et al. 2015), which is significantly lower than that observed in LGL and also in the Soda Lake of Inner Mongolia (Namsaraev et al. 2015). In addition, the   (Deng et al. 2019) abundances of some bacterial groups were lower than those observed in LGL, including Bacteroidetes (8.25%) and Planctomycetes (6.8%) (Banda et al. 2019). Planctomycetes and Acidobacteria have rarely been observed in other soda lakes (Namsaraev et al. 2015;Aguirre-Garrido et al. 2016). Our results indicate that the compositions of the LGL bacterial communities were primarily affected by the environmental factors, which is most likely due to the long residence time of this lake. These effects are likely reflected in the differences in the dominant taxonomic classes within LGL relative to those in other soda lakes. Cytophagia and Flavobacteria (phylum: Bacteroidetes) are the dominant classes in other soda lakes (Szabó et al. 2017), while the most abundant class of Bacteroidetes in LGL was Sphingobacteria, which may be related to the unique hydrochemical characteristics of LGL. Similarly, Alphaproteobacteria are typically the dominant proteobacterial class among soda lakes (Szabó et al. 2017); however, the Gammaproteobacteria class was dominant in LGL. Thus, the LGL bacterial communities exhibit unique compositional differences relative to other soda lakes.
The archaeal diversity in LGL was much lower than that observed for the bacterial communities, but it was unique among soda lakes. A total of eight archaeal phyla were detected in LGL comprising nine classes and 29 genera. The LGL archaeal communities exhibited a higher level of diversity than those of other soda lakes. For example, only three archaeal phyla were detected in Doroninskoe Lake (Matyugina et al. 2018). Interestingly, the dominant archaeal phyla also vary with the sediment salinity in some salt lakes. For example, Crenarchaeota are generally dominant in hyposaline sediments, while Halobacteriales (phylum Euryarchaeota) are dominant in hypersaline sediments (Jiang et al. 2007). Methanogens and other Euryarchaeota are important contributors to global organic carbon cycling (Vavourakis et al. 2016). The relative abundances of the Archaea in LGL also varied compared to those in other soda lakes. Crenarchaeota were dominant (43.96%), followed by Euryarchaeota (36.56%). In contrast, Crenarchaeota have been either undetected (Matyugina et al. 2018) or had very low levels (Rojas et al. 2018) in other soda lakes. Similar to the levels observed in LGL, Euryarchaeota (35.15%) has been reported to be the most abundant phylum in the soda lakes in the Badain Jaran Desert (Banda et al. 2019). Nevertheless, the overall microbial diversity in LGL was higher than those reported in other studies, and the unique hydrochemical characteristics of LGL may be an important factor contributing to this high diversity and unique microbial community structure.
In addition, the unclassified Bacteria and Archaea accounted for a considerable proportion of the LGL communities, contributing 12.67% and 18.31% to the overall communities, respectively. The unclassified Bacteria included groups GP6, GP16, GP7, GP3, and GP4 of the Acidobacteria and the unclassified Parcubacteria. Acidobacteria are one of the most abundant phyla in soils, and their OTU richness, phylogenetic diversity, and community composition are significantly related to the pH of the soil (Wei et al. 2018). Previous studies have reported that human disturbances and activities can reduce the abundances of soil Acidobacteria (Qin et al. 2019). Moreover, sediment bacterial communities, including Acidobacteria, have been shown to be sensitive to fluctuations in environmental factors, especially the external water supply.

Dominant Bacterial Genera Unique to LGL
The most dominant bacterial genus in the LGL communities was Thiobacillus (8.92-16.78%), which featured uniquely higher abundances than in other soda lakes (e.g., those in eastern China: 1.39-2.47%) (Duan et al. 2020). The abundances of Thiobacillus have also been reported to be negatively correlated with the sedimentary sulfate and total sulfur contents (Duan et al. 2020). Thiobacillus can be one of the most dominant groups in freshwater sediments, and it can be used as a biomarker to predict the intensity of subsequent blooms in such environments (Chen et al. 2015). Furthermore, Thiobacillus has also been detected in some typical habitat types (e.g., soil, water, and duck and fish farm) (Yi et al. 2021) and has been observed to be a unique member of coastal bacterial benthic communities (Sherysheva et al. 2020). Intriguingly, Thiobacillus has rarely been observed in other soda lakes (Table 3).
Thiobacillus is an autotrophic bacterium. It is one of the primary iron-reducing bacterial taxa in lake sediments, and its abundances and diversity are closely related to the degree of water eutrophication (Fan et al. 2018). Through these processes, Thiobacillus participates in the redox cycling of heavy metals by producing ferrous iron and accelerating the oxidation of ferric ion in localized areas, such as anaerobic sedimentary environments with high concentrations of heavy metals, in which they contribute to a large proportion of the communities (Ding et al. 2017). Additionally, Thiobacillus can be a key mediator of S 2− oxidation coupled to denitrification, thereby playing an important role in NO 3 − reduction under S 2− enrichment conditions when organic carbon is scarce (Pang et al. 2021). Our CCA analysis of the LGL communities also revealed that the Thiobacillus abundances were most highly correlated with the variations in the temperature, pH, and CO 3 2− and TP concentrations. Thus, the association of Thiobacillus with these factors warrants further investigation.

Dominant archaeal genera unique to LGL
Members of the archaeal family Methanoregulaceae, have been isolated in samples from various habitats, including acidic peat bogs, anaerobic organic waste treatment reactors, submerged sinkhole ecosystems (e.g., oil fields, paddy soils, and mud volcanoes), and freshwater lakes (Savvichev et al. 2021). Methanoregula was the most abundant taxa in LGL (21.40-28.29%), even though it has low sodium requirements (Rosenberg et al. 2014). Furthermore, the CCA revealed that the Methanoregula abundances were significantly correlated with the TOC content and TS, in addition to the K + and Na + concentrations. Methanoregula is a nitrogen-fixing archaeal taxon that dominates freshwater lakes (Stoeva et al. 2014) and is a dominant methane producer within communities. Other studies have shown that it has a significant genetic potential for nitrogen metabolism (e.g., nitrate transport, denitrification, nitrite assimilation, and nitrogen fixation) in methyl-methanogenesis bacterial genomes (Biderre-Petit et al. 2019). In addition, Methanoregula has been detected at different temperatures in wetland soils near alkali lakes (Deng et al. 2019), but it has not been detected in most soda lakes, because it has an optimal pH growth range of 4.50 to 5.55 (Rosenberg et al. 2014). Finally, Methanoregula has also been reported to be the dominant member of a methanogenic community and is well adapted to hypoxic conditions (Savvichev et al. 2021).

Conclusions
The LGL soda lake is a unique sodium carbonate ecosystem that may serve as an excellent model for understanding microbial diversity and microbial adaptation to carbonate habitats. Here, high-throughput 16S rRNA gene sequencing was conducted to comprehensively characterize the bacterial and archaeal communities of LGL. The bacterial diversity in LGL was significantly higher than the archaeal diversity and was mostly dominated by Proteobacteria, Bacteroidetes, and Planctomycetes; while the dominant archaeal groups were Crenarchaeota and Euryarchaeota. The presence and high abundances of Crenarchaeota was uniquely different compared to other soda lakes. Moreover, the characteristics and metabolism of Thiobacillus provide more possibilities for bacterial diversity, and their abundances were most strongly correlated with the pH, temperature, and CO 3 2− and TP concentrations. The high abundance of Methanoregula in LGL was also a unique observation for the LGL archaeal communities, and their abundances were correlated with the TOC content, TS, and K + and Na + concentrations. Finally, the minimal anthropogenic effects on LGL and its extreme environmental conditions provide a unique context for understanding the interactions between microorganisms and extreme soda environments, while also furthering our understanding of microbial resources on the Tibetan Plateau.