Dilution curve
For the dilution curve, a certain number of sequences were randomly selected from the sample, and the Alpha diversity index of these sequences corresponding to the sample was counted. The horizontal coordinate represents the number of sequencing strips randomly selected from the sample, and the vertical coordinate represents the number of operational taxonomic units (OTUs) that can be constructed based on this number of sequencing strips to plot the curve, and the adequacy of the current sequencing data volume is judged according to whether the curve reaches a flatness (Ma et al. 2020). According to Fig. 2, the microbial OTU in petroleum-contaminated soils was significantly reduced compared to the uncontaminated controls, i.e., the overall number of microbial species decreased. The number of soil microbial OTUs at different oil contamination concentrations was in the order of C15> C45> C30, where the difference between C30 and C45 was small, indicating that the number of soil microbial species first decreased and then leveled off as the contamination level increased (Chen et al. 2020).
Microbial alpha (α)-diversity—analysis for variability between groups
Alpha diversity (α-diversity) refers to the species diversity within a community or habitat, mainly focusing on the species diversity within the community. The common method is to calculate Chao, Shannon, Simpson, and coverage4 indexes based on OTU results for biodiversity analysis (Liu et al. 2021). In this study, the Shannon and Simpson indices were selected to characterize the α-diversity of soil contaminated with different additives and to examine the differences between treatments, the inter-group difference test of indices was used. The results showed that the microbial α-diversity of petroleum-contaminated soils (C15, C30, C45) was significantly lower than the control group with no petroleum contamination, but the differences in soil microbial α-diversity between varying concentrations of contamination did not reach a statistically significant level (Fig. 3).
Microbial species composition analysis through Venn diagrams
Venn diagrams were obtained by comparing OTUs between samples or between subgroups, which can visualize the similarity and overlap of OTU composition of environmental samples (Wang et al. 2019). The results showed that each soil sample produced a total of 14,090 OTUs, of which 1791 were shared, accounting for 12.71% of the total OTUs (Fig. 4). the number of OTUs in the C15, C30, and C45 treatments were 3719, 3194, and 3345, respectively, which were lower than that of CK (3832). The number of shared OUTs between contaminated and clean soils showed a decreasing trend as the oil contamination gradient increased. The highest overlap between different pollution gradients was found in the C15 and C45 treatments, which had a high similarity of bacterial community structure. In addition, the different oil contamination treatments did not significantly affect the number of soil-specific OTUs.
Microbial β-diversity analysis for variability between groups
To study the similarity or difference relationship of different sample community structures, cluster analysis was performed on the sample community distance matrix to build a sample hierarchical clustering tree. This method effectively identifies the “major” elements and structures in the data, removes noise and redundancy, reduces the dimensionality of the original complex data, and reveals the simple structure hidden behind the complex data (Zhang et al. 2017).
Figure 5 shows that the different shape legends in the figure represent the control and three soil samples with different concentrations of oil contaminants. The differences in the microbial community composition of CK, C15, C30, and C45 were found by principal component analysis. The microbial community composition of the contaminated soil differed, although not significantly, between the different contamination treatments and differed significantly From CK. the PC1 and PC2 axes explain 20.03% and 9.23% of the results, respectively.
Differential analysis of microbial community composition
The histogram presents the community composition and species abundance at different taxonomic levels (Zhu et al. 2020). In this study, community composition and species abundance analysis were conducted at the genus level. In the absence of oil contamination (CK), the dominant genera included Pantoea, Streptomyces, Alkanindiges, and Massilia. After oil contamination, the dominant genera of the microbial community changed significantly, with Pantoea being dominant. After D2 treatment, the abundance of Streptomyces increased, and the difference between C30 and C45 decreased (Fig. 6). In addition, the abundance of Nocardioides and Acinetobacter increased with increasing oil pollution concentration, while Thiothrix, Sphinggomonas, and Gemmatimonas decreased significantly.