Skip to main content

Co-fitness analysis identifies a diversity of signal proteins involved in the utilization of specific c-type cytochromes

Abstract

Purpose

c-Type cytochromes are essential for extracellular electron transfer (EET) in electroactive microorganisms. The expression of appropriate c-type cytochromes is an important feature of these microorganisms in response to different extracellular electron acceptors. However, how these diverse c-type cytochromes are tightly regulated is still poorly understood.

Methods

In this study, we identified the high co-fitness genes that potentially work with different c-type cytochromes by using genome-wide co-fitness analysis. We also constructed and studied the co-fitness networks that composed of c-type cytochromes and the top 20 high co-fitness genes of them.

Results

We found that high co-fitness genes of c-type cytochromes were enriched in signal transduction processes in Shewanella oneidensis MR-1 cells. We then checked the top 20 co-fitness proteins for each of the 41 c-type cytochromes and identified the corresponding signal proteins for different c-type cytochromes. In particular, through the analysis of the high co-fitness signal protein for CymA, we further confirmed the cooperation between signal proteins and c-type cytochromes and identified a novel signal protein that is putatively involved in the regulation of CymA. In addition, we showed that these signal proteins form two signal transduction modules.

Conclusion

Taken together, these findings provide novel insights into the coordinated utilization of different c-type cytochromes under diverse conditions.

Introduction

The respiratory diversity of electroactive microorganisms such as Shewanella or Geobacter has been widely studied, and such diverse respiratory capability is mainly due to the abundant c-type cytochromes of these species (Lovley 2012; Logan et al. 2019). Generally, these microorganisms can respond to different (extracellular) electron receptors by the expression of different c-type cytochromes (Shi et al. 2016; Ishii et al. 2018). For example, Shewanella oneidensis MR-1 can use MtrCAB-OmcA to reduce extracellular iron/manganese oxides and DmsEFAB to reduce dimethyl sulfoxide (DMSO) (Gralnick et al. 2006; Coursolle and Gralnick 2010); they can also reduce nitrite, nitrate, and fumarate by using NrfA (Gao et al. 2009), NapAB (Simpson et al. 2010), and FccA/IfcA (Maier et al. 2003), respectively. Furthermore, inner membrane c-type cytochrome CymA is critical for these numerous (extracellular) respiratory processes (Myers and Myers 1997). Nonetheless, how these microorganisms coordinate the differential expression of the various c-type cytochromes, as well as their cooperation (e.g., with CymA), are not well understood.

With the development of transposon sequencing (TnSeq) technology, researchers can quantitatively analyze fitness profiles for thousands of mutants in bacteria, establish a direct relationship between genes and cell phenotypes, and provide new clues for gene function inference and regulatory relationship verification (Wetmore et al. 2015; Cain et al. 2020). For example, genome-scale co-fitness analysis has been used to reveal a functional connection between HsbR and RpoE in Pseudomonas stutzeri RCH2; that is, HsbR acts as an antisigma factor for the sigma factor RpoE (Vaccaro et al. 2015). The utility of co-fitness also helps to identify that BT3761 serves as an N-acetylglutamate synthase that is required for arginine biosynthesis in the gut microorganism Bacteroides thetaiotaomicron (Liu et al. 2019). Recently, Price et al. provided genome-scale fitness data of Shewanella under 176 experimental conditions (Price et al. 2018), which not only allows us to examine the functional importance of c-type cytochromes in different environments (Ding et al. 2021) but also makes it possible to learn the functionally related genes, especially to explore the coordinated expression of c-type cytochromes and related genes.

Therefore, we explored the potential triggering conditions of c-type cytochromes by genome-scale co-fitness analysis in this paper. First, we found that high co-fitness genes of c-type cytochromes were enriched in signal transduction processes in Shewanella cells. Then, with an emphasis on the inner membrane c-type cytochrome CymA, we identified a diversity of signal proteins that involved in the utilization of different c-type cytochromes. Finally, co-fitness protein network analysis showed that these signal proteins would form two signal transduction modules. In summary, this finding provides novel insights into the coordinated expression of different c-type cytochromes under diverse conditions.

Materials and methods

c-Type cytochrome

c-Type cytochromes are the main electron transfer proteins in Shewanella. In general, they can covalently bind heme through two cysteine (c) residues, and the sequence feature of the heme binding site is the CXXCH motif. Meyer et al. identified 42 candidate c-type cytochrome genes in Shewanella oneidensis MR-1 through pattern matching (Meyer et al. 2004), and the follow-up reports (Jin et al. 2013; Ding et al. 2016) confirmed 41 c-type cytochrome genes in this species (Supplementary Table 1).

Co-fitness data

Genome-scale transposon sequencing for Shewanella oneidensis MR-1 under 176 different conditions has been performed recently, and the resulting genome-wide fitness data can be obtained from Fitness Browser (https://fit.genomics.lbl.gov/) (Price et al. 2018).

Here, a fitness value of a gene in a given experiment is defined as the log2 change in abundance of the corresponding gene mutant, and the co-fitness value of two genes is the Pearson correlation of all fitness values for the two genes across all experimental conditions (Wetmore et al. 2015; Cain et al. 2020).

Enrichment analysis

The functional annotation tool DAVID (https://david.ncifcrf.gov/) was employed to perform GO molecular function enrichment and KEGG pathway enrichment (Huang et al. 2009). The p-value is adjusted for multiple testing using the false discovery rate controlling procedure from Benjamini and Hochberg (1995), and the cutoff for the p-value is routinely set to 0.05.

Signal proteins

The microbial signal transduction (MiST; https://mistdb.com/) database was used to obtain all signal proteins in Shewanella oneidensis MR-1. This database was established as a comprehensive signal transduction classification system, which used more than 300 signaling domains from Pfam, Agfam, and ECF to identify and classify signal proteins (Gumerov et al. 2020).

Protein structure

Since protein function is mainly determined by its structure, we employed the SWISS-MODEL (https://swissmodel.expasy.org/) server to predict protein structure (Bienert et al. 2017). The resulting models were evaluated by using the qualitative model energy analysis (QMEAN) z score, which could indicate whether a model is comparable to what one would expect from experimental structures of similar size, and are usually used as a global evaluation measurement (Waterhouse et al. 2018).

Protein interaction network

The protein interaction information was obtained from the STRING database (http://string-db.org/) (Szklarczyk et al. 2015; Szklarczyk et al. 2019) and analyzed by using the igraph package (Csardi and Nepusz 2006). Note each interaction in STRING is annotated with multiple confidence scores according to different evidence, and a combined score (0 ~ 1000) is computed by combining all of these evidences. To evaluate the effects of different confidence scores, we used 400, 500, 600, 700, 800, and 900 as the filtered thresholds for the combined scores.

Community structure

Analysis of communities (or modules) is widely used to uncover the biological function units that underlie biological processes of interest (Saelens et al. 2018). Generally, such communities can be identified by maximizing the modularity function introduced by Newman (Newman 2006). For a presumptive partition of a network into several communities, the modularity M of this partition is defined as follows:

$$M\equiv \sum \limits_{s=1}^r\left[\frac{ls}{L}-{\left(\frac{ds}{2L}\right)}^2\right]$$
(1)

where r is the number of communities, ls is the number of edges between nodes in communities, ds is the sum of the degrees of the nodes in community s, and L is the total number of edges in the network.

We employed four methods (edge betweenness, fast greedy, infomap, and propagating labels) to study the communities and then compared the corresponding modularity. The igraph package is also used in this process (Csardi and Nepusz 2006).

Results and discussion

High co-fitness genes of c-type cytochromes are enriched in signal transduction

Since mutant fitness data measure the importance of each gene across many conditions, highly co-fitness gene pairs can therefore reflect the functional relevance between the pair of genes (Wetmore et al. 2015; Cain et al. 2020). To explore the cooperation between c-type cytochromes and other genes, we first obtained the top 20 genes with high co-fitness values to each c-type cytochrome. For robustness purposes, we also considered the top 15, top 10, and top 5 genes with high co-fitness values to each c-type cytochrome.

Then, we performed molecular function (Fig. 1A) and KEGG pathway (Fig. 1B) enrichment analyses for these genes. As shown in Fig. 1, the most common enriched molecular functions included heme binding (GO:0020037), electron carrier activity (GO:0009055), cytochrome-c oxidase activity (GO:0004129), and iron ion binding (GO:0005506), which were present in all four cases. These electron transfer-related enrichments are mainly due to the cooperation between c-type cytochromes and other cytochromes, and thereby, the co-fitness gene list contains a large number of c-type cytochromes. There were also several enriched molecular functions presented in three cases, including receptor activity (GO:0004872), signal transducer activity (GO:0004871), sulfate transmembrane-transporting ATPase activity (GO:0015419), succinate dehydrogenase activity (GO:0000104), and disulfide oxidoreductase activity (GO:0015036). The enrichments of genes in receptor activity and signal transducer activity clearly show that there are many genes related to signal transduction in the co-fitness gene list.

Fig. 1
figure 1

Enrichments for the co-fitness genes for all 41 c-type cytochromes. A Molecular function enrichment and B KEGG pathway enrichment. Note: the corrected p-value < 0.05 (Benjamini correction)

The enrichment of signal transduction proteins was further confirmed by the enrichment of two-component system (son02020) and bacterial chemotaxis (son02030) in KEGG pathway enrichment analysis. The other KEGG pathway enrichments mainly included oxidative phosphorylation (son00190) and ABC transporters (son02010), as well as butanoate metabolism (son00650) and sulfur metabolism (son00920). This is consistent with the molecular function enrichment mentioned above, which mainly includes biological pathways related to energy or oxidoreductase activity.

Therefore, both molecular function and KEGG pathway analyses suggest the enrichment of signal transduction proteins in the co-fitness genes of c-type cytochromes. Although the complex regulatory mechanism and a lot of signal proteins involved in the EET process of Shewanella have been widely discussed (Fredrickson et al. 2008; Rodionov et al. 2011; Sundararajan et al. 2011; Ding et al. 2020), systematic study of signal proteins related to all c-type cytochrome in Shewanella has not been reported. These results can be further used to explain why a diverse set of c-type cytochromes is responsible for the diversity of Shewanella respiration, as well as how these c-type cytochromes are appropriately triggered under different conditions. The results also show that the number of selected genes (i.e., 5, 10, 15, or 20) has no significant effect on the enrichment results. To obtain more comprehensive information, we chose the top 20 genes with high co-fitness values for each c-type cytochrome and further studied the cooperative relationship of these high co-fitness genes in the following sections.

Most c-type cytochromes are linked to diversified signal proteins

To examine how Shewanella used these signal proteins to deal with various electron transfer processes, we checked the top 20 co-fitness proteins for each of the 41 c-type cytochromes and identified the corresponding signal proteins for them. As a result, we found that most c-type cytochromes (32 in 41) were related to at least one signal protein (Supplementary Table 1), which raises the hypothesis that c-type cytochromes need specific signaling proteins to be involved in their utilization.

To examine this hypothesis, we investigated the function of signal proteins that are associated with CymA (SO_4591), which is the most thoroughly studied c-type cytochrome involved in the EET process of Shewanella (Myers and Myers 1997). This tetraheme c-type cytochrome serves as an entry point for electrons and is commonly used in several electron transfer systems in Shewanella, e.g., the MtrCAB pathway for iron and manganese oxides reduction and the DMSO pathway for dimethyl sulfoxide reduction (Gralnick et al. 2006; Coursolle and Gralnick 2010).

These signal proteins are as follows: SO_0141, SO_0437, SO_1385, SO_2240, SO_4454, and SO_4557. At first glance, most of them are chemotaxis proteins, which is consistent with several recent reports. For example, Tai and collaborator found links between chemotaxis proteins and the classical MtrCAB electron transfer pathway that starts from the inner membrane CymA (Tai et al. 2010). Harris et al. also showed that cell congregation in response to minerals requires both chemotaxis proteins and extracellular electron transfer cytochromes in S. oneidensis MR-1 (Harris et al. 2018).

Therefore, the high co-fitness between CymA and these six signal proteins can be well related to the fact that CymA is necessary for the reduction of many anaerobic electron acceptors, and specific signaling proteins (chemotaxis proteins here) are needed to participate in these processes. First, SO_2240 is a cache domain-containing methyl-accepting chemotaxis protein (MCP), and SO_1385 is a PAS (Per/Arnt/Sim) domain-containing MCP. It has been shown that deletion mutants of these MCPs or the critical EET cytochrome CymA will strongly affect Shewanella to congregate to the vicinity of insoluble electron acceptors (Harris et al. 2010; Harris et al. 2012). More specifically, the SO_2240 or cymA mutant showed nonmotile around MnO2, Fe (OH)3, or poised electrodes, whereas the SO_1385 mutant exhibited wild-type levels of motility and reversals around MnO2 but irregularity to Fe (OH)3 or poised electrodes (Harris et al. 2012). Second, SO_2240 has also been shown to be the major MCP, and SO_4454 is a minor MCP that is involved in energy taxis in Shewanella. Meanwhile, the major MCP SO_2240 is necessary for the responses to a number of anaerobic electron acceptors (Baraquet et al. 2009). Third, SO_0141 is a nitrate/nitrite-responsive bifunctional diguanylate cyclase/phosphodiesterase with a PAS sensory domain, which might be used in nitrate/nitrite as electron acceptor conditions, while SO_0437 has been identified as a c-di-GMP-hydrolyzing enzyme PdeB, which was linked to the regulation of sulfate uptake and assimilation in S. oneidensis MR-1 (Chao et al. 2013).

Overall, five of the six signal proteins have known links to (or are closely related to) the electron transfer of Shewanella. Among them, the major MCP SO_2240 is related to the sensing of many kinds of extracellular electron acceptors, and the others seem to have strong specificity.

On the other hand, for the signal protein SO_4557, which has not been reported previously, we predicted a 3D structure for this protein. A total of 1357 templates were found to match the protein SO_4557 sequence by using the SWISS-MODEL server. We filtered the top 5 models and ranked them following the qualitative model energy analysis (QMEAN) z score (Table 1). Here, QMEAN is a composite estimator that uses several different geometrical properties, which is shown to be able to provide both global and local quality estimates for the predicted model (Bienert et al. 2017; Waterhouse et al. 2018). The QMEAN z score is usually used as a global estimation measurement, and a z score of approximately zero indicates good agreement between the model structure and experimental structures of similar size. Therefore, among the final predicted models, we choose the model with the highest QMEAN z score (−0.12). The local quality estimate of this model (Fig. 2A) also suggested the high reliability of this prediction structure, as most residues showed a per residue score > 0.6, and this threshold is used to distinguish high vs. low quality for local model evaluation in SWISS-MODEL. Figure 2B further shows the comparison of model quality scores of individual models to the scores obtained from experimental structures of similar size.

Table 1 The top 5 predicted models by using the SWISS-MODEL server
Fig. 2
figure 2

Further evaluation of the selection model. A Local quality estimate of the predicted model, the figure shows each residue of the model (x-axis) and the expected similarity to the native structure (y-axis). B Comparison of model quality scores of individual models to the scores obtained from experimental structures of similar size; the x-axis is protein length, the y-axis is the normalized QMEAN score, and every dot represents one experimental structure

In summary, the predicted best model is CheA kinase in Escherichia coli (PDB code: 3ja6); in fact, most (4 out of 5) of these filtered models all match this CheA kinase (Table 1). This Escherichia coli CheA kinase has been shown to deal with multifunctional chemotaxis signaling through conformational changes (Cassidy et al. 2015). Therefore, based on such a high confidence of the QMEAN estimations (both global and local quality estimates, as mentioned above), it is rational to speculate that the signal protein SO_4557 also takes such a multifunctional role that is involved in the sensing of many kinds of extracellular electron acceptors in S. oneidensis MR-1, just as the major MCP SO_2240 is.

Co-fitness protein network analysis reveals two signal transduction modules

Since proteins need to interact with each other to perform their functions, the protein-protein interaction (PPI) network can provide insights into the organization and function of biological systems (Typas and Sourjik 2015). We first obtained the background PPI networks of Shewanella from the STRING database. Note: for robustness purposes, we considered multiple STRING confidence scores, which range from 400 (medium confidence) to 900 (very high confidence). Then, we extracted the protein interaction information for all 41 c-type cytochromes and their top 20 high co-fitness genes to construct the co-fitness protein networks.

Community structure analysis of PPI networks and the resulting communities (or network modules) usually contains groups of proteins that are functionally coordinated or perform special biological processes, such as protein degradation and signal transduction (Lin et al. 2015; Saelens et al. 2018). Therefore, the identified communities can facilitate understanding of the proteins within the communities from a phenotype perspective. The basic principle of identifying communities is that there are relatively dense connections within the communities, while the connections between them are relatively sparse. The classical modularity metric function M is an important parameter related to community detection. We thereby employed four frequently used methods (edge betweenness, fast greedy, infomap, and propagating labels) to study the communities in the co-fitness protein networks and compared their modularity. The results show that the modularity scores for the network with a STRING confidence score of 700 are much better (Fig. 3). As high modularity reflects high-dense connections within communities and sparse connections across them, we will thereby choose the result with the largest modularity value and the corresponding co-fitness protein network. As a result, we obtained 11 communities from this network (Supplementary file 1). We then further performed GO molecular function enrichment analysis for the proteins in these communities (Table 2).

Fig. 3
figure 3

Comparison of the modularity of the four community structure detection algorithms: edge betweenness, fast greedy, infomap, and propagating labels

Table 2 The communities identified from the co-fitness protein network in this study and the corresponding GO molecular functions

As shown in Table 2, only two communities presented no statistically significant results (communities 6 and 11). The electron transfer-related enrichment terms (e.g., heme binding, iron ion binding, electron carrier activity, etc.) in communities 1, 8, and 9 are mainly due to the co-fitness gene list containing many c-type cytochromes. The enrichment of flavin adenine dinucleotide binding in community 2 and FMN binding in community 7 is consistent with experimental reports: (1) self-secreted flavins such as flavin mononucleotide (FMN) can serve as redox mediators to facilitate indirect electron transfer between c-type cytochromes and extracellular electron acceptors (Marsili et al. 2008; Glasser et al. 2017), (2) the flavins can also act as cofactors that binding to outer-membrane c-type cytochromes and then help to transfer electrons through direct contact of these flavin-cytochrome complexes with extracellular electron acceptors (Okamoto et al. 2013), and (3) Shewanella use a distinct flavin transporter that can provide the noncovalently bound flavin adenine dinucleotide cofactor to mediate electron transfer (Kees et al. 2019; Light et al. 2019). The remaining four communities (especially communities 4 and 5) show that the co-fitness gene list not only contains a large number of signal proteins (the “High co-fitness genes of c-type cytochromes are enriched in signal transduction”) but also forms functional modules that are used for signal transduction.

As shown in Fig. 4, there are 17 signal proteins in module 4 (SO_1144, SO_1385, SO_1434, SO_1989, SO_2119, SO_2120, SO_2123, SO_2125, SO_2240, SO_2323, SO_2327, SO_3203, SO_3209, SO_3252, SO_4454, SO_4466, and SO_4557), which are mainly chemotaxis signal transduction system proteins; the remaining 12 proteins in this module are mostly flagellar-associated proteins (10 proteins, including flagella biosynthesis, filament assembly, flagella motor, and flagella hook). This is consistent with the fact that chemotaxis is closely related to the EET process (Tai et al. 2010; Harris et al. 2018), as well as the fact that flagella can function as an environmental sensor (Kuhn et al. 2018). All 16 proteins in module 5 are signal proteins (SO_0141, SO_0437, SO_1500, SO_1558, SO_1946, SO_2366, SO_2538, SO_2543, SO_2544, SO_3305, SO_3306, SO_3337, SO_3556, SO_3700, SO_3988 and SO_4445), which contain five one-component proteins and ten two-component proteins (response regulator, histidine kinase, hybrid histidine kinase, etc.). Further examination showed that there were 39 signal proteins in this 203-node co-fitness network, and the ratio (~19.2%; 39/203) of signal proteins was far higher than that of the Shewanella genome (~9.8%). These results are consistent with our previous studies on electron transfer pathways using transcriptional regulation modules (TRMs); that is, Shewanella needs a large number of signal transduction proteins to deal with its most important electron transfer process (Ding et al. 2020).

Fig. 4
figure 4

Classification of signal transduction proteins in modules 4 and 5. OCP, one-component proteins; TCP, two-component proteins; RR, response regulator; HRR, hybrid response regulator; HK, histidine kinase; HHK, hybrid histidine kinase

Conclusion

In this paper, we identified the high co-fitness signal proteins that potentially work with different c-type cytochromes in Shewanella by using genome-wide co-fitness analysis. Further co-fitness protein network analysis showed that these signal proteins would form two signal transduction modules. Taken together, the present results not only help us to understand how these c-type cytochromes are properly triggered but also can be used to explore the coordinated utilization of different c-type cytochromes under diverse conditions in Shewanella cells. For example, our results suggested that the signal protein SO_4557 could work with a critical c-type cytochrome CymA and should have the ability to sense a variety of extracellular electron acceptors. Further experimental investigation is needed to elucidate such a possible role of this signal protein.

Availability of data and materials

All data generated and analyzed during this study are included in this article.

References

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their valuable comments on this study. The work was supported by the Natural Science Foundation of China (62161050) and the Science and Technology Research Project of Jiangxi Education Department (GJJ201605).

Funding

The work was supported by the Natural Science Foundation of China (62161050) and the Science and Technology Research Project of Jiangxi Education Department (GJJ201605).

Author information

Authors and Affiliations

Authors

Contributions

DDW, conceptualization, methodology, writing—original draft, and funding acquisition; HWF, LLL: methodology, data analysis, and writing—review and editing; WP, funding acquisition and writing—review and editing. The authors read and approved the final manuscript.

Corresponding author

Correspondence to De-wu Ding.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplementary Table 1.

Co-fitness signal proteins for c-type cytochromes. Note: these co-fitness signal proteins are listed according to their co-fitness rank.

13213_2022_1694_MOESM2_ESM.zip

Additional file 2.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ding, Dw., Huang, Wf., Lei, Ll. et al. Co-fitness analysis identifies a diversity of signal proteins involved in the utilization of specific c-type cytochromes. Ann Microbiol 72, 38 (2022). https://doi.org/10.1186/s13213-022-01694-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13213-022-01694-4

Keywords

  • Co-fitness analysis
  • c-Type cytochrome
  • Extracellular electron transfer
  • Signal protein