PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons
E-Collection

Related Articles
Families in Gene Neighborhoods
June 2015
Signaling: A Platform for Opposing Functions
May 2015
Nuclear Pore Complex: A Flexible Transporter
February 2015
Nuclear Pore Complex: Higher Resolution of Macromolecules
February 2015
Nuclear Pore Complex: Integrative Approach to Probe Nup133
February 2015
Piecing Together the Nuclear Pore Complex
February 2015
iTRAQing the Ubiquitinome
July 2014
CAAX Endoproteases
August 2013
The Immune System: A Strong Competitor
June 2013
The Immune System: Strand Swapping for T-Cell Inhibition
June 2013
PDZ Domains
April 2013
Protein Interaction Networks: Adding Structure to Protein Networks
April 2013
Protein Interaction Networks: Morph to Assemble
April 2013
Protein Interaction Networks: Reading Between the Lines
April 2013
Protein Interaction Networks: When the Sum Is Greater than the Parts
April 2013
Alpha-Catenin Connections
March 2013
Cytochrome Oxidase
November 2012
Bacterial Phosphotransferase System
October 2012
Solute Channels
September 2012
Budding ensemble
August 2012
The machines behind the spindle assembly checkpoint
June 2012
G Protein-Coupled Receptors
May 2012
Revealing the Nuclear Pore Complex
March 2012
Topping off the proteasome
March 2012
Anchoring's the way
February 2012
Reading out regioselectivity
December 2011
An effective and cooperative dimer
November 2011
PDZ domains: sometimes it takes two
November 2011
Raising a glass to GLIC
August 2011
A2A Adenosine Receptor
May 2011
A growing family
February 2011
FERM-ly bound
February 2011
CXCR4
January 2011
Guard cells pick up the SLAC
December 2010
Zinc Transporter ZntB
July 2010
Zinc Transporter ZntB
July 2010
Importance of extension for integrin
June 2010
Spot protein-protein interactions… fast
March 2010
Alg13 Subunit of N-Acetylglucosamine Transferase
February 2010
Urea transporter
February 2010
Two-component signaling
December 2009
ABA receptor...this time for real?
November 2009
Network coverage
November 2009
Get3 into the groove
October 2009
Guanine Nucleotide Exchange Factor Vav1 and Rho GTPase Rac1
October 2009
GPCR subunits: Separate but not equal
September 2009
Proofreading RNA
July 2009
Ribonuclease and Ribonuclease Inhibitor
April 2009
The elusive helicase
April 2009
Click for cancer-protein interactions
December 2008

Research Themes Protein-protein interactions

Families in Gene Neighborhoods

SBKB [doi:10.1038/sbkb.2015.20]
Technical Highlight - June 2015
Short description: A bioinformatics strategy takes advantage of the proximal organization of genes encoding proteins involved in metabolic pathways to predict protein function.


Sequence similarity networks for the proline racemase superfamily, displayed for genes with 35% sequence identity. The identified clusters are color coded. Figure from reference 1 .

As sequencing data accumulate, effective approaches are needed to decipher functions of the enzymes encoded within those genomes. For organisms such as eubacteria and archaea, genes encoding enzymes and other proteins involved in the same metabolic pathway often cluster together in operons. Taking advantage of the localization in such clusters or gene neighborhoods, the groups of Jacobson, Gerlt and Almo (PSI NYSGRC) developed a new bioinformatics approach to predict in vitro activities of the encoded proteins as well as their metabolic functions in cells.

Using this strategy genome neighborhood networks (GNNs) they analyzed 2,333 unique sequences encoding proteins in the proline racemerase superfamily. The authors constructed a sequence similarity network in which varying thresholds can be set that correlate to distinct sequence identity levels; in this study, 35% and 60% cutoffs were used. The simultaneous query of all sequences results in amplification of genes for functionally related proteins; importantly, if genes for unrelated proteins occur within these neighborhoods in some species, those signals will be eliminated as noise using such analysis. For this reason, the authors suggest that this large-scale, aggregate approach is more efficient for the identification of proteins involved in metabolic pathways compared to single-genome analyses. The GNN approach predicted function for >85% of the proteins, which the authors verified by measuring in vitro enzyme activity, by assaying phenotypes and using transcriptomics as well as X-ray crystallography.

For more complex superfamilies, information from multiple sources will need to be integrated. For example, when bacterial genes are located in polycistronic transcriptional units, that information can be combined to identify pathways and predict enzyme function.

Irene Kaganman

References

  1. S. Zhao et al. Prediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks.
    eLlife. 3 (2014). doi:10.7554/eLife.03275

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health