PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons
E-Collection

Related Articles
Drug Discovery: Solving the Structure of an Anti-hypertension Drug Target
July 2015
Retrospective: 7,000 Structures Closer to Understanding Biology
July 2015
Design and Evolution: Bespoke Design of Repeat Proteins
June 2015
Design and Evolution: Molecular Sleuthing Reveals Drug Selectivity
June 2015
Design and Evolution: Tunable Antibody Binders
June 2015
Design and Evolution: Unveiling Translocator Proteins
June 2015
Evolution of Photoconversion
June 2015
Families in Gene Neighborhoods
June 2015
Protein Folding and Misfolding: A TRiC-ster that Follows the Rules
March 2015
Protein Folding and Misfolding: Beneficial Aggregation
March 2015
Peptidyl-carrier Proteins
October 2014
Predicting Protein Crystal Candidates
October 2014
Protein and Peptide Synthesis: Coming Full Circle
October 2014
Protein and Peptide Synthesis: Sensing Energy Balance
October 2014
Mining Protein Dynamics
May 2014
Novel Proteins and Networks: Assigning Function
May 2014
Novel Proteins and Networks: Polysaccharide Metabolism in the Human Gut
May 2014
Design and Discovery: Evolutionary Dynamics
January 2014
Design and Discovery: Identifying New Enzymes and Metabolic Pathways
January 2014
Design and Discovery: Virtual Drug Screening
January 2014
Caught in the Act
December 2013
Microbiome: Insights into Secondary Bile Acid Synthesis
September 2013
Microbiome: Structures from Lactic Acid Bacteria
September 2013
The Immune System: A Brotherhood of Immunoglobulins
June 2013
The Immune System: Super Cytokines
June 2013
Design and Discovery: A Cocktail for Proteins Without ID
February 2013
Design and Discovery: Enzyme Reprogramming
February 2013
Design and Discovery: Extreme Red Shift
February 2013
Design and Discovery: Flexible Backbone Protein Redesign
February 2013
Designer Proteins
February 2013
Membrane Proteome: Sphingolipid Synthesis Selectivity
December 2012
Symmetry from Asymmetry
October 2012
Serum albumin diversity
August 2012
Pocket changes
July 2012
Predictive protein origami
July 2012
Targeting Enzyme Function with Structural Genomics
July 2012
Finding function for enolases
June 2012
Substrate specificity sleuths
April 2012
Disordered Proteins
February 2012
Metal mates
February 2012
Making invisible proteins visible
October 2011
Alpha/Beta Barrels
October 2010
Deducing function from small structural clues
February 2010
Extremely salty
February 2010
Membrane proteins spotted in their native habitat
January 2010
How does Dali work?
December 2009
Secretagogin
December 2009
Designing activity
September 2008

Research Themes Protein design

Families in Gene Neighborhoods

SBKB [doi:10.1038/sbkb.2015.20]
Technical Highlight - June 2015
Short description: A bioinformatics strategy takes advantage of the proximal organization of genes encoding proteins involved in metabolic pathways to predict protein function.


Sequence similarity networks for the proline racemase superfamily, displayed for genes with 35% sequence identity. The identified clusters are color coded. Figure from reference 1 .

As sequencing data accumulate, effective approaches are needed to decipher functions of the enzymes encoded within those genomes. For organisms such as eubacteria and archaea, genes encoding enzymes and other proteins involved in the same metabolic pathway often cluster together in operons. Taking advantage of the localization in such clusters or gene neighborhoods, the groups of Jacobson, Gerlt and Almo (PSI NYSGRC) developed a new bioinformatics approach to predict in vitro activities of the encoded proteins as well as their metabolic functions in cells.

Using this strategy genome neighborhood networks (GNNs) they analyzed 2,333 unique sequences encoding proteins in the proline racemerase superfamily. The authors constructed a sequence similarity network in which varying thresholds can be set that correlate to distinct sequence identity levels; in this study, 35% and 60% cutoffs were used. The simultaneous query of all sequences results in amplification of genes for functionally related proteins; importantly, if genes for unrelated proteins occur within these neighborhoods in some species, those signals will be eliminated as noise using such analysis. For this reason, the authors suggest that this large-scale, aggregate approach is more efficient for the identification of proteins involved in metabolic pathways compared to single-genome analyses. The GNN approach predicted function for >85% of the proteins, which the authors verified by measuring in vitro enzyme activity, by assaying phenotypes and using transcriptomics as well as X-ray crystallography.

For more complex superfamilies, information from multiple sources will need to be integrated. For example, when bacterial genes are located in polycistronic transcriptional units, that information can be combined to identify pathways and predict enzyme function.

Irene Kaganman

References

  1. S. Zhao et al. Prediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks.
    eLlife. 3 (2014). doi:10.7554/eLife.03275

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health