PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons

Related Articles
Families in Gene Neighborhoods
June 2015
Expanding the Reach of SAD
April 2015
Greasing the Path for SFX
January 2015
Time-Resolved Crystallography with HATRX
December 2014
Structures Without Damage
August 2014
Error Prevention
July 2014
A Refined Refinement Strategy
May 2014
Membrane Proteome: Microcrystals Yield Big Data
April 2014
Optimizing Damage
February 2014
Getting Better at Low Resolution
January 2014
Building a Structural Library
November 2013
Drug Discovery: Identifying Dynamic Networks by CONTACT
October 2013
Microbiome: Solid-State NMR, Crystallized
September 2013
Fluorescence- and Chromatography-Based Protein Thermostability Assay
October 2012
Insert Here
October 2012
Native phasing
August 2012
Smaller may be better
April 2012
Metal mates
February 2012
Not so cool
December 2011
One from many
August 2011
Rosetta hone
July 2011
Solutions in the solution
June 2011
Beyond crystals, solutions, and powders
May 2011
Snapshot crystallography
March 2011
FERM-ly bound
February 2011
A new amphiphile for crystallizing membrane proteins
January 2011
'Super-resolution' large complexes
December 2010
Proteinase K and Digalacturonic Acid
September 2010
Some crystals like it hot
May 2010
Tips for crystallizing membrane proteins in lipidic mesophases
February 2010
Tackling the phase problem
November 2009
Crystallizing glycoproteins
September 2009
Crystals from recalcitrant proteins
August 2009
Tips for crystallizing membrane proteins
June 2009
Chaperone-assisted crystallography
March 2009
An “X-ray” ruler
January 2009
Methylation boosts protein crystallization
December 2008

Technology Topics Crystallography

Families in Gene Neighborhoods

SBKB [doi:10.1038/sbkb.2015.20]
Technical Highlight - June 2015
Short description: A bioinformatics strategy takes advantage of the proximal organization of genes encoding proteins involved in metabolic pathways to predict protein function.

Sequence similarity networks for the proline racemase superfamily, displayed for genes with 35% sequence identity. The identified clusters are color coded. Figure from reference 1 .

As sequencing data accumulate, effective approaches are needed to decipher functions of the enzymes encoded within those genomes. For organisms such as eubacteria and archaea, genes encoding enzymes and other proteins involved in the same metabolic pathway often cluster together in operons. Taking advantage of the localization in such clusters or gene neighborhoods, the groups of Jacobson, Gerlt and Almo (PSI NYSGRC) developed a new bioinformatics approach to predict in vitro activities of the encoded proteins as well as their metabolic functions in cells.

Using this strategy genome neighborhood networks (GNNs) they analyzed 2,333 unique sequences encoding proteins in the proline racemerase superfamily. The authors constructed a sequence similarity network in which varying thresholds can be set that correlate to distinct sequence identity levels; in this study, 35% and 60% cutoffs were used. The simultaneous query of all sequences results in amplification of genes for functionally related proteins; importantly, if genes for unrelated proteins occur within these neighborhoods in some species, those signals will be eliminated as noise using such analysis. For this reason, the authors suggest that this large-scale, aggregate approach is more efficient for the identification of proteins involved in metabolic pathways compared to single-genome analyses. The GNN approach predicted function for >85% of the proteins, which the authors verified by measuring in vitro enzyme activity, by assaying phenotypes and using transcriptomics as well as X-ray crystallography.

For more complex superfamilies, information from multiple sources will need to be integrated. For example, when bacterial genes are located in polycistronic transcriptional units, that information can be combined to identify pathways and predict enzyme function.

Irene Kaganman


  1. S. Zhao et al. Prediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks.
    eLlife. 3 (2014). doi:10.7554/eLife.03275

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health