PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons

Related Articles
Predicting Protein Crystal Candidates
October 2014
Drug Discovery: Finding Druggable Targets
October 2013
Membrane Proteome: Unveiling the Human α-helical Membrane Proteome
August 2013
Infectious Diseases: Determining the Essential Structome
May 2013
Infectious Diseases: Targeting Meningitis
May 2013
Microbial Pathogenesis: Computational Epitope Prediction
January 2013
Microbial Pathogenesis: Influenza Inhibitor Screen
January 2013
Targeting Enzyme Function with Structural Genomics
July 2012
Disordered Proteins
February 2012
The cancer kinome
April 2010
Learning from failure
December 2009
Dealing with difficult families
February 2009

Technology Topics Target Selection

Dealing with difficult families

PSI-SGKB [doi:10.1038/th_psisgkb.2009.2]
Technical Highlight - February 2009
Short description: A 'genome pool' strategy that targets multiple members of a protein family boosts structure determination rates, particularly when tricky proteins are avoided.Structure 16, 1659-1667 (2008)

Obtaining diffraction-quality crystals can be frustrating. For the PSI, with its focus on solving structural representatives of all major protein families, lack of useable crystals is a significant stumbling block.

Faced with such a huge task, structural genomics centers now routinely use a 'genome pool' strategy, whereby multiple members of a protein family are simultaneously purified and crystallized right from the start. Jaroszewski et al. from the PSI JCSG show that this approach works well and propose that only if it fails should new versions of the protein be designed and produced.

The authors examined crystallization information recorded in the PSI database TargetDB and produced a two-dimensional distribution of crystallization success rates plotted against sequence identity to the closest crystallized homolog and against sequence identity to the closest non-crystallized homolog. They show that features associated with difficulties in crystallization can be predicted because they are conserved between homologs, and that crystallization success is correlated only for the closest homologs.

For instance, only targets with greater than 75% sequence identity to any crystallized target have a very high change of success, but targets with even 40% sequence identity to ones that failed are unlikely to yield crystals.

The chance of solving at least one structure from a family increases with the number of sequences available per family.

Using data mining of TargetDB, coupled with experimental validation, the authors grouped target protein sequences into five crystallization classes from 1 (optimal) to 5 (very difficult) and analyzed known microbial genomes using this system. Their results showed, surprisingly, that there is little difference in the overall distribution of these classes among genomes, and most individual families with sufficient diversity exhibit a range of crystallization classes. But there are some families that have more 'optimal' targets and others that have very few. These 'difficult' families can still be dealt with by increasing the size of the 'genome pool' (see Figure), an approach that is feasible now that more than 1,000 genomes have been sequenced.

The genome pool approach of targeting multiple members of a given protein family is a simple way to improve the success rate of structure determination for that family. The likelihood of crystallization can be further increased by selecting target sequences with favorable characteristics; for example, using for guidance XtalPred, which was developed by the same group.

Maria Hodges


  1. L. Jaroszewski et al. Genome pool strategy for structural coverage of protein families.
    Structure 16, 1659-1667 (2008).

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health