PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons

Related Articles
Families in Gene Neighborhoods
June 2015
Expanding the Reach of SAD
April 2015
Greasing the Path for SFX
January 2015
Time-Resolved Crystallography with HATRX
December 2014
Structures Without Damage
August 2014
Error Prevention
July 2014
A Refined Refinement Strategy
May 2014
Membrane Proteome: Microcrystals Yield Big Data
April 2014
Optimizing Damage
February 2014
Getting Better at Low Resolution
January 2014
Building a Structural Library
November 2013
Drug Discovery: Identifying Dynamic Networks by CONTACT
October 2013
Microbiome: Solid-State NMR, Crystallized
September 2013
Fluorescence- and Chromatography-Based Protein Thermostability Assay
October 2012
Insert Here
October 2012
Native phasing
August 2012
Smaller may be better
April 2012
Metal mates
February 2012
Not so cool
December 2011
One from many
August 2011
Rosetta hone
July 2011
Solutions in the solution
June 2011
Beyond crystals, solutions, and powders
May 2011
Snapshot crystallography
March 2011
FERM-ly bound
February 2011
A new amphiphile for crystallizing membrane proteins
January 2011
'Super-resolution' large complexes
December 2010
Proteinase K and Digalacturonic Acid
September 2010
Some crystals like it hot
May 2010
Tips for crystallizing membrane proteins in lipidic mesophases
February 2010
Tackling the phase problem
November 2009
Crystallizing glycoproteins
September 2009
Crystals from recalcitrant proteins
August 2009
Tips for crystallizing membrane proteins
June 2009
Chaperone-assisted crystallography
March 2009
An “X-ray” ruler
January 2009
Methylation boosts protein crystallization
December 2008

Technology Topics Crystallography

Error Prevention

SBKB [doi:10.1038/sbkb.2014.210]
Technical Highlight - July 2014
Short description: Protein crystallographers must be careful to prevent avoidable errors in the PDB.

An example of a non-parsimonious modeling error. The hexamer of protein molecules in PDB 3M9B, originally presented in P21 symmetry (left), are in fact placed around the space-diagonal threefold axis of the true space group P213 (right). Figure courtesy of Zbigniew Dauter.

The Protein Data Bank (PDB) is an invaluable resource for all biologists, especially those conducting drug design or data-mining studies. The PDB currently contains over 85,000 crystallographic structures. Although the majority of these depositions are of high quality, preventable errors in a number of structures could affect the PDB's credibility as a source of reliable structural information.

Dauter and colleagues (PSI MCSG and NYSGRC) recently analyzed a number of typical errors that are prevalent in the PDB. These errors were divided into four general categories: inconsistent data presentation, non-parsimonious modeling, ignoring evidence, and ignoring prior knowledge.

The authors presented examples of each type of error and sought to remind crystallographers to invoke Bayesian reasoning when assessing the correctness of structural models. Namely, experimenters should take into account both the primary evidence available (i.e., crystallographic data) and their own prior knowledge (i.e., the laws of chemistry and physics) when building and refining structures. Furthermore, despite the availability of increasingly automated methods of structure determination, experimenters should rigorously and objectively evaluate the details of all structures before carefully depositing them in the PDB.

Additionally, Dauter and colleagues advocated that original authors and unaffiliated researchers continue to re-refine and redeposit models when improvement is possible. These corrected structures should take advantage of the PDB's little-used REMARK and CAVEAT codes to indicate the reasons for the model's update and to alert other users about questionable features, respectively. The authors concluded that the next generation of protein crystallographers must be adequately trained in order to prevent the majority of these errors, and ultimately to ensure the highest standards of structure determination and the continued health of the PDB.

Timothy Silverstein


  1. Z. Dauter et al. Avoidable errors in deposited macromolecular structures: an impediment to efficient data mining.
    IUCrJ. 1, 179-193 (2014). doi:10.1107/S2052252514005442

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health