PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons

Related Articles
Microbiome: Expanding the Gut Gene Catalog
November 2014
Complex Search
September 2014
Repairing a Rift
September 2014
iTRAQing the Ubiquitinome
July 2014
Immunity: Clustering Immunoglobulins
June 2014
Mining Protein Dynamics
May 2014
Design and Discovery: Identifying New Enzymes and Metabolic Pathways
January 2014
Epigenetics: Tracing Histone Demethylase Inhibitors
December 2013
Cancer Networks: Predicting Catalytic Residues from 3D Protein Structures
November 2013
Protein-Nucleic Acid Interaction: Inhibition Through Allostery
July 2013
Infectious Diseases: Targeting Meningitis
May 2013
Protein Interaction Networks: Reading Between the Lines
April 2013
Design and Discovery: A Cocktail for Proteins Without ID
February 2013
Targeting Enzyme Function with Structural Genomics
July 2012
More in one
June 2012
Disordered Proteins
February 2012
RNA Chaperone NMB1681
July 2011
Capsid assembly in motion
April 2011
One at a time
April 2011
A growing family
February 2011
Predicting functions within a superfamily
January 2011
Isoxanthopterin Deaminase
November 2010
Scaling up mutational scanning
November 2010
Alpha/Beta Barrels
October 2010
Mre11 Nuclease
May 2010
Assigning protein function: GeMMA
April 2010
Face off
October 2009

Technology Topics Annotation/Function

Microbiome: Expanding the Gut Gene Catalog

SBKB [doi:10.1038/sbkb.2014.230]
Technical Highlight - November 2014
Short description: An international effort has identified nearly ten million genes belonging to microbes hosted by diverse human guts.

MetaHIT project workflow to create a human gut gene catalog from three continents. 1

The microbes that live in our intestines are intimately tied to our health and represent a rich source of unexplored metabolic information. A typical gut community may possess an order of magnitude more genes than are encoded in the genome of its human host. New work by Wang, Bork and colleagues at the Metagenomics of the Human Intestinal Tract (MetaHIT) consortium greatly expands the list of known gut microbial genes in a single high-quality gene catalog, making it an excellent resource for gene function and structural studies.

With the goal of increasing the number and diversity of sampled populations, the MetaHIT consortium sequenced new Danish and Spanish samples and analyzed sequence data from their previous European samples, American samples from the Human Microbiome Project and Chinese samples from a diabetes study. They also extracted genes from 500 sequenced prokaryotic genomes to help identify genes from prevalent but low-abundance species in the samples.

By applying a standardized workflow to preprocess and assemble sequence reads, predict genes and then cluster them to remove redundancies, they produced the first integrated catalog from three continents. The catalog has fewer redundancies, less fragmentation and longer genes than previous collections; it represents nearly 1,300 gut metagenomes from over 1,000 individuals, amounting to 9.8 million non-redundant genes, a nearly threefold expansion over the older, non-integrated catalogs.

The authors showed that this combination of samples improved sequence mapping quality and coverage of some rare species, and allowed detection of country- and individual-specific gut microbial signatures. Functional annotation using the Kyoto Encyclopedia of Genes and Genomes and the evolutionary genealogy of genes non-supervised orthologous groups indicated nearly 7,000 and 36,500 orthologous groups from these databases, respectively, and suggested that coverage of prokaryotic functional capacity may be saturated in the catalog. The integrated gene catalog is freely available in the GigaScience Database at

Tal Nawy


  1. J. Li et al. An integrated catalog of reference genes in the human gut microbiome.
    Nat Biotechnol. 32, 834-41 (2014). doi:10.1038/nbt.2942

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health