PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons

Related Articles
Drug Discovery: Solving the Structure of an Anti-hypertension Drug Target
July 2015
Retrospective: 7,000 Structures Closer to Understanding Biology
July 2015
Deciphering Microbial DUFs
November 2014
Microbiome: Artificial Sweeteners Induce Unhealthy Metabolism
November 2014
Microbiome: Expanding the Gut Gene Catalog
November 2014
Microbiome: Fiber Degraders Revealed
November 2014
Microbiome: When Form Doesn't Equal Function
November 2014
Novel Proteins and Networks: Polysaccharide Metabolism in the Human Gut
May 2014
Bacteria and Bile Salts
September 2013
Microbiome: Insights into Secondary Bile Acid Synthesis
September 2013
Microbiome: Solid-State NMR, Crystallized
September 2013
Microbiome: Structures from Lactic Acid Bacteria
September 2013
Microbiome: The Dynamics of Infection
September 2013
Exploring the Secretome of Gut Bacteria
September 2011
Sugar switch
December 2010

Research Themes Microbiome

Microbiome: Expanding the Gut Gene Catalog

SBKB [doi:10.1038/sbkb.2014.230]
Technical Highlight - November 2014
Short description: An international effort has identified nearly ten million genes belonging to microbes hosted by diverse human guts.

MetaHIT project workflow to create a human gut gene catalog from three continents. 1

The microbes that live in our intestines are intimately tied to our health and represent a rich source of unexplored metabolic information. A typical gut community may possess an order of magnitude more genes than are encoded in the genome of its human host. New work by Wang, Bork and colleagues at the Metagenomics of the Human Intestinal Tract (MetaHIT) consortium greatly expands the list of known gut microbial genes in a single high-quality gene catalog, making it an excellent resource for gene function and structural studies.

With the goal of increasing the number and diversity of sampled populations, the MetaHIT consortium sequenced new Danish and Spanish samples and analyzed sequence data from their previous European samples, American samples from the Human Microbiome Project and Chinese samples from a diabetes study. They also extracted genes from 500 sequenced prokaryotic genomes to help identify genes from prevalent but low-abundance species in the samples.

By applying a standardized workflow to preprocess and assemble sequence reads, predict genes and then cluster them to remove redundancies, they produced the first integrated catalog from three continents. The catalog has fewer redundancies, less fragmentation and longer genes than previous collections; it represents nearly 1,300 gut metagenomes from over 1,000 individuals, amounting to 9.8 million non-redundant genes, a nearly threefold expansion over the older, non-integrated catalogs.

The authors showed that this combination of samples improved sequence mapping quality and coverage of some rare species, and allowed detection of country- and individual-specific gut microbial signatures. Functional annotation using the Kyoto Encyclopedia of Genes and Genomes and the evolutionary genealogy of genes non-supervised orthologous groups indicated nearly 7,000 and 36,500 orthologous groups from these databases, respectively, and suggested that coverage of prokaryotic functional capacity may be saturated in the catalog. The integrated gene catalog is freely available in the GigaScience Database at

Tal Nawy


  1. J. Li et al. An integrated catalog of reference genes in the human gut microbiome.
    Nat Biotechnol. 32, 834-41 (2014). doi:10.1038/nbt.2942

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health