PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons

Related Articles
Drug Discovery: Solving the Structure of an Anti-hypertension Drug Target
July 2015
Retrospective: 7,000 Structures Closer to Understanding Biology
July 2015
Design and Evolution: Bespoke Design of Repeat Proteins
June 2015
Design and Evolution: Tunable Antibody Binders
June 2015
Immunity: Clustering Immunoglobulins
June 2014
Immunity: Conformational Capture
June 2014
Immunity: One Antibody to Rule Them All
June 2014
Immunity: Tissue Contribution
June 2014
Caught in the Act
December 2013
Serum Albumins and Allergies
October 2013
The Immune System: A Brotherhood of Immunoglobulins
June 2013
The Immune System: A Strong Competitor
June 2013
The Immune System: Strand Swapping for T-Cell Inhibition
June 2013
The Immune System: Super Cytokines
June 2013
Tuning Immune Response with Costimulation
June 2013
Regulatory insights
September 2012
Serum albumin diversity
August 2012
Substrate specificity sleuths
April 2012
February 2012
Analyzing an allergen
January 2012
TLR4 regulation: heads or tails?
October 2011
Binding complement with complementarity
June 2011

Research Themes Immunology

Immunity: Clustering Immunoglobulins

SBKB [doi:10.1038/sbkb.2014.205]
Technical Highlight - June 2014
Short description: An algorithm identifies functional clusters of cell surface-anchored and secreted immunoglobulin superfamily proteins through the comparison of conserved regions and taking into account known protein–protein interaction data.

PICTree classification of the 477 human IgSF proteins into functional families (green circles indicate known binding partners). Figure courtesy of Andras Fiser.

As more sequencing data become available, new tools to generate insights about protein function are sorely needed. Clustering algorithms that rely solely on pair-wise sequence similarity can miss functionally related proteins that share little sequence identity.

To generate predictions of functionally related families among the pharmacologically important immunoglobulin superfamily (IgSF) proteins, Fiser and colleagues (PSI NYSGRC) trained their algorithm to identify proteins that bind the same ligand in a similar manner. This algorithm, named PICTree, first compares sequence profile-based hidden Markov models, which amplifies signals from conserved regions that generally correlate with functional importance. In addition to improving clustering, the algorithm was calibrated on a dataset that included ligand-interaction data from the STRING protein interaction database. Further, the algorithm places more emphasis on sequence similarity within the N-terminal domain, which is frequently involved in ligand binding within cell-surface IgSF proteins.

Analysis of the 477 human cell-surface or secreted IgSF proteins resulted in the identification of 83 clusters with 2–34 members in each, and 87 singletons. Toward the researchers' goal of defining ligand interactions for all IgSF proteins, they predicted the function of a previously uncharacterized protein, VSIG8, in this initial analysis.

Of the five IgSF functional pairs in the training dataset that PICTree failed to identify, four required additional experimental information to ascertain if the pairs indeed share common binding modes. The authors also propose the incorporation of protein-specific binding site information in future versions of the algorithm: for example, in secreted IgSF proteins, unlike cell-surface ones, the binding site could lie outside the N-terminus.

For now, large-scale structural genomics efforts could benefit from information about functional families as well as single interactors that currently lack such information, in order to prioritize targets for experimental analysis.

Irene Kaganman


  1. E.H. Yap et al. Functional clustering of immunoglobulin superfamily proteins with protein-protein interaction information calibrated hidden Markov model sequence profiles.
    J. Mol. Biol. 426, 945-961 (2014). doi:10.1016/j.jmb.2013.11.009

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health