Technical Highlight - November 2010
Short description: Mutating a protein's sequence is a useful way of uncovering functionally important residues. A large-scale method tracking up to 600,000 variants at once will speed up this analysis.
The amino acid sequence of a protein is enough to determine its structure and function, yet how the sequence alone conveys this information continues to elude scientists. Mutating residues and then looking for a change in function has been the traditional—and often effective—way of understanding this. To get a detailed functional map has often been a laborious process, but now Stanley Fields and colleagues, writing in Nature Methods, present a method to speed it up and produce large-scale sequence-function analyses.
Purifying proteins with individual mutations to study their effects is, thankfully, a thing of the past for most large-scale analyses. Instead, thousands to millions of protein variants can be generated using a library of protein variants that are then displayed on the surface of phage, yeast or bacteria. These displayed mutants can be assayed simultaneously for a particular activity or function. The bottleneck, however, was sequencing using the Sanger method, which allowed at most a few thousand variants—generally those with highest activity—to be analyzed.
Now Fields and his team show how 600,000 protein variants can be followed at once. One important development was to use high-throughput DNA sequencing, with an Illumina paired-end approach. The other innovation was to apply only moderate selection pressure to the pool of variants. In this case, the group looked at the WW (two-tryptophan) domain of human YAP65 and displayed it on the surface of T7 bacteriophage, selecting variants that bound to the cognate peptide GTPPPPYTVG. By applying only moderate selection, they were able to study a wider range of mutations.
Looking at position-averaged effects of mutations, they identified a distinct region of the WW domain that could tolerate sequence variation. And, as expected, the two conserved tryptophans (WW) had to be maintained for ligand binding.
This method, combining protein display, low-intensity selection and very accurate high-throughput sequencing, allows simultaneous study of the activity of hundreds of thousands of protein variants. Modifications could lead to the mapping of sequence features that, for example, confer resistance to antibiotics or anticancer drugs.
D. M. Fowler, C. L. Araya, S. J. Fleishman, E. H. Kellogg, J. J. Stephany et al. High-resolution mapping of protein sequence-function relationships.
Nat. Meth. 7, 741-746 (2010). doi:10.1038/nmeth.1492