Copyright 2004 Reed Business Information US, a division of Reed Elsevier Inc.
All Rights Reserved

Genomics and Proteomics
October 1, 2004
Hurdles in the Path of High-Throughput Proteomics;
After the success of genomics, proteomics researchers may have had visions of characterizing the entire proteome as rapidly. If only it were that easy; protein complexity continues to present a formidable challenge.
BYLINE: By Emma Hitt, PhD; Emma Hitt is a freelance writer based in Atlanta .
BODY:
High-throughput proteomics is an appealing idea. But with currently available technology, the concept may be somewhat of an oxymoron, like "working vacation" or "healthy tan." As the level of throughput of a proteome analysis increases, the comprehensiveness decreases, says Ruedi Aebersold, PhD, a researcher with the Institute for Systems Biology, Seattle.
That's not to say that certain subsets of a proteome can't be analyzed in a high-throughput manner. "Ideally, we want to measure every protein in a sample," Aebersold says. "If that is the goal, then we are not at high throughput, and in my opinion, we will not get there soon if no significant strategic changes are implemented. However, if the goal is to measure a selected subset of the proteins present, then high throughput can be achieved today."
The answer depends on the question
The type of challenge involved also depends on the problem being addressed by an analysis. If the goal is to profile proteins, then dynamic range and detecting and quantifying low abundance proteins are the issue, says Michael Snyder, PhD, professor and chairman of the Department of Molecular, Cellular, and Developmental Biology at Yale University, New Haven, Conn. "If the issue is detecting specific proteins, then having the right antibody probes is the major challenge," Snyder says. "If the issue is analyzing biochemical activities via protein chips, then having protein chips with enough content (i.e., proteins) is the biggest challenge. If protein localization is the goal, then for multicellular organisms the challenge is getting protein tagged at their normal levels."
The major high-throughput methods used in proteome analysis are LC-MS/MS and protein microarrays. Several factors limit throughput for both of these methods. The term proteomics--the large-scale study of proteins--was optimistically coined after the success of the large-scale study of the genome, or genomics. Compared to genomics, however, proteomics is a much more complicated proposition. The human genome contains only about 33,000 genes, but these genes code for more than 200,000 proteins, not to mention post-translational modifications, which expand that number even further. It has been estimated that human serum, for example, contains more than 500,000 different protein species.
From discovery to screening
With LC-MS/MS, the main problem is the complexity of the sample. The mixtures from protease digests of proteomes are exceedingly complex, and every peptide must be identified de novo by database searching. "There are simply too many peptides to be sequenced in such a sample to reach high throughput," Aebersold says. "In addition, the data analysis is complex."
To combat this problem, Aebersold and his colleagues proposed a move from a perpetual discovery mode to a screening mode that takes advantage of the proteomic knowledge accumulated in prior experiments. According to Aebersold, high-throughput analysis is inconceivable if the objects studied have to be discovered de novo each time. However, once all the possible molecules and activities within a species have been discovered and described, biological experimentation can be transformed from a discovery mode to a browsing mode. The solution was to develop sets of synthetic isotopically labeled reference peptides, each uniquely identifying a protein, that are added to the sample mixture to serve as standards for peptide identification and quantification. Such reference peptide mixtures can be isotopically labeled with a class of reagents termed Isotope Coded Affinity Tags, or the stable heavy isotopes can be added during chemical synthesis.
Problems that plague protein arrays
Like LC-MS/MS, protein microarrays also face limitations mainly due to the inherent structural diversity and complexity of proteins. "Several major technical hurdles plague protein array technology," says Gary Hardiman, PhD, director of the Biomedical Genomics Microarray (BIOGEM) Core Facility at the University of California , San Diego . "For example, the stable attachment of proteins to array surfaces and detection of interacting proteins."
Unlike nucleic acids, proteins do not possess straightforward binding partners and they exhibit diverse biochemical features. In addition, there is no equivalent to a polymerase chain reaction (PCR) used for nucleic acids that would facilitate rapid production of proteins. Furthermore, thousands of purified proteins are required for the generation of high-density protein microarrays. Cloning systems using site-specific recombination are routinely employed for high-throughput cloning and expression of protein sets. However, cloning along with DNA sequence confirmation and gene identification "remain cumbersome precursors to protein expression," Hardiman says.
Another problem, says Hardiman, is the stable attachment of proteins, which are highly sensitive to the physiochemical properties of the chip support material. "Polar arrays, for example, are chemically treated to bind hydrophilic proteins. But such surfaces are unsuitable for cell membrane proteins which possess hydrophobic moieties."
Surface chemistries may promote retention of some proteins and denaturation of others. Selecting a surface chemistry that permits diverse proteins to retain their native folded conformation and biological activity can be difficult, Hardiman says, although affinity tags may permit a common immobilization strategy that can help counteract these effects.
To maintain their activity, proteins must remain hydrated at ambient temperatures. Glass slides are compatible with standard microarrayer and detection equipment and are relatively inexpensive, but they are prone to high evaporation rates and sample cross contamination. Matrix slides and nanowell array formats by comparison help reduce evaporation and minimize cross contamination, but they are more expensive than glass slides.
According to Hardiman, the presence of a humidifier during printing and inclusion of glycerol in the samples help prevent sample evaporation. In addition, nanoliter droplets of 40% glycerol remain completely hydrated at ambient temperatures, even when exposed overnight to the laboratory environment.
Typically, antibodies have provided a high-affinity selective method for binding proteins of interest for arraying purposes because they can be produced in both high quantity and purity. The biggest challenge is to obtain specific antibodies against the several hundred thousand proteins that comprise the human proteome. "At this time, antibodies are available for only a fraction of the proteome," says Hardiman. "Furthermore, the specificity of many of these antibodies is poorly documented, and additional antibodies may be required to allow detection of post-translational modifications. Many antibody arrays are limited and contain [only] a few well-defined capture agents directed at a particular class of protein markers."
Many antibodies often cross-react with more than one target protein, which can contribute to large numbers of false positives. However, smaller antibody fragments prepared using phage display may minimize interaction of non-target proteins with a particular antibody. Array antibody candidates can be selected in advance from antibody expression libraries, enabling the generation of highly specific antibody microarrays. In addition, the proteomes under comparison can be labeled in a comparable fashion with different fluorophores, although the reproducibility of these chemical reactions is poor, and interference to the protein antibody interactions presents an additional complexity, Hardiman adds.
Making progress
"For profiling proteins with LC-MS/MS, it will be possible to detect most of a proteome with appropriate fractionation, although we are a long way from profiling a complete proteome, which is hard to define," says Snyder. "For protein chips, we are not far from getting most of a human proteome displayed on a slide, and for yeast it is already possible." Snyder's lab developed the first example of a protein array (in yeast) approaching an entire genome.
Kelvin Lee, PhD, director of the Proteomics Program and associate professor in the School of Chemical and Biomolecular Engineering at Cornell University , Ithaca , N.Y. , says that there aren't any specific commercial advances that seem to be revolutionary in terms of measuring the whole proteome. "I think there's a lot of hope, but only incremental progress," he says.
Lee advises colleagues to consider the need for multiple technologies to solve this problem. An important objective is that students will need to be cross-trained in several fields, such as biochemistry, technology development, and cell biology, for the field to advance.
"But it is an extremely interesting time," he says. "If we are talking about the need to detect and quantify every protein in a cell expressed at a given time, including all post-translational modifications, then I think we are still a ways away from that day. We are still very much in evolution with several examples to show that proteomics is useful. But we lack well-defined, easy-to-use technology for many different applications. It will require more people from many different disciplines to become engaged in some of these challenges to move the field ahead."
For more on the organizations mentioned here, refer to this article at www.genpromag.com
Solving Proteomics Dilemmas
Emma Hitt
There may never be an effective way to analyze an entire human proteome in a high-throughput manner, but several newer technologies may help solve certain proteomics dilemmas on a case-by-case basis. These are some suggestions from the experts.
Mass spectrometers adapted for proteomics
Time-of-flight (TOF), tandem TOF, triple quadrupole, quadrupole ion traps (QIT), Fourier-transform ion traps (FT/IT), and Fourier-transform ion cyclotron resonance (FT/ICR) mass spectrometers. These instruments are not necessarily much faster, but they allow smarter collection strategies for proteome analyses, says Ruedi Aebersold, PhD, of the Institute for Systems Biology.
Automated large-scale screens
Hybrigenics' automated large-scale yeast two-hybrid screen that elucidates protein interactions to uncover pharmaceutical targets is an important developments, says Gary Hardiman, PhD, director of the BIOGEM Core Facility, University of California . Initially developed at the Pasteur Institute, the technology screens for interactions between fragments of proteins. The size of the screen and the statistical analyses allows identification of real protein-protein interactions, reducing the number of false negatives and positives. Also, defining the interaction domains allows a starting point for functional analyses.
Information databases
Hardiman also mentions the Proteome BioKnowledge Library from Incyte, which contains information about more than 60,000 scientific references on more than 80,000 proteins, including data on protein classification and function. http://proteome.incyte.com/.
Reagents for depletion and improved stains
Both antibody and non-antibody based reagents and fractionation methods will probably be the most helpful in overcoming limitations in protein profiling, especially for serum/plasma. Newer protein stains will help increase sensitivity, says Alex J. Rai, PhD, director of general chemistry, with the Johns Hopkins University School of Medicine, in Baltimore, Maryland.
LOAD-DATE: October 01, 2004