William M. Clemons
Division of Chemistry & Chemical Engineering
California Institute of Technology
1200 E. California Blvd.
Pasadena, CA 91125
2007 Searle Scholar
Our lab is interested in how proteins are manufactured and targeted to their specific locations as well as in the growing field of membrane protein structure and biology. The strategy that we use is based largely on X-ray crystallography. We currently study evolutionally conserved pathways in prokaryotes with the eventual goal of expanding into eukaryotic membrane proteins. There are four major projects in the lab.
Sec Dependent Protein Translocation
In all kingdoms, many nascent secretory and membrane proteins are co-translationally targeted to membranes by cleavable signal sequences and transmembrane (TM) segments with the aid of the signal recognition particle (SRP). At the membrane, the polypeptide is translocated across or integrated laterally by the protein-conducting channel the SecY complex. We have recently determined the structure of the SecY channel complex from the archaebacteria Methanococcus jannaschii. The X-ray structure demonstrated how the channel can translocate many different kinds of protein chains that differ in size and chemical nature, how the channel maintains the membrane barrier, and how membrane proteins can exit an aqueous pore and enter the lipid bilayer. One of the most interesting results is that the structure changed the dogma that a pore is formed by multiple SecY complexes showing that a single complex forms the translocation pore. Subsequently this model has been supported by direct biochemical evidence [1,2]. Electron microscopy (EM) had shown that the SecY complex assembles into multimers when bound to the ribosome [3-5]. In light of the recent data, one can now ask why multimers are required for translocation and how the different SecY complexes interact with the ribosome. To answer this question requires high resolution structures of ribosome-channel complexes and we are collaborating with the lab of Dr. Tom Rapoport at Harvard Medical School to obtain structural information.
Tat Dependent Protein Translocation
In many prokaryotes, another protein translocation system is present in the cytoplasmic membrane. It is called the twin-arginine translocation (Tat) complex, and it operates in a functionally distinct pathway from Sec-dependent translocation (reviewed in [6,7]). The Tat complex is homologous to the _pH-dependent pathway found in the thylakoid membrane of higher plant chloroplasts. Unlike the SecY complex, which translocates unstructured substrates, the Tat complex specifically transports folded proteins, such as those containing metal co-factors. The N-terminal signal sequences of the Tat pathway are longer than conventional signal sequences and contain the conserved motif SRRxFLK.
There are three major components in the Tat sytem, TatA, TatB and TatC. All are integral membrane proteins with TatA and TatB containing a single-membrane spanning N-terminal helix and an amphipathic cytoplasmic C-terminal helix, while TatC contains six transmembrane helices and both termini in the cytoplasm. TatA and TatB have an approximate 20% sequence identity, yet are functionally different. TatA is expressed in more than 20 fold excess over TatB and TatC .
Although Tat substrates are small, transport of the folded substrates requires that the channel form pores up to 70_ in diameter. A simple analysis of the required dimensions leads to a pore lined by at least 20 transmembrane helices . Two large complexes have been purified from overexpressed Tat components. The first contains TatB and TatC in a 1:1 ratio with a small amount of TatA, at a molecular weight by gel filtration of approximately 600 kDa indicating a multi-copy complex [9,10]. TatB and TatC are required for signal sequence recognition, and this complex has been shown to bind to signal peptide . The second complex consists predominantly of TatA with small amounts of TatB and TatC . Visualized by negative stain EM, the TatA complex appears as large rings with an apparent central pore of ~65_, similar to the predicted channel size . This, with much other evidence, leads to the conclusion that a pore is formed by multiple copies of TatA and signal peptide recognition is by TatB and TatC.
The Tat translocation field is relatively young, and many simple questions remain unanswered about this intrinsically interesting system. Our goals are to understand the function of the Tat system through the use of structural biology. This system is a potential antimicrobial target and has biotechnological applications in protein expression.
Membrane Proteins Involved in Glycosylation
In eukaryotes, an essential part of the export and maturation of many proteins that enter the ER lumen is that they contain specific glycosylation sites for the attachment of N-linked oligosaccharide chains. The first enzyme in this process is a multisubunit enzyme complex termed the oligosaccharyltransferase (OST). The enzyme is part of the larger translocon complex that contains SecY and recognizes the consensus sequence N-X-T/S as nascent proteins are translocated. Recently, a bacterial homologue has been identified in the species Campylobacter jejuni  which has been shown to functionally glycosylate substrates in E. coli. This and related archael homologues are perfect candidates for structural studies.
Cellulose synthase and Eukaryotic Membrane Proteins
A major component of plant tissue is cellulose, a &Mac178;-1,4-glucan chain, which is a major component of the biomass of the earth and the major component of many products (i.e. paper). It is synthesized by a family of proteins termed cellulose synthases (CesA) (reviewed in [13,14]). These are large integral membrane proteins, over 1000 amino acids long, containing eight TM helices and a large cytoplasmic globular domain between the second and third TM helices. The most interesting fact about these proteins is that the catalytic activity is at the center of the TM domain, with glucose entering from the cytoplasmic side and cellulose exiting the opposite side. Additionally, the active site must somehow alternate relative to the growing chain to add each incoming sugar in a different orientation from the last. Many plants contain multiple genes in the CesA family that vary in size, but all CesA genes contain the same general features and have high sequence similarity. Solving the structure of the CesA proteins will provide a means for exploring expression systems for eukaryotic membrane proteins and is a long term goal of the laboratory.