This web page was produced as an assignment for an undergraduate course at Davidson College.
Proteomic Database Analysis of SEC22/YLR268W and YLR262C-A
SEC22 AND YLR264C-A were characterized based on genomic databases analysis and expression pattern database analysis on two previous webpages. The current goal is to supplement our past analyses by mining proteomic databses, in order to buttress our knowledge with action to gain a more complete understanding of the nature of discovery-driven descriptive analysis.
To rekindle the memory of the reader, this webpage includes a brief summary of what was found on the prvious two webpages. SEC22 and YLR 262C-A are both Saccharomyces cerevisae genes located on chromosome 12. SEC22 encodes a synaptobrevin homolog (v-SNARE protein). The v-SNARE protein is involved in anterograde and retrograde transport between the ER and the Golgi (Dolinski et. al., 2004). YLR262C-A remains elusive: Expression connection returned only one paper with similarly expressed genes (Mizuguchi et. al.), and Function Junction showed that YLR262C-A was induced in experiments involving EtOH, DMSO, and peroxisomes.The Function Junction data led me to the preliminary, but flawed, conclusiosn that YLR262C-A is involved in detoxification of ethanol--flawed because Yeast utilize ethanol in fermentation, and actually store ethanol, during times of oxygen abundance, to be used later in fermentation in the event that oxygen becomes scarce.
It is unlikely that we will find enough data on proteome websites to develop a strong hypothesis for the function, location and structure of our unknown protein because proteome anaylsis is the least extensively used method in the bugeoning, multidisciplanary, but equally ill-defined field of genomics. Sometimes proteomics is grouped under the heading of genomics. The point is that while genomic methods have been mastered-meaning-we have resolved the troublesome issues of how to obtain well-controlled data, proteomic methods are far behind. Proteomics is a newer discipline so a gene that is not well-researched in genomic microarrays will probably be even less so in proteomics.
At the end of this page, an experiment that has potentail to elucidate the function, location and structure of the non-annotated protein is proposed.
The following databases will be used in our analysis:
TRIPLES - a database of TRansposon-Insertion Phenotypes, Localization, and Expression in Saccharomyces. The database provides information regarding three separate components of gene function: subcellular protein localization, timing of gene expression, and disruption phenotypes.
PDB - This site gives us information about protein structure that can help us in determining the structure-function relationship of the protein.
YRC Two - hybrid Analysis - This site tests protein interactions using bait proteins to discover prey proteins by interacting with prey.
MIPS -Database of protein function and localization
DIP - Database of interacting proteins. This database is useful for finding protein interactions and hypothesizing about functional relationships because may of the proteins have their functions listed as part of the clear search results. The table format of presenting data is clear and simple, but with its simplicity might have to leave out some of the known complexities.
ExPASy - Expert Protein Analysis System Proteomics serverof the Swiss Institue of Bioinformatics (SIB)
EMP Project - Enzymes and Metabolic Pathways database is a comprehenesive and unique source of data covering metabolism and biochemical pathways.
KEGG - KEGG provides molecular and cellualr function data.
Fig 1. Image obtained from TRIPLES. Image showes Potential ORFs disrupted by transposon insertion, gene expression data, subcellular localization in the ER, and a disruptionphenotype that renders the cells inviable. Note when reading the Disruption phenotype data, the following two abbreviations; HapTra=cell inviability of haploid transformants and strong=strong difference in growth.
|We interpret this data to mean that SEC22p is an ER membrance protein as we found earlier, and that it is crucial to the funcion of the entire yeast cell because when the gene is disrupted by insertion of a transposon the cell is inviable. This confirms the notion that SEC22 is likely involved in multiple important protein targetting events because a disruption causes inviability.|
Fig 2. PDB image of an engineered SEC22 homolog obtained from Mus Musculus, 1ifq, can be viewed at the adress in the follwing citation (PDB, 2004; <http://pdbbeta.rcsb.org/pdb/explore.do?structureId=1ifq>). Other views of the protein are offered at the PDB site ([PDB], 2004)
|We interpret the above figure less extensively here than we might simply because extensive interpretation does not elucidate much. We just need to get our minds around the issue of what the protein looks like in 3D. Seeing B sheets as arrows and alpha helices as long ribbons helps us imagine the protein. Even though this is a SEC22 homolog that is not identical, it helps. You can go to the site and click on the links from the webpage to find more information.|
Fig 2. This image from Curagen corporation was obtained by navigating through a link from YRC. The image shows us the proteins that interact with SEC22. This is actually two pictures: the one on the left shows SEC22 function and the one on the right shows that SED5 is regulated by proteins other than SEC22, such as YKT6 via VTI1.
|The interpretation of this data is facilitated by noting that YKT6 is involved in cell signalling pathways (YRC link to Curagen). Note that YKT6 is a synaptobrevin homolog that is similar to SEC22, is essential for ER to golg transport, and defects in YKT6 result in secretory problems. This provides convergent data for the what we believe about SEC22 as a vesicle transport protein of the ER.|
Fig 3. This image from MIPS search shows us which proteins that SEC22 physically interacts with as well as genetically interacts with.
|The interpretation of the above figrue should center around the fact that many of these genes are SEC genes and we already saw SED5 above. This suggests that these proteins are part of a pathway that regulates protein targetting.|
Fig 4. This image was obtained from <http://www.expasy.org/cgi-bin/niceprot.pl?SC22_YEAST> It shows once again that the cellular function is ER to Golgi transport for SEC22. The image also makes references to ote proteins such as SLY2 and YPT1.
|We should interpret this figure to mean that SEC22 is involved in the cellular function we originally thought. Additioanlly, SEC22 might dock with SLY2, because Fih 2 shows that SLY1 is involved with SEC22. Fig 2 also shows that YPT1 is regulated by SEC22. If YPT1 is not lost when SEC22 is present this supports the idea that SEC22 can induce YPT1. We could design an experiment to test this.|
Fig 5. This figure from DIP shows some of the same players interacting with SEC22 and it also shows the functions of these proteins.
|We can interpret this graph in a way that elucidates our interaction web. YPT-1 binds to GTP so YPT1 might be involved in regulation of signal induction. Figure 3 shows us that CDC45, a cell cycle protein might be involved with SEC22, which gives this hypothesis support. The same holds for SAR1 in that it is a GTP binding protein and it is shown in figure 2. An ADP ribosylation factor 1 shows up in Fig 5, which may suggest a regulation of, or BY the cell cycle. SLT2 is a kinase. That this protein is a kinase suggests again a regulation of a signalling pathway. This means that SEC22 probably has an effect on the timing of protein transport. This idea was advanced in one of the earlier pages I wrote on the topic, which mentioned that a cell may store protein in vesicles and then target it to the desried location at the desired time. Alot of the other proteins in Fig 5. are membrane associated or transport vesicle proteins, which suppors the notion that SEC22 binds to vesicles and is involved in vesicle transport processes.|
EMP - no data
Fig 6. Image from Benno Figure 1 of SEC 22 and its interacting proteins. SEC22 is shown to interact with proteins that we saw in previous figures.
|We interpret these data as corroborating evidence for the interactions of SEC22 that we defined earlier. Also, seeing the data on a graph helps us to mentally organize it. Se can see that SEC22 and SED5 directly interact, YKT6 appears to be removed from SEC22 by at least two intermediate enzymes, one of which is SED5. This shows us that some of the data we gain from the yeast calling pathway on Curagen can make proteins seem more closley related than we would percieve them to be based on a graph. Note that the color key in our genomics textbook indictes that blue boxes around SED5, YKT6, SEC22, and BET1 indicate that these are membrane fusion proteins. localized to the same place, the ER membrane, but the functions of these proteins differ. Also note that the green lines connect proteins whose subcellular localizations are the same, but whose functions may differ, according to page 181 of our genomics textbook. This is significant when we note that that most of our proteins (SED5, SEC22, GOS1, BOS1, SLY1, SAR1) are located in the same place, the ER. However, we also note that some of these proteins are located in multiple places. For example, SEC22 has a black and a green line running to SAR1, this means that these two proteins are located in the same place and in different places.|
TRIPLES - no data/no records found
PDB - no data
YRC Curagen link - no data
MIPS - Check it out! We have some data. Too bad it gives us zero reliable inforamtion...
Fig 7. This image from MIPS search shows us that YLRC262C-A is localized to the nucleus and the cytoplasm.
|We should interpret this to mean nothing. These are the two most abundant places for a protein to be found. The ER is also an area that is unreliable as a cellular localization, unless extensively substantiated.|
DIP - no data
ExPASy - no data
EMP - no data
Fig 8. The image from the KEGG database shows us that YLR262C-A has homologs in C. Elegans and a synthetic homolog of S. cerevisae.
We should interpret this data to mean very little. A syntehitic homolg does not offer new data, nor does a homolgous C.Elegans protien if it is not researched and cheracterized. I followed the links and founf that neither protein had been characterized, so it was back to square one.
SEC22 is well-annotated so it would be difficult to suggest furher experiments at this time.
YLR262C-A is so poorly annotated that we do not even know the subcellular localization. It could be in the nucleus or in the cytoplasm according to our data, and those two localizations are extremely prone to false positives. We should perform an insitu hybridization or immunoflourescence by isolating the yeast protein YLR262C-A and injecting the protein into a mouse. The mouse would make antibodies to the yeast protein. Then we should inject theses mouse antibodies into a rabbit. We should isolate the antibody producing cells of the rabbit to tag these proteins with flourescene. We can synthesize the rabbit antibodies in the proesence flourescently labeled nTPs. Then we can freeze (or more precisely, fix the cells with formaldehyde) the yeast cells at a particular time point. then we treat the yeast cells with detergent so that the cell membrane is porous enough for antibodies to enter. Then we place mouse antibodies on a culture of yeast cells and then rinse and treat with the floursecently tagged rabbit antibodies, and rinse. We can expose the cells and see where the antibodies bound to find the subcellular location of the protein. The trick of isolating the protein is not that hard considering it is only 64 aa's long (as described on my earlier pages) so we can synthesize it easily. We could test the protein at different times during the cell cycle and/or the developmental stages of the yeast genome to see if the gene is turned on at particular times. insitu hybridization is not well-suited to high-throughput proteomic analysis, but it is very useful in determining specifics about a particular protein.
We also probably want to get a better idea of what YLR262C-A performs as a cellular role while we are at it. Thus, we could perform a gene disruption experiment and observe the effects. We could flank the gene YLR262C-A with lox P sites and then add cre enzyme so that the gene would be excised and then we could observe the effects. If the cell line becomes inviable then we know we arre dealing with a crucial protein. We assume that whichever function is lost, is the one that YLRs6sC-A performed. If vesicle targetting is disrupted then we assume that YLR262C-A protein is involved in vesicle targetting, for example. At this time, it is impossible to hypothesize what the function of YLR262C-A would be.
Through the semester we have become proficient in the use of genomics and proteomics databases to form hypotheses. We have found the limitations of these methods, in that we often lack sufficient data to arrive at hypotheses. We have also found that these methods can be used to control our experiments, as was the case with discoveries of aneuploidy. We now are moving to the smaller scale. Once we have a specific hypothesis about a gene or we really want more information about a specific gene for whatever reason then we need to use some more traditional methods, such as immunofloresence from developmental biology. The proteomic tools used on this webpage facilitate hypothesis development because they provide physical interaction data. When presented in a visually effective manner, much can be elucidated with little effort by the reader.
But there are other problems and areas we need to develop in this burgeoning field. For one, we need more experiments performed in proteomics and we need to overcome the hurdles of hitting a moving target. Protein levels change all the time in the cell and theremaybe multiple splicings for a single mRNA. The proteome can segue into the metabolome, which is a moving target unlike the genome. Additionally, as we move towards individualized cancer treatments and even preventative medical treatments it is important for us to develop sound protein modelling tools so that we can image the 3D interactions of proteins accuerately and predict hyper-active or non-functional proteins and see how to modify malfunctions in proteins that maybe minute or environmentally (e.g. pH) determined. There is a great deal of complexity and we have not yet scratched the surface, but genomics,and proteomis methods to drive hypothesis generation, computer programming, visual data representation, and mathematical algortihmic databasemining, combined with the more traditional methods to test and engineer specific genes offer hope for future knoweldge and understanding, as well as social benefit.
Cherry JM. et. al., 1997. Genetic and physical maps of Saccharomyces cerevisiae. Nature 387: 67-73.
[Curagen] Uetz et al. 2000
Dolinski, K. et. al., 2004. Saccharomyces Genome Database. <http://www.yeastgenome.org/> Accessed Nov. 18, 2004.
Kumar, A. et al. (2000). TRIPLES: a Database of Gene Function in S. cerevisiae. Nucleic Acids Res. 28, 81-84. (Full-text in PDF reproduced with permission from NAR Online http://www.oup.co.uk/nar ).
[PDB] Protein Database, 2004.<http://www.rcsb.org/pdb/> Accessed Nov. 18, 2004.
Ross-Macdonald, P. et al. (1999). Large-scale analysis of the yeast genome by transposon tagging and gene disruption. Nature 402, 413-418. Nature 402, 413-418.
Shikowksi, Benno and Uetz, Peter. (2000) Benno Figure 1 from the Genomics place website.
Max Citrin's Genomic Homepage
E-mail Questions and Comments
© Copyright 2004 Department of Biology, Davidson College, Davidson, NC 28035