The Yeast Genes MCA1 and YOR205C
My two favorite yeast genes are located on Saccharomyces cerevisae chromosome XV. As shown on the graph below, the two genes (each in rectangular boxes) exist toward the middle of the chromosome. One gene, labeled MCA1, is similar to mammalian caspase genes that play a role in cell apoptosis. The other gene, YOR205C, exists on a segment of chromosome XV near MCA1. This gene is non-annotated, which means that much about the gene has not been determined. Various gene analysis techniques, such as Gene Ontology, BLAST analysis, NCBI databases, PREDATOR secondary structure prediction, and Kyte-Doolittle Hydropathy plots, can be used to characterize the cellular activity of the annotated and non-annotated gene.
Figure 1. Map of a portion of S. cerevisiae chromosome XV, displaying the MCA1 gene and the FMP38 uncharacterized gene (in large black boxes). Permission granted from http://www.yeastgenome.org. (Cherry J.M. et al., 1997)
Characterization of MCA1
Gene Ontology Information
Molecular Function – Caspase Activity
Biological Process – apoptosis
Cellular Component – nucleus
What does MCA1 do?
According to the Gene Ontology terms above, MCA1 acts like a cysteine-aspartic acid protease, or caspase, that is expressed in the nucleus. It is a gene involved in apoptosis, or programmed cell death. Under certain conditions, such as interaction with harmful extracellular compounds, a yeast cell activates a metabolic pathway that eventually leads to oragnized cellular component destruction and membrane blebbing. Unlike in necrosis, which results in cell membrane lysis, a cell undergoing apoptosis activates enzymes, such as caspases and DNases, that break down DNA and proteins into small, organized chunks (For example, DNA is cut into ~500 bp pieces). As a final step, the membrane of the cell forms small vesicles around cellular components, keeping those components contained and not floating around in the interstitital space of a multicellular organism. These small vesicles are engulfed by macrophages.
Why does a cell undergo such a process? In multicellular organisms, apoptotic cell bodies that result from the process are easily phagocytosed by macrophages, thereby providing an efficient way of getting rid of dead cells ("Apoptosis"). Without this process, dead cells would burst their contents, creating extracellular debris that could cause the immune system to react, causing inflammation and damage to other cells in the area. However, yeast is not a multicellular organism; it doesn't even have an immune system. So why does it undergo apoptosis? An article written in 2000 suggests that apoptosis developed in yeast cells as an altruistic mechanism. The article shows that much like in multicellular species, apoptosis can be initiated through exposure to high concentrations of reactive oxygen species (ROS), such as H2O2 (Frohlich and Madeo, 2000). Exposure to ROS can cause a cell to eventually die; however, in the process of death that cell will use up resources that other yeast cells could use. Instead of wasting resources, yeast cells shut down their metabolisms and undergo apoptosis as an altruistic behavior, aiding the resource allocation and survival of a yeast colony (Frohlich and Madeo, 2000). Over evolutionary time, this process developed into an elaborate mechanism whereby a futile cell, such as ones with mutations in translation genes and will therefore not survive, would save colony resources by killing itself efficiently. The process also developed specific proteases, such as additonal caspases, that enable it to respond to a greater variety af stimuli and to commit suicide more efficiently (Frohlich and Madeo, 2000). This process also became a key mechanism for development of full tissues in multicellular organisms.
Coding Sequence vs. Genomic Sequence
A comparison of the Genomic Sequence of MCA1, shown below left, and the Coding sequence, below right, show that the gene contains no introns. Since the two sequences are the same length, 1299 bp, the genomic sequence does not contain extra segments and therefore no intron segments are spliced out during gene expression. In additon, according to the Entrez Gene information on MCA1 (shown here), there is only one splicing variant of MCA1 and therefore only has one mRNA product.
Figure 2. Coding sequence of MCA1. The coding sequence is 1299 base pairs long. Permission Granted from http://www.yeastgenome.org.
GC Content Analysis
The yeast genome has a 38.3% GC content (Campbell and Heyer 2006). By comparing this value with the GC content of the MCA1 gene, one can determine whether this gene has unusual characteristics for a typical S. cerevisiae gene. This analysis was performed using the GC calculator, which determined that the GC content of the MCA1 gene was 42.4%. Although this value is slighly higher than the average for the genome, it is within the same relative nucleotide percentage. Therefore, the gene is not unusual for S. cerevisiae in terms of nucleotide content, which also shows that the gene is probably under the same selection pressures and mutation rates than the rest of the genome. High GC contents correlate with high rates of mutation (Campbell and Heyer 2006); this GC content is slightly higher than average but not substantially higher, supporting the conclusion that it has either a slightly higher or an identical mutation rate compared to the whole genome.
BLAST search for homologues:
To determine whether any regions of the MCA1 gene were conserved in other places on the S. cerevisae genome or in other species, I performed a BLASTn search on the genomic DNA sequence of the MCA1 gene and retreived 57 BLAST hits.
Figure 3. Pictorial representation of BLAST results from BLASTing the genomic sequence of MCA1. The picture shows that two sequences out of 57 hits produced bit scores greater than 200, shown by red lines. Other hits returned bit scores between 40 and 80, showing that their sequence homology was not as high. Data from public domain server http://www.ncbi.nlm.nih.gov/blast/.
Figure 4. Data sample from the BLAST search. All other hits had E-values greater than 0.001, pointing toward a smaller homology with the query sequence. 11 hits, shown above, produced E-values of 0.001 or less. Data from public domain server http://www.ncbi.nlm.nih.gov/blast/.
Two of these hits had Bit scores higher than 100 – the first corresponded to the YOR197W gene, which is the alternative name for MCA1, in a clone of S. cerevisae, while the second corresponded to an ORF on chromosome XV of S. cerevisae with the same identifying information, presumably the MCA1 gene again. Therefore, the only two hits with excessively large bit scores were hits for the MCA1 gene. However, other hits came up with low E-values, showing that some similar genes to MCA1 might exist in other places. The next two hits on the database code for a hypothetical mRNA (reference here) and an ORF on chromosome I of the Candida glabrata (another yeast) genome. Since this sequence is only a hypothetical mRNA it has no annotation information and therefore we cannot determine whether the gene is a caspase or not. However, we can determine the degree of similarity between the sequences:
Figure 5. BLAST2 alignment data for comparison of MCA1 (horizontal axis) and Candida glabrata hypothetical mRNA. Data available from http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi.
Figure 6. BLAST2 alignment for the middle section of both genes, showing 74% identity between the two sequences over a 503 base-pair period. Data available from http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi.
Figure 7. BLAST2 alignment for the end section of both genes, showing 75% identity between the two sequences over a 276 base-pair period. Data available from http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi.
The BLAST2 graph shows that two segments of the sequence have homologous regions. The top set of data shows that the middle section of each sequence is 74% identical, while the bottom set displays 75% identity between the end sections of each piece of DNA. From this information, it can be inferred that the two sequences share homologous segments and are paralogs, pointing to the possible existence of caspase-like genes in other yeast genomes. If this is true, then it would support the conclusion that caspase activity did not originate in S. cerevisiae and could have evolved in an ancestral strain, possibly the unicellular eukaryote/multicellular eukaryote MRCA. In this situation, this apoptosis gene is a precursor to the more advanced system of apoptosis seen in mammals and other multicellular organisms.
The list also contains genes in other yeast species, such as K. lactis and S. pombe, showing that it remains conserved in many related species. Further down the hit list, two hits were surprising:
Figure 8. Figure 4. Data sample from the BLAST search. Data from public domain server http://www.ncbi.nlm.nih.gov/blast/.
Segments of mouse DNA and segments from human chromosome 5 also appeared as similar DNA sequences. However, all of the hits for human and mouse DNA were either predicted proteins or non-annotated genes, showing that the sequence data is not statistically similar to any known human or mouse known gene.
Literature Information on MCA1
One paper from Pubmed, entitled “A Caspase-Related Protease Regulates Apoptosis in Yeast,” describes and characterizes the function of MCA1 (called YCA1 in the paper) through the use of immunoblotting, gene overexpression, flourimetric analysis of caspase activity, and flow cytometry (Madeo et al., 2002) (link to the abstract). The paper states that the gene “is activated when yeast is stimulated to apoptosis” and “displays a caspase-like proteolytic activity” (Madeo et al., 2002), implicating the gene in an apoptosis pathway. The article also mentions that the protein produced from MCA1 is most like an “initiator caspase” in vertebrates, or one of the earlier signaling compounds in the pathway (Madeo et al., 2002). This is in contrast wth an effector caspase molecule, which is activated further along in the process and which performs more functional tasks in the cell other than signaling other molecules. Specifically, the gene acts much like human caspase-8, since introduction of caspase-8 into yeast genes caused cell death while introduction of other caspases did not cause yeast cells to die (Madeo et al., 2002).
Another interesting point that the article showed was that “the catalytical center of YCA1 is essential for the caspase activity (Madeo et al., 2002). By mutating a portion of the center of the enzyme, the researchers reduced the activity of the protein greatly, showing that the middle section is the most important for determining proper protein function. This data provides further evidence that this caspase-like gene is conserved in action throughout yeast species. From the BLAST2 data above, the conserved section between the Candida glabrata and the S. cerevisiae gene lies in the middle of the protein, the region that codes for the protein's catalytical center. Since this funtional region is one of the most conserved sections between the two proteins, this finding further implicates the Candida glabrata gene in having an MCA1 homolog and an apoptosis pathway. .
Finally, the article hypothesized that “yeast only contain a single, central caspase,” or MCA1, for the regulation of apoptosis. Humans, by contrast, have many initiator caspases that target multiple numbers of other genes, creating a much more complex pathway to cell death. This system, therefore, is much less complex than that in humans, it is nonetheless a precursor to the array of signaling pathways involved in animal apoptosis. Although there are other proteins involved in the total process, yeast only requires one initiator caspase, while other apoptotic pathways require multiple proteins.
Does this gene have other orthologs?
To determine whether orthologs to MCA1 exist in species other than yeast, I performed an NCBI homologue search for MCA1 and retrieved the following results:
Figure 9. Data from NCBI Homologene database on MCA1.
Data available on NCBI Homologene here
The data show that the gene is conserved in only a few species – two other yeast species (K. lactis and E. gossypii) as well as a species of mold (N. crassa) and the japanese rice speices O. sativa. If this list is exclusive, it is more likely that the gene appeared in species of O. sativa through horizontal gene transfer than through direct evolution from yeast; however, this list probably does not include all the species that contain orthologs to the MCA1 gene, since the BLAST search returned 57 sequence hits.
According to the above Pubmed article, if MCA1 has a human ortholog it should be caspase-8. To test this hypothesis, I performed a BLAST2 search comparing my favorite yeast gene with human caspase 8 (accession number U60520).
Figure 10. Data from BLAST2 search. Data available on public domain server http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi.
Since no significant similarity in the sequences was observed, the genes are probably not orthologous. In addition, a comparison of the protein sequence of MCA1 with the caspase-8 precursor protein in humans (accession number Q14790) showed that the two proteins are not orthologous.
Although these two genes do not share sequence homology, it is possible that they have similar shapes and therefore similar intracellular actions. Since caspase-8 was cytotoxic when introduced into yeast (Madeo et al., 2002), it could have a similar shape but a different primary structure.
PDB Information - Not Available for MCA1.
Mutant Alleles of MCA1
According to SGD, there is only one known mutant of MCA1. This mutant (shown in the table on this SGD page) causes a systematic deletion in a portion of the MCA1 gene. However, the table also states that the phenotype of this gene still produces viable yeast, showing that the mutation is not cytotoxic and perhaps does not affect the action of the MCA1 gene significantly. Mutant phenotype information does not provide any information that supports the conclusions made in the above results.
PREDATOR Secondary Structure Prediction
Since the gene ontology terms for cellular location of MCA1 have already been determined, there is no need to use a Kyte-Doolittle hydropathy plot to determine whether the protein product exists in a plasma membrane (it does not). However, software from the PREDATOR database can help predict the secondary structure of MCA1. Although it is difficult to determine the shape of the protein from the secondary structure without computer modeling, this prediction can help further characterize the protein products from MCA1.
Figure 11. PREDATOR prediction data for MCA1. Color-coded protein segments correspond to the designations in Figure 12. (Combet et al., 2000)
Figure 12. PREDATOR numerical prediction data for the yeast gene MCA1. The data show that the protein is mostly comprised of random coiling. Color-codes for each structure type correspond to the colors in Figure 11. (Combet et al., 2000)
The data show that most of the MCA1 protein is composed of random coiling. Although almost 20% consists of alpha helix, the vast majority of the remaining part of the protein exhibits random coiling patterns. This database prediction tool shows the relative shortcomings of simply using databases to characterize genes. Although some information can be determined, and even sometimes enough can be gleaned to form a hypothesis, the data will always be incomplete without additional experimental evidence.
From these databases, some of the properties of MCA1, a gene in yeast that is an initiator of apoptosis, have been determined. The gene is activated by the presence of high concentrations of ROS, which begins a signaling pathway within the cell that leads to a number of physiological changes, including organized DNA degradation, protein degradation, phosphatidylserine flipping from the inner membrane to the outer membrane, and the breakup of the cell into smaller vesicles. The gene does not contain any introns and, as far as we know, only codes for one mRNA and does not exhibit alternative splicing. The Gene Ontology information localizes the gene to the nucleus and confirms the gene's role as a caspase and a part of the apoptotic pathway. Orthologs to this gene exist in other yeast species, showing that the gene did not evolve in S. cerevisiae and possibly implicating the gene as an evolutionary precursor to animal apoptosis signaling proteins. The gene also has conserved regions in the middle of the sequence with more distant Candida yeast, further tracing the evolutionary history of the gene. Although the literature source for the gene compared the mode of action of the MCA1 protein to that of caspase-8 in humans, the two genes do not possess sequence similarity.
Characterization of YOR205C
Like the annotated gene described above, the non-annotated gene YOR205C exist near the 0.7 Mb marker of chromosome XV. Although this gene's function is unknown, it has been named FMP38 according to the SGD (SGD YOR205C gene).
Gene Ontology Information
Molecular Function and Biological Process GO terms are unknown for this gene.
Cellular Component: Mitochondrion
Nucleotide Sequence of YOR205C:
Figure 11. Nucleotide sequence of YOR205C ORF. This nucleotide sequence is 1671 bp long.
Permission Granted from http://www.yeastgenome.org.
GC Content Analysis
Compared with the S. cerevisiae genomic %GC of 38.3%, YOR205C has a GC content of 38% (GC Calculator). This value is almost exactly the average GC content for S. cerevisiae. The gene is not unusual the the yeast genome in terms of nucleotide prevalence and has a similar mutation rate compared to the genomic average.
BLAST Search for homologs:
One way to determine a gene's possible function is to find similar sequences that have a known function. To this end, the nucleotide sequence of YOR205C was BLASTed; the search for nucleotide-nucleotide sequences returned the following set of hits with E-values below 1. The total number of hits returned was 81; however, many of these hits had Bit Scores of approximately 44 and E-values of greater than one, leading to the conclusion that these sequences share very limited similarity to YOR205C.
Figure 11. Data sample from the BLAST search. All other hits had E-values greater than 1. 11 hits, shown above, produced E-values of 1 or less.
Data from public domain server http://www.ncbi.nlm.nih.gov/blast/.
The first two hits in the search returned the actual YOR205C gene - the first hit as the gene itself and the second hit as a segment of yeast chromosome XV. Aside from these hits, there were three others that produced relatively low E-values (E<0.05). The first was another S. cerevisiae gene that coded for a "putative ATP-dependent RNA helicase" (NCBI BLAST database) (link to the helicase molecule). This gene is involved in processing and metabolism of RNA molecules in a cell (information about RNA helicases found here). These two genes (YOR205C and the RNA Helicase) share a sequence of 75 base pairs that is exactly identical. This finding predicts that YOR205C interacts with RNA, since part of its structure is identical to that of an RNA processing molecule. YOR205C and this RNA helicase could be paralogs; however, further comparison of the two sequences though experimental characterization of YOR205C would be needed to determine whether the two genes evolved from a common ancestor or not.
The next gene found in the BLAST search codes for a segment of the complete genome of Prochlorococcus marinus, a photosynthetic unicellular bacterium (link to the gene). The gene's function is listed as a GTPase (NCBI BLAST database). The two sequences only share a identical 26 base-pair segment with each other, however, so it is unlikely that the genes are identical and that YOR205C is a GTPase. However, the sequence identity could show that YOR205C interacts with GTP.
The final hit in the BLAST search with a small E-value was a gene from Buchnera aphidicola, or a bacteria that lives inside aphids to help them metabolize foods and give them energy (link to the gene). The two sequences only share a 29 base-pair region of homology, and that conserved sequence only contains 28 identities between the two genes. However, the Buchnera gene codes for a GTP-binding protein, giving more evidence that the yeast gene is involved with GTP in some way. It is possible, though, that because the two genes share so little homology that they might have similar structures at one segment on each gene but have completely different functions.
Known Mutant Phenotypes:
The S. cerevisiae database displays the following known phenotypes that result from mutations in this gene (below):
A systematic deletion in a segment of the YOR205C gene leads to viable cells.
A systematic deletion in another segment of the YOR205C gene causes yeast cells to "exhibit growth defect[s] on a non-fermentable (respiratory) carbon source."
A homozygous systematic deletion causes "reduced fitness in rich medium."
Another homozygous systematic deletion causes "decreased metabolite accumulation."
(SGD YOR205C phenotypes).
This information links mutant phenotypes of the YOR205C gene to reduced metabolic capability. All three mutations produce phenotypes that have either reduced metabolic capacity or reduced growth rates, which could be affected by metabolism. In particular, muataions in the gene can cause yeast cells to grow differently. These mutations, therefore, implicate the YOR205C gene in metabolism and/or cell growth. This hypothesis is supported by the GO term that locates this gene in the mitochondrion, the site of metabolism in cells.
Using a Kyte-Doolittle plot can help determine whether the protein coded by YOR205C contains any predicted inter-membrane segments. The figure below shows the Kyte-Doolittle prediction plot for YOR205C:
Figure 12. Kyte-Doolittle transmembrane prediction plot for YOR205C. The window size was set at 19 to predict transmembrane regions of the protein. (Kyte-Doolittle Hydropathy Plot)
Because there are no regions of the protein from YOR205C above the red cutoff line, which signals a hydropathy score of 1.8, the software predicts that there are no regions of the YOR205C protein that are hydrophobic enough to fit across a plasma membrane. This data lends support to the conclusion that the protein exists either inside the mitochondria or in the cytoplasm; combined with the GO location information one could conclude that the protein product of YOR205C exists inside the mitochondria.
PREDATOR Secondary Structrure Prediction
The PREDATOR software allows one to predict the secondary folding structure of the amino acids in YOR205C. This prediction tool can show regions of alpha helices or random coils in a protein, thereby helping to determine its shape.
Figure 13. PREDATOR prediction data for YOR205C. Color-coded protein segments correspond to the designations in Figure 14. (Combet et al., 2000)
Figure 14. PREDATOR numerical prediction data for the yeast gene YOR205C. The data show that the protein is mostly comprised of random coiling. Color-codes for each structure type correspond to the colors in Figure 13. (Combet et al., 2000)
The PREDATOR prediction tool determined that much of the YOR205C gene is made up of random coiling, although there are a few sections of the protein that are alpha helixes. Prediction of protein shape from this data, however, is more difficult and cannot be done using online databases.
Homologs to YOR205C through NCBI Homologene
Figure 15. NCBI Homologene data for S. cerevisiae FMP38, or YOR205C. (NCBI Homologene Database)
The above chart shows that the S. cerevisiae non-annotated gene contains homologs in other species of yeast. Looking at these other genes could help determine the function of YOR205C, since orthologs in other species share similar sequences and often share similar functions. Searching through the Homologene database, I determined that the K. lactis gene (Link to the Gene) shares only weak homology to the yeast gene. It also states that this gene is a "predicted GTPase" (Entrez Gene), further linking the YOR205C gene to interactions with GTP. The Entrez Gene site for the E. gossypii (another yeast) gene shows that this gene is a predicted GTPase as well (Entrez Gene). These two homologs listed in the NCBI database provide evidence that YOR205C interacts with GTP in the mitochondrion.
Two commonalities appear among the results from the above databases about YOR205C. The GO information combined with the listing of mutant phenotypes from the SGD provide evidence that the protein from YOR205C is active in the S. cerevisiae mitochondrion and therefore plays a role in metabolism. A GC content that correlates with the %GC of the whole genome shows that the gene has typical nucleotide compositions and mutation rates. The Kyte-Doolittle plot shows that the protein probably does not span the mitochondrial membrane, making it an intra-mitochondrial protein. Combined with data from the BLAST search, which states that the gene could have RNA helicase-like properties, the evidence points toward the conclusion that the gene probably has a function as an RNA processing and metabolizing molecule within the yeast mitochondrion. The BLAST search also presents data that connects YOR205C to a GTPase or a type of gene that interacts with GTP. This data is supported by NCBI Homologene data that lists GTPases as homologs to YOR205C. These two databases predict that this gene interacts with GTP as well as other compounds. From these two sets of data, I predict that YOR205C is an intra-mitochondrial GTP-dependent RNA helicase that processes RNA molecules and cleaves GTP inside the mitochondrial membranes.
Updated 20 September 2006. Apoptosis. <http://en.wikipedia.org/wiki/Apoptosis>. Accessed 4 October 2006.
Campbell, A. Malcolm, and Laurie J. Heyer. Discovering Genomics, Proteomics, and Bioinformatics. San Francisco and others: CSHL press, 2006.
Cherry JM, Ball C, Weng S, Juvik G, Schmidt R, Adler C, Dunn B, Dwight S, Riles L, Mortimer RK, Botstein D. 1997. Genetic and physical maps of Saccharomyces cerevisiae. Nature 387(6632 Suppl):67-73.
Combet C., Blanchet C., Geourjon C. and Deléage G. NPS@: Network Protein Sequence Analysis. Last Updated 6 October 2006. TIBS 2000 March Vol. 25, No 3 :147-150. Data accessed at <http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_preda.html>. Accessed 6 October 2006.
Entrez Gene. 2006. <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene>. Accessed 5 October 2006.
Frohlich, Kai-Uwe, and Frank Madeo. 3 April 2000. Apoptosis in yeast – a monocellular organism exhibits altruistic behaviour. FEBS Letters 473:6-9.
GC Calculator. Date updated not available. <http://www.genomicsplace.com/cgi-bin/gc_calculator.pl>. Accessed 6 October 2006.
Kyte-Doolittle Hydropathy Plot. Date updated not available. <http://gcat.davidson.edu/DGPB/kd/kyte-doolittle.htm>. Accessed 6 October 2006.
Madeo, Frank et al. April 2002. A Caspase-Related Protease Regulates Apoptosis in Yeast. Molecular Cell 9:911-917.
NCBI BLAST database. 7 May 2006. NCBI. <http://www.ncbi.nlm.nih.gov/blast/>. Accessed 4 October 2006.
NCBI BLAST2 database. 1999. NCBI. <http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi>. Accessed 5 October 2006.
NCBI Homologene Database. 2006. NCBI Homologene. <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=homologene>. Accessed 5 October 2006.
RNA Helicases from yeast. 13 October 2003. <http://www.medecine.unige.ch/~linder/HELICASES_TEXT.html>. Accessed 6 October 2006.
S. cerevisiae MCA1 gene. NCBI Nucleotide. April 2006. <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide>. Accessed 4 October 2006.
SGD (Saccharomyces Genome Database) MCA1 gene. 2006. <http://db.yeastgenome.org/cgi-bin/locus.pl?locus=MCA1>. Accessed 4 October 2006.
SGD YOR205C gene. 2006. <http://db.yeastgenome.org/cgi-bin/locus.pl?locus=YOR205C>. Accessed 6 October 2006.
SGD YOR205C phenotypes. 2006. <http://db.yeastgenome.org/cgi-bin/phenotype/phenotype.pl?dbid=S000005731>. Accessed 6 October 2006.
Questions? Comments? Email brhenschen "at" davidson.edu
© 2006 Department of Biology, Davidson College, Davidson, NC 28035