The present web page was produced as an assingment for an undergraduate course at Davidson College

Journey into the Unknown:

Two Yeast Genes and their Functions



Given the recent completion of the human genome draft sequence, it might seem reasonable to conclude that biologists have uncovered all the secrets of our genomic DNA. Such a view is only reasonable prima facie. Quite to the contrary, one could argue that the sequencing of our genome has demonstrated our ignorance. More to the point, the study of our genome is increasingly revealing incoherences in preivous models (namely Mendelian inheritence) and genes for which the scientific community has no conception of their cellular function.

The present discussion will treat two genes in the unicellular eukaryote, Saccharomyces cerevisiae (commonly known as baker's yeast). I will attempt to show the disparity that can be found in the accepted bodies of knowledge concernig known genes and putative ones. The term, putative gene, appears, however, to have an implicit contradiction. Let's consider, for a moment, the implication of the use of "putative." How can a sequence be a gene if its functional roles are unknown? The answer lies in semantics and in the logic of the scientific method.

Most geneticists would agree that a gene is a sequence of DNA from which an RNA transcript is made. From this definition we see the difficulty that can arise from semantics. Consider an open reading frame (ORF). Whenever an ORF is discovered it could potentially be called a gene, yet its function would be unknown. An ORF, then, cannot tell us if a transcript is actually used physiologically in a cell or what its role might be (It should be noted that only a few regions of the genome are actually coding sequences). To do this, an investigator requires experimental manipulations or computational algorithims. Now, the second part of our above answer emerges. The scientific method is essentially hypothesis testing. An investigator either formulates a hypothesis based upon earlier findings or on the basis of a particular proposition a priori. Experimental data then allows the investigator to deduce whether his or her hypothesis is true. In the case of a putative gene, the term is ascribed to a sequence in the form of a hypothesis. Therefore, to say that a particular locus is a putative gene is to propose, hypothetically, that the locus is a gene.

In what follows, I will discuss, first, an annotated or known gene and, second, a unannotated or putative gene. The first topic, the COX5B gene, will be expository in nature and is intended as a review of the current knowledge on the gene. For the second discussion of YIL103W, I will attempt to propose a hypothesis for the role of this putative gene with the use of computational methods.



*(not to be confused with the proteins COX1 and COX2, which are encoded by the PTGS1 gene in larger eukaryotes)

The COX5B gene resides on Chr. IX and encodes the Cox5b (Cytochrome c OXidase, chain Vb) protein in S. cerevisiae. Since the COX5B gene product has a known, i.e., verified, function, biologists consider it an annotated gene. Let's examine more closely the evidence collected to date that suggests a role for the COX5B gene product in yeast.

Any reader unfamiliar with the family of cytochrome c oxidase proteins is likely to be understandably bewildered at this point in the text. It might, then, be of some use to consider the broader view before proceeding into the biology. As mentioned above, yeast is an organism that is very familiar to us - though it's latin name, Saccharomyces cerevisiae, may not be. For centuries bakers have relied on strains of yeast to bake breads and assortments of pastries. Although not of the same genus as S. cerevisiae, strains of yeast are also required for the fermentation of barley and grapes, which give rise to beer and wine, respectively. Yeast's instrumental value is derived from one of its intrinsic biological characteristics. Yeast is a facultive aerobe; this is jargon for an organism that can metabolize energy in both the presence and absence of oxygen. When a cell metabolizes energy sources (e.g., carbohydrates), the cell is said to oxidize the energy source. If oxygen is available, cellular respiration can harness significantly more free energy then when oxygen in unavailable. Fermentation is the term used to characterize anerobic metabolism. S. cerevisiae has evolved such that it can survive in the absence of oxygen; thus it can perform both cellular respiration and fermentation. By contrast, anerobic environments are lethal to humans. In this unique ability, then, one can easily see one of the many interests that yeasts pose biologists. At least, two questions of scientific interest emerge: (1) what evolutionary advantage or selection pressure, if any, has promoted the conservation of anerobic respiration in yeast? (2) How does a yeast cell sense a change in oxygen, thereby switching to aneorbic metabolism?

The present discussion will focus solely on implications of the the second question (Although the possible evolutionary questions concerning yeast are of value, an examination therein is beyond the scope of the present discussion). Now we are prepared to return to the treatment of COX5B. The protein from which the gene derives its name, cytochrome c oxidase, chain Vb, is a constituent of perhaps one of the most essential enzymes to homeostasis. Many readers will likely recall from their collegiate biology courses that cytochrome c oxidase (CcO) is the enzyme that catalyzes the final step in the electron transport chain. The word "oxidase" reflects the observation that CcO faciliates the transfer of electrons to dioxygen molecules. Generally metabolic reactions begin with the breakdown of glucose. Considered together, cellular respiration has 4 phases: glycolysis, the pyruvate cylce, the ciritc acid cylce, and oxidative phosphorylation (Purves,, 2001). The many reactions that occur during each of these phases contributes to the oxidation of glucose and its chemical products. That is, energy is removed from glucose in the form of electrons and stored as chemical bonds in the molecule, adenosine-triphosphate (ATP). Because almost all of the metabolic reactions necesary to sustain life in organisms require energy input (i.e., the reactions are endergonic), the hydrolysis of ATP, or cleaving of its phosphate bonds, releases the energy that fuels these reactions (Purves,, 2001). The final phase of respiration is no different; oxidative phosphorylation couples the energy from the oxidation of glucose to transport hydrogen ions, or protons, accross the inner membrane of the mitochondrion. Thus, the translocation of protons into the inter-membrane matrix maintains the proper gradient needed for ATP biosynthesis (Babcok and Wikstrom, 1992).

Cytochrome c oxidase (CcO) is one of several enzymes responsible for electron transport during oxidative phosphorylation. CcO is a dimeric protein that consists of two monomers each having 13 subunits or chains; three genes in the mitochrondrial genome enconde chains I, II, and III, and the other ten chains are encoded by genes in the nucleus (Tsukihara,, 1996). CcO has transmembrane domains that span the inner mitochondrial membrane, but is is generally considered a periphiral membrane protein, residing partly in the inter-membrane space (Purves,, 2001). Only three chains do not have any transmembrane domains; they are chains Va, Vb, and VIb (Tsukihara,, 1996). The topic of the present discussion, our gene of interest, COX5B, encodes the chain Vb of the CcO molecule.

At this point, we will now turn our gaze on the specific functions of chain Vb in the activity of the holenzyme (this term simply means that several polypeptides comprise a large enzymatic molecule). Tsukihara and colleagues (1996) reported that chain Vb is only conformationally stable when bound to the subunit assembly. More particularly, chain Vb is an isoform of chain Va (Bunn and Poyton, 1996). These two proteins, however, are encoded by two distinct genes, COX5B and COX5A, respectively, and their primary structures have only 66% homology (Boyton,, 1988). Considered in conjuction with findings that the chain Va isoform is only detectable from yeast cells grown aerobically, the divergence in sequence identity lead to the proposition that the two chain V isoforms have different cellular roles and are, consequently, differentially expressed in response to environmental stimuli and may confer seperate influences on the reaction rates of respiration via the CcO complex (Boyton,, 1988). Since the orginal formulation of this isoform hypothesis, Allen and colleagues (1995) found that the Va and Vb isoforms do in fact modualte distinct effects on the enzymatic activity and sturcture of the CcO molecule.

Accordingly, the SGD database lists the three Gene Ontology classifications for COX5B in the following manner:

Molecular Function: cyctochrome c oxidase activity

Biological Process: anerobic metabolism

Cellular Component: mitochondrial membrane, and respiratory chain complex IV

From the above treatment of the activities of the cytochrome c oxidase and chain V isoforms, it is reasonable to conclude that oxygen is the essential substrate necessary for the function of cytochrome c proteins. Given that yeast can metabolize glucose in the absence of oxygen and that different chain V isoforms are expressed aerobically and anerobically, two pressing questions arise. The first one is our orginal question (see above), how do yeast cells sense changes in ambient oxygen concentrations? Second, to what extent might COX5A and COX5B be involved in the oxygen sensing pathways? Before answering the previous question, it would be useful to pause for a moment and define a few terms and classifications. In addition to COX5A and COX5B, many other annotated yeast genes known to function in oxidative phosphorylation, oxidative stress response, fatty acid biosynthesis, and heme biosynthesis are regulated by oxygen (Bunn and Boyton, 1996). Yeast genes are divieded into two categories based upon their response to oxygen.

Aerobic genes are expressed only in the pressence of oxygen, and hypoxic genes are expressed exclusively during anerobiosis. All thirteen subunits of the CcO holenzyme are encoded by aerobic genes. COX5B is the only exception and along with CYC7 (iso-2-cytochrome c), are hypoxic genes (Bunn and Boyton, 1996).

Hypoxia describes instances when cells fail to recieve sufficient oxygen.

Anoxia is a more severe decrease in oxygen saturation with respect to hypoxia, often resulting in permanent damage.

Returning to the question concerning whether the COX5 genes were functional constituents in the oxygen-sensing pathway in yeast, let's consider in detail the study reported that investigated this very question. Burke and colleagues (1997) wanted to investigate the differential responses of COX5A and COX5B to oxygen concentrations, and secondly whether their expression followed a threshold effect. They observed that expression of the COX5 isoform pair behaved in a reciprocal manner. As the concentration of oxygen was decreased from 200 microM to 0.5 microM, the expression of COX5A (and other aerobic yeast genes) decrease concaminantly, dropping dramatically after 0.5 microM. By contrast, COX5B expression was only observed in anerobic environments, when the oxygen concetration ranged from 0 to 0.5 microM. Thus, Burke and colleagues (1997) concluded that the COX5 gene pair swith on and off reciprocally over a range of aerobic and anerobic environments. What remained unclear at the end of the Burke,, (1997) study, however, was the mechanism by which oxygen modulated control of hypoxic gene expression. Was there a signaling molecule?




The hypothetical ORF, YIL103W, our unannotated gene is located approximately 20 kbp downstream of the COX5B locus on the cis-strand of chr. IX in S. cerevisiae. Its map coordinates are 171748 - 173025 bp and its mRNA transcript is 1278 bp in length (SGD database, 2003; <>). A translation of the coding sequence for YIL103W is available from the SGD database, but the amino acid sequence is readily obtained with the use of NCBI ORF Finder. The encoded primary sequence has 425 residues. For the purposes of the present discussion, I have named YIL103W’s hypothetical protein product, 103p.

In order to gain insight into its possible cellular roles, the amino acid sequence for 103p was queried against the Conserved Domain (CD) database. A conserved domain is a functional unit within the secondary or tertiary sturcture of a polypeptide that has been shown to remain conserved across taxonomic groups. Thus, based upon conformational properties predicted from its primary sequence, it is possible to make inferences about the role of a putative gene product. Figure 1 (below) illustrates the findings retreived when a CD-search was peformed for the 103p sequence.

Figure 1. Conserved domains found in the hypothetial protein 103p. The thin black line, above which reside blue numbers, represents the query sequence, 103p in this case. The two larger red rectangles below the query sequence are CDs that produced significant alignments with 103p. The E-value denoted in the far right column is a measure of the probability that the observed alingments would have been obeserved by chance alone.

These CD-search data indicate that 103p might be associated with the synthesis of diphthamide. Diphthamide is a protein so named because it conferes immunity to diptheria toxin. Given knowledge of the manner in which bacterial toxins infect a host cell, any diphthamide related protein is likely to be cytosolic. This prediction is consistent with the reported cytoplasmic GO Cellular Compontent for YIL103W (SGD database, 2003; <>).

To examine further the proposition that YIL103W encodes a diphthamide-related protein, the 103p sequence was aligned directly against the two known primary sequences for human DPH2 and yeast Dph2, using BLAST2Seq. Figures 2 and 3 illustrate these BL2seq findngs.



Figure 2. Amino Acid identity between the hypothetical protein 103p and human DPH2. The top strand, sequence 1, signifies 103p. The bottom blue strand signifies DPH2. A, a graphical depiction of the calculated alignment scores. B, the raw data showing the precise locations of identity and breaks therein. An identity is a match in residues at the same position on each polypeptide.



Figure 3. Amino Acid seqeunce homology between the hypothectical protein 103p and yeast Dph2. The top strand, sequence 1, represents 103p. The bottom stand, sequence 2, represents Dph2. A, a graphical depiction of the calculated alignment scores. B, the raw data showing the precise locations of identity and breaks therein. An identity is a match in residues at the same position on each polypeptide.

The implication of the data present in this discussion is that YIL103W is, at leat, a diphthamide related protein, or may share some of its functional characteristics but have a distinct cellular localization. Upon a considerationo of the CD data alone, this implication would certainly appear to be true. Although the the BLAST2Seq findings reinforce potential functional similarity between 103p and diphthamide, the low percent identity values, 23% and 21%, with diphthamide proteins precludes the possiblity that 103p and Dph2 are the same protein or even isoforms of the same family.



Allen, L.A., Zhao, X.J., Caughey, W., and Poyton, R.O. (1995) Isoforms of yeast cytochrome c oxidase subunit V affect the binuclear reaction center and alter the kinetics of interaction with the isoforms of yeast cytochrome c. J Biol Chem 270: 110-118.

Babcock, G.T., and Wikstrom, M. (1992) Oxygen activation and the conservation of energy in cell respiration. Nature 356: 301-309.

Bunn, H.F., and Poyton, R.O. (1996) Oxygen sensing and molecular adaptation to hypoxia. Physiological Reviews 76: 839-885.

Burke, P.V., Raitt, D.C., Allen, L.A., Kellogg, E.A., and Poyton, R.O. (1997) Effects of oxygen concentration on the expression of cytochrome c and cytochrome c oxidase genes in yeast. J Biol Chem 272: 14705-14712.

Hon, T., Dodd, A., Dirmeier, R., Gorman, N., Sinclair, P.R., Zhang, L., and Poyton, R.O. (2003) A mechanism of oxygen sensing in yeast: multiple oxygen-responsive steps in the heme biosynthetic pathway affect Hap1 activity. J Biol Chem. Accepted Manuscript, M303677200.

Poyton, R.O., Trueblood, C.E., Wright, R.M., and Farrell, L. (1988) Expression and function of cytochrome c oxidase subunit isologues: modulators of cellular energy production? Annl NY Acad Sci 550: 289-307.

Purves, W.K., Sadava, D., Orians, G.H., and Heller, H.C. 2001 Life: the science of biology. Sixth edition. Sinauer Associates, 1044 pp.

[SGD] Saccharomyces Genome Database. Locus Summary: YIL103W. <>. Accessed 2003 Oct 4.

[SGD] Saccharomyces Genome Database. Locus Summary: COX5B/YIL111W. <>. Accessed on 2003 Oct 4.

Tsukihara, T., Aoyama, H., Yamashita, E., Tomizaki, T., Yamaguchi, H., Shinzawa-Itoh, K., Nakashima, R., Yaono, R., and Yoshikawa, S. (1996) The whole structure of the 13-subunit oxidized cytochrome c oxidase at 2.8 angstroms. Science 272: 1136-1344.

Return to Arthur Clement's Inquiries Homepage

Any comments, replies, or suggestions would be welcome via email to A. Clement.


© Copyright 2003 Department of Biology, Davidson College, Davidson, NC 28035.

Created on 6 October 2003, by A. Clement.