*This website was produced as an assignment for an undergraduate course at Davidson College.*


What are Orthologs?

The origin of similar genetic sequences that are found in different species can sometimes be traced back to speciation events resulting in divergent copies of a single gene. These similar genetic sequences are said to be orthologs, and they code for proteins which are likewise similar, but not exactly the same (Sadava et al., 2008). Orthologs are useful for identifying the relatedness of various species as well as for understanding the origin and function of the proteins they code.

Development of Hemoglobin

Found in erythrocytes, hemoglobin is well known as the protein primarily responsible for carrying oxygen throughout the human body. However, hemoglobin is found in a variety of other species, including plants and bacteria, strongly suggesting that these genes were found in very early common ancestors who then passed on divergent copies of the gene during speciation events. In all vertebrates, the amino acid sequences coding for the α-subunits of the hemoglobin protein and those coding for the β-subunits are 50% similar (Hardison, 1996), suggesting a common ancestry between the two. These more derived tetramers in fact descended from the more primitive myoglobin, a monomeric protein responsible for oxygen storage and delivery in the earliest jawed vertebrate, lampreys.  Around 450 million years ago, this myoglobin gene gave rise to the earliest hemoglobin-coding genetic sequence when it replicated. When cartilaginous fish and bony fish diverged about 50 million years later, the early hemoglobin gene evolved in such a way that resulted in two new forms coding for the alpha and beta subunits, and these forms were later passed onto all vertebrates along that evolutionary line, including mammals.

The usefulness of hemoglobin is not just recognized by animals. Hemoglobin is also used in plants to carry oxygen for the purposes of respiration. Leghemoglobin, a relative of hemoglobin, has also been detected in the nodules of plants, binding to oxygen so that sensitive nitrogen-fixing enzymes may function properly. The leghemoglobin protein sequence differs than that of vertebrate hemoglobin by 80%, appropriately reflecting the distance in ancestry between the two genes (Hardison, 1996). Similarly, hemoglobin and other homologous genes have been found in algae and protozoan, which are also responsible for trapping oxygen for the purposes of respiration. In the yeast Saccharomyces, a flavohemoprotein consisting of a heme domain and a FAD domain is responsible for signaling and regulation under favorable oxygen conditions (Hardison, 1996). The flavohemoportein is also present in Escheria coli. These examples all highlight the status of hemoglobin as a gene with multiple functions and as a descendant of an ancestral gene that must have been very ancient, as similar forms of it are found in many other different species of organisms. 

                                         Figure 1. Significant events in hemoglobin genes development. Shaded boxes are extrons, open boxes are introns, and                                                                                                           lightly-shaded boxes show the FAD domain. Diamonds signify gene replication events, and circles signify speciation events.        
                                                    The location where an intron interrupts the codon of the amino acid chain is shown above the intron,. For example, B5 is the amino 
                                                            acid  at the fifth position of helix B. Image retreived from Hardison, 1996. Permission pending.
Link to the article here.


                      Figure 2. Various functions of hemoglobin proteins and hemoglobin-like proteins in different spieces of organisms. Image retreived from 
                                               Hardison, 1998. Permission pending.

Form Follows Function

In many species, the function of hemoglobin typically involves binding to an oxygen molecule. This function is due to the highly conserved structure of the hemoglobin protein, allowing it to perform the same function in different organisms. To be so conserved, hemoglobin's function must be very important to all living organisms, since the protein is found in almost all species. This consistence in function suggests that the structures of hemoglobin proteins in different species is conserved, and thus the genetic sequences encoding these proteins must likewise be similar. Examining the subunits of hemoglobin in comparison to the ancestral myoglobin protein mentioned above will likely give some clue about which sections of the hemoglobin is most conserved and thus most important to protein shape and function. 

Figure 3. A comparison between the α-subunit of hemoglobin, β-subunit of
and myglobin. Sequences highlighted in grey are conserved between the
            α-subunit and β-subunit of hemoglobin.Sequences highlighted in light brown are c
onserved in both
subunits and the ancestral myoglobin.  Imaged retrieved from
http://www.aw-bc.com/mathews/ch07/c07emhp.htm. Permission pending.

Figure 3 shows the amino acids that are most conserved among the α-subunits and β-subunits  of hemoglobin and the ancestral hemoglobin. Some of these amino acids include the histidines on both sides of the heme group, labeled here as E7 and F8, which are responsible for stabilizing the heme group and preventing it from becoming too oxidized, which would prevent binding with oxygen (Matthew et al, 2010). Since the success of the heme group in carrying oxygen largely depends on having these histidine chains, it is appropriate that chains would be highly conserved in hemoglobin along with the heme group. We can see that that, because of its highly useful nature to living organisms, the heme group was conserved along with the amino acids absolutely necessary for it to function (Matthew et al, 2010). Amino acids reponsible for allosteric conformation was also highly preserved (Matthew et al, 2010).This whole packaged was thus passed on from myoglobin to hemoglobin genes with little changes. The similarity in conserved amino acids seen here is also partly due to the relatedness of species from which these genes were found, since they are both from vertebrates. Species sharing a more common ancestor would have similar genetic sequences, since not much differentiation has taken place yet. Species that share more distant common ancestors, however, would show greater differences in their orthologs.

                              Figure 4. Percentage similarity in α-subunits of hemoglobins found in various species.
                            Species which diverged further back in time had less similarity in their amino acid
sequences coding for the
α-subunit. Image retrieved from
. Permission pending.

                Figure 5. Selected hemoglobin amino acid sequences from invertebrates, bacteria, protists, mammal, and bacteria. The alignment was generated using ClustalW.                           Not all of the "Big Seven" group is represented, but members from the same kingdoms and phylums are present. The column with the distal histidine of vertebrate                                     and plant hemoglobins is marked with an 'H'. A further description of the sequences tested can be found here. Image from Hardison, 1998.  Permission pending.

The table shown above highlights several similarities found in the hemoglobin orthologs. It is interesting to note that species that were more divergent from each other had less similarity in their amino acid sequences. Although certain amino acid sequences did vary from one another, important amino acids that are important to the function of hemoglobin genes are shown to be highly conserved, and these amino acids are seen in the columns with the letter representing the amino acid above that column. Overall, we can see that while divergence does indeed changed parts of the amino acid sequences, specific amino acids that are critical to hemoglobin function remain conserved.


Bruce Alberts, Dennis Bray, Julian Lewis, Martin Raff, Keith Roberts, and James D. Watson. Molecular Biology of the Cell Fourth Edition. http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=mboc4&part=A2. March 2010.

Hardison, Ross C. A brief history of hemoglobins: Plant, animal, protist, and bacteria. Proc. Natl. Acad. Sci., 1996; 93: 5675-5679. Link to article here

Hardison, Ross. Hemoglobins from Bacteria to Man: Evolution of Different Patterns of Gene Expression. The Journal of Experimental Biology, 1998; 1099-1117. Link to article here

Mathews, Van Holde and Ahern. Evolution of Myoglobin/Hemoglobin Proteins. http://www.aw-bc.com/mathews/ch07/c07emhp.htm . March 2010.

Sadava D, Heller CH, Orians GH, Purves WK and Hillis DM. Life: The Science of Biology 8th Edition. Massachusetts and Virginia: Sinauer Associates Inc. and W.H. Freeman and Company, 2008. Print.

Tam Hua's Home Page

Molecular Biology Homepage

Davidson College Homepage

Please send suggestions, questions, or comments to tahua@davidson.edu