This web page was produced as an assignment for an undergraduate course at Davidson College
ORTHOLOGS for Human Insulin
Sequences found at: http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi
Human amino acid sequence used to find conservation among other species:
Shown below (example):
Species ( )- [type of insulin protein contained in species]
Human - amino acid sequence ------ -----------------------------------------
Similarities - matching amino acids ---------------------------------------------
Species given - amino acid sequence
Pig (Sus scrofa) - [Preproinsulin]
FVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN FVNQHLCGSHLVEALYLVCGERGFFYTPK RREAE+ Q G VELGG--G G LQ LALEG--QKRGIVEQCCTSICSLYQLENYCN FVNQHLCGSHLVEALYLVCGERGFFYTPKARREAENPQAGAVELGG--GLGGLQALALEGPPQKRGIVEQCCTSICSLYQLENYCN
House mouse (Mus musculus) - [Insulin I - insulin I and II are contained on two different chromosomes]
FV QHLCG HLV+ALYLVCGERGFFYTPK+RRE ED QV Q+ELGG P- G LQ LALE +LQKRGIV+QCCTSICSLYQLENYCN
Chicken (Gallus gallus) - [Insulin I precursor]
NQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN NQHLCGSHLVEALYLVCG+RGFFY PK +R+ E--QV ------GP L -----+-E----KRGIVEQCC SCSL+ QLENYCN
African clawed frog (Xenopus laevis) - [Preproinsulin]
NQHLCGSHLVEALYLVCGERGFFY+PK RR+ E V G AG L -E
Zebrafish (Danio rerio) - [Insulin]
QHLCGSHLV+ ALYLVCG GFFY PK R+ E L ------LG -P + + + + ----++KRGIVEQCCCS+++L+ --NYCN
Marbled electric ray (Torpedo marmorata) - [Insulin precursor]
+QHLCGSHLVEALY VCG +GF+ Y PK ---+ L -------GG -----------------GIVE CC + CSL+ LE YCN
Western diamondback rattlesnake (Crotalus atrox) - [Insulin]
NQ LCGSHLVEAL+L+CGERGF+Y+P++ ----------------------------------GIVEQCC + CSLYQLENYCN
Duckbill platypus (Ornithorhynchus anatinus) - [Insulin]
F NQHLCGSHLVEALYLVCGE+GF+Y P ----------------------------------+ GIVE+CC -+CS+YQLENYCN
Jack Bean (Canavalia ensiformis) - [Insulin precursor fragments]
FVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN FVNQHLCGSHLVEALYLVCGERGFFYTPK -----------------------------------GIVEQCC S+CSLYQLENYCN
In each of the species above, the A-chain (GIVEQCCTSICSLYQLENYCN) is mainly conserved. Most of the sequences have a few mutations and differences in the amino acid sequence, but over all the A-chain is conserved, while the B-chain and the preproinsulin section is changed. Most of the divergence seems to occur in the B-chain. Mainly the B-chain is omitted from the sequence. The preproinsulin, amino end has some conservation. The first amino acids tend to change some, but the over all sequence is similar. The proinsulin section also stays intact through out the species above. Therefore, conservation is kept on the carboxyl end of the protein and on the amino end, and diverges in the middle. Consequently, it seems that since preproinsulin is spliced off in early stages of insulin function, it has no reason to adapt to the given species, whereas the proinsulin and B-chain have more of an individual effect on the given protein and species and are more subject to change. It is all part of evolution and if it weren't for these natural occurances, the necessary insulin protein would not be working properly in each species.
Amino acid sequences found for these species at:
Earth worm (Caenorhabditis elegans) - [ insulin-like peptide alpha-type 1 precursor]
Fruit fly (Drosophila melanogaster) - [protein called, HDC09365, which contains some part of insulin]
Summary of related species
These species are among the few that contain parts of the insulin superfamily genes, but do not specifically carry the gene called insulin. The insulin gene has made up a superfamily that includes insulins, relaxins, insulin-like growth factors, and bombyxin. They are all secreted regulatroy hormones that have disulphide bonds, an A-chain fold, an alignment with the B-chain, and a link to the A chain. Within C. elegans, there are different types of precursors, (alpha and beta with various numbers) which all have slightly different sequences. They also have insulin-related proteins labeled ins-1 to ins-33, meaning there are 33 different insluin like proteins in earth worms. For this sequence, it is hard to find where the two sequences match and show conservation. For D. melangonaster, there are not many places within the fruit fly that code for an insulin-like protein. This protein can be compared to insulin because it is a heterodimer of a B-chain and an A-chain linked by two disulphide bonds, and are also is secreted. The bacteria also carries some insulin in its cells. It is an epidermal growth factor of a single chain insulin fusion protein. It is only a fragment, but it still conserved part of the A-chain. The bacteria's amino acid sequences it much more similar to human insulin than earth worms and fruit flies, this is why it is used more commonly for medical purposes.
Medical uses for insulin orthologs
Bacteria have been special in making insulin. Because there are insulin orthologs, we have been able to use bovine and porcine insulin for people who cannot produce their own insulin (Bio Bus). The human gene for insulin has also been engineered now into plasmids and transformed into bacteria cells. Since bacteria divides so quickly, we can have bacteria with the needed insulin protein in hours (Bio Bus). More antibiotics, and necessary insulin supplements can be made with these bacteria cells carrying insulin and improve the increasing diabetic reserach field.
Figure 1. The taxonomy of the insulin gene within different species. In this diagram the numbers represent the amount of protein matches made with the insulin gene and other related-insulin family proteins. http://www.ebi.ac.uk/interpro/IEntry?ac=IPR004825
The amino acid sequence for insulin orthologs is highly conserved in vertebrates among a variety of species. The primary structures of insulins from over 25 different vertebrate species are known (Steiner, et. al, 1985). Insulin even has related precursors, insulin-like proteins, and other insulin-like growth factors that all have small similarities in the amino acid sequence of other invertebrates and prokaryotes.
The chromosomal location of human insulin is now known to be located on the short arm of chromosome 11. In a study done by Steiner and his colleagues, they have found the insulin gene to be located on chromosomes 1 and 7 in mice and rats. These insulin genes seem to be homologous to the human insulin. In most species insulin appears as a single copy, but mice and rats carry two. The rat's insulin gene is about 90% homologous and has an estimated divergence time of 25-35 million years from insulin's origin (Steiner, et. al, 1985). Many mutations in the insulin gene cause this divergence. The exact time of this divergence of molecules from insulin cannot be predicted, but the separation time seems to be around 0.5 billion years ago. Evidence has proven itself by insulin appearing in invertebrates and prokaryotes (Steiner, et. al, 1985).
Since insulin-like precursors have been located in C. elegans and D. melangonaster, it shows insulin's ancient origin. Over millions of years insulin has been transformed into the human insulin gene it is today that regulates glucose levels in the body, into insulin family groups of proteins, and even insulin-like growth factors. All of these proteins have different functions, but all include a part, differing in size, of the original insulin strain.
Another part of insulin that has created its increase in divergence and amount of species it inhabits, is its evolutionary changes within the preproinsulin, proinsulin, and insulin gene structure transformation. There is an encoding signal peptide-B chain, C-peptide, and A-chain, which are all subject to change over time. According to the study done by Steiner, using human, dog, mouse, rat, and guinea pigs, he found that when comparing the C-peptide region across vertebrates, that length and overall net charge was better conserved than the actual amino acid sequence (1985). He suspected that "proinsulins of higher vertebrates may fold more rapidly as a result of the C-peptide region's ability to fold back on itself, thus juxtaposing the A and B chain segments more favorably than would the random movements of a less flexible connecting segment" (Steiner, et. al, 1985). Most changes have been found to occur along the backbone peptide chain folding region, within the hydrophobic core. These differences have caused protein-like structures that are functional in different species to work like insulin. These new forms of insulin have significantly changed the receptor binding affinity. For example, upon three mutations in the human insulin gene, abnormal insulin results with a reduced receptor binding affinity (Steiner, et. al, 1985). So, as the binding changes, so does the insulin protein structure. Among different organisms of birds, like chickens and turkeys, they have adapted to also have approximately twice the receptor binding potency and biological activity (Steiner, et. al, 1985). Among the vertebrates, most changes in the insulin gene are a few nucleotides or amino acids, rather than significant differences. Insulin has been conserved very well over millions of years.
NCBI. National Center for Biotechnological Information. <http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi >. Accessed 2005 Feb 24.
UniProt. The Universal Protein Resource. 2002- 2004. <http://www.ebi.uniprot.org/uniprot-srv/results/gridView.do?setPosition =10&pager.offset=0>. Accessed 2005 Mar 6.
Rhoads-Frost, Donna. 2004. Conneticut's BioBus. Lighting the Magic Lantern Curriculum. <http://www.ctbiobus.org/ curriculum/pdfs/lightingthemagiclantern_04.pdf >. Accessed 2005 Feb 22.
Steiner, D.F., S.J. Chan, J.M. Wlesh, S.C.M. Kwok. 1985. Structure and Evolution of the Insulin Gene. Department of Biochemistry and Molecular Biology at the University of Chicago. 19, 463-484.
© Copyright 2005 Department of Biology, Davidson College, Davidson, NC 28036
If you have any questions, comments, or suggestions concerning this page, please contact email@example.com