Part 1: Open Reading Frame and Molecular Weight

The cDNA of the african clawed frog, Xenopus laevis, was analyzed with MacDNAsis.  MacDNAsis read the cDNA sequence in each of the possible open reading frames (ORF).  From this analysis, I was able to determine that the largest ORF begins at nucleotide number 92 and terminates at nucleotide number 3304. To view the cDNA sequence that was utilized for the analysis, click here.  After choosing the largest open reading frame, I was then able to perform several other analyses with it.


Figure 1. Largest Open Reading Frame (ORF) of DNA ligase.  This image diagrams the three different reading frames of DNA ligase's cDNA.  Start codons (ATG) are represented by red triangles, stop codons by green vertical lines, the largest ORF by a black rectangle.  The nucleotide numbers (1-3770) are listed above the reading frames.

Next, the MacDNAsis program was used to translate the largest ORF from DNA to protein.  The protein's molecular weight was determined from this amino acid sequence to be 120,227.65 kDa.        

 Part 2: Structure Determination

The amino acid sequence of DNA ligase was used to infer the protein's membrane and secondary structure.

Kyte and Doolitle Hydropathy Plot:
This plot is used to predict if a protein is an integral membrane protein.  DNA ligase is not an integral membrane protein.  However, this plot of DNA ligase's hydrophilic regions shows some areas that appear to cross a membrane.  Overall, the structure of DNA ligase is hydrophillic.
Figure 2: Hydropathy Plot of the DNA ligase amino acid sequence.  Positive values on the y-axis correspond to hydrophobic regions of the protein.  Areas of the protein with values that are greater than or equal to +1.8 are possibly in a transmembrane domain.
Hopp and Woods Antigenicity Plot:
This antigenicity plot present very hydrophilic areas that may serve as epitope sites for antibodies.  No extreme hydrophilic sites are obvious.

Figure 3: Hopp and Woods antigenicity plot of DNA ligase amino acid sequence.  Regions that are located above the x-axis are hydrophobic and regions that are below the x-axis are hydrophilic.

Secondary Structure:
The Chou, Fasman, and Rose analysis creates an image of the secondary structure of DNA ligase.  This structure is based on the order of amino acids and their various characteristics.  To view the tertiary structure of DNA ligase with a Rasmol image, click here.

Figure 4: Computer prediction of the secondary structure of DNA ligase.  a-helices are represented by blue lines, b-pleated sheets by red lines, turns in the chain by green lines, and coils by black lines.

 Part 3: Comparison of African Clawed Frog DNA Ligase with Other Species
Multiple Sequence Alignment ofthe Amino Acid
Sequences of the Five Genbank Organisms
The multiple sequence alignment provides a comparison of DNA ligase sequences from five organisms.  The sequence provided below is a sample of only 50 amino acids.  The cDNA and amino acid sequences of the five organisms can be viewed by clicking on the appropriate species:
Xenopus laevisHomo sapiensMus musculusArabidopsis thalianaSchizosaccharomyces pombe

Figure 6: Sample segments of the amino acid sequence of DNA ligase from the five organisms examined.  Yellow letters surrounded in black indicate identical amino acids between organisms.  This sample indicates a high degree of genetic conservation between the species analyzed.


Phylogenetic Tree of the Amino Acid
Sequences of the Five Genbank Organisms
The phylogenetic tree describes the degree of homology between the amino acids of the species analyzed.  There is 81.4% homology between the amino acid sequences of DNA ligase for the mouse and the human.
Figure 7: Phylogenetic Tree of the Five Genbank Organisms.  Percentages describe the degree of homology between the amino acid sequences of the analyzed organisms.

Return to Davidson College Molecular Biology Home Page

If you have any questions, comments, or suggestions,
then please email me at