BLAST Search

Discovery Questions


The investigators who eventually cloned the HD gene in 1993, used a novel approach that lead them to the correct gene. However, since the human genome sequence is freely available online, we will utilize this database to home in on the HD gene. Rather than using the Genome Browser as we did for CFTR, we will use a sequence search engine called BLAST. There are two types of BLAST searches, one for protein sequences (BLASTp) and one for nucleotide sequences (BLASTn). We will use both to find the HD gene.

Go to the National Center for Biotechnology Information (NCBI) BLAST web site. From the BLAST web page, follow these directions:
1) “Standard protein-protein BLAST [blastp]” which will allow you to submit protein sequences for comparison to all protein sequences available in the world. The goal is to submit a sequence (called the query sequence) and find those sequences that match your query sequence. The longer your query sequence is, the longer the search will take but the more likely you are to find meaningful results.

2) We are going to perform BLAST searches with seven sequences. Each one is listed below. You should perform each search separately, and record the name of the gene, the abbreviated name of each gene, and a short description of the gene or protein’s role in cells.

3) A small section of cDNA was cloned and sequenced from a person who had died from HD. Brain tissue was removed and cDNAs were produced. Many cDNAs were sequenced but these seven are of particular interest to us. Six of the seven cDNAs allowed the investigators to deduce amino acid sequences using the genetic code. Here are some of the deduced amino acids sequences. See if you can figure out which one might be the one that causes HD. Copy and paste the sequences into the large blank space and then click on the “BLAST!” button. Notice that the default includes a Do CD-Search. This is a conserved domain search so you can find functional units within any proteins that match your query sequence.

4) On the results page, click on the “Format!” button. You may have to wait a while for the results, depending on when you submit this BLASTp search. You will get a visual result that shows some of the hits (or database matches)

5) Click on the first hit that is human (Homo sapiens). For each of these protein fragments, read the short description and see if you think this might be the cause for HD. Remember, we are looking for a dominant disease with a loss of mental function.
Write down the names, the abbreviated names, and the accession numbers of all 7 sequences that you find by this search. An accession number is a unique identifier given to each entry in the database. Also, jot down a short description for each protein or gene you locate in your BLAST searches.

Amino Acid Sequence #1
MEFVMKQALGGATKDMGKMLGGDEEKDPDAAKKEEERQEALRQA


Amino Acid Sequence #2
MSAVSQPQAAPSPLEKSPSTAILCNTCGNVCKGEVLRVQDKYFH

Amino Acid Sequence #3
MELENIVANSLLLKARQEKDYSSLCDKQPIGRRLFRQFCDTKPT

Amino Acid Sequence #4
MGWGGGGGCTPRPPIHQQPPERRVVIVVFLGLLLDLLAFTLLLP

For sequence #5, investigators were not able to determine the proper reading frame for deducing an amino acid sequence. Therefore, submit a BLASTn search by going back to the BLAST page and choosing click on “Standard nucleotide-nucleotide BLAST [blastn]”
cDNA sequence #5
cttgcctgac atcggtttcc cctcccccac ggtcccaaga tggttgtgga catccaatct cacagcagag tcatctccta tgcaggctgc ctgactcaga tgtctccctt tgccattttt

Go back to BLAST web site and perform a BLASTp for....
Amino Acid Sequence #6
MAAAAEPGARAWLGGGSPRPGSPACSPVLGSGGRARPGPGPGPG

Amino Acid Sequence #7
MATLEKLMKAFESLKSFQQQQQQQQQQQQQQQQQQQQQQQPPPP

6) By now, you have figured out which gene/protein is the right one because it is well documented in the database. How long is the HD gene? How long is the mRNA? What is the protein called?

7) Let’s determine the genomic location of the seven genes you found in your BLASTings. Where is each of the seven genes located? You can search quickly at Mapviewer Enter each gene’s name in the “Search for” box and then hit the “Find” button.
Abbreviated names tend to work better than full names.

8) What is the cause of Huntington’s disease? In other words, what does this gene look like when a person has HD? How does it differ from most people’s alleles? If you did not find the answer in your previous search, you can find more definitive answer at the repository called OMIM (Online Mendelian Inheritance in Man(kind).

9) How many hits are there? Scroll down and click on the link above the gene called “HUNTINGTIN-INTERACTING PROTEIN 1; HIP1”. Read about HIP1 and others like it. What is happening to people with HD?

Discovery Questions


Genomics Course Page

Biology Department Main Page



Send comments, questions, and suggestions to: macampbell@davidson.edu or (704) 894 - 2692

© Copyright 2003 Department of Biology, Davidson College, Davidson, NC 28035