Genome Browser Search

Online Study Question

If you wanted to home in on the CF gene today, you could utilize the human genome sequence. We will zoom in on the CF gene using the Genome Browser at the University of California at Santa Cruz. There are similar Genome Browser versions located on other campuses, such as the one located in England at the Sanger Center.

1) Click on “Genome Browser” in top left menu. On the resulting page, make sure “human” is selected from the pull down menu, as well as the most recent version.

2) Enter the RFLP maker “D7S8” in the "position or search term" box and click on “Submit”.

3) You should see the D7S8 marker, as well as a few other more recently discovered markers that were not available when CF was being cloned for the first time.

4) Look below the display where you will see a large collection of pull down menus. Click on “hide all ” to make all features disappear, then modify these three settings using the first three pull down menus under the "Mapping and Sequencing Tracks" heading:

Base Position: Full
Chromosome Band: Dense
STS Markers: Full

5) Click on the “Refresh” button above the pull down menus. This should clean up the view a bit. Make sure you can still see D7S8 (it should be highlighted).

6) At the top of the browser window, copy and paste into the “position” box:
" chr7:115990000-117500000 "

8) Click on “Jump” (next to the position/search box). “Jump” tells the browser to move to the range of DNA bases numbered 115,990,0000 through 117,5000,000 on chromosome 7 (or 116 Mb – 117.5 Mb). This region is approximately the 1.5 Mb range of bases that is flanked by the oncogene MET (also called SWSS4849) on one end and D7S8 on the other. All we have done so far is electronically zoomed out from the single RFLP marker D7S8 to a wider perspective showing both RFLP markers.

9) MET is near base 115990000; D7S8 is near base 117440000) at this time to verify you are in the right section of the human genome. To see MET in this sea of markers, change the settings as follows:

STS Markers: Hide (under "Mapping and Sequencing Tracks")
Genescan Genes: Full (under "Genes and Gene preidction Tracks")

10) Click on the “Refresh” button. You should see a series of brown horizontal lines with vertical tick marks (of varying thickness) that indicate all the predicted genes in this 1.5 MB region. The vertical tick marks are where the GeneScan computer program predicted all exons are located. The brown cartoon can be used to estimate what the gene looks like, with introns as horizontal brown lines devoid of the verticle tick marks.

11) Write down how many genes are predicted for this 1.5 Mb region. Can you guess which one is CF?! Imagine being the first person to sequence this region and not know which gene is the cause of cystic fibrosis.

12 ) Now change the settings as follows:

Genescan Genes: Hide
UCSC Genes: Full
RefSeq Genes: dense

13) Hit the “Refresh” button. How many different known genes are really in this region (count RefSeq Genes)? Why are some genes listed more than once (UCSC Genese) with different patterns of tick marks for each listing even though the cover the same portion of the genome? Did the number of predicted “GeneScan” genes and “RefSeq Genes” agree? Explain your answer.

14) Skim the list of known genes and find the gene called CFTR. When the CF gene was cloned and sequenced, the investigators wanted to call it something a bit more descriptive than just the CF gene. So they called it the Cystic Fibrosis Transmembrane conductance Regulator. As we continue our exploration of this human locus, try to figure out why they gave it this name.

15) Click on the middle CFTR text listed for the UCSC genes, and read what you have found. How long is this gene? How many amino acids are in the encoded protein?

16) Scroll down some and find the microarray gene expression data that reveal which tissues transcribe a particular gene the most. Which tissues transcribe CFTR the most (red boxes)? Which ones the least (green boxes)?

17) Scroll back up to where there is a table of "Sequence and Links to Tools and Databases".

18) Click on the link to GeneCards for “CFTR”. Note: GeneCards pops up a new window instead of the right frame where the other databases were displayed.

19) If the CFTR page does not appear, scroll down the alphabetical listing and find CFTR. Click on this link in the first column.

20) Note CFTR’s chromosomal position as a thin red line on the diagram of chromosome 7. Does CF’s postion match what you saw in the Genome Browser? Determine the exact start and stop positions for CFTR. The term “pter” means the terminus of the p arm (p stands for petite which is French for small). How long is the CFTR gene? Which strand is the coding strand which is indicated as “orientation” on this page; plus strand is on top.

Use your web browser's "Find" function and locate the link called "NM_000492.3" in the "Transcripts" section and under the heading "REFSEQ mRNAs:" (about one third of the way down this results page). Click on the "1" to the left of "CoreNucleotide records" and then NM_000492 on the resulting page. How long is the mRNA? Calculate the percentage of the CFTR gene is exon.

21) Go back to your GeneCards browser window and scroll down to the expression in human tissues section that contains a collection of colored bar graphs displaying relative amounts of CFTR mRNA in normal human tissues. Which 5 tissues (using the UniGene Electronic Northern dataset) expresses the highest levels of CFTR? Rank them in order from highest to lowest. Any surprises? Which display is easier to read, this GeneCard one or the version you saw in step 16 above?

22) Scroll down further in the GeneCards page to the "Disorders & Mutations" section, click on the link named "CFTR_HUMAN, P13569" to the right of “UniProt/Swiss-Prot:”. This database focuses on the human proteome instead of the human genome. A proteome is the protein equivalent to a genome – the total protein content of an organism.

23) Scroll down to the Features heading and then click on the link called “Feature table viewer” (next to a gray box).

24) On this results page, scroll to the very bottom and view the predicted features of CFTR in a gray box. Before you do anything, you may need to click once of the zoom button to show all the physical features. Click on each of the colored features to determine how many transmembrane (TM) domains (each TM is illustrated as a green rectangle) and the location of the ATP binding sites (blue boxes). You can zoom in on any region by sliding the red vertical bars to flank the area you want magnified and then clicking on the zoom button. What are the small green "v's" scattered along the black bar at the top?

25) If the N terminus of CFTR is in the cytoplasm, draw a picture of CFTR in your Study Guide using the information in the feature table viewer. Use a pencil in case you make mistakes along the way. At this time, just focus on the topology, or the number of times CFTR snakes across the plasma membrane.

26) Add to your drawing the features of ATP binding sites, phosphorylation sites and glycosylation sites. To see these features clearly, drag the red lines (located on both sides of this view) by clicking on the rectangles at the bottom and dragging the lines to frame the area with all the little blue circles. Then click on the Zoom button. How many ATP-binding sites, phosphorylation sites and glycosylation sites are there? Add these to the picture you are creating in your Study Guide. Note that the TM domains have gotten thicker since you have zoomed in. You can slide the slider bar at the bottom of the window to the left and right to see other areas of the protein structure at this magnification.

27) Reset the view of the entire CFTR protein. Now use the red bars again to zoom in for a higher resolution. Center the red bars on the first (located closer to the amino terminus) ATP binding site. Continue to zoom in until you can see the single letters (in color) representing the amino acids of CFTR.

28) Move the slider bar near the bottom over to the right until you can see amino acid 508 which is a phenylalanine (represented by the letter F). Click on the green symbol above F508 and read the text in the top left corner. What does it say? Mark this amino acid on your drawing.

29) How many transmembrane domains are in CFTR? How many ATP binding sites? Phosphorylation sites? Glycosylation sites? What feature of CFTR is closest to the amino acid F508?

Other Discovry Questions

Genomics Course Page

Biology Department Main Page

Send comments, questions, and suggestions to: or (704) 894 - 2692
© Copyright 2006 A. Malcolm Campbell
Department of Biology, Davidson College, Davidson, NC 28035