This web page was produced as an assignment for an undergraduate course at Davidson College.

Project Summary:
    Expanding upon their work to analyze the ties between one's genome and one's physical features, Craig Venter's Human Longevity has developed a method to generate a three-dimensional image of the human face, based solely upon the individual's genome. Venter’s group applied machine learning to the human genome in an endeavor of discovery science. First, the researchers acquired the genome data of 1,061 people and imaged their faces to produce high-resolution, three-dimensional maps readable by their algorithm. The researchers also considered the participant's other physical characteristics, including height, weight, age, eye color, hair color, and voice. These data were used to train the machine-learning algorithm, allowing it to find patterns within the genomes of the individuals it could correlate to trends in the facial images. Given the sequence of the genome of an unseen individual, the algorithm was able to produce facial images which could be matched to photographs for eight of ten people, when all races were considered, and for five of ten people when the individuals were all of the same race.

A genome-informed image of the correspodent's face
Figure 1: A genome-informed image of the face of the writer of the news story (left) and the writer's face at age 20 (right).

Genomics Methods:
    Venter’s group employed whole genome sequencing to acquire the genomic data which would inform their model. The model the researchers built relies upon principal component analyses between the known genomes and the phenotypes of the donating individuals. The data of the initial 1,061 participants of the study informed the initial primary components from which the model makes predictions.

Take Home Message:
    Primary component analysis of genomic data exhibits an ability to uncover patterns with great enough reproducibility to generate images of the human face. These results may demonstrate a machine understanding of the expression of complex traits within the genome. In a world in which large databases of genomic data coexist with social media websites storing personal information and photos, these findings have profound implications for privacy. The genome becomes increasingly personal information when technologies can generate an image of one’s face from it–and it may only become easier. Soon, only limited genomic information, such as a profile of SNPs, may be enough to create a facial image. These findings emphasize the need for a cohesive, universal way to secure and anonymize genomic data as it increasingly informs contemporary technologies.

Genome-informed faces
Figure 2: High-resolution images of participants' faces (left) and genome-informed images of the faces of these individuals (right) (Lippert et al. 2017).

    Venter’s genome-informed facial image construction astounds me. First, I am bewildered by the power of machine learning technology to gather enough information from primary component analysis of the genome to generate accurate facial images. I wonder if it is possible to uncover the patterns in the genome the computer recognizes to determine the heritability of complicated traits.
    Second, this research confronted me with the importance of the privacy of one’s genome. As genomic data becomes increasingly relevant to everyday life, particularly in healthcare, more of one’s genome will be more accessible than ever. These findings present an unsettling reality: your genome may be used to find you! While, as the news report states, this application has great utility in forensics–body fluid samples might now be used to generate facial images of individuals at the scene of a crime–it puts individuals at risk. When the faces of billions of people can be found in photos available on social media sites, the connection between genome sequence, genome-derived facial image, and real person becomes much easier to make. Just as we protect our social security numbers, credit card information, and passwords we may soon have to protect the sequences of our genomes.


Genomics Page

Biology Home Page

Owen's Home Page

Email Questions or Comments:

© Copyright 2018 Department of Biology, Davidson College, Davidson, NC 28035