Gattaca as a lens on contemporary genetics: marking 25 years into the film’s “not-too-distant” future

Abstract The 1997 film Gattaca has emerged as a canonical pop culture reference used to discuss modern controversies in genetics and bioethics. It appeared in theaters a few years prior to the announcement of the “completion” of the human genome (2000), as the science of human genetics was developing a renewed sense of its social implications. The story is set in a near-future world in which parents can, with technological assistance, influence the genetic composition of their offspring on the basis of predicted life outcomes. The current moment—25 years after the film’s release—offers an opportunity to reflect on where society currently stands with respect to the ideas explored in Gattaca. Here, we review and discuss several active areas of genetic research—genetic prediction, embryo selection, forensic genetics, and others—that interface directly with scenes and concepts in the film. On its silver anniversary, we argue that Gattaca remains an important reflection of society’s expectations and fears with respect to the ways that genetic science has manifested in the real world. In accompanying supplemental material, we offer some thought questions to guide group discussions inside and outside of the classroom.


Thought questions for discussions of Gattaca
Gattaca has been used in educational settings to introduce concepts in genetics and to explore controversies at the intersection between science and society.
In this section, we offer discussion questions that can be used to drive conversations in group settings, based on topics that appear in the film and in modern discourse around genomic technology. This is not an exhaustive list but rather is intended to generate starting points for discussion. There is no sole "correct" answer to any of these questions. We make a rough division into questions about (a) ethical, legal, and social issues; (b) the science of human genetics; and (c) the film and its world. At the same time, many of these questions could be grouped in multiple categories.
A warning for those who haven't seen the film: Major spoilers are embedded in a few of these questions.

Ethical, Legal, and Social Issues
1. Gattaca shows a world in which widespread embryo selection becomes the basis for discrimination, despite laws against it. Do you think it is possible to de-sign a society that used widespread embryo selection, as in Gattaca, and at the same time to avoid genetic discrimination? Why or why not? How would you attempt to do it?
2. When (if ever), in your opinion, should human embryo selection (or direct modification of genetic material (Greely, 2019)) be allowed? Would you place restrictions on the traits that are allowed as targets of selection or modification? Which traits would you allow? Would you place restrictions on the basis of scientific understanding of the traits or their genetic basis? What kinds of knowledge would you require?
3. Following on the previous question, many people feel that there are certain traits that should not be allowed to be targets of embryo selection. However, as discussed briefly in the main text, the existence of pleiotropy means that selection on a target trait could have an effect on other traits. Given pleiotropy, what does it mean to specify a set of traits that are not allowed as (direct) targets of embryo selection?
4. The film suggests that a world can exist whereby discrimination is no longer based on the "color of your skin" (race and ethnicity), but rather, on the composition of one's genome. Do you think such a world, where one form of discrimination (genetic discrimination) fully supplants another (racial discrimination) is possible? Why or why not? Are there historical analogues for this happening?
5. As mentioned in the main text, Gattaca's original ending included montage of people who might never have been born if embryo selection against genetic disease had been developed sooner. It also suggested that one person who might not have been born is "you." What do you make of of these statements as arguments?
6. In the Gattaca detectives' search for the murderer, they collect DNA samples from hundreds of people on the basis of only their location (being either Gattaca employees at the facility, people congregated in a place where in-valids are thought to assemble, and people within a radius of the crime scene). This kind of procedure is sometimes called a "DNA dragnet" and has been carried out in our world, including around the time Gattaca was made. . Gattaca portrays or suggests genetic discrimination in many contexts: hiring, promotion, education, health insurance coverage, criminal investigation, and even dating. Are there other contexts in which you think genetic discrimination could become a problem? What are they?
Genetics 10. Vincent says that most of the conditions he is predicted to develop at birth are "still untreatable to this day." Is the fact that treatment is impossible for these conditions a warning sign about the accuracy of the predictions themselves? Why or why not? More generally, what are the gaps between the ability to predict an outcome and the ability to intervene in it downstream?
11. A trait's heritability is not a fixed property of the trait; it can vary on the basis of the genetic variation in the population and on the amount of environmental variation that influences the trait. In the world of Gattaca, how would you expect traits' heritabilities to change? Specifically, how do you think that the components of heritability, traits' genetic variance and environmental variance, would change? More generally, what would be the long-term effects on the population of embryo selection as practiced in Gattaca?
12. Turley and colleagues (2021) raise the point that prediction of an embryo's traits is dependent on the environment that the embryo experiences. The embryo's environment might be different from previous environments, which can be the only basis for prediction.
For what kinds of traits do you think that this disconnect will be most relevant, and most likely to lead to prediction errors?
13. As mentioned in the main text, the first child born after polygenic embryo selection was born in 2020.
The first report of children born from gene-edited embryos came even earlier, in 2018, after a secret experiment led by He Jiankui that was roundly condemned by scientists and biomedical ethicists (Greely, 2019). Discuss the difference between the risks and benefits represented by embryo selection procedures, including polygenic embryo selection, and direct genetic editing of embryos. 14.
[For groups familiar with human genetics study designs] As mentioned in the main text, the SNP heritability is likely the near-term upper bound on the percentage of a trait's variance that could be explained with a genetic predictor. Do you think that genetic predictions' performance will exceed the SNP heritability in the long term? Why or why not? If you said, "yes," how do you think that such predictions could be developed? Will they achieve the performance suggested by twin-based heritability estimates, or will they be bounded by some other number?
Gattaca and its world 15. In a deleted scene, the geneticist offers Vincent's parents an opportunity, for a steep fee, to insert genetic material into one of their embryos, sequences associated with advanced musical or mathematical ability.

Details on the liability threshold model and derivation of equation 1
Under the liability threshold model, an individual's liability is represented as a normally distributed random variable, itself a sum of two independent normally distributed variables, one representing a "genetic" component of liability, and the other representing an "environmental" component of liability. The individual develops the disease if their total liability exceeds a threshold. We do not observe an individual's liability, only whether the individual exceeds the threshold and thus develops the disease. The liability threshold model is a coarse statistical description of complex disease risk. That said, many of its basic predictions approximately hold for many complex diseases in humans (Visscher and Wray 2015).
To model a specific disease under the liability threshold model, one must specify a disease prevalence, which determines the threshold, and a heritability, which determines the relative variance of the genetic and environmental components of liability. For example, a rare disease will have a high threshold, such that only a few people have liabilities exceeding the threshold. For a highly heritable disease, the variance of the genetic component of liability will be higher than that of the environmental component of liability, such that the genetic liability is responsible for most of the variance in overall liability. One way to write the model is where D represents the event that the individual develops the disease, L G is the genetic component of liability, L E is the environmental component of liability, and t is a threshold. In one of several possible parameterizations, L G and L E are both normal random variables with expectation 0 and variances σ 2 G and σ 2 E , with σ 2 G + σ 2 E = 1, so that the liability is a standard normal. In this parametrization, σ 2 G is the heritability of the liability. The threshold is then chosen as t = Φ −1 (1 − p), where p is the disease prevalence and Φ is the cumulative distribution function of the standard normal distribution.
Under this model, define r an individual's predicted risk for the disease. We seek the quantile of the genetic liability necessary to obtain disease risk r given prevalence p and liability-scale heritability σ 2 G . If an individual has genetic risk r, then by definition, P(L G + L E > t|L G = l G ) = r, i.e. the probability their total liability exceeds the threshold, conditional on their realized genetic liability, l G , is r. This statement implies P(L E > t − l G ) = r, which is equivalent to P(L E ≤ t − l G ) = 1 − r. The left side is the cumulative distribution function of L E evaluated at t − l G , so by inverting the cumulative distribution at 1 − r (that is, by calling the quantile function of L E at 1 − r), we obtain t − l G . If Φ −1 is the quantile function of the standard normal, then σ E Φ −1 is the quantile function of a normal distribution with expectation 0 and standard deviation σ E , so we have And, remembering that t = Φ −1 (1 − p), we have that the implied genetic liability is Ultimately, we seek the quantile associated with risk r, and thus genetic liability l G , obtained by evaluating the cumulative distribution function of L G at l G . L G is normal with expectation 0 and standard deviation σ G , so its cumulative distribution function can be obtained by dividing the argument by σ G and using the standard normal cumulative distribution function. Thus, the percentile of the genetic liability L G necessary to produce risk r is which is equal to equation 2. The relationship of individual risk percentage and individual risk quantile considered here is also the subject of the "predictiveness curve" considered by So & Sham (2010

Table S1
Definitions of key concepts and terms

Term Definition
"Valid" and "in-valid" Terms from Gattaca that refer to people born either with ("valid") or without ("in-valid") genetic assistance. We use these terms because they are the most frequent in the film, but valids are also referred to as vitros or "made men," and in-valids as God-children, faith births, and uteros.
"Borrowed ladder" In the world of Gattaca, the illegal act of adopting another individual's genetic identity in order to gain access to exclusive parts of society (reserved for the "valid"). People who attempt such impersonation are also called "de-gene-rates."

Genetic prediction
The attempt to predict an organism's phenotype from its genotype.

Heritability
The proportion of a trait's variance attributable to genetic variation in a given population. A trait's heritability is not fixed and depends on both genetic variation and environmental conditions. The precision of genetic prediction is limited by the heritability. The heritability explained by common variants is called the "SNP heritability."

Genetic interactions
Encompassing gene-by-gene interactions and gene-by-environment interactions. Interaction describes a situation in which the effect of a genotype on an individual outcome depends on either another genotype (gene-by-gene interaction) or the organism's environment (gene-byenvironment interaction).
Polygenic score A prediction of a phenotype formed from an individual's genotype, in practice typically a weighted sum of a person's counts of alleles associated with the phenotype. Also called a polygenic risk score (PRS).

Embryo selection
The selection of a candidate embryo produced by in vitro fertilization for implantation on the basis of its predicted phenotype. Also called embryo screening or preimplantation genetic screening. Though some have taken Gattaca to be a movie about gene editing, the film suggests that the key technology of "genetic assistance" is embryo selection.
DNA "fingerprint" A subset of a person's genotype, typically from a small number of highly variable markers, chosen so that two unrelated people are extremely unlikely to share genotypes at all markers. Heavily used in forensic genetics.
Investigative genetic genealogy (IGG) The attempt to identify the source of a crime-scene sample by identifying people who appear to be biological relatives of the source in genetic genealogy databases.
Forensic DNA phenotyping A special case of genetic prediction, in which DNA is used to predict externally visible phenotypes for forensic purposes.

Polydactyly
A condition in which an organism develops additional digits on its extremities. Gattaca features a pianist with six fingers per hand.