CHICAGO—The outcomes of genome-wide association studies (GWAS) have not been what scientists expected, but researchers are developing new approaches to use revelatory GWAS information to identify genetic causal variants, predictors of treatment response, and future opportunities for genetic insight.
The ability to perform GWAS, developed in the past five years, has allowed the identification of genetic risk factors for systemic diseases. In addition, GWAS have pinpointed biologic constituents and treatment response predictors in the genetic components of both common and rare disorders. However, contrary to what scientists expected, GWAS have identified only a small number of the causal variants for recently identified genetic loci, which, in turn, interpret only a small part of the genetic contribution to these diseases.
The emerging limitations of GWAS are not a reason to halt these studies, however. “The good thing about genome-wide association studies is that they do what they are billed to do—that is, represent common genetic variation in the human genome—and they do it well,” said David B. Goldstein, PhD, Richard and Pat Johnson Distinguished University Professor and director of the Center for Human Genome Variation at Duke University in Durham, N.C. Dr. Goldstein addresses GWAS research and its application at the ACR State-of-the-Art Lecture, “Moving Forward in the Genome Wide Association Studies Era,” here at the 2011 ACR/ARHP Annual Scientific Meeting here in November. [Editor’s note: This session was recorded and is available via ACR SessionSelect at www.rheumatology.org.] On the other hand, “the variation in response, most of which cannot be explained, is motivation for doing serious genetics,” continued Dr. Goldstein.
To this end, researchers are developing new approaches to isolate and define causal variants and explore genetic features that affect disease. These new methods seek to explain the “missing heritability” of the GWAS and generate DNA sequencing approaches to identify causal genetic variants.
Gap between GWAS Findings and Identification of Causal Variants
GWAS use gene chips to scan the human genome, analyzing large amounts of DNA in many people to search for variants that are more common in cases than in controls. One underlying premise is that GWAS detect common variation in specific diseases that lead to genes causing the disease and later to individualized therapies. While GWAS have identified a multitude of single-DNA letter changes, so-called Mendelian disorders, associated with risk of many common diseases, in reality scientists have been unable to discover the specific genetic changes influencing these common conditions. “What we have in the main are signals that we can’t track to causal variants,” Dr. Goldstein said. “When we can’t track signals to causal variants, we have little biology from the signals because we don’t know what is causing the variants.”
This has resulted in a different interpretation of GWAS findings: the basic genetic components of most common diseases are rare genetic variants rather than more common ones. Furthermore, the rare genetic variants occur in more remote functional regions of the genome that GWAS do not study. Hence, there is a gap between what researchers see in GWAS findings and what they are now discovering may be the causal variants of diseases. This gap is often called the “missing heritability.”
Missing Heritability
The term missing heritability refers to the low percentage of information about the overall genetic component and risk of common diseases gleaned from GWAS. Common variants account for only a small proportion of genetic components, and the missing heritability lies in the huge class of rare genetic variants that GWAS do not see. As Dr. Goldstein explains, “Variants that are primary drivers of disease are relatively rare in the human population.”
He offers a reason for the missing heritability while questioning the missing heritability notion: “Variants confer risk of disease, and natural selection acts against them so they don’t become too common. Therefore, there is no reason to think about the issue of so-called missing heritability. We did not interrogate the whole genome; we interrogated the common variants that pass through the filter of natural selection. So it’s clear that in many diseases, but not all, a lot of action will be rare variants not detected in GWAS.”
Autoimmune diseases may be an exception to this thinking. “Some variants that are major risk factors, for example, variants selected in response to infectious agents that have consequences for autoimmune diseases, appear more commonly in the general population,” noted Dr. Goldstein.
Sequencing
One of the approaches to overcome the shortcomings of GWAS may be sequencing, which has been found to be better than GWAS at discovering causal variants in common diseases. GWAS, however, are a good starting point: GWAS can define interesting regions that can be further explored by sequencing.
Sequencing of the whole genome of specific patient or of the coding regions of the genome (exome) is an effective and often less costly method for detecting rare variants thought to be the basic components of most common diseases. Whole genome sequencing of individuals reinforces the idea that rare variants are more likely to be functional or causal than common ones.
Next-generation DNA sequencing is another sequencing approach that can identify causal genetic variation in families. This method involves sequencing the entire genome of certain members of a family with multiple occurrences of a disease. Dr. Goldstein described a study of isolated, unexplained genetic conditions in which he studied 12 families with unknown congenital disorders where the child does not match anything known. He collected DNA from both unaffected parents and affected child. By sequencing these trios, he is looking for causal mutations for the child’s disorder.
Next-generation sequencing can sometimes find genetic mutations in challenging situations, but, Dr. Goldstein said, “we are getting answers only one-fourth to one-third of the time in cases where we expect a clear answer, presumed monogenic diseases. Why do we miss pathogenic variants?” His answer: “The variant is not called, possibly because the genomic region is difficult to sequence with current platforms; or the type of variant hard to call; or the variant is called but not recognized as pathogenic, possibly because of sample size.”
Dr. Goldstein offers a basic complex disease sequencing approach:
- Identify subjects, extreme phenotype or family based;
- Sequence (100+ individuals);
- Align to reference and call variants;
- Compare to hundreds of sequential controls;
- Follow up genotyping in larger cohorts; and
- Select experiment using thousands of samples.
In experiments on schizophrenia and epilepsy, however, Dr. Goldstein found that this sequencing approach using thousands of samples did not yield good evidence. He called this problem “locus heterogeneity blight” and asked, “What kind of sample size do you need as a function of the genes that carry mutations that influence the disease you are studying?” He noted that, “if locus heterogeneity is high, the sample size must be very high to get significant evidence.”
Looking to the future, Dr. Goldstein listed these steps for sequencing:
- Large sample sizes;
- Low to medium throughput;
- Functional evaluations; and
- Rephenotyping of patients after discovery of variants.
He also advocates establishing methods to evaluate biological functions of the variants implicated.
Drug Response
As part of the commitment to improving patient care as a goal of genetic study, Dr. Goldstein offered several reminders. Clinical trials data represent a critical opportunity to understand variable responses to treatment, he noted, and larger trials with quantitative measures of response provide important opportunities (often the case in rheumatology). Looking to the future, as risk factors for common diseases are identified, trial populations can and should be stratified.
Ann Kepler is a medical journalist based in Chicago.