Repetitive DNA provides a hidden layer of functional information

In the first study to run a genome-wide analysis of Short Tandem Repeats (STRs) in gene expression, a large team of computational geneticists led by investigators from Columbia Engineering and the New York Genome Center have shown that STRs, thought to be just neutral, or ‘junk,’ actually play an important role in regulating gene expression.

“Our work expands the repertoire of functional genetic elements,” says the study’s leader Yaniv Erlich, who is an assistant professor of computer science at Columbia Engineering, a member of Columbia’s Data Science Institute, and a core member of the New York Genome Center. “We expect our findings will lead to a better understanding of disease mechanisms and perhaps eventually help to identify new drug targets.”

Genomic variants are what makes our DNA different from each other, and come,
Erlich explains, “like spelling errors in different flavours.” The most common
variants are SNPs (single nucleotide polymorphisms). Computational geneticists
have been focused mostly on SNPs that look like a single letter typo—mother vs.
muther—and their effect on complex human traits.

Erlich’s study looked at Short Tandem Repeats (STRs), variants that create what
look like typos: stutter vs. stututututututter. Most researchers, assuming that
STRs were neutral, dismissed them as not important. In addition, these variants
are extremely hard to study. “They look so different to analysis algorithms,” Erlich notes, “that they just usually classify them as noise and skip these positions.”

Erlich used a multitude of statistical genetic and integrative genomics analyses to
reveal that STRs have a function: they act like springs or knobs that can expand
and contract, and fine-tune the nearby gene expression. Different lengths
correspond to different tensions of the spring and can control gene expression and disease traits. He is calling these variants eSTRs, or expression STRs, to note that they regulate gene expression. He and his team also discovered that these eSTRs can be associated with a range of conditions including Crohn’s diseases, high blood pressure, and a range of metabolites. These eSTRs explain on average 10 to 15% the genetic differences of gene expression between individuals.
“We’ve known that STRs are known to play a role in these diseases, but no one has ever conducted a genome-wide scan to find their effect on complex traits,” Erlich adds. “If we want to do personalized medicine, we really need to understand every part of the genome, including repeat elements—there’s a lot of exciting biologyahead.” New York Genome Centre