Information Overload
Modern medicine is generating new information at a dizzying
pace, but Children’s researchers are developing the tools and
techniques to make sense of the flood of data.
By Cyril Manning
Picture this: a vast checkerboard, stretching in every direction
to the horizons of your imagination. There are roughly 30,000
spaces in this grid – one for every snippet of chemical
information that makes up the human genome. Moment by moment,
a scattered, apparently random sample of these spaces begins to
glow, each fluorescent beacon burning at a specific intensity.
It’s a secret code of sorts, genetic marching orders that
tell living cells how to behave.
For years, researchers knew this code existed, but finally it
has been cracked, at least in a sense. Today, a scientist can
buy a small manufactured chip, called a microarray, containing
this grid of genetic pieces. By physically extracting RNA (the
chemical messenger that transcribes genetic information) from
the nucleus of a cell, dyeing it with a fluorescent compound,
and dropping it onto the chip, the scientist can see exactly what
signals are being sent. That’s because the fluorescent RNA
binds with certain pieces in the grid, lighting them up like microscopic
beacons.
Since the mapping of the human genome in the 1990s, the problem
for scientists studying genetic disease is no longer seeing these
genetic messages, but making sense of them. Recall that immense
checkerboard: 30,000 spaces, some of them glowing, each at its
own intensity. This is only a snapshot of which genes are being
expressed (that is, which ones deliver their encoded messages)
at one moment in time in one individual. Now picture thousands
of these grids stacked one upon another, snapshots of gene expression
over time; then overlay them one on top of the next, one for each
individual in a population. There are patterns here; the challenge
is deciphering them.
“When we look at how these genes might be interacting
with each other to cause disease, there are billions of possible
combinations,” says Isaac Kohane, MD, PhD, director of the
Children’s Hospital Informatics Program (CHIP). “It’s
like the ultimate Rubik’s Cube.” But it’s a
puzzle Kohane and others within the program, which uses a wide
range of tools, computer power and analytic methods to address
various issues in medicine, plan to solve. The CHIP researchers
are sifting through vast amounts of health-related data, identifying
previously indiscernible patterns in complex systems, and turning
those patterns into valuable insights – not just in genetics,
but also in basic biology, diagnosis and treatment of disease,
and management of public health issues.
This is no small ambition. It requires investigators with expertise
in two or more very different fields. “Math and computer
science do not come naturally to most biologists,” says
Kohane, himself an endocrinologist and computer scientist. “Many
CHIP researchers are dually or triply trained in medicine and
mathematics or computer science, and that puts us at the center
of some very exciting work.”
Mathematics, not microscopes
One example is the research of Alvin Kho, PhD, which focuses on
using computer analysis to understand the flawed genetic instructions
that can lead to pediatric cancer – specifically, the brain
tumor known as medulloblastoma. A mathematician by training, Kho
analyzes these genetic signals in ways that cannot be matched
by a traditional biologist looking through a microscope. As Kohane
puts it, “There’s a whole generation of biologists
who can’t do state-of-the-art work because the com- putational
aspect of it is out of their reach.”
For over 100 years, biologists have speculated that there is
a close correlation between human development and the process
of cancer growth, known as tumorigenesis. Although lacking proof,
the idea is that tumorigenesis is an instance when normal development
goes awry, and cells keep on multiplying. That connection is difficult
to test, however, because gene expression can’t be measured
in the brains of living humans. Kho is working to show that the
two processes are indeed parallel by comparing genes that appear
to be related to human brain cancer to genes – in mice –
that appear to be related to brain development.
Plugging the human and mouse data into his computer, Kho can
tap a few keys and generate a three-dimensional cube that shows
each point where a human and mouse gene turns on – each
location corresponding to the signal the gene sends and when it
is sent.
The analysis is complex, but the basic trend is easy to see:
the genetic “beacons” seen at the earliest stage of
brain development among mice cluster in roughly the same space
as the “beacons” seen in tumorigenesis – suggesting
that similar genetic instructions are involved in both processes.
Just as important, the gene expression seen in late stages of
brain development are largely missing in tumor development, giving
more credibility to the idea that normal development includes
an “off” switch that is missing in cancer. There is
much more investigation to be done, but eventually Kho’s
findings could bring other cancer researchers closer to treatment
and prevention strategies.
Definitive diagnoses
Cancer is only one of many diseases with at least some genetic
component; there are countless fields in which identifying genetic
links could help clinicians diagnose disease sooner and more accurately,
and even come up with new therapies targeting the specific genes
responsible.
But it’s not as simple as it sounds. Single genetic anomalies
rarely cause disease; instead, most diseases result from multiple
genetic mutations and interactions. And before CHIP scientists
can succeed, they need something they can’t derive from
an equation: an enormous amount of patient data.
“You can’t study the genetics of a disease without
patient data, and getting that data requires a lot of collaboration,”
says Ingrid Holm, MD, an endocrinologist and geneticist at Children’s
Genomics Center, which works closely with CHIP. “More clinicians
are starting to get interested in the genetic side of disease
now,” she says. “Children’s has done a lot of
research into genetic diseases like muscular dystrophy, and is
now starting to get into the study of the genetic factors responsible
for diseases like congenital heart disease, asthma and allergy,
autistic spectrum disorders, and diabetes.”
Holm helps those clinical researchers set up their studies and
figure out what information they need to collect to facilitate
useful genetic analysis. In addition, CHIP has developed numerous
downloadable and Web-based tools to make it easier for researchers
to integrate and interpret genetic information.
One of several clinicians currently working on a large-scale
genomics study in collaboration with CHIP is Leonard Rappaport,
MD, director of Children’s Developmental Medicine Center,
and an expert in autistic spectrum disorders. Rappaport and Kohane
hope to develop a genetic model for diagnosing autistic children,
because, as Kohane puts it, “Even the best behavioral therapists
are increasingly uncertain about their ability to characterize
all the particular subclasses of the disorder.”
Autism is no longer regarded as a single developmental disorder,
but a spectrum of disorders involving problems with social interaction
and communication. Some autistic children function almost normally,
while others lack a basic understanding of the world external
to themselves.
“There’s strong empirical evidence that different
types of autism may have specific prognoses and require specific
interventions,” explains Rappaport. By associating specific
genetic patterns (genotype) found in patients’ RNA with
different autistic behavior and characteristics (phenotype), the
collaborators aim to gain a new understanding of the disorder.
“This study will help us identify the underlying biology
of autism,” says Rappaport. “And will give us a more
definitive way to diagnose individual children and tailor treatments
specifically to their needs.”
Mapping public health patterns
It will take one to two years for Rappaport to collect enough
patient information for a genomic analysis of autistic spectrum
disorders. But another CHIP project is showing a real-world payoff
right now. Kenneth Mandl, MD, MPH, an emergency medicine physician,
CHIP investigator and research director of Children’s Biopreparedness
Center, is using computer modeling to instantly detect unusual
public health patterns.
In 2001 Mandl developed software that could detect unusual patterns
in emergency room visits at Children’s. The system instantly
compares the symptoms of new patients to a database of more than
500,000 emergency room visits over the past 11 years, and raises
a red flag if it detects unusual activity. For example, if the
system detects more respiratory symptoms than it predicts for
a particular season and day of the week, physicians would be alerted
to a possible virus outbreak.
Now, with funding from state and federal agencies, Mandl is developing
a large-scale project called AEGIS (Automated Epidemiologic Geotemporal
Integrated Surveillance) that will use data from multiple institutions
across geographic regions. The system could allow public health
officials to anticipate the trajectory of an infectious disease
outbreak, identify environmental health problems such as contaminated
groundwater, or catch the earliest signs of a biological weapons
attack. The system can already predict an emergency room’s
next-day volume within seven percent, but Mandl is continuing
to expand the software’s capabilities by teaching it to
interpret new types of information (such as lab results) to make
even more sophisticated predictions.
“While the type of data this system sifts through is entirely
different from the human genome, we’re using similar tools
to what the geneticists are using,” says Mandl. “That’s
because we’re both looking at how multiple systems interact
and behave across time and space, and we’re both extracting
patterns and clusters from this overwhelming amount of data,”
he says.
At first glance, Mandl’s system may appear to have little
in common with Rappaport’s autism study or Kho’s tumorigenesis
investigations. But ultimately, they all wrestle with the same
problem: modern medicine is generating new information at a pace
far too fast for traditional analysis techniques to keep up with.
Children’s Hospital’s Informatics Program is building
the tools to help biologists and physicians clearly see the beacons
that glow and recede in a sea of information, stretching to the
horizons of their imagination. As the computing power of informatics
grows stronger, deciphering the signals those beacons are sending
out should become as simple as identifying blood cells through
a microscope.
To support research in the
Children’s Hospital Informatics Program,
contact Donna Richardson in the Children’s
Hospital Trust
at (617) 355-2061 or donna.richardson@chtrust.org.