Computing Community Consortium Blog

The goal of the Computing Community Consortium (CCC) is to catalyze the computing research community to debate longer range, more audacious research challenges; to build consensus around research visions; to evolve the most promising visions toward clearly defined initiatives; and to work with the funding organizations to move challenges and visions toward funding initiatives. The purpose of this blog is to provide a more immediate, online mechanism for dissemination of visioning concepts and community discussion/debate about them.

At the Intersection of Big Data and Healthcare:
What 7.2 Million Medical Records Can Tell Us

August 23rd, 2012 / in big science, research horizons, Research News / by Kenneth Hines

Health InfoScape by MIT and GE [image courtesy GE Data Visualization].We’ve featured lots of stories about Big Data over the last several months, but here’s a fascinating new one that illustrates the value of Big Data analytics in addressing important national priorities. Researchers at SENSEable City Lab — a new research initiative of the Massachusetts Institute of Technology — together with colleagues at GE Healthymagination have analyzed data from over 7 million electronic medical records, illustrating in a powerful visual the (sometimes surprising) relationships between medical conditions on the basis of the frequency of co-occurrences. They’re calling this extensive disease network the “Health InfoScape.”

When you have heartburn, do you also feel nauseous? Or if you’re experiencing insomnia, do you tend to put on a few pounds, or more? By combing through 7.2 million of our electronic medical records, we have created a disease network to help illustrate relationships between various conditions and how common those connections are…


We often have a tendency to think of illness as an isolated event, but our first analysis details the numerous (sometimes unexpected) associations that exist around any given condition. This gives us new insight as to how closely connected some seemingly un-related health conditions might be. Such results force us to re-examine conventional categories of disease classification, as the boundaries between traditional disease categories are thoroughly blurred.


Our initial results are a mix of the expected and the unexpected — simultaneously challenging and reaffirming our preconceptions of health pattern, within individuals and across the U.S.

A view of the Health InfoScape website from MIT and GE [image courtesy GE Data Visualization].

Click here to “take a look by condition or condition category and gender to uncover interesting associations.”

And there’s yet more opportunity for researchers here:

Now that we have a succinct picture of the human health network in the country, we will continue our investigation by delving deeper into how the environments around us factor into these results.

The Health InfoScape constitutes a perfect example of the role of Big Data science and engineering into the future.

(Contributed by Kenneth Hines, CCC Program Associate)

At the Intersection of Big Data and Healthcare: <br>What 7.2 Million Medical Records Can Tell Us