Computing Community Consortium Blog

The goal of the Computing Community Consortium (CCC) is to catalyze the computing research community to debate longer range, more audacious research challenges; to build consensus around research visions; to evolve the most promising visions toward clearly defined initiatives; and to work with the funding organizations to move challenges and visions toward funding initiatives. The purpose of this blog is to provide a more immediate, online mechanism for dissemination of visioning concepts and community discussion/debate about them.

Mining Health Data, 140 Characters at a Time

July 17th, 2011 / in Research News / by Erwin Gianchandani

Twitter provides a trove of health trendsImagine you’re at the CDC, and you’re trying to predict and respond to this year’s flu season in real-time. You could either contact millions of Americans — or let them contact you via Twitter. In an exciting paper titled “A Model for Mining Public Health Topics from Twitter” posted this week, Johns Hopkins University Assistant Professor Mark Dredze and second-year graduate student Michael Paul demonstrate a way to affordably gather real-time data about our health issues.

Not only did the pair’s data support similar efforts, like Google’s Flu Tracker, it also generated new knowledge, according to the BBC:

It provided an insight into how Twitter users viewed a range of illnesses and how they went about treating themselves.


It also showed that many chose the wrong drugs to tackle common ailments.


“Tweets showed us that some serious medical misperceptions exist out there,” said Paul…


“We found that some people tweeted that they were taking antibiotics for the flu,” he said. “But antibiotics don’t work on the flu, which is a virus, and this practice could contribute to the growing antibiotic resistance problems.”

Indeed, incorrect patient actions are a major source of concern in healthcare, and mining a wealth of data from a variety of sources can provide useful insights and suggest new forms of intervention.

At the same time, the kind of research Dredze and Paul are reporting relies on voluntarily shared data, which is inherently limited to what users themselves post.

“We could only learn what people were willing to share and we think there’s a limit to what people are willing to share on Twitter,” said Paul.

Read the original research article or the follow-up story by the BBC to learn more.

(Contributed by Max ChoEben Tisdale Fellow, CRA)

Mining Health Data, 140 Characters at a Time

Comments are closed.