Twitter is more than a way to tell the world what flavor bagel you just ate or how long your layover is at O’Hare. The microblogging service also has the potential to track an influenza outbreak, or an emerging biological warfare attack, in a faster, less costly way than traditional methods of disease surveillance, according to a computer science expert at Southeastern Louisiana University.
Twitter has already been reported in this space to aid emergency communication. More than 40 million Americans use Twitter and other social media web services more than once a day.
Currently, a process known as syndromic surveillance is used to collect health-related data to alert public health officials to the probability of an outbreak of disease, typically influenza or other contagious diseases. The technique involves collecting data from hospitals, clinics and other sources.
It’s a labor-intensive, time-consuming approach, from which the Centers for Disease Control and Prevention produces weekly estimates. Moreover, these estimates typically lag a week or more behind actual events.
“By monitoring a social network such as Twitter, researchers can capture comments from people with the flu who are sending out status messages,” said Southeastern Louisiana computer science and industrial technology professor Aron Culotta. He said that because Twitter monitoring is done in real time, it can detect outbreaks sooner than can traditional means of polling hospitals, which typically have a lag time of two to four weeks.
To get started, Culotta and his team of student assistants analyzed more than 500 million tweets over the eight-month period of August 2009 to May 2010, collected using Twitter’s application programming interface. By using a few keywords to track rates of influenza-related messages, the team obtained a 95 percent correlation with national health statistics, enabling them to accurately forecast future influenza rates.
Culotta said that once the program (as yet unnamed) is running, it’s actually neither time-consuming nor expensive. “It’s entirely automated, because we’re running software that samples each day’s messages, analyzes them and produces an estimate of the current proportion of people with the flu.”
Initially, Culotta collected statistics for the whole country, but future work will extract information from messages that are more location-specific, which will allow regional reporting. There are also plans for a web site to display real-time results.