Google's bad case of the flu
Mixing data from social media with harder numbers for analytics can be useful, but it has its limits. Case in point: Google's flu-tracking system dramatically overestimated the number of people in the United States with influenza at the peak of this year's season, reports Declan Butler at Nature. The most likely reason: news coverage that warped the usual social-media patterns that Google Flu Trends depends on.
Flu cases peaked just after Christmas, according to the Centers for Disease Control and Prevention, which tracks reports of flu-like symptoms from 2,700 healthcare facilities. But Google's estimate, which is based on flu-related search terms, was nearly twice the CDC estimate at the peak. (For the rest of the flu season, Google's (NASDAQ: GOOG) estimates closely matched those of the CDC.)
The Christmas glitch was probably caused by publicity, say some researchers. News reports about the flu as the outbreak reached its peak may have caused more people to do flu-related Google searches. The fact the peak came during the holidays may also mean people had more time to do those searches. But Google's algorithms had no way to adjust.
This isn't the first time social-media data have suddenly gone off the charts, even for Google Flu Trends--it was also flummoxed in 2009 by the sudden surge of publicity from that year's swine flu outbreak.
But it's a useful reminder as IT departments work to mix social-media data from Google, Twitter and other sources with harder conventional sales and marketing data. The social-media world is still an echo chamber, and tracking it for business use depends on filtering out much of the noise. But a sudden burst of publicity can completely distort what appears to be going on in that social world--and filtering out that distortion automatically may not be possible.