Thursday, 13 November 2008

How we help track flu trends

This post is the latest in an ongoing series about how we harness the data we collect to improve our products and services for our users. - Ed.

Google search isn't just about looking up football scores from last weekend or finding a great hotel for your next vacation. It can also be used for the public good. Yesterday, we announced Google Flu Trends, which uses aggregated search data in an effort to confront the challenge of influenza outbreaks.

By taking Google Trends — where you can see snapshots of what's on the public's collective mind — and applying the tool to a public health problem, our engineers found that there was a correlation between flu-related queries and the actual flu. They created a model for near real-time estimates about outbreaks, in the hopes that both health care professionals and the general public would use this tool to better prepare for flu season.

Since we launched yesterday, the response from the medical community has been positive. "The earlier the warning, the earlier prevention and control measures can be put in place," said Dr. Lyn Finelli of the influenza division at the Centers for Disease Control and Prevention, to The New York Times. "[T]his could prevent cases of influenza." You can check out the tool for yourself.

We couldn't have built this flu detection system without analyzing historical patterns. Because flu season is different every year, just a few months of data wouldn't have done the trick. For example, the 2003-2004 flu season was unusually severe in many regions. The data from that season was especially robust and allowed us to discover a more accurate, reliable set of flu-related terms. To learn more about how we built the system, see this page on how Flu Trends works.

Because we're committed to protecting your privacy, we made sure that the searches that we analyze for Google Flu Trends are not drawn from personally-identifiable search histories but rather from an aggregated set of hundreds of billions of searches.

In order to provide a rough geographic breakdown of potential flu outbreaks, we use IP address information from our server logs to make a best guess about where queries originate. To protect your privacy, we anonymize those IP addresses at nine months. And we don't provide this aggregated, anonymized data to third parties. For more information about the privacy protections for Flu Trends check out our FAQs and privacy policy.

This is just the first launch in what we hope will be several public service applications of Google Trends in the future. And as we continue to think of ways to use aggregated and anonymized search data in helpful ways, we're also committed to safeguarding our users' privacy.