Posts Tagged ‘crowdsourcing’

Crowdsourcing data?

Thursday, September 10th, 2009

Sometimes, news just seems to coalesce around one topic.

A few weeks ago, the New York Times has a thoughtful piece on patients sharing their data online to push for more efficient research.  Dr. Amy Farber, after being diagnosed with a rare but deadly disease called LAM, founded the LAM Treatment Alliance and LAMsight, “a Web site that allows patients to report information about their health, then turns those reports into databases that can be mined for observations about the disease.”

In a completely different arena, we also had news that Google Maps is using GPS information from mobile phones to improve traffic data.  Google had used data from local highway authorities for traffic data on major highways, but now, GPS data from users of Google Maps with the My Location feature will provide data for local roads as well.

Pretty exciting stuff. Crowdsourcing isn’t new.  But thus far, it’s been used mostly for things that are subjective. Like Hot or Not.  Customer reviews.  It’s also been primarily voluntary. You choose to write a review and shared your data. Or if it’s involuntary, it’s not something that is accessible to the public (e.g. search results, credit card data, mortgage data, etc.).

What’s exciting now is that we’re starting to get into discussions about crowdsourcing for stuff like

  1. Medical research – where people are trying to extract objective conclusive results from data.
  2. Traffic data – where data is automatically collected (opt-in/opt-out, whatever) and made available to the public.

The two most common objections are around the supposed inaccuracy of self-reported data and the privacy risks of providing so much individualized information.

But as Ian Eslick, the MIT doctoral student developing LAMsight points out,

No one expects that observational research using online patient data will replace experimental controlled trials…“There’s an idea that data collected from a clinic is good and data collected from patients is bad,” he said. “Different data is effective at different purposes, and different data can lead to different kinds of error.”

And as the people behind Google Maps explain, they worked hard to increase accuracy by making participation as easy as possible.

The issue of privacy is a little trickier.  Google says you can opt-out of contributing your data easily, and Google promises that even those who contribute data can trust that their data will remain anonymous, “Even though the vehicle carrying a phone is anonymous, we don’t want anybody to be able to find out where that anonymous vehicle came from or where it went — so we find the start and end points of every trip and permanently delete that data so that even Google ceases to have access to it.”

There are certain to be some people who won’t feel comfortable with Google’s promises. Yet I doubt they will have much impact on Google’s ability to deliver this service. The bigger issue for me is  how privacy may be holding back smaller, less established players from developing potentially valuable services based on crowdsourced data collection?

In other words, is our currently ad-hoc and unsatisfactory approach to privacy inadvertently stifling competition by making it nearly impossible for startups to compete with the establishment wherever sensitive personal information is involved?

What data would you like to gain access to that might face similar privacy challenges?


Get Adobe Flash player