1) Interesting story on NPR last week about a new study using cellphone data to track people’s movements. It turns out they were able to predict the nearest cellphone tower 93% of the time and their actual locations 80% of the time. The potential value to public policy is significant. It could affect how we put money into public transportation, for example.
Interestingly, though, no one mentioned any concerns about privacy, just a short statement that researchers don’t have names or numbers. Seems like a perfect, obvious example of how that’s not sufficiently deidentifying, especially as the conclusion is that you can predict where people are. Another researcher claims that he has data for half a million people and that “major carriers around the world are now starting to share data with scientists.” What if we end up with another AOL scandal on our hands, and worse, the scandal keeps this kind of research from continuing?
2) The Open Knowledge Foundation has launched a set of principles for open data in science, in support of the idea that scientific data should be “freely available on the public internet permitting any user to download, copy, analyse, re-process, pass them to software or use them for any other purpose without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. To this end data related to published science should be explicitly placed in the public domain.”
We certainly support more data being openly and freely available, but we’re curious. How will we deal with the rights of people who are in scientific studies? I’m not a scientist — do most agreements to participate in studies anticipate this level of public availability? And how can we standardize data to be more easily comparable?
3) It’s not enough to have data. We also need tools to visualize, analyze, and understand data, and more and more tools are available for just that purpose. Here’s a long list of mapping tools from the Sunlight Foundation, ClearMaps from Sunlight Labs, and Pivot, a new way to combine large groups of similar items on the internet, from Microsoft Live Labs.