Cuil, the new search engine, launched with much fanfare this past week. It’s been blogged about all over the place already, so I’m not going to analyze how its results compare to Google’s. I’m more curious about its privacy policy, which trumpets that it collects NOTHING, nada, zip, zilch.
I found it sort of funny that the other big news in search engines recently was Google’s announcement that it was launching an updated version of Google Trends called Google Insights for Search. While one search engine bragged about its lack of data collection, the other was showing it off.
The two news items together highlight the problem at the heart of our ongoing search for more privacy online. Despite all the handwringing over online data collection, especially by big search engines, people love seeing the data that gets collected, even when they’re not advertisers. We want to see how often we’re mentioned in Twitter, or what parts of the world are searching for topics we blog about. It’s not hard to imagine more serious research and analysis being applied to this data and real social good coming out of it.
I’ve never found very compelling the National Rifle Association’s argument, “Guns don’t kill people; people kill people.” But I find myself wanting to say something similar about data collection: “Data collection doesn’t violate privacy; irresponsible people and laws violate privacy.” Shutting down data collection altogether can’t be the answer.
My contracts professor from law school, Ian Ayres, suggests in his book Super Crunchers that the IRS become a source for useful information for ordinary people. The agency could tell taxpayers how much others in their income bracket, on average, are donating to charity or contributing to their IRAs, or tell small businesses whether they might be spending too much money on advertising.
The idea isn’t so far-fetched. About two months ago, the Italian government caused an uproar when it published online the tax details of every single Italian taxpayer. Allegedly meant to fight tax evasion, the move by the outgoing government sounded more like it was motivated by political spite. The most fascinating thing for me, though, was reading various comments in the blogosphere and finding out Norway, Sweden, and Finland do this every year! Apparently, the tax documents are considered official and therefore public records. According to the Swedish government, it’s in keeping with a general principle of government transparency: “To encourage the free exchange of opinion and availability of comprehensive information, every Swedish citizen shall be entitled to have free access to official documents.” And no one really minds.
Of course, this would be inconceivable in the U.S.—there’s a law against it. But as Ian Ayres suggests, the idea that the government should be giving information back to us, instead of just collecting it from us, isn’t totally crazy and Scandinavian. It could be released in anonymized aggregates or in others ways that wouldn’t reveal how much our neighbor makes. The information could be genuinely useful, not just titillating.
There could even be implications for public policy. So much of government policy is expressed in the Internal Revenue Code (such as favoring homeownership over renting), but our debates about tax cuts, mortgage deductions, and credits are based on fairly imprecise numbers. Even as we argue about what a tax cut will do to the “middle class,” we don’t even know what the “middle class” is. Where should government transparency start, if not at the point of revenue collection?
My friend sent this to me recently. Created by the ACLU for its campaign against the National ID program, it’s a mash-up of all our worst surveillance fears. It starts with a guy calling his local pizzeria for a couple of double meat pizzas, while you see the computer screen the girl at the pizza place is looking at as she rings up his order. She surprises him first by knowing his name, his home address, and his place of work from the moment his call comes in, but it gets rapidly worse, from a $20 health surcharge for meat pizza because of his high cholesterol and blood pressure to her snide comments about his waist size and his ability to pay for the pizzas, based on what she knows of his purchase history, including airplane tickets to Hawaii.
It’s entertaining, but also frustrating for a couple of reasons. First, there are very good reasons for me to be concerned about private companies’ data collection and their potential for collusion in U.S. government surveillance, but this video doesn’t explain how the National ID program would lead to the pizzeria having my health records. By focusing only on the sensational horror of the pizza girl knowing the customer bought a bunch of condoms, it forgets to tell us the pizzeria might literally be giving their customers’ names, phone numbers, and addresses to government officials. (The ACLU does have this report providing a more detailed argument about the dangers of private-public surveillance, but there was no direct link to it from the pizza video.)
Second, in terms of data collection and its dangers in general, the video ends up feeling sort of hysterical. It obscures, rather than clarifies, what’s really at stake.
We do live in a world where data collection is happening on an unprecedented level. But for me, what’s scary is not the mere possibility that all this data could get linked together. It’s about control. Do I get to decide who has my information? Do I get to control how it’s disseminated and analyzed?
Right now, we definitely don’t and that’s a problem. But the solution may not be to stop data collection altogether and segregate all the information out there so no linkage can happen ever.
I might not want the pizza girl at my local pizzeria to know about my health problems, but I might not mind if, as I ordered food online, the program allowed me to review my choices and build a more a nutritious meal specific to my needs, without disclosing my specific preferences to each restaurant. I might not want the government to be able to access my purchase history, but I might want to be able to securely track and access my purchases and my financial accounts at the same time so I can better determine how well I’m meeting my budget. I might even want to share certain information, securely and anonymously, if I thought it would lead to beneficial research by scientists, economists, and policymakers.
Of course, I wouldn’t sign up for anything if I thought my personal information could get leaked to the government or anyone else without my consent. It would make for a somewhat less dramatic video, but this is what the Common Datatrust Foundation is interested in addressing—how can we turn our capacity for data collection and sharing into something that is a public good, rather than a scary fear?
Even thought I thought this New York Times article on data collection wasn’t very informative, a New York legislator was sufficiently moved to propose this legislation.
Michael Zimmer has some interesting comments about the proposed bill and some of its weaknesses, as well as a strength:
“The bill is strongest, however, in relation to a demand I have long made on Web search providers: let me see the data you have collected about my actions. The bill states:
17. Business entities shall provide consumers with reasonable access to personally identifiable information and other information that is associated with personally identifiable information retained by the third party entity for online preference marketing uses.
The press seems to have missed the importance of this section. If passed, the law would require Google, Facebook, DoubleClick, etc to provide me access to the personally identifiable information ‘and other information that is associated’ with my user account stored in their databases.
This is a vital right for consumers to be able to protect their data privacy: having access to view your data is the first step towards regaining some control over the collection of the data in the first place.”
The challenge–and it’s a worthy one–is how could this information be provided to us in a way that makes it useful and relevant? I’d like to see a law providing access to my own data that is more meaningful to me than HIPAA feels when I’m faced with a sheaf of waivers to sign at my doctor’s office.
Recently, I was having dinner with some friends when the topic of Property Shark came up. My friends, being homeowners, were disturbed that someone could simply go online, type in their address, and find out who the owners were and precisely what they had paid for it. One friend exclaimed, “I don’t want people to know how much money I have!” When I pointed out that the information was public record, and that before Property Shark, anyone could have gone down to City Hall and found the same information, he didn’t care. It still bothered him.
For all our talk of “privacy,” of how it’s being violated all over the place, of how it’s already lost, it’s not even clear what we mean when we say “privacy.” We, as a society, might have agreed that it is good public policy for real estate records to be public so that potential buyers can make sure sellers actually own the property they’re selling. Capitalism can’t thrive if you can’t be sure you own what you own. But when we theoretically made this agreement, we certainly didn’t imagine a world where “public” means available to anyone, anywhere, at any time. Professor Helen Nissenbaum, who recently presented at the DIMACS Data Privacy Workshop, has proposed that we think about “contextual integrity” rather than “privacy.” She argues that it’s more useful to consider what’s appropriate in each context rather than assuming there is a blanket “privacy” standard applicable to all situations.
That makes sense to me. My friend wasn’t arguing that the information shouldn’t be public record. Rather, he wasn’t comfortable with that information being accessed so easily online.
Personally, in the universe of privacy breaches, Property Shark doesn’t seem so problematic, but it’s certainly helpful as the Common Datatrust Foundation works on privacy problems to remember that “privacy” doesn’t have a singular meaning. One of CDTF’s goals for this year is to create some privacy standards for companies and other data collectors that acknowledge that information flow can’t just have a on/off, public/private spigot. It’s obvious that our world and our needs are more complex than that. After all, sometimes it’s hard to know even what we want when we clamor for more privacy. Even my friend, when pressed, admitted that the next time he was looking to buy a house, the first thing he would do is go to Property Shark.