Is anyone surprised that there are more Google searches for “orgy” than “apple pie”? Does this mean “smut peddlars” should be re-characterized as mainstays of mainstream culture? That’s the defense strategy in a trial of a “pornographic Web site operator.”
Now, the parallels the defense is trying to draw are ridiculous. There are 1001 reasons to explain the fact that “orgy” is more popular than “apple pie” and “watermelon” on the internet. For one, “apple pie” is way more specific than “orgy”. “Restaurants” versus “orgy” might be a more interesting comparison.
Still, that would be hard to do right because there are endless variations when it comes to searching for a place to eat. Realistically, how many ways are there to search for “orgy”?
Nevertheless, it is always interesting when raw data about what we do undermines what we think we should do. Does increasing access to such “behavioral” data mean an end to hypocrisy? or the erosion of a basic human device that helps us all get out of bed and face the world each day?
My friend sent this to me recently. Created by the ACLU for its campaign against the National ID program, it’s a mash-up of all our worst surveillance fears. It starts with a guy calling his local pizzeria for a couple of double meat pizzas, while you see the computer screen the girl at the pizza place is looking at as she rings up his order. She surprises him first by knowing his name, his home address, and his place of work from the moment his call comes in, but it gets rapidly worse, from a $20 health surcharge for meat pizza because of his high cholesterol and blood pressure to her snide comments about his waist size and his ability to pay for the pizzas, based on what she knows of his purchase history, including airplane tickets to Hawaii.
It’s entertaining, but also frustrating for a couple of reasons. First, there are very good reasons for me to be concerned about private companies’ data collection and their potential for collusion in U.S. government surveillance, but this video doesn’t explain how the National ID program would lead to the pizzeria having my health records. By focusing only on the sensational horror of the pizza girl knowing the customer bought a bunch of condoms, it forgets to tell us the pizzeria might literally be giving their customers’ names, phone numbers, and addresses to government officials. (The ACLU does have this report providing a more detailed argument about the dangers of private-public surveillance, but there was no direct link to it from the pizza video.)
Second, in terms of data collection and its dangers in general, the video ends up feeling sort of hysterical. It obscures, rather than clarifies, what’s really at stake.
We do live in a world where data collection is happening on an unprecedented level. But for me, what’s scary is not the mere possibility that all this data could get linked together. It’s about control. Do I get to decide who has my information? Do I get to control how it’s disseminated and analyzed?
Right now, we definitely don’t and that’s a problem. But the solution may not be to stop data collection altogether and segregate all the information out there so no linkage can happen ever.
I might not want the pizza girl at my local pizzeria to know about my health problems, but I might not mind if, as I ordered food online, the program allowed me to review my choices and build a more a nutritious meal specific to my needs, without disclosing my specific preferences to each restaurant. I might not want the government to be able to access my purchase history, but I might want to be able to securely track and access my purchases and my financial accounts at the same time so I can better determine how well I’m meeting my budget. I might even want to share certain information, securely and anonymously, if I thought it would lead to beneficial research by scientists, economists, and policymakers.
Of course, I wouldn’t sign up for anything if I thought my personal information could get leaked to the government or anyone else without my consent. It would make for a somewhat less dramatic video, but this is what the Common Datatrust Foundation is interested in addressing—how can we turn our capacity for data collection and sharing into something that is a public good, rather than a scary fear?
One of the most interesting issues in privacy to me is the gap between those who live and breathe privacy and security day to day and those who don’t. Having gone from the latter group to the former only recently, I know how wide that gap is. Those who care about privacy discuss and analyze various solutions with passion and intensity, while people like my sister dispose of broken laptops by placing them in NYC trashcans. (True story—the laptop was mine, and she was sincerely puzzled when I threw a fit.) All the news coverage of data leaks has led many people to have a vague sense of dread about their privacy rights, but understand nothing more. So even if interesting solutions are proposed for protecting personal information, the question of who will care enough to adopt them is as important as whether the proposals actually work.
It seems this issue played out in the development of the U Prove technology, which had been proposed before. It just wasn’t very marketable when it was pitched to individual consumers. One thing Stefan Brands and Credentica did differently was marketing it to software developers. That strategy seems to have proven successful, given that Microsoft has now bought the company.
But will Microsoft’s investment in Credentica pay off with users who have only vague concerns about their privacy? (I love the way the Wired article says, “Brands and Thompson tend to refer to the math behind U-Prove as ‘magic’ rather than going too deep into the details.”) Will Microsoft be able to overcome its image as a big bad company and persuade consumers they are really invested in protecting privacy? It’s a difficult problem. Privacy concerns need to be addressed now, before the public cares enough to demand it, but solutions proposed by major companies may not satisfy uneasy consumers.
I’m biased, of course, because we at the Common Datatrust Foundation are working on a different model, that privacy and security should be entrusted to a trusted third-party that would administer and monitor exchanges of information between individuals, institutions, agencies, and businesses. But I’d be happy to see progress by Microsoft or any other company or organization in proposing privacy and security systems that truly returns control over personal information back to individuals without requiring everyone to understand all this privacy stuff.
I’m curious to know what others think. If we believe the privacy of even those who don’t care should be protected, where should the push for change come from?
Unfortunately, the answers aren’t very satisfying.
“It’s what we do. Our corporate mission is to organize the world’s information and make it universally accessible and useful. Health information is very fragmented today, and we think we can help. Google believes the Internet can help users get access to their health information and help people make more empowered and informed health decisions. People already come to Google to search for health information, so we are a natural starting point. In addition, we have a lot of experience storing and managing large amounts of data and developing consumer products that offer a positive and simple user experience.”
I thought their mission, as a corporation, was to maximize profits for their shareholders.
The answer to Question #6 is even worse:
“Much like other Google products we offer, Google Health is free to anyone who uses it. There are no ads in Google Health. Our primary focus is providing a good user experience and meeting our users’ needs.”
But we all know that “other Google products” that are free make money through advertising. And there are “no ads in Google Health”?
In launching Google Health, Google has clearly acknowledged that health information is even more sensitive than the personal information the company has been assiduously collecting up to this point. Although it glosses over the differences between its other applications and Google Health, promising to “conduct our health service with the same privacy, security, and integrity users have come to expect in all our services,” the mere fact that it doesn’t have advertising trumpets that Google is trying to differentiate Google Health from something like Gmail.
But the harder Google tries to assure me that there is no advertising and that the service is free, the harder it is for me to believe there are truly no costs to me. Clearly, there is a real value to providing secure online access to personal health records. Medical records, for the appropriate people, should be accessible, transferable, and plain legible, as anyone who has tried to read a doctor’s handwriting can attest. So why would someone give me something for nothing?
According to the Wall Street Journal, Google is not ruling out advertising in the future, and in the meantime, it hopes Google Health will simply drive more users to Google in general. Perhaps Google itself doesn’t quite know where Google Health will go. But given how easy it is to imagine nightmare scenarios of what can happen with this kind of information, I want the company who’s collecting it and storing it to have a better story about why it’s doing this.
“There is no mode of expression, no technique of production that will instantly produce truth or falsehood.”
Data, like film or photography is a representation of “the real world”. We study it in the hopes of finding enlightenment and understanding. However, there is no “technique of [data representation] that will instantly produce truth or falsehood.” If someone is disillusioned with data it is often because they expect too much from it. Data can’t “tell us” anything. It can only take something that is hard to grasp and offer up otherwise submerged surface area for examination, inquiry and analysis.
An important part of analyzing data is doubt and questioning, yet most data reported by the mainstream media is doubt-resistant. Some magic number is reported to the public with no real data to pick apart and study. Where do these magic numbers come from?
Responsible reporting of new data findings should probe and challenge the data. Where did the data come from? How reliable are these sources? What data is missing? How might what’s missing change the results? (This is the hardest to pull-off because it requires us to imagine what we don’t know.) How is the way the data is presented inadvertently influencing how it will be interpreted? What assumptions will each person bring to their interpretation of this data? Are they valid? Who’s in a position to make that judgment? Without at least asking these questions, eye-catching “magic number” headlines are a disservice to the public, designed to catch eyeballs with false clarity rather than expose the confusing uncertainty of reality.
More often than not, analyzing one data set simply propels you to collect and question more data. Now that I have this data, what other data do I need? Now that I have answers to these questions, what other questions do I now know I need to ask?
This is not to say that data never yields answers. Generally speaking, however, every hard-won answer simply opens the door to 5 more questions you couldn’t have imagined at the outset.
April 22nd, 2008 by The Common Datatrust Foundation
We’re very pleased to announce that Geoffrey Desa has joined the board of the Common Datatrust Foundation. Geoff is currently finishing up his doctoral dissertation on technology social entrepreneurship at the University of Washington, Seattle, and will soon begin teaching and continuing his research at San Francisco State University. His research is on small ventures that develop and deploy technology for a social purpose, recognizing that many forms of technology can be replicated and used by large numbers of people at low cost. Geoff has studied organizations and projects from all over the world, from secure documentation programs for human rights field investigators to improved technology for Kenyan beekeepers. In particular, Geoff is interested in how these innovative organizations are launched, how they access and use resources, and how early decisions impact future work.
We’re thrilled that Geoff will be bringing his expertise on nonprofit organizational structure to the Common Datatrust Foundation as we work on creating a nonprofit that sets new standards for transparency, accountability, and trustworthiness.
Now that ads are ’specially targeted to each one of us individually, how do you feel about the ads that are finding you?
Here’s Microsoft’s take on the situation. Not exactly news, but I just found it here.
Here are some snippets from the dialogue:
Consumer:
You’re not even listening are you?
We don’t talk anymore.
You do all the talking.
It’s not really a dialog.
You say you love me, but you’re not behaving like you love me.
Advertiser
They said you would love everything I did.
Consumer:
You’re not even listening are you?
If you knew me…
Advertiser:
Know you? Sweetheart, I know everything about you. You’re 28….to 34 You’re online interests include music, moves…and laser hair removal. You have a modest, but dependable disposable income.
Consumer:
I’m out of here.
The problem is: How are advertisers going to listen more closely to consumers given the privacy models we have today? How closely do you want them to be listening anyway?
But what if advertising simply became a service that helps you find what you need and discover what you want? Like a personal shopper or interior decorator. Or a financial adviser or psycho-pharmacologist. How far would you be willing to go in this relationship? And would it still be advertising?
I forgot my camera the first time I saw this exhibit on a Friday night, with free admission courtesy of Target, and so the photos below don’t capture the enthusiasm and almost sweaty energy of the intense crowd that filled every corner of the exhibition space that night. These photos are from early Wednesday morning last week, with a considerably thinner crowd, and although they’re not fantastic photos, I hope they show some of the curiosity and engagement I saw on people’s faces.
An example of Mimi’s point: data of flight patterns imposed on a map, immediately conveying information as well as something nice to look at.
A sweet and funny work playing with data from online dating sites, certainly a database of societal concerns, if not as serious as the Architecture and Justice piece on prison populations.
And last, something completely unrelated to data, but probably best at conveying how fun this whole exhibition is.
Ostensibly, the service is aimed at the “general public” uploading videos.
“Insight gives the creators an inside look into the viewing trends of their videos on YouTube, and helps them to increase views and become more popular,” said YouTube Product Manager Tracy Chan.
But of course, such a tool is useful to advertisers as well.
“Partners can evaluate metrics to better serve and understand their audiences, as well as increase ad revenue. And advertisers can study their metrics and successes to tailor their marketing — both on and off the site — and reach the right viewers.”
PLM is a web service that provides treatment data for diseases like Parkinson’s disease, multiple sclerosis, and AIDS, collected from individuals.From the recent NYT Magazine profile:
…PatientsLikeMe seeks to go a mile deeper than health-information sites like WebMD or online support groups like Daily Strength. The members of PatientsLikeMe don’t just share their experiences anecdotally; they quantify them, breaking down their symptoms and treatments into hard data. They note what hurts, where and for how long. They list their drugs and dosages and score how well they alleviate their symptoms. All this gets compiled over time, aggregated and crunched into tidy bar graphs and progress curves by the software behind the site. And it’s all open for comparison and analysis. By telling so much, the members of PatientsLikeMe are creating a rich database of disease treatment and patient experience.
Why is this interesting? Well, instead of establishing a parasitic relationship between the web service and their users where the service more or less “spies” on their users, and then makes money off of the the data they collect by selling it to advertisers, PatientsLikeMe sets up individuals and web services in a symbiotic relationship where the user has a stake in the data because the user is the one that gets value out of the aggregates. This is not only more sustainable from a PR perspective, but also from a data quality perspective. If you’re trying to understand how your personal treatment profile stacks up against others; the more detailed and accurate your information is, the more you get out of the service, the more valuable the service is to you and to others.With Youtube, Google is still playing cat and mouse with their users, hoping they won’t notice or care about the data that’s being collected and sold. PatientsLikeMe on the other hand, is part of an emerging crop of web services (Freshbooks and Wesabe to name 2 in the finance genre) that build a symbiotic relationship with their users. Of this new breed of data-driven services, PatientsLikeMe is perhaps the most ground-breaking because the user’s relationship to the service is (for a change) just so obvious:
The community as a whole succeeds or fails on the individual contributions of its members.
CDTF made a field trip to MoMA to see the Design and the Elastic Mind exhibit. I won’t try to summarize the exhibit here, but I think we each left more hopeful and inspired!
One lightbulb that went off (again) for me at the exhibit was just how hard it is to create truly “meaningful” visualizations of data.
I mean “meaningful” in the literal, not metaphorical sense of the word. Almost all of the exhibits were some combination of beautiful, clever, or heartwarming.
Why? Maps have intrinsic meaning, they are a representation of the physical world we live in. (Calendars too.) As a semantic-rich canvas for data visualizations, maps become the lens through which we extract knowledge from the information presented to us.
Takeaway? When visualizing data, the backdrop is just as important as the actors in the show because by providing context, they provide us with a frame of reference to begin asking questions of the data: What is the significance of how the data points fall? Well, that depends on the semantic significance of the space they inhabit.
Now the question is, can we build a repertoire of semantic-rich canvases for visualizing data beyond maps and calendars?
Here are just a handful of the exhibits. They either fall into the category of: No explanation needed; or Cool, but what does it mean? (Pictures taken from the MoMA website.)