Archive for March, 2008

Data Visualizations at MoMA’s “Design and the Elastic Mind”: Beautiful, clever and heartwarming; but what is it trying to tell me??

Friday, March 28th, 2008

CDTF made a field trip to MoMA to see the Design and the Elastic Mind exhibit. I won’t try to summarize the exhibit here, but I think we each left more hopeful and inspired!

One lightbulb that went off (again) for me at the exhibit was just how hard it is to create truly “meaningful” visualizations of data.

I mean “meaningful” in the literal, not metaphorical sense of the word. Almost all of the exhibits were some combination of beautiful, clever, or heartwarming.

But the only ones that most effectively communicated information as opposed to just data were visualizations on maps and timelines: San Francisco taxi traffic patterns, flow of IP data across the globe, timeline of wikipedia edits.

Why? Maps have intrinsic meaning, they are a representation of the physical world we live in. (Calendars too.) As a semantic-rich canvas for data visualizations, maps become the lens through which we extract knowledge from the information presented to us.

Takeaway? When visualizing data, the backdrop is just as important as the actors in the show because by providing context, they provide us with a frame of reference to begin asking questions of the data: What is the significance of how the data points fall? Well, that depends on the semantic significance of the space they inhabit.

Now the question is, can we build a repertoire of semantic-rich canvases for visualizing data beyond maps and calendars?

Here are just a handful of the exhibits. They either fall into the category of: No explanation needed; or Cool, but what does it mean? (Pictures taken from the MoMA website.)

Which ones do you “get” right away?

Cabspotting (in San Francisco)Cabspotting in San Francisco (Amy Balkin)

Rewiring the SpyRewiring the Spy: “Mapping” terrorism in the news. Haunting. (Lisa Strausfeld and James Nick Sears)

Google Earth Mashup: New York Area Flood ZonesGoogle Earth Mashup: Sea level rise flood maps. (Alex Tingle)

Emergent SurfaceEmergent Surfaces: Motorized sculpture responds to its environment. Gorgeous. (Hoberman Associates)

I Want You to Want Me

I Want You to Want Me! Snippets from dating sites. Cute! (Jonathan Harris and Sep Kamvar)

Text Arc: Alice in WonderlandText Arc: “Mapping” Alice in Wonderland. Egg-shaped whimsy. (W. Bradford Paley)

SonumbraSonumbra: A tree of light that responds to the people in the room. Eerily soothing. (Rachel Wingfield & Mathias Gmachl)

Mapping the Internet“Mapping” the Internet Oddly 80s! (Bill Cheswick)

Proposed legislation that gives people access to their own data

Tuesday, March 25th, 2008

I totally missed this.

Even thought I thought this New York Times article on data collection wasn’t very informative, a New York legislator was sufficiently moved to propose this legislation.

Michael Zimmer has some interesting comments about the proposed bill and some of its weaknesses, as well as a strength:

“The bill is strongest, however, in relation to a demand I have long made on Web search providers: let me see the data you have collected about my actions. The bill states:

17. Business entities shall provide consumers with reasonable access to personally identifiable information and other information that is associated with personally identifiable information retained by the third party entity for online preference marketing uses.

The press seems to have missed the importance of this section. If passed, the law would require Google, Facebook, DoubleClick, etc to provide me access to the personally identifiable information ‘and other information that is associated’ with my user account stored in their databases.

This is a vital right for consumers to be able to protect their data privacy: having access to view your data is the first step towards regaining some control over the collection of the data in the first place.”

The challenge–and it’s a worthy one–is how could this information be provided to us in a way that makes it useful and relevant? I’d like to see a law providing access to my own data that is more meaningful to me than HIPAA feels when I’m faced with a sheaf of waivers to sign at my doctor’s office.

Yet another data breach

Thursday, March 20th, 2008

A major grocery chain, Hannaford, recently announced that due to a security breach, up to four million credit cards may be vulnerable to access by criminals. So add another to the list of 2008 security breaches, and it’s only March.

As Flowing Data points out, when you look at a timeline of big data breaches from Attrition.org, data breaches have occurred with more frequency, not less, the closer we get to the present. Yet data breaches seem to be getting less coverage than they used to. When I looked at the full list of breaches catalogued by Attrition.org, I saw some that I’d heard of and many I hadn’t. And with this recent breach, I haven’t seen as much coverage as I would have expected. Plenty of specialized blog reactions and local news coverage, but not much national attention. Are people just getting used to this? Or is it that they think they have no alternatives?

A nonprofit wants to share its mailing list with some economists–would that bother you?

Thursday, March 13th, 2008

There’s a fascinating article in the New York Times Sunday Magazine on an economists’ study of what makes people donate by an interesting liberal-conservative pair, Dean Karlan and John List. They wanted to do an empirical study of fundraising strategies, to find out what kind of solicitations are the most successful. As the article points out, lab experiments of economic choices aren’t particularly realistic: “If you put a college sophomore in a room, gave her $20 to spend and presented her with a series of pitches from hypothetical charities, she might behave very differently than when sitting on her sofa sorting through letters from actual organizations.”

So Karlan and List found an opportunity for a field experiment, a partnership with an actual, unnamed nonprofit that allowed them to try different solicitation strategies and map the outcomes. They wrote solicitation letters that were similar, except some didn’t mention a matching gift, some mentioned a 1-to-1 match, some a 2-to-1, and some a 3-to-1. In the end, if a matching gift was mentioned, it increased the likelihood of a donation, but the size of the matching gift did not. As the author, David Leonhardt, notes, their findings and the findings of other economists in this area are significant to many people, from the nonprofits trying to be better fundraisers to economists studying human behavior, even to those who want to make tax policy more effective and efficient.

The article, however, didn’t mention whether the donors to the nonprofit had consented to their responses being shared with anyone other than the nonprofit. I’m not that concerned about whether donors’ privacy may have been egregiously violated. (I’m also not sure what’s required of nonprofits in this area.) I’m just curious to know, if they had been given the choice, would they have agreed to their information being shared with the economists? Obviously, the study wouldn’t have worked if potential donors had been told they would be sent different solicitation letters to measure their responses, but I think if most people on a nonprofit’s mailing list were asked if they would explicitly allow their information to be used in academic studies, they would consent. They might want assurances that their individual identities would be protected—that no one would know Mr. So-And-So had given zero dollars to a cause he publicly champions. But they might very well be willing to help the nonprofit figure out how to be more effective and be a part of an academic study that could shape public policy. They might even be curious to know how their giving measures compares to other donors in their income brackets or geographic areas.

Most people, myself included, have a knee-jerk antipathy to having their personal information shared with anybody other than the organization or company they give it to. But maybe we would feel differently if we were actually given some choices, if our personal identities could be protected, if sharing information could lead to more than just targeted advertising or more junk mail.

Ohmigod, companies are tracking what we look at online!

Monday, March 10th, 2008

Breaking news from the New York Times.

What’s truly interesting about this article, though, isn’t that the New York Times is announcing as “news” something that’s been going on for a very long time. Rather, the New York Times, even while devoting space on its front page, doesn’t really seem to have a point.

The article tries to distinguish itself from vague alarms raised by privacy advocates with data, the results of a study done with comCast measuring “data collection events,” each time “consumer data was zapped back to the Web companies’ servers.” (Even though the New York Times has produced some of the prettiest data graphics in recent memory, this one looks like something created on Excel and conveys little more than a flurry of numbers.) But the overwhelming impression left by the article is that companies are trying to target advertising, and some might do it better than others, rather than that extensive personal information is being collected. So then it isn’t surprising that several of the comments in response to the Bits blog post are about how they never click on ads, or how stupid these companies are in sending them ads for things they’re not interested in, or how they’ve blocked pop-up ads on their browser.

After all, the article mentions only briefly what kind of information is being collected: “the person’s zip code, a search for anything from vacation information, or a purchase of prescription drugs or other intimate items.” The article cites Jules Polonetsky, chief privacy officer for AOL, “[who] cautions that not all the data at every company is used together. Much of it is stored separate,” yet the author doesn’t explain the significance of that statement. The article doesn’t mention that even if consumer data is stripped of “identifiers” like a user name, individual identification could happen easily through the combination of datasets.

I would love to see an article by a mainstream publication that addresses this issue in a truly comprehensive and thoughtful way. What’s missing in the conversation started by this article is not only a fuller analysis of how personal information is being collected and what dangers there are for individual privacy, but also a nuanced discussion of that information’s value and what it means for “a handful of big players” to hold most of it. The article ends citing a study of California adults, 85% of whom thought sites should not be allowed to track their behavior around the Web to show them ads. But does that statistic really capture what’s at stake?

P.S. Is AOL’s innocent penguin happy or merely surprised that anchovy ads are being sent to him?

Property Shark and “Contextual Integrity”: Where real estate obsession and privacy academia intersect

Tuesday, March 4th, 2008

Recently, I was having dinner with some friends when the topic of Property Shark came up. My friends, being homeowners, were disturbed that someone could simply go online, type in their address, and find out who the owners were and precisely what they had paid for it. One friend exclaimed, “I don’t want people to know how much money I have!” When I pointed out that the information was public record, and that before Property Shark, anyone could have gone down to City Hall and found the same information, he didn’t care. It still bothered him.

For all our talk of “privacy,” of how it’s being violated all over the place, of how it’s already lost, it’s not even clear what we mean when we say “privacy.” We, as a society, might have agreed that it is good public policy for real estate records to be public so that potential buyers can make sure sellers actually own the property they’re selling. Capitalism can’t thrive if you can’t be sure you own what you own. But when we theoretically made this agreement, we certainly didn’t imagine a world where “public” means available to anyone, anywhere, at any time. Professor Helen Nissenbaum, who recently presented at the DIMACS Data Privacy Workshop, has proposed that we think about “contextual integrity” rather than “privacy.” She argues that it’s more useful to consider what’s appropriate in each context rather than assuming there is a blanket “privacy” standard applicable to all situations.

That makes sense to me. My friend wasn’t arguing that the information shouldn’t be public record. Rather, he wasn’t comfortable with that information being accessed so easily online.

Personally, in the universe of privacy breaches, Property Shark doesn’t seem so problematic, but it’s certainly helpful as the Common Datatrust Foundation works on privacy problems to remember that “privacy” doesn’t have a singular meaning. One of CDTF’s goals for this year is to create some privacy standards for companies and other data collectors that acknowledge that information flow can’t just have a on/off, public/private spigot. It’s obvious that our world and our needs are more complex than that. After all, sometimes it’s hard to know even what we want when we clamor for more privacy. Even my friend, when pressed, admitted that the next time he was looking to buy a house, the first thing he would do is go to Property Shark.