The politics of being counted

Monday, July 13th, 2009

The 2010 U.S. Census has been in the news a lot lately.

A national association of Latino clergy recently announced a campaign to persuade a million of its members to boycott the 2010 U.S. Census.  They hope their boycott puts pressure on the federal government to pass legalization legislation, but they also claim that they don’t want federal money allocated and used to harass illegal immigrants.

Republican Representative Michele Bachmann of Minnesota also announced she and her family will be boycotting the census.  In her case, she’s refusing to answer questions because she thinks it’s outrageous that the government wants to know how long it takes you to get to work, and that ACORN, along with many other organizations and businesses, is involved in helping to carry out the 2010 census.  Also, she’s angry that the census doesn’t ask you if you’re a U.S. citizen.  (Which isn’t quite right.  It does ask you if you’re a U.S. citizen; it doesn’t ask you if you have legal status or not.)

In contrast, one group got their wish to be counted.  The Census Bureau recently announced the 2010 U.S. Census will release data on same-sex marriage.  Data on same-sex marriages has been collected for a long time, but the last administration interpreted the Defense of Marriage Act as prohibiting the release of that data.  The initial plans for the 2010 census were to “edit” the responses to recategorize same-sex marriages as “unmarried partners.”  In 1990, the bureau simply changed the gender of one person.  So the new policy means responses will be accepted as they are.

Some people want to be counted.  Some people don’t.

I’m firmly in the “count me” camp.

As Republican Representatives Patrick McHenry, Lynne Westmoreland and John Mica pointed out to Rep. Bachmann, refusing to respond to the Census is “illogical, illegal, and not in the best interest of our country.” The League of United Latin American citizens, in contrast to the Latino clergy group, is participating in a coalition of media, community groups, labor unions, and churches to urge participation in the census.  Clearly, being on the same side of the political spectrum or sharing a specific policy agenda doesn’t mean you’ll agree about the census.

It’s not that numbers are apolitical.  The Census determines how apportion federal funds and representatives for the House.  The LA Times article cites a study that argues the illegal population in California led to California gaining 3 seats in the House of Representatives, while Indiana, Mississippi, and Michigan to lose seats.  It’s all about power, which means it’s all about politics.

But political debates shouldn’t be about whether or not to be counted.  Debates should be about whether certain proposals will do what they claim, or even about whether the numbers are accurate.  To refuse to be counted altogether, when the numbers will determine so much?  It’s like refusing to vote.

One last election-related post!

Thursday, November 6th, 2008

It’s hard to believe now, but the red state/blue state maps only became a standard image in American politics in 2000, when it seemed to illustrate very vividly the sharp divides in the country, on politics, culture, even consumer habits.  Many people, however, used the same data in more granular form to show that the story was more nuanced than that, both in 2000 and 2004.  (UPDATED: And a new one for 2008.)

Now, in 2008, we have this great graphic from the New York Times, using data to tell a story, rather than simply provide a snapshot, of how the country has changed since 2004.


Compare this graphic, showing the counties in which Obama won more votes than Kerry (and the counties in which McCain won more votes than Bush), to the simpler red-blue map of the electoral votes won by each candidate.


If we were able to look even closer, we would be able to see how different issues and concerns may have influenced the decision to vote Democratic from county to county.

Who would be interested in that kind of data?  Not just Democrats wanting to gloat, but also Republicans wanting to analyze where their party is and should go, policymakers trying to understand people’s concerns, community organizers trying to galvanize people, even private individuals wanting to understand their community and their country a little bit better.

Now that the election is over, we can really start thinking about what happens next, for our country and our world.  More data, not just for data’s sake, but for more understanding.

Amazon’s red and blue book-buying map

Wednesday, October 15th, 2008

Sorry, it’s another semi-political post!


We at the Common Data Project are definitely interested in more than politics, but this Amazon map of political book-buying state by state was too interesting not to blog about it. It illustrates so many things I believe in.

One: Information-sharing can be fun.

People love patterns, and even more, knowing where they fit into them. The Amazon customers who are most likely to be drawn to this map are those who have bought political books, books that fall into the red, blue, or purple categories. No one is likely to be outraged that his purchase of Thomas Friedman’s book in the last 60 days got counted in designing this map. Although there’s a lot of data collection that Amazon prefers to keep on the down-low, this kind of tracking is refreshingly open and explicit. We know it’s being collected, and most of all, we get something in return. We all get to enjoy the data as well.

Two: Data has limited value if there is limited context.

As pretty as this map is, it doesn’t really provide much information. Junk Charts lays out a lot of the deficiencies that limit our ability to draw any meaningful conclusions. Providing the map with just the states colored in, but without real sales numbers, doesn’t give you a real sense of which books are selling better, in the same way that the 2004 election red-blue maps with their wide swaths of red in the middle didn’t provide real information about population density and how close the election had actually been, nor how seemingly blue or red states actually contained significant pockets of people who had voted for the other guy. How many people in South Dakota bought a “red” book? Ten, twenty, or a hundred thousand?

The paucity of information on how books were rated red, blue or purple drove me crazy, too. Every place I clicked to “Learn more,” it took me to the same very short four paragraphs. It says that the categorization was based on the book’s own promotional materials and the tags readers added to them, but I still wonder who categorized these books and precisely how they did so. Would all the authors necessarily have labeled their books as blue or red?

And if they were categorizing books as purple, as neither obviously liberal or conservative, why didn’t they include them in the percentage calculations by state?

Three: Underlying data should always be available for alternative analyses.

A lot of people are wary of data; they’ve heard too many times how numbers can be twisted to serve any purpose. We at the Common Data Project make no promises that data = truth, only that when data is truly open and available, conclusions based on that data can then be prodded, tested, and possibly refuted.

In this case, I’m not quite sure if Amazon does have a conclusion to assert, but the decisions it made about which data to include and exclude have shaped the map presented. One conclusion you might draw from a cursory glance might be the same one drawn by one of the commenters to the Junk Charts post—that people only read books they’re likely to already agree with. Imagine now if we could test that conclusion, if we could count how many readers in each state bought both “red” and “blue” books, or if there were readers who would consider themselves “conservative” but bought “liberal” books. Maybe there’s a very active and large political book club in Wyoming buying books from across the spectrum!

It may very well be true that people who identify as conservative buy “red” books, while people who identify as liberal buy “blue” books, but the map as provided doesn’t provide enough information to truly test that conclusion or propose interesting hypotheses of why that’s happening.

Still, I had a good enough time playing around with the map that I was reminded me of a book I’ve been meaning to read, which is probably Amazon’s ultimate goal anyway!

Politics and Privacy, Part II

Thursday, October 2nd, 2008

Last week, I wrote about how political data collection has shown that data collection doesn’t have to be a completely one-way street, but rather, can involve individuals’ active and sometimes almost enthusiastic participation.  Part of the enthusiasm comes from a belief that this is what democracy is about—we have the right to try to persuade our fellow citizens, whether from a soap box in the town square or by calling a voter list through a phone bank.  But the data collection by political campaigns encompasses a lot more than name, occupation, and email address.  Karl Rove revolutionized it, with his famous use of consumer preferences to identify and target likely Republican voters, but the Democrats have worked hard to catch up, Catalist being one of the big players in this effort. It’s one thing to compile donor lists; another to cross-reference “beer versus wine” preferences to voter lists.  How is democracy affected by intense, data-based voter profiling?

As Solon Barocas pointed out during his talk on voter profiling at the recent DIMACS workshop, researchers have found that micro-targeting voters can increase polarization and divisiveness.  As candidates are able to air one radio ad for the Latino voters in one state and a different one for the white voters in another, they’re able to espouse more extreme positions than they would if forced to appeal to a more general audience.

If true, this is a serious problem.  But I like to believe that in the long run, and done right, political data collection and analysis could actually enable new kinds of consensus and coalition-building.  For one, in an era where blogs monitor political campaigns hour-by-hour, a local radio ad can be made available to a national audience no matter which micro-audience was originally targeted.  (Update: we can even find out about “telephone” calls to the deaf community!)

But more importantly, I can imagine that if voters and not just campaigns were able to see who else felt the way they did on major issues, many might be surprised.  Solon mentioned that despite the headlines, the algorithms by which likely Democratic or Republican voters are identified is not as simple as beer = conservative, wine = liberal.  Yes, campaigns believe they can figure out who in a community might lean in their direction, but it’s a much more complicated calculation.

So if people chose to share and know who else felt similarly, in ways that were more fine-grained than national polls, really interesting things could happen to our political discourse.  The Left Coast environmentalist might learn the hunter in South Dakota shares a commitment to conservation.  The pro-choice atheist and the pro-life Catholic might learn they both oppose the death penalty.  I’m not advocating that we throw open the curtains on the voting booth.  But knowing how our fellow citizens feel about the issues facing all of us—it almost sounds like that old-fashioned American democratic institution, the town hall meeting.

After all, democracy is the ultimate social activity.  We’re supposed to be making decisions together.

Politics and Privacy, Part I

Friday, September 26th, 2008

Rock the Vote Application

Catalist and Rock the Vote recently launched an effort to increase voter registration through a very exact tool, a Facebook application.   Using Catalist’s voter targeting databases, and knowing who has downloaded a voter registration form from Rock the Vote’s widget, they’re asking Facebook users to call the people who never actually sent in their forms and remind them to do so.

I’m curious to know how potential voters are responding to these phone calls.  Given Rock the Vote’s target demographic, and the age of most users on Facebook, they may not be as shocked to get a phone call as an older voter might. And in general, I think people are more aware that their personal information is being collected, analyzed, and shared in the political context than they are in other contexts.  I’ve had friends tell me they don’t make donations, even to candidates they support, for fear of getting on “some list.”  And anyone who has ever lived in a state or district involving a close race knows that it’s not uncommon to have a total stranger call you or even knock on your door and ask for you by name.

These kinds of intrusions can be annoying, and in some communities, being outed as a Democrat or a Republican can have more serious repercussions.  But in general, I don’t think the public is as uncomfortable with this kind of data collection by political campaigns and the Federal Election Commission as they are when it’s being done by search engines or ISPs.  (I’m not talking specifically about detailed voter profiling and data mining, which I think is slightly different and will blog about separately.)

I think there are a couple of reasons for this.  First, people believe there are a number of issues that have to be weighed.  It’s not just their privacy rights versus a company’s profits, but their privacy rights versus democratic principles, like government transparency in the case of FEC disclosure.  Second, the data collection is extremely obvious.  We all know campaigns are tracking who’s donated, so they can ask again and again and again, at least until that maximum contribution limit is reached.

Most importantly, though, people want their candidate to win.  If they are contributing more than $200 in an election cycle to a political candidate, they can live with being in the campaign’s database, as well as the FEC’s.  If they care enough to go to a rally and then are asked for their email address, they don’t mind being sent emails from the campaign day after day.  They know that if they are called during dinner and reminded to vote for their candidate, the other likely voters are being called, too.  Heck, the most enthusiastic supporters are using the data themselves, by volunteering for phonebanks and canvassing, as with the Catalist/Facebook application.

Political data collection has some lessons to teach data collection in other arenas.  Don’t try to hide what you’re doing—be obvious.  Even more importantly, give people an incentive to provide information.  Google and Yahoo can assure us that the log data, the IP addresses, the tracking they do when we’re logged into their email accounts, are all meant to provide us a better service, but we don’t really feel like we’re getting something out of it, especially compared to what they’re getting out of it.  These companies currently seem to be working on the model of “Don’t worry, whatever we’re doing won’t hurt you.”  The model should be, “Participate and get value out of the data yourselves.”

