Posts Tagged ‘Policy’

Fighting fire with fire, data with data

Tuesday, October 28th, 2008

We all know that companies buy and sell databases of information about us.  They use it to make marketing decisions, including how to identify people who are looking to refinance their homes.  There are certainly interesting questions these practices raise about personal information and privacy, but more immediately, as this article makes clear, data mining can have real-life, adverse and sometimes devastating consequences for ordinary people.

For example, when Mercurion Suladdin, a county librarian in Utah, filled out an application to refinance her home at Ameriquest, “she quickly got a call from a salesman at Beneficial, a division of HSBC bank where she had taken out a previous loan.”

The salesman said he desperately wanted to keep her business. To get the deal, he drove to her house from nearby Salt Lake City and offered her a free Ford Taurus at signing.

What she thought was a fixed-interest rate mortgage soon adjusted upward, and Ms. Suladdin fell behind on her payments and came close to foreclosure before Utah’s attorney general and the activist group Acorn interceded on behalf of her and other homeowners in the state.

“I was being bombarded by so many offers that, after a while, it just got more and more confusing,” she says of her ill-fated decision not to carefully read the fine print on her loan documents.

HSBC knew she was a good target because of products offered by data vendors, like “mortgage triggers” and “TargetPoint Predictive Triggers” from Equifax, which “advertises ‘advanced profiling techniques’ to identify people who show a ‘statistical propensity to acquire new credit’ within 90 days.”

Data brokers and lenders defend themselves by saying they are offering a service similar to a second opinion from a doctor.  It’s a good point, especially when you consider that a good doctor would never recommend an alternative treatment without going over all the risks and possible side-effects.  The question isn’t whether it was right or not to give her options; the question is whether she had enough information to weigh those options, especially in comparison to all the information HSBC had about her.

The Internet Age allows all of us to have so much information all the time, but the kind of power that comes with data mining and data analysis tools belongs overwhelmingly to banks, search engine companies, and insurance companies, and not at all to to individuals and consumer protection organizations.  Sure, Ms. Suladdin could have tried to Google “HSBC Beneficial risks?”, but that pales in comparison to what HBSC Beneficial was able to do with her data.

Imagine if she had had data analytic tools similar to her bank’s.  I’m not talking about the ability to go to some online forum to ask questions about mortgages.  What if she had been able to access a database that would tell her what kind of mortgages others like her had received?  Or what was statistically likely to happen to interest rates?  Or a combination of these factors?

Or think about all these nonprofit organizations working hard to deal with the subprime mortgage crisis and the rapidly increasing rate of foreclosures.  Like many nonprofits that offer a service, the first step is to identify the people who need their help and notify them these services exist–not very different from what a business has to do in its marketing strategy.  Why shouldn’t nonprofits have access to the same data as the companies that offered those mortgages to these people in the first place?

Anyone who has read this blog knows, we don’t believe in fighting bad uses of data by shutting down data collection altogether.  Let’s fight fire with fire, data with data.

The great story of good data

Wednesday, October 22nd, 2008

I love stories.

You might think, then, that I wouldn’t love data.  Stories and data are often seen as two very different ways of presenting information.  Data is considered cold, impersonal, incomplete.

But much of data’s bad reputation comes from limited data, not data in and of itself.  As Hans Rosling, a Swedish professor, demonstrates in this video, data can tell amazing stories.

It’s long but well-worth watching in its entirety.  It’s a few years old, from the 2006 TED conference, but I would bet it’s almost as riveting on YouTube as it was live at the conference.  In it, Rosling uses animated graphs of UN statistics from 1962 to 2003 to tell stories about our world and how it’s changed in ways that defy easy generalizations.

For example, his Swedish medical students studying global health assumed that there were two kinds of countries in the world—Western countries where family sizes are small and people live longer, and Third World countries where family sizes are large and people die young.  But as his animated graphs show, many countries that are still poor and developing have moved by 2003 into the upper left-hand quadrant, of countries with smaller families and longer life expectancies.  By 2003, Vietnam is in the same place the United States was in 1974.  As he declares, “If we don’t look at the data, we underestimate the tremendous change in Asia.”

In my favorite segment, like a great novelist building a complex character, Rosling breaks down one set of data over and over, showing the much more interesting and complex story behind average income and child survival.

easy-gdp.png

The first graph, comparing GDP per capita among countries in the OECD, East Asia, South Asia, Africa, and Latin America, tells the story we all expect.  The blue dot in the upper right-hand quadrant is OECD countries; the small red dot on the bottom left-hand quadrant is Africa.

within-africa.png

But then he shows how different countries within Africa have tremendous variations in GDP per capita, as well as child mortality, despite Western conceptions of a monolithic “Africa and its problems.”

within-countries.png

And just when you’re patting yourself on the back for understanding that Africa includes a very diverse range of countries, he shows that even within the countries, the distribution of income is very broad.  The highest income quintile in South Africa is quite high, approaching the average per capita GDP in the United States.

As Rosling says, “Improvement of the world must be highly contextualized!”  And the data is what will allow us to do it.  His demonstration itself shows how data can be limiting, how it can be used to “prove” that all of Africa is poor and sick.  But the solution clearly isn’t to ignore the data but to look at more data. Ultimately, broad, detailed, longitudinal data push us to think harder, rather than rest on our assumptions. Stories still need to be told–how did Mauritius get wealthy and healthy?  Why didn’t Ghana? But without the data, we wouldn’t even know those stories were there.


Get Adobe Flash player