Posts Tagged ‘Data Mining’

In the mix…EU data retention laws, Wikipedia growing

Friday, June 11th, 2010

1) Australia thinking about requiring ISPs to record browsing histories (via Truste).

Electronic Frontier Australia (EFA) chair Colin Jacobs said the regime was “a step too far”.

“At some point data retention laws can be reasonable, but highly-personal information such as browsing history is a step too far,” Jacobs said. “You can’t treat everybody like a criminal. That would be like tapping people’s phones before they are suspected of doing any crime.”

Sounds shocking, but the EU already requires it.

2) European privacy officials are pointing out that Microsoft, Google and Yahoo’s methods of “anonymization” are not good enough to comply with EU requirements (via EFF).  As we’ve been saying for awhile, “anonymization” is not a very precise claim.  (Even though they also want ISPs to retain browsing histories for law enforcement–confused? I am.)

3) Wikipedia is adding two new executive roles.  In the process of researching our community study, it really struck me how small Wikipedia‘s staff was compared to the staff of more centralized, less community-run businesses like Yelp and Facebook.  Having two more staff members is not a huge increase, but it does make me wonder, is a larger staff inevitable when an organization tries to assert more editorial control over what the community produces?

In the mix

Monday, April 5th, 2010

1) Slate had an interesting take on the bullying story in Massachusetts and the prosecutor’s anger at Facebook for not providing information, i.e., evidence of the bullying.  Apparently, Facebook provided basic subscriber information, but resisted providing more without a search warrant.  Emily Bazelon points out how this area of law is murky, and references the coalition forming around reforming the Electronic Communications Privacy Act, but her larger point is an extra-legal one.  The evidence of bullying the DA was looking for was at one point public, even if eventually deleted. She points out that it may be hard for kids or parents who are upset to have the presence of mind to do this, but that they could take screenshots and preserve evidence themselves.

The case raises a lot of interesting questions about anonymity, privacy, and the values we have online.  Anonymity on the Internet has been a rallying cry for so many people, but I wonder, if something is illegal in the offline world, should it suddenly be legal online because you can be anonymous and avoid prosecution?  (Sexual harassment is a crime in the subway, too!)  We now live in a world where many of us occupy space both online and offline.  We used to think of them as completely separate spaces, and it’s true that the Internet gives us opportunities to do things, both good and bad, that we wouldn’t have offline.  But it’s increasingly obvious that we need to transfer some of the rules we have about the offline world into the online one.  For disability rights advocates, that includes pushing the definition of “public accommodation” to include online stores like Target, and suing them if their sites are not accessible to the blind using screen readers.  For privacy advocates, that includes acknowledging that people have an expectation of privacy in their emails as well as their snail mail.  Free speech in the offline world doesn’t mean you can say anything you want anywhere you want.  Maybe it’s time to be more nuanced about how we protect free speech online as well.

2) It turns out Twitter is pretty good at predicting box office returns – what else might it predict?

3) Cases like this amaze me, because the parties are litigating a question that seems like a no-brainer.  A New Jersey court upheld recently that an employee had an expectation of privacy in her Yahoo personal account, even if she accessed it on a company computer. Would we ever litigate whether an employee had an expectation of privacy in a piece of personal mail she brought to the office and decided to read at her desk?

4) The New York Times is acknowledging their readers’ online comments in separate articles, namely, this one describing readers’ reactions to federal mortgage aid.  It’s a smart way to give online readers a sense that their comments are being read.  I wonder if this is where the “Letters to the Editor” page is going.  I’ve been wondering, who are these readers who are so happy to be the 136th comment on an article?  But the people who write letters to the editor have always been people who have extra time and energy.  In a way, online comments expands the world of people who are willing to write a letter to the editor.

5) Would we feel differently about government data mining if the government were better at it? Mimi and I went to a talk at the NYU Colloquium on Information Technology and Society where Joel Reidenberg, a law professor at Fordham, talked about how transparency of personal information online is eroding the rule of law.  One of the arguments he made against government data mining was that it doesn’t work, with the example of airport security, its inability to stop the underwear bomber, and its terribly inaccurate no-fly lists.  Well, the Obama administration just announced a new system of airport security checks that uses intelligence-based data mining that is meant to be more targeted.  It’s hard to know now whether the new system will be better and smarter, but it raises a point those opposed to data mining don’t seem to consider — what if the government were better at it?  Could data mining be so precise that it avoids racial profiling?  Are there other dangers to consider, and can they be warded off without shutting down data mining altogether?

Fighting fire with fire, data with data

Tuesday, October 28th, 2008

We all know that companies buy and sell databases of information about us.  They use it to make marketing decisions, including how to identify people who are looking to refinance their homes.  There are certainly interesting questions these practices raise about personal information and privacy, but more immediately, as this article makes clear, data mining can have real-life, adverse and sometimes devastating consequences for ordinary people.

For example, when Mercurion Suladdin, a county librarian in Utah, filled out an application to refinance her home at Ameriquest, “she quickly got a call from a salesman at Beneficial, a division of HSBC bank where she had taken out a previous loan.”

The salesman said he desperately wanted to keep her business. To get the deal, he drove to her house from nearby Salt Lake City and offered her a free Ford Taurus at signing.

What she thought was a fixed-interest rate mortgage soon adjusted upward, and Ms. Suladdin fell behind on her payments and came close to foreclosure before Utah’s attorney general and the activist group Acorn interceded on behalf of her and other homeowners in the state.

“I was being bombarded by so many offers that, after a while, it just got more and more confusing,” she says of her ill-fated decision not to carefully read the fine print on her loan documents.

HSBC knew she was a good target because of products offered by data vendors, like “mortgage triggers” and “TargetPoint Predictive Triggers” from Equifax, which “advertises ‘advanced profiling techniques’ to identify people who show a ‘statistical propensity to acquire new credit’ within 90 days.”

Data brokers and lenders defend themselves by saying they are offering a service similar to a second opinion from a doctor.  It’s a good point, especially when you consider that a good doctor would never recommend an alternative treatment without going over all the risks and possible side-effects.  The question isn’t whether it was right or not to give her options; the question is whether she had enough information to weigh those options, especially in comparison to all the information HSBC had about her.

The Internet Age allows all of us to have so much information all the time, but the kind of power that comes with data mining and data analysis tools belongs overwhelmingly to banks, search engine companies, and insurance companies, and not at all to to individuals and consumer protection organizations.  Sure, Ms. Suladdin could have tried to Google “HSBC Beneficial risks?”, but that pales in comparison to what HBSC Beneficial was able to do with her data.

Imagine if she had had data analytic tools similar to her bank’s.  I’m not talking about the ability to go to some online forum to ask questions about mortgages.  What if she had been able to access a database that would tell her what kind of mortgages others like her had received?  Or what was statistically likely to happen to interest rates?  Or a combination of these factors?

Or think about all these nonprofit organizations working hard to deal with the subprime mortgage crisis and the rapidly increasing rate of foreclosures.  Like many nonprofits that offer a service, the first step is to identify the people who need their help and notify them these services exist–not very different from what a business has to do in its marketing strategy.  Why shouldn’t nonprofits have access to the same data as the companies that offered those mortgages to these people in the first place?

Anyone who has read this blog knows, we don’t believe in fighting bad uses of data by shutting down data collection altogether.  Let’s fight fire with fire, data with data.

Politics and Privacy, Part II

Thursday, October 2nd, 2008

Last week, I wrote about how political data collection has shown that data collection doesn’t have to be a completely one-way street, but rather, can involve individuals’ active and sometimes almost enthusiastic participation.  Part of the enthusiasm comes from a belief that this is what democracy is about—we have the right to try to persuade our fellow citizens, whether from a soap box in the town square or by calling a voter list through a phone bank.  But the data collection by political campaigns encompasses a lot more than name, occupation, and email address.  Karl Rove revolutionized it, with his famous use of consumer preferences to identify and target likely Republican voters, but the Democrats have worked hard to catch up, Catalist being one of the big players in this effort. It’s one thing to compile donor lists; another to cross-reference “beer versus wine” preferences to voter lists.  How is democracy affected by intense, data-based voter profiling?

As Solon Barocas pointed out during his talk on voter profiling at the recent DIMACS workshop, researchers have found that micro-targeting voters can increase polarization and divisiveness.  As candidates are able to air one radio ad for the Latino voters in one state and a different one for the white voters in another, they’re able to espouse more extreme positions than they would if forced to appeal to a more general audience.

If true, this is a serious problem.  But I like to believe that in the long run, and done right, political data collection and analysis could actually enable new kinds of consensus and coalition-building.  For one, in an era where blogs monitor political campaigns hour-by-hour, a local radio ad can be made available to a national audience no matter which micro-audience was originally targeted.  (Update: we can even find out about “telephone” calls to the deaf community!)

But more importantly, I can imagine that if voters and not just campaigns were able to see who else felt the way they did on major issues, many might be surprised.  Solon mentioned that despite the headlines, the algorithms by which likely Democratic or Republican voters are identified is not as simple as beer = conservative, wine = liberal.  Yes, campaigns believe they can figure out who in a community might lean in their direction, but it’s a much more complicated calculation.

So if people chose to share and know who else felt similarly, in ways that were more fine-grained than national polls, really interesting things could happen to our political discourse.  The Left Coast environmentalist might learn the hunter in South Dakota shares a commitment to conservation.  The pro-choice atheist and the pro-life Catholic might learn they both oppose the death penalty.  I’m not advocating that we throw open the curtains on the voting booth.  But knowing how our fellow citizens feel about the issues facing all of us—it almost sounds like that old-fashioned American democratic institution, the town hall meeting.

After all, democracy is the ultimate social activity.  We’re supposed to be making decisions together.

Get Adobe Flash playerPlugin by wpburn.com wordpress themes