Posts Tagged ‘in the mix’

In The Mix…predicting the future; releasing healthcare claims; and $1.5 millions awarded to data privacy

Tuesday, November 30th, 2010

Some people out there think they can predict the future by scraping content off the web. Does it work simply because web 2.0 technologies are great at creating echo chambers? Is this just another way of amplifying that echo chamber and generating yet more self-fulfilling trend prophecies? See the Future with a Search (MIT Technology Review)

The U.S. Office of Personnel Management wants to create a huge database that contains healthcare claims of millions of. Many are concerned for how the data will be protected and used. More federal health database details coming following privacy alarm (Computer World)

Researchers at Purdue were awarded $1.5 million to investigate how well current techniques for anonymizing data are working and whether there’s a need for better methods. It would be interesting to know what they think of differential privacy. They  appear to be actually doing the dirty work of figuring out whether theoretical re-identification is more than just a theory. National Science Foundation Funds Purdue Data-Anonymization Project (Threat Post)

In the mix…data-sharing’s impact on Alzheimer’s research, the limits of a Do Not Track registry, meaning of data ownership and more

Friday, August 13th, 2010

1)  It’s heartening that an article on how data-sharing led to a breakthrough in Alzheimer’s research is the Most Emailed article on the NYTimes website right now. The reasons for resisting data-sharing are the same in so many contexts:

At first, the collaboration struck many scientists as worrisome — they would be giving up ownership of data, and anyone could use it, publish papers, maybe even misinterpret it and publish information that was wrong.

But Alzheimer’s researchers and drug companies realized they had little choice.

“Companies were caught in a prisoner’s dilemma,” said Dr. Jason Karlawish, an Alzheimer’s researcher at the University of Pennsylvania. “They all wanted to move the field forward, but no one wanted to take the risks of doing it.”

2) Google agonizes on privacy. The Wall Street Journal article discusses a confidential Google document that reveals the disagreements within the company on how it should use its data.  Interestingly, all the scenarios in which Google considers using its data involve targeted advertising; none involve sharing that data with Google users in a broader, more extensive way than they do now.  Google believes it owns the data it’s collected, but it also clearly senses that ownership of such data has implications that are different from ownership of other assets.  There are individuals who are implicated — what claims might they have to how that data is used?

3) Some people have suggested that if people are unhappy with targeted advertising, the government should come up with a Do Not Track registry, similar to the Do Not Call list.  But as Harlan Yu notes, Do Not Track would not be as simple as it sounds. He notes that the challenges involve both technology and policy:

Privacy isn’t a single binary choice but rather a series of individually-considered decisions that each depend on who the tracking party is, how much information can be combined and what the user gets in return for being tracked. This makes the general concept of online Do Not Track—or any blanket opt-out regime—a fairly awkward fit. Users need simplicity, but whether simple controls can adequately capture the nuances of individual privacy preferences is an open question.

4) What happens to a business’s data when it goes bankrupt? The former publisher and partners of a magazine and dating website for gay youth were fighting over ownership of the company’s assets, including its databases.  They recently came to an agreement to destroy the dataEFF argues that the Bankruptcy Code should be amended to require such outcomes for data assets.  I don’t know enough about bankruptcy law to have an opinion on that, but this conflict illuminates what’s so problematic about the way we treat data and property.  No one can own a fact, but everyone acts like they own data.  Something fundamental needs to be thrashed out.

5) Geotags are revealing more locational info than the photographers intended.

6) The owner of an ISP that resisted an FBI request for information can finally reveal his identity. Nicholas Merrill can now reveal that he was the plaintiff behind an ACLU lawsuit that challenged the legality of national security letter, by which the FBI can request information without a court order or proving just cause.  In fact, the FBI can even impose a gag order prohibiting the recipient of the NSL from telling anyone about the NSL, which is what happened to Merrill.

In the mix..government surveillance, HIPAA updates, and user control over online data

Monday, July 19th, 2010

1) The U.S. government comes up with some, um, interesting names for its surveillance programs.  “Perfect Citizen” sounds like it’s right out of Orwell. As the article points out, there are some major unanswered questions.  How do they collect this data?  Where do they get it?  Do they use it just to look for interesting patterns that then lead them to identify specific individuals, or are all the individuals apparent and visible from the get-go?  And what are the regulations around re-use of this data?

2) Health and Human Services has issued proposed updated regulations to HIPAA, the law regulating how personal health information is shared. CDT has made some comments about how these regulations will affect patient privacy, data security, and enforcement.  HIPAA, to some extent, lays out some useful standards on things like how electronic health information should transmitted.  But it also has been controversial for suppressing information-sharing, even when it is legal and warranted.

So what if instead of talking about what we can’t do, what if we started talking about what we can do with electronic health data?  I’m not imagining a list of uses where anything outside of the list is barred, but rather an outline of the kinds of uses that are useful.  The whole point of electronic health records is to make information more easily shareable so care is more continuous and comprehensive and research more efficient and effective.

I love this bit from an interview with a neuroscientist who studies dog brains because, “dogs aren’t covered by Hipaa! Their records aren’t confidential!”

3) A start-up called Bynamite is trying to give users control over the information they share with advertisers online. It’s another take on something we’ve seen from Google and BlueKai, where users get to see what interests have been associated with them.  Like those services, Bynamite allows you to remove interests that don’t pertain to you or that you don’t want to share.  Bynamite then goes further by opting you out of networks that won’t let you make these choices.  That definitely sounds easier to managing P3P, and easier than reading through the policies of all the companies that participate in the National Advertising Initiative.

I agree with Professor Acquisti that all of us, when we use Google or any other free online service, are paying for our use of the service with our personal information, and that Bynamite is trying to make that transaction more explicit.  But I wonder if the value of the data companies have gained is explicit.  Is the price of the transaction fair?  Does 1 hour of free Google search equal x amounts of personal data bits?  Can you even put a dollar value on that transaction, given that the true value of all this data is in aggregate?

The accompanying blog post to this article cites a study demonstrating how hard it is to assign a dollar value to privacy.  The study subjects clearly did value “privacy,” but the price they put on it depended on how much they felt they had any privacy to begin with!

In the mix

Wednesday, May 27th, 2009

Data.gov: Unlocking the Federal Filing Cabinets. (NYT Bits)

On the Anonymity of Home/Work Location Pairs. (Schneier on Security)

Do People Care About Data Correlation?. (Kim Cameron’s Identity Blog)

In the mix

Wednesday, May 20th, 2009

Site Lets Writers Sell Digital Copies. (NY Times)

Linked Data is Blooming: Why You Should Care (ReadWriteWeb)

Mint Considers Selling Anonymized Data From Its Users (ReadWriteWeb)

The Growing Popularity of Popularity Lists (The Numbers Guy/Wall Street Journal)

Tuesday in the Mix

Tuesday, May 12th, 2009

Just Landed: Processing, Twitter, MetaCarta & Hidden Data (blprnt)

Greece Puts Brakes on Street View (BBC)

Developer of AdBlock Plus Proposes a Fairer Approach to Ad Blocking (ReadWriteWeb)

What Does Access to Real World Data Online Make Possible? (ReadWriteWeb)

Monday in the Mix

Monday, May 11th, 2009

Signs Your Wireless Carrier Loves You (NYT)

Calendar as filter (Dilbert.com)

New Search Service Aims at Answering Tough Queries, but Not Taking on Google (NYT)


Get Adobe Flash player