Posts Tagged ‘In The News’

Should Pharma have access to doctors’ prescription records?

Tuesday, April 26th, 2011

Maine, New Hampshire and Vermont want to pass laws to prevent pharmacies from selling prescription data to drug companies, who in turn use it for “targeted marketing to doctors” or “tailoring their products to better meet the needs of health practitioners” (depending on who you talk to).

This gets at the heart of the issue of imbalance between private and public sectors when it comes to access to sensitive information.

From our perspective, it doesn’t seem like a good idea to limit data usage. If the drug companies are smart, they’re also using the same data to figure out things like what drugs are being prescribed in combination and how that affects the effectiveness of their products.

Instead, we should be thinking of ways to expand access so that for every drug company buying data for marketing and product development, there is an active community of researchers, public advocates and policymakers who have low-cost or free access to the same data.

Comments on Richard Thaler “Show Us the Data. (It’s Ours, After All.)” NYT 4/23/11

Tuesday, April 26th, 2011

Professor Richard Thaler, a professor from the University of Chicago wrote a piece in the New York Times this weekend with an idea that is dear to CDP’s mission: making data available to the individuals it was collected from.

Particularly because the title of the piece suggests that he is saying exactly what we are saying, I wanted to write a few quick comments to clarify how it is different.

1. It’s great that he’s saying loudly and clearly that the payback for data collection should be the data itself – that’s definitely a key point we’re trying to make with CDP, and not enough people realize how valuable that data is to individuals, and more generally, to the public.

2. However, what Professor Thaler is pushing for is more along the lines of “data portability”, the idea of which we agree with at an ethical and moral level, has some real practical limitations when we start talking about implementation. In my experience, data structures change so rapidly that companies are unable to keep up with how their data is evolving month-to-month. I find it hard to imagine that entire industries could coordinate a standard that could hold together for very long without undermining the very qualities that make data-driven services powerful and innovative.

3. I’m also not sure why Professor Thaler says that the Kerry-McCain Commercial Privacy Bill of Rights Act of 2011 doesn’t cover this issue. My reading of the bill is that it’s covered in the general sense of access to your information – Section 202(4) reads:

to provide any individual to whom the personally identifiable information that is covered information [covered information is essentially anything that is tied to your identity] pertains, and which the covered entity or its service provider stores, appropriate and reasonable-

(A) access to such information; and

(B) mechanisms to correct such information to improve the accuracy of such information;

Perhaps what he is simply pointing out is the lack of any mention about instituting data standards to enable portability versus simply instituting standards around data transparency.

I have a long post about the bill that is not quite ready to put out there, and it does have a lot of issues, but I didn’t think that was one of them.


Response to: “A New Internet Privacy Law?” (New York Times – Opinion, March 18)

Wednesday, April 6th, 2011

There has been scant detailed coverage of the current discussions in Congress around an online privacy bill. The Wall Street Journal has published several pieces on it in their “What They Know” section but I’ve had a hard time finding anything that actually details the substance of the proposed legislation. There are mentions of Internet Explorer 9’s Tracking Protection Lists, and Firefox’s “Do Not Track” functionality, but little else.

Not surprisingly, we’re generally feeling like legislators are barking up the wrong tree by pushing to limit rather than expand legitimate uses of data in hard-to-enforce ways (e.g. “Do Not Track,” data deletion) without actually providing standards and guidance where government regulation could be truly useful and effective (e.g. providing a technical definition of “anonymous” for the industry and standardizing “privacy risk” accounting methods).

Last but not least, we’re dismayed that no one seems to be worried about the lack of public access to all this data.

In response, we sent the following letter to the editor to the New York Times on March 23, 2011 in response to the first appearance of the issue in their pages – an opinion piece titled “A New Internet Privacy Law,” published on March 18, 2011.


While it is heartening to see Washington finally paying attention to online privacy, the new regulations appear to miss the point.

What’s needed is more data, more creative re-uses of data and more public access to data.

Instead, current proposals are headed in the direction of unenforceable regulations that hope to limit data collection and use.

So, what *should* regulators care about?

1. Much valuable data analysis can and should be done without identifying individuals. However, there is as yet, no widely accepted technical definition of “anonymous.” As a result, data is bought, sold and shared with “third-parties” with wildly varying degrees of privacy protection. Regulation can help standardize anonymization techniques which would create a freer, safer market for data-sharing.

2. The data stockpiles being amassed in the private sector have enormous value to the public, yet we have little to no access to it. Lawmakers should explore ways to encourage or require companies to donate data to the public.

The future will be about making better decisions with data, and the public is losing out.

Alex Selkirk
The Common Data Project – Working towards a public trust of sensitive data


Terms of Use

Monday, February 23rd, 2009

A social networking site should know better.

On February 6th, Facebook made a teensy change to the language in its terms of use, to reclassify all user content posted to the site (photos, inane observations about one’s friends and oneself) the perpetual property of Facebook Inc.

Last weekend, the Consumerist blog (recently purchased from Gawker by Consumers Union) pointed out that if you value your own identity in any way, these user terms are unacceptable. Even if you don’t have a sex change operation or join the Amish and cancel your facebook membership. You just might not want your pretty mug popping up in an ad for, say, stereo headphones, because your profile pic happens looks like a great stock photo.

By Wednesday, the uproar had grown so big that FB was forced to backtrack, and reverted to its earlier terms of use. But the story doesn’t end there. FB is now soliciting input from users on a new set of rules – through what else? – A “bill of rights and responsibilities” (drafting a subject-specific “bill of rights”  is also the US Congress’ favorite remedy for righteous indignation, dontcha know…)

The flap should give some comfort to privacy advocates, and not only because FB backtracked. We learned that users a) are patient enough to read the fine print, and b) care what’s in it.

But there’s something to fret about too: if FB had, say, 1 million users instead of 175 million, would this issue ever have exploded the way it did? Not likely. And most of us have already signed off on dozens of other terms of use that never received the same kind of scrutiny.

P.S. 8 in Brooklyn Heights gets an F. Really?!

Friday, September 19th, 2008

Is this a case of “question the data when it tells you something you don’t want to hear”?

Or is there something genuinely broken about NYC’s grading system? How could a school in a neighborhood where the median home sale price for June-August of this year was $2.775 million receive a failing grade? For that matter, what is an F going to do to the median home sales price? (But that’s neither here nor there.)

For what it’s worth, I think the grading system sounds reasonably well thought out: 5% is based on attendance, 10% on parent surveys (which actually gives P.S. 8’s parents an opportunity to influence the grade by expressing their satisfaction over the schools various extra-curricular activities.)

The lion’s share of the grade is based on year-over-year improvement (not for a particular grade, as in this year’s 4th grader’s versus last year’s 4th graders, but for the same class of students: this year’s 4th graders versus last year’s 3rd graders.) It’s a subtle, but importance difference.

In effect, what’s being measured is year-over-year improvement of the students, not the teachers. The remaining 25% is based on median scores (but calculated against other schools and schools with similar demographics).

Still, F seems a bit harsh. And the consequences of receiving an F sound harsh too.

Mr. Klein has said that schools that receive a D or an F two years in a row could be closed or the principal could be removed.

So, is NYC really better off without P.S. 8 than with it?

Perhaps there is something not quite right about the algorithm. Unless I’m completely misunderstanding this third-hand account of how the algorithm works, recent demographic changes in the P.S. 8 district that may be the reason why the school received an F, as opposed to a D or C. (P.S. 8 received a C in 2007.)

A quarter of the students now qualify for free lunch, compared with 98 percent in 2002, and more than half the students are white or Asian-American, up from 11 percent in 2002. Most of these changes are happening among the youngest children, before tests begin in the third grade.

In short, P.S. 8 is now competing against some of the most privileged NYC schools. However, P.S. 8’s privileged (aka white) kids are for the most part, still too young to be tested. As a result, the burden of competing with the city’s other well-off schools is carried mostly by P.S. 8’s less privileged middle-schoolers.

However, this rather significant oversight in the city’s algorithm is not what P.S. 8’s parents (at least the ones who were quoted in the NYT article) are putting forth as evidence of the grading system’s inadequacy.

Several P.S. 8 parents suggested that the F said more about the grading system than the school. They cited events like the annual read-a-thon fund-raiser, an art program that culminated in student work’s being showcased at the Guggenheim, and the school’s recent selection as Brooklyn’s Rising Star Public Elementary School for 2008 by Manhattan Media, a publisher of weekly newspapers.

But, what does the Guggenheim have to do with anything? Not much, if you buy into the city’s reasoning for what the grading system is trying to measure.

What we want with progress reports is to measure what schools add to kids, not what kids bring to the schools, Mr. Liebman said.

So what’s the issue here? A fundamental disagreement about what should be graded? Dissatisfaction with how its graded? Sour grapes over the results of the grades?

It’s hard to imagine that anyone would have made a peep about the city’s algorithm if P.S. 8 had received an A. At the end of the day, we all lend too much credence to data that tells us what we want to hear and too readily discount data that surprises us with bad news. It’s a modern-day variant of “shooting the messenger“. Although in the case of P.S. 8, it seems like the messenger did kinda munge up the message a bit, it certainly wasn’t as badly as some would like to believe.

Stand up and be counted!

Thursday, September 11th, 2008

Just to quote a different news source:

John McCain’s choice of Alaska Governor Sarah Palin as his running mate is sitting very well with a lot of American voters, according to the latest FOX News poll.

The new survey also shows that—among all four candidates running—Palin (at 33 percent) is seen as most likely to understand “the problems of everyday life”—barely outpacing Barack Obama (32 percent), and finishing significantly ahead of both McCain (17 percent) and Joe Biden (10 percent).

Among independent voters, Palin’s lead over Obama on this score widens to 13 points (35 percent to 22 percent).

Fox also provides a link to the “raw data”. (Don’t get too excited.)

Apparently, raw data means a list of the questions asked, broken down by party (Republican, Democrat and Independents, over time.)

Raw data doesn’t include: Who are these people being polled? How did they get so lucky as to be selected to represent the viewpoints of their fellow voting Americans?

I’m not going to get into the validity of political polling here – we have explored poll validity at some length before. (e.g. Polls are conducted entirely through land lines, excluding the 15% of Americans that only have a cellphone number.) Frankly I don’t know enough to say anything useful about them, other than to express the usual lay person skepticism about such things.

Instead, I think a far more interesting question the recent flurry of political polling highlights is how such surveys or any “data collection effort for that matter affects the people who choose not to participate or are never selected to participate in such polls. Most of us breath a sigh of relief when we dodge a telemarketer who wants to spend “a quick 10 minutes” with us on the phone to answer a “simple survey”. But what are we passing up by not participating when so much decision-making is data-driven?

A missed questionnaire about cleaning products is nothing to fret about? But about declining to hand over personal medical data to the doctors, nurses (not to mention hospitals, researchers and insurance companies) we depend on to treat our ailments and prevent future problems?

To cite yet another poll to back up my point about polls 😉

“According to a recent poll, one in six adults (17%) – representing 38 million persons – say they withhold information from their health providers due to worries about how the medical data might be disclosed.

Persons who report that they are in fair or poor health and racial and ethnic minorities report even higher levels of concern about the privacy of their personal medical records and are more likely than average to practice privacy-protective behaviors.”

Harris Interactive Poll #27, March 2007.

More info on the poll here.

Case in point: Back in April, I wrote about the site, which provides a wonderful new service that allows individual users to share the most intimate details of their medical conditions and treatments, which in turn creates a pool of invaluable information that is publicly available. However, I also wonder about how their data may be skewed because their users are limited to the pool of people who are comfortable sharing their HIV status and publicly charting their daily bowel movements. The question we have for PatientsLikeMe is: Who isn’t being represented in your data set? And how does that affect the relevance of your data to the average person who comes to your site looking for information? Who won’t find your data helpful because it’s not relevant to their personal situation?

Increasingly, companies, agencies at all levels of government, researchers who advise policy-makers and even individuals are making “data-driven” decisions.

Yet, how often do we dismiss a study by scoffing at the limited range of its participants?

So what do we do? Tell everyone Privacy is soo 20th century. The new millenium is all about self-exposure.

How we resurrect the notion of privacy in a world that can no longer depend on a closed door to protect us against invasions, is a question we must find answers to. However the the solution is not to keep people away from sharing personal data. Doing so means giving up our place at the discussion table when it comes to influencing decisions as mundane as “How reliable does cellphone reception need to be?” to life or death decisions such as “How many ambulances does my local hospital need?”, “What combination of therapies will work best for my condition?”

To state my case more strongly: Participating as a data point in data-driven research is a passive form of voting, the most basic of rights in a functioning democracy.

To be sure, this is an idealistic take on data collection. There remains the much thornier issue of how to ensure that data is used for mutual benefit not monitoring or predatory manipulation. (Factory workers being watched by union bosses at the voting booth faced similar challenges.)

However, to repeat our favorite mantra: The enemy isn’t the data!

Proposed legislation that gives people access to their own data

Tuesday, March 25th, 2008

I totally missed this.

Even thought I thought this New York Times article on data collection wasn’t very informative, a New York legislator was sufficiently moved to propose this legislation.

Michael Zimmer has some interesting comments about the proposed bill and some of its weaknesses, as well as a strength:

“The bill is strongest, however, in relation to a demand I have long made on Web search providers: let me see the data you have collected about my actions. The bill states:

17. Business entities shall provide consumers with reasonable access to personally identifiable information and other information that is associated with personally identifiable information retained by the third party entity for online preference marketing uses.

The press seems to have missed the importance of this section. If passed, the law would require Google, Facebook, DoubleClick, etc to provide me access to the personally identifiable information ‘and other information that is associated’ with my user account stored in their databases.

This is a vital right for consumers to be able to protect their data privacy: having access to view your data is the first step towards regaining some control over the collection of the data in the first place.”

The challenge–and it’s a worthy one–is how could this information be provided to us in a way that makes it useful and relevant? I’d like to see a law providing access to my own data that is more meaningful to me than HIPAA feels when I’m faced with a sheaf of waivers to sign at my doctor’s office.

A nonprofit wants to share its mailing list with some economists–would that bother you?

Thursday, March 13th, 2008

There’s a fascinating article in the New York Times Sunday Magazine on an economists’ study of what makes people donate by an interesting liberal-conservative pair, Dean Karlan and John List. They wanted to do an empirical study of fundraising strategies, to find out what kind of solicitations are the most successful. As the article points out, lab experiments of economic choices aren’t particularly realistic: “If you put a college sophomore in a room, gave her $20 to spend and presented her with a series of pitches from hypothetical charities, she might behave very differently than when sitting on her sofa sorting through letters from actual organizations.”

So Karlan and List found an opportunity for a field experiment, a partnership with an actual, unnamed nonprofit that allowed them to try different solicitation strategies and map the outcomes. They wrote solicitation letters that were similar, except some didn’t mention a matching gift, some mentioned a 1-to-1 match, some a 2-to-1, and some a 3-to-1. In the end, if a matching gift was mentioned, it increased the likelihood of a donation, but the size of the matching gift did not. As the author, David Leonhardt, notes, their findings and the findings of other economists in this area are significant to many people, from the nonprofits trying to be better fundraisers to economists studying human behavior, even to those who want to make tax policy more effective and efficient.

The article, however, didn’t mention whether the donors to the nonprofit had consented to their responses being shared with anyone other than the nonprofit. I’m not that concerned about whether donors’ privacy may have been egregiously violated. (I’m also not sure what’s required of nonprofits in this area.) I’m just curious to know, if they had been given the choice, would they have agreed to their information being shared with the economists? Obviously, the study wouldn’t have worked if potential donors had been told they would be sent different solicitation letters to measure their responses, but I think if most people on a nonprofit’s mailing list were asked if they would explicitly allow their information to be used in academic studies, they would consent. They might want assurances that their individual identities would be protected—that no one would know Mr. So-And-So had given zero dollars to a cause he publicly champions. But they might very well be willing to help the nonprofit figure out how to be more effective and be a part of an academic study that could shape public policy. They might even be curious to know how their giving measures compares to other donors in their income brackets or geographic areas.

Most people, myself included, have a knee-jerk antipathy to having their personal information shared with anybody other than the organization or company they give it to. But maybe we would feel differently if we were actually given some choices, if our personal identities could be protected, if sharing information could lead to more than just targeted advertising or more junk mail.

Ohmigod, companies are tracking what we look at online!

Monday, March 10th, 2008

Breaking news from the New York Times.

What’s truly interesting about this article, though, isn’t that the New York Times is announcing as “news” something that’s been going on for a very long time. Rather, the New York Times, even while devoting space on its front page, doesn’t really seem to have a point.

The article tries to distinguish itself from vague alarms raised by privacy advocates with data, the results of a study done with comCast measuring “data collection events,” each time “consumer data was zapped back to the Web companies’ servers.” (Even though the New York Times has produced some of the prettiest data graphics in recent memory, this one looks like something created on Excel and conveys little more than a flurry of numbers.) But the overwhelming impression left by the article is that companies are trying to target advertising, and some might do it better than others, rather than that extensive personal information is being collected. So then it isn’t surprising that several of the comments in response to the Bits blog post are about how they never click on ads, or how stupid these companies are in sending them ads for things they’re not interested in, or how they’ve blocked pop-up ads on their browser.

After all, the article mentions only briefly what kind of information is being collected: “the person’s zip code, a search for anything from vacation information, or a purchase of prescription drugs or other intimate items.” The article cites Jules Polonetsky, chief privacy officer for AOL, “[who] cautions that not all the data at every company is used together. Much of it is stored separate,” yet the author doesn’t explain the significance of that statement. The article doesn’t mention that even if consumer data is stripped of “identifiers” like a user name, individual identification could happen easily through the combination of datasets.

I would love to see an article by a mainstream publication that addresses this issue in a truly comprehensive and thoughtful way. What’s missing in the conversation started by this article is not only a fuller analysis of how personal information is being collected and what dangers there are for individual privacy, but also a nuanced discussion of that information’s value and what it means for “a handful of big players” to hold most of it. The article ends citing a study of California adults, 85% of whom thought sites should not be allowed to track their behavior around the Web to show them ads. But does that statistic really capture what’s at stake?

P.S. Is AOL’s innocent penguin happy or merely surprised that anchovy ads are being sent to him?

Facebook: The Only Hotel California?

Thursday, February 14th, 2008

As the subject of recent splashy news on privacy and personal data collection, Facebook is starting to seem a little scary. In the words of one former user, Nipon Das, “It’s like the Hotel California. You can check out anytime you like, but you can never leave.” We’ve heard how difficult it is to remove yourself from Facebook.

We’ve seen how Facebook initially chose to launch Beacon, a advertising tool that told your friends about your activities on other websites, such as a purchase on eBay, without an easy opt-out mechanism, until outrage and a petition organized by forced Facebook to change its policy.

Facebook employees are even poking around private user profiles for personal entertainment.

But although Facebook is at the forefront of a new kind of marketing, it’s not the only company with discomforting privacy policies and terms of use. Facebook’s statement that its terms are subject to change at any time is standard boilerplate. Its disclosure that it may share your information with third parties to provide you service is also pretty standard. After all, it’s certified by TRUSTe, the leading privacy certifier for online businesses. In fact, Facebook is arguably more explicit than most companies about what it’s doing because by its very nature, it’s more obvious that users’ personal information is being collected.

You could argue that the users do have a choice. They could choose not to use Facebook. But how did it turn out that in the big world of the internet, we have only two choices: 1) provide your personal information on the company’s terms; or 2) don’t use the service?

So far, it’s not clear that the controversy around Facebook has led to increased public concern about other companies and their personal data collection. It doesn’t even seem to have spilled over to all the programs that run on Facebook’s platform. No one seems perturbed that the creator of some random new application for feeding virtual fish now has access to his or her profile.

But there clearly is growing public unease, an increasing sense that our Google searches or our online purchases may be available to people we don’t know and can’t trust. Perhaps Facebook will end up providing an invaluable public service, albeit inadvertently, in making more people wonder, “What exactly did I agree to?”

Get Adobe Flash player