Posts Tagged ‘Privacy Policies’

Comments on Richard Thaler “Show Us the Data. (It’s Ours, After All.)” NYT 4/23/11

Tuesday, April 26th, 2011

Professor Richard Thaler, a professor from the University of Chicago wrote a piece in the New York Times this weekend with an idea that is dear to CDP’s mission: making data available to the individuals it was collected from.

Particularly because the title of the piece suggests that he is saying exactly what we are saying, I wanted to write a few quick comments to clarify how it is different.

1. It’s great that he’s saying loudly and clearly that the payback for data collection should be the data itself – that’s definitely a key point we’re trying to make with CDP, and not enough people realize how valuable that data is to individuals, and more generally, to the public.

2. However, what Professor Thaler is pushing for is more along the lines of “data portability”, the idea of which we agree with at an ethical and moral level, has some real practical limitations when we start talking about implementation. In my experience, data structures change so rapidly that companies are unable to keep up with how their data is evolving month-to-month. I find it hard to imagine that entire industries could coordinate a standard that could hold together for very long without undermining the very qualities that make data-driven services powerful and innovative.

3. I’m also not sure why Professor Thaler says that the Kerry-McCain Commercial Privacy Bill of Rights Act of 2011 doesn’t cover this issue. My reading of the bill is that it’s covered in the general sense of access to your information – Section 202(4) reads:

to provide any individual to whom the personally identifiable information that is covered information [covered information is essentially anything that is tied to your identity] pertains, and which the covered entity or its service provider stores, appropriate and reasonable-

(A) access to such information; and

(B) mechanisms to correct such information to improve the accuracy of such information;

Perhaps what he is simply pointing out is the lack of any mention about instituting data standards to enable portability versus simply instituting standards around data transparency.

I have a long post about the bill that is not quite ready to put out there, and it does have a lot of issues, but I didn’t think that was one of them.


Measuring the privacy cost of “free” services.

Wednesday, June 2nd, 2010

There was an interesting pair of pieces on this Sunday’s “On The Media.”

The first was “The Cost of Privacy,” a discussion of Facebook’s new privacy settings, which presumably makes it easier for users to clamp down on what’s shared.

A few points that resonated with us:

  1. Privacy is a commodity we all trade for things we want (e.g. celebrity, discounts, free online services).
  2. Going down the path of having us all set privacy controls everywhere we go on internet is impractical and unsustainable.
  3. If no one is willing to share their data, most of the services we love to get for free would disappear. Randall Rothenberg.
  4. The services collecting and using data don’t really care about you the individual, they only care about trends and aggregates. Dr. Paul H. Rubin.

We wish one of the interviewees had gone even farther to make the point that since we all make decisions every day to trade a little bit of privacy in exchange for services, privacy policies really need to be built around notions of buying and paying where what you “buy” are services and how you pay for them are with “units” of privacy risk (as in risk of exposure).

  1. Here’s what you get in exchange for letting us collect data about you.”
  2. Here’s the privacy cost of what you’re getting (in meaningful and quantifiable terms).

(And no, we don’t believe that deleting data after 6 months and/or listing out all the ways your data will be used is an acceptable proxy for calculating “privacy cost.” Besides, such policies inevitably severely limit the utility of data and stifle innovation to boot.)

Gaining clarity around privacy cost is exactly where we’re headed with the datatrust. What’s going to make our privacy policy stand out is not that our privacy “guarantee” will be 100% ironclad.

We can’t guarantee total anonymity. No one can. Instead, what we’re offering is an actual way to “quantify” privacy risk so that we can track and measure the cost of each use of your data and we can “guarantee” that we will never use more than the amount you agreed to.

This in turn is what will allow us to make some measurable guarantees around the “maximum amount of privacy risk” you will be exposed to by having your data in the datatrust.

The second segment on privacy rights and issues of due process vis-a-vis the government and data-mining.

Kevin Bankston from EFF gave a good run-down how ECPA is laughably ill-equipped to protect individuals using modern-day online services from unprincipled government intrusions.

One point that wasn’t made was that unlike search and seizure of physical property, the privacy impact of data-mining is easily several orders of magnitude greater. Like most things in the digital realm, it’s incredibly easy to sift through hundreds of thousands of user accounts whereas it would be impossibly onerous to search 100,000 homes or read 100,000 paper files.

This is why we disagree with the idea that we should apply old standards created for a physical world to the new realities of the digital one.

Instead, we need to look at actual harm and define new standards around limiting the privacy impact of investigative data-mining.

Again, this would require a quantitative approach to measuring privacy risk.

(Just to be clear, I’m not suggesting that we limit the size of the datasets being mined, that would defeat the purpose of data-mining. Rather, I’m talking about process guidelines for how to go about doing low-(privacy) impact data-mining. More to come on this topic.)

In the mix…Everyone’s obsessed with Facebook

Friday, May 7th, 2010

UPDATE: One more Facebook-related bit, a great graphic illustrating how Facebook’s default sharing settings have changed over the past five years by Matt McKeon. Highly recommend that you click through and watch how the wheel changes.

1) I love when other people agree with me, especially on subjects like Facebook’s continuing clashes with privacy advocates. Says danah boyd,

Facebook started out with a strong promise of privacy…You had to be at a university or some network to sign up. That’s part of how it competed with other social networks, by being the anti-MySpace.

2) EFF has a striking post on the changes made to Facebook’s privacy policy over the last five years.

3) There’s a new app for people who are worried about Facebook having their data, but it means you have to hand it over to this company which also states, it “may use your info to serve up ads that target your interests.” Hmm.

4) Consumer Reports is worried that we’re oversharing, but if we followed all its tips on how to be safe, what would be the point of being on a social network? On its list of things we shouldn’t do:

  • Posting a child’s name in a caption
  • Mentioning being away from home
  • Letting yourself be found by a search engine

What’s the fun of Facebook if you can’t brag about the pina colada you’re drinking on the beach right at that moment? I’m joking, but this list just underscores that we can’t expect to control safety issues solely through consumer choices. Another thing we shouldn’t do is put our full birthdate on display, though given how many people put details about their education, it wouldn’t necessarily be hard to guess which year someone was born. Consumer Reports is clearly focusing on its job, warning consumers, but it’s increasingly obvious privacy is not just a matter of personal responsibility.

5) In a related vein, there’s an interesting Wall St. Journal article on whether the Internet is increasing public humiliation. One WSJ reader, Paul Cooper, had this to say:

The simple rule here is that one should always assume that everything one does will someday be made public. Behave accordingly. Don’t do or say things you don’t want reported or repeated. At least not where anyone can see or hear you doing it. Ask yourself whether you trust the person who wants to take nude pictures of you before you let them take the pictures. It is not society’s job to protect your reputation; it’s your job. If you choose to act like a buffoon, chances are someone is going to notice.

Like I said above, privacy in a world where the word “public” means really really public forever and ever, and “private” means whatever you manage to keep hidden from everyone you know, protecting “privacy” isn’t only a matter of personal responsibility. The Internet easily takes actions that are appropriate in certain contexts and republishes them in other contexts. People change, which is part of the fun of being human. Even if you’re not ashamed of your past, you may not want it following you around in persistent web form.

Perhaps on the bright side, we’ll get to a point where we can all agree everyone has done things that are embarrassing at some point and no one can walk around in self-righteous indignation. We’ve seen norms change elsewhere. When Bill Clinton was running for president, he felt compelled to say that he had smoked marijuana but had never inhaled. When Barack Obama ran for president 16 years later, he could say, “I inhaled–that was the point,” and no one blinked.

6) The draft of a federal online privacy bill has been released. In its comments, Truste notes, “The current draft language positions the traditional privacy policy as the go to standard for ‘notice’ — this is both a good and bad thing.” If nothing else, the “How to Read a Privacy Policy” report we published last year had a similar conclusion, that privacy policies are not going to save us.

Wow, new privacy features!

Friday, December 11th, 2009

Wow, so many companies rolling out new privacy features lately!

Facebook rolled out its new “simplified” privacy settingsGoogle introduced Google Dashboard, a central location from which to manage your profile data, which supplements Google Ads Preferences.  And Yahoo released a beta version of the Ad Interest Manager.

Many, many people have reviewed Facebook’s new changes, and pointed out some of the “bait-and-switch” Facebook has done for some new, and I think better, controls.  I don’t have much more to say about that.

But it’s interesting to me that Google and Yahoo have chosen similar strategies around privacy issues, though with some differences in execution.  Both companies haven’t actually changed their data collection practices, and cynics have argued that they’re both just trying to stave off government regulation.  Still, I think that it makes a difference when companies actually make clear and visible what they are doing with user data.

“Is this everything?”

Both Google and Yahoo indicate in different ways that the user who is looking at Dashboard or Ad Interest Manager is not getting the full data story.

Google’s Dashboard is supposed to be a central place where a user can manage his or her own data.  In and of itself, it’s not that exciting.  As ReadWriteWeb put it, it doesn’t tell you anything you didn’t know before.  It provides links in one place to the privacy settings for various applications, but it focuses on profile information the user provides, which represents only a tiny bit of the personal information Google is tracking.

Google does, however, provide a link to the question, “Is this everything?” that describes some of their browser-based data collection and a link to the Ads Preferences Manager page.  To me, it feels a little shifty, that the Dashboard promises to be a place for you to control “data that is personally associated with you,” but it doesn’t reveal until you scroll to the bottom that this might not be everything.  Others may feel differently, but this to me goes right at the heart of the problem of how “personal information” is defined.  When I go to the Ads Preferences Manager, I see clearly that Google has associated all kinds of interests with me–how is this not “personally associated” with me?  Google states it’s not linking this data to my personal account data which is why they haven’t put it all in one place, which is good, but it seems too convenient a reason to silo that off.

Yahoo’s strategy is a little different.  It may not be fair to compare Yahoo’s Ad Interest Manager to Google’s Dashboard at this point, given that it’s in such a rudimentary phase.  It’s in beta and doesn’t work yet with all browsers.  (As David Courtney points out in PCWorld, being in beta is a pretty sorry excuse for the fact that it doesn’t work with IE8 and Firefox.)  Depending on how much you use Yahoo, you may not see anything about yourself.

Still, I thought it was interesting that Yahoo highlighted some of the hairy parts of its privacy policy in separate boxes high up on the page.  Starting from the top, Yahoo states clearly in separate boxes with bold headings that there are ways in which your data is collected and analyzed that are not addressed in this Ad Interest Manager.  The box for the Network Advertising Initiative is a little weak; it doesn’t really explain what it means that Yahoo is connected to the NAI.  But the box on “other inputs,” shows prominently that even as you manage your settings on this page, there may be other sources of data Yahoo is using to find out more about you.


Yahoo also reveals that the information they’re tracking from you is collected from a wide range of sources, including both Yahoo account information like Mail and non-account websites like its Front Page.  Unlike Google, Yahoo doesn’t ask you to click around to find out that some of “everything” is elsewhere.


Turning “interests” on and off

Google and Yahoo are very similar here.  Google’s Ad Preferences Manager indicates which interests have been associated with you with a clear link to how they can be removed, with a button for opting out from tracking altogether.


Yahoo’s Ad Interest Manager has a different design, but the button for opting out altogether is similarly visible.


We’re using cookies!

Compared to the other issues, this is the most obvious difference between Google and Yahoo.

Google has this on its Ads Preferences Manager:


So you can see that some string of numbers and letters has somehow been attached to your computer, but you’re not told what this means in terms of what Google knows about you.

In contrast, Yahoo shows this at the bottom of the Ad Interest Manager:


Yahoo knows I’m a woman!  Between 26 and 35!  The location is actually wrong, as I am in Brooklyn, NY, but I did live in San Francisco 5 years ago when I first signed up for a Yahoo account.  Still, Yahoo is very explicitly showing, and not just telling, that it knows geographical information, age, gender, and the make and operating system of your computer.  I’m impressed—they must know this is going to scare some people.

Does any of this even matter?

I prefer the Yahoo design in many ways — the boxes and verticality of the manager to me are easier to read and understand than the horizontal spareness of the Google design.  But in the end, the design differences between Google and Yahoo’s new privacy tools may not even matter.  I don’t know how many people will actually see either Manager.  You still have to be curious enough about privacy to click on “Privacy Policy,” which takes you to Yahoo! Privacy, at which point, in the top right-hand corner, you see a link to “Opt-out” of interest-based advertising.  The same is true with Google. And neither company has actually changed much about their data collection practices.  They’re just being more open about them.

But I am impressed and heartened that both companies have started to reveal more about what they’re tracking and in ways that are more visually understandable than a long, boring, legalistic privacy policy.  I hope Yahoo is feeling competitive with Google on privacy issues and vice-versa.  I’d love to see a race to the top.

In the mix

Friday, November 6th, 2009

Cuil’s Famous Privacy Policy No Longer Protects Privacy (

Google’s Privacy Dashboard Doesn’t Tell Us Anything We Didn’t Know Before (ReadWriteWeb)

In the mix

Tuesday, June 16th, 2009

EFF Launches TOSBack–A “Terms of Service” Tracker for Facebook, Google, eBay, and More.  (EFF)

The “Hidden Cost” of Privacy.  (Schneier on Security)

Google Fusion Tables.  (Official Google Research Blog)

It’s our data. When do we get to use it, too?

Thursday, June 4th, 2009

When we started this survey of privacy policies, our goal was simple: find out what these policies actually say.  But our larger goal was to place the promises companies made about users’ privacy in a larger context—how do these companies view data?  Do they see it as something that wholly belongs to them?  Because ultimately, their attitude towards this data very much shapes their attitude towards user privacy.

In the last couple of years, we’ve seen an unprecedented amount of online data collection that’s happened largely surreptitiously.  We can’t say that we, as users, haven’t gotten something in return.  The “free” services on the internet have been paid for with our personal information.  But the way the information has been collected has prevented us negotiating with the benefit of full information.  In other words, we haven’t gotten a good deal.  The data we’ve provided is so valuable, we should have struck a harder bargain.

And I think more and more people are starting to feel that way.  Even though most only feel a vague discomfort at this point, it’s unlikely that companies like RealAge will be able to continue what they’ve been doing.

For us at CDP, the fear is that we’ll throw the baby out with the bathwater.  We don’t want to shut down data collection altogether—we just want companies to stop thinking of our data as their data and their data alone.  We want to be able to share in the incredible value that this data has, so that we as a society can all benefit from the data collection and analysis capabilities we’ve developed.  Of course, that’s only possible with stronger privacy protections than are available now, which is why privacy is such an important issue for us to understand.

So what would it look like for us to “share” in the value of data?  It might sound crazy that companies collecting all this data would ever share data with their users, but it’s already happening.

Google, as a company that believes it’s in the business of information rather than advertising, does make some sincere efforts to provide data to the public.  Google Trends may be intended for advertisers, but it also provides the whole world with information on what people are searching for.  Google Flu Trends is a natural outgrowth of that, and some researchers believe this data can be helpful in determining where flu outbreaks are going to occur faster than reporting by clinics.

Some companies, like eBay and Amazon, have built their data collection into the service they provide to their customers.  Some of the information they collect on transactions and ratings can be viewed by all users.  Anyone looking to bid on an item on eBay can see how other buyers have rated that seller.  A user of Amazon looking to buy a new digital camera can view what other buyers considered.

Although Wikipedia is a bit different as a nonprofit, the service it provides also actively incorporates public disclosure of the data collected.  The contributions of any one editor can be seen in aggregate and aggregate stats on website activity are also available to the general public.  This information is important in the self-policing that is essential for Wikipedia to maintain any credibility.

Although the amount of data these companies are sharing with their users and the public is miniscule compared to the amount of data they’ve actually collected from us, it raises the possibility that data collection could happen in a completely different way than it does now.  Companies could make more obvious that data collection is happening, and instead of scaring users away, give users some reason to participate in the collection of data.  The whole process could be one in which users are openly engaged, rather than one in which users feel hoodwinked.

So this is our goal at CDP: what do we need to do in terms of privacy protection, both in terms of technologies and social norms, to make this model of data collection possible?

And if the terms of the policy change?

Tuesday, June 2nd, 2009

It’s bad enough that most of the “choices” we have in privacy today are either, “Accept our terms or don’t use the service.”  But then the terms can change at any time?

Nearly every privacy policy I looked at had some variation on these words: “Please note that this Privacy Policy may change from time to time.”  If the changes are “material,” which is a legal phrase meaning “actually affects your rights,” then all data that’s collected under the prior terms will remain subject to those terms.  Data that’s collected after the change, though, will be subject to the new terms, and the onus is put on the user to check back and see if the terms have changed.

Most companies, like Google, Yahoo, and Microsoft, promise to make an effort to let you know that material changes have been made, by contacting you or posting the changes prominently.  Some, like New York Times Digital and Facebook, promise that material changes won’t go into effect for six months, giving their users some time to find out.

Recently, Facebook decided to test out the right they had reserved to change the terms of use.  Facebook wanted to amend the terms of its license to the content provided by Facebook members.  Although it wasn’t actually a term in the privacy policy, it implicated users’ privacy rights as it involved personal content they had uploaded to Facebook.  Facebook claimed that its new terms of use didn’t materially change users’ rights but merely clarified what was already happening with data.  For example, if user A decides to send a message to user B, and then A deletes her account, the message A sent to B will not be deleted from B’s account.  The information is no longer belongs only to user A.

However, Facebook’s unilateral attempt to change the terms of use provoked such uproar that the changes were withdrawn.  Instead, two new documents were created, Facebook Principles and Statement of Rights and Responsibilties, and users were given the option to discuss and vote on these documents before they go into effect.  Ultimately, the new versions were approved by vote of Facebook members.

Facebook is certainly not a model of privacy protection, but this incident is illuminating.  Legally, Facebook could change its terms without its members’ approval.  But practically, it couldn’t.  There’s been some debate over whether angry users understood the changes and what they meant, but that’s almost irrelevant.  Facebook couldn’t simply dictate the terms of its relationship with its users any more, given that its greatest asset is the content created by its users.

It may seem counterintuitive, but it’s not surprising that some of the most visible and effective consumer efforts to change how a company uses personal information have stemmed from an online service based on voluntary sharing. The more people are given opportunities to participate in how information is shared, the better people can understand what it means for a company to share their information and the more likely they are to feel empowered to shape what happens to their information.  Facebook can’t offer the service that it does without the content generated by its users.  But as it’s begun to realize, its users then have to be a part of decisions about the way that content is used.

We all know privacy policies are frustrating, inadequate, and difficult to understand.  So it’s good to remember that all our privacy battles don’t have to be fought on their terms.

Multiple-choice privacy

Thursday, May 21st, 2009

Everyone agrees that “choice” is crucial for protecting privacy. But what should the choices be?

a) Do not call me, email me, or contact me in any way.
b) Do not let any of your partners/affiliates/anyone else call me, email me, or contact me in any way.
c) Let me access, edit, and delete my account information.
d) Let me access, edit, and delete all information you’ve collected from me, including log data.
e) Track me not.
f) All of the above.
g) None of the above.

Until recently, most tools offered by internet companies over user information have focused on helping people avoid being contacted, i.e., “marketing preferences.” That’s presumably what we cared about when privacy was all about the telemarketer not calling you at home. Companies have also given users access to their account information, which is in the companies’ own interest, since they would prefer to have updated information on you as well.

But few companies acknowledge that other kinds of information they’ve collected from you, like log data, search history, and what you’ve clicked on, might affect your sense of privacy as well. Since they conveniently choose not to call this kind of information “personal,” they have no privacy-based obligation to give you access to this information or allow you to opt out of it.

Still, in the last year or two, there have been some interesting changes in the way some companies view privacy choices. They’re starting to understand that people not only care about whether the telemarketer calls them during dinner, but also whether that telemarketer already knows what they’re eating for dinner.

Most privacy policies will at least state that the user can choose to turn off cookies, though with the caveat that the action might affect the functionality of the site. AskNetwork developed AskEraser to be a more visible way for users to use without being tracked, but as privacy advocates noted, AskEraser requires that a cookie be downloaded, when many people who care about privacy periodically clear their cookies. AskEraser also doesn’t affect data collection by third parties.

More interestingly, Google recently announced some new tools for their targeted advertising program for people concerned about being tracked. These tools include a plug-in for people who don’t want to be tracked that will persist even when cookies are cleared and a way for users to know what interests have been associated with them. Google’s new Ad Preferences page also allows people to control what interests are associated with them and not just turn off tracking altogether.

Neither tool is perfect but they’re still fascinating. The more users are able to see what companies know about them, the better they can understand what kind of information is being collected as they use the internet. And Google seems to recognize that people’s concerns about privacy can’t just be assuaged just through an on-off switch, that we want more fine-tuned controls instead.

The big concern for me, though, is whether Google or any other company that wants to be innovative about privacy is actually interested in fundamentally changing the way data is collected. Google’s targeted advertising program can afford to lose the data they would have tracked from privacy geeks, and still rely on getting as much information as possible from others, most of whom have no idea what is happening.

Data retention: are we missing the point?

Tuesday, May 12th, 2009

Data retention has been a controversial issue for many years, with American companies not measuring up to the European Union’s more stringent requirements.  But for us at CDP, it obscures what’s really at stake and often confuses consumers.

For many privacy advocates, limiting the amount of time data is stored reduces the risk of it being exposed.  The theory, presumably, is that sensitive data is like toxic waste, and the less we have of it lying around, the better off we are.  But that theory, as appealing as it is, doesn’t address the fact that our new abilities to collect and store data are incredibly valuable, not just to major corporations, but to policymakers, researchers, and even the average citizen.  It doesn’t seem like focusing on this issue of data retention has necessarily led to better privacy protections.  In fact, it may be distracting us from developing better solutions.

For example, Google and Yahoo in the past year announced major changes to their policies about data retention, promising to retain data for 9 months and 6 months, respectively.  These promises, however, were not promises to delete data, but to “anonymize” it. As discussed previously, neither company defines precisely what that verb means.  According to the Electronic Frontier Foundation, Yahoo is still retaining 24 of 32 digits of users’ IP addresses.   As the Executive Director of Electronic Privacy Information Center (EPIC) stated, “That is not provably anonymous.” Yet most mainstream media headlines focused only on the Yahoo’s claim of shorter data retention.  The article in which the above quote appeared sported the headline: “Yahoo Limits Retention of Personal Data.”

At the same time, despite all the controversy around data retention, this issue isn’t even addressed in the privacy policies of these three ISPs.  Google addressed this issue in a separate FAQs section, while Yahoo addressed it in a press release and its blog.  Microsoft in December 2008 said that they would cut their data retention time from 18 months to six if their major competitors did the same.  But this information was not in the privacy policy itself.  Among the other companies I looked at, Wikipedia, Ask Network, Craigslist, and WebMd did at least provide some information, if not comprehensive answers, on how long certain types of data are retained in their policies.  No information could be found readily on the sites of eBay, AOL, Amazon, New York Times Digital, Facebook, and Apple.

This might be due to the fact that data retention remains a somewhat obscure issue to most internet users.  But it’s also true that for many of these sites, much of the data that’s collected is part of the service.  As an eBay buyer or seller, it’s useful to see how others have been rated.  On Amazon, it’s helpful to know what others have considered as they shop for a particular product.  At the same time, my buying and viewing history on Amazon could easily reveal as much, if not more, that I want to keep private as my surfing history on Google.  So why does most of the focus on data retention seem to be on ISPs and search engines?

When I look at a search engine like Ixquick, which is trying to build a reputation for privacy by not storing any information, I’m even less convinced that deleting all the data is a sustainable solution.  Ixquick is a metasearch engine, meaning that it’s pulling results from other search engines.  It’s not a solution to replace Google or Yahoo for everyone.  It feels more like a handy tool for someone who is concerned about his or her privacy, than a model that other search engines could end up following.  If data deletion by all search engines is the goal, the example to hold up can’t be a search engine that relies on other non-deleting search engines.

What exactly do we want to keep private?  At the same time, what information do we want to have?  What is the best way to balance these interests?  These are the questions we should be asking, not “How long is Yahoo going to keep my data?”

Get Adobe Flash player