Posts Tagged ‘IAPP’

@IAPP Privacy Foo Camp 2010: What Is Anonymous Enough?

Tuesday, October 26th, 2010

Editor’s Note: Becky Pezely is an independent contractor for Shan Gao Ma, a consulting company started by Alex Selkirk, President of the Board of the Common Data Project.  Becky’s work, like Tony’s, touches on many of the privacy challenges that CDP hopes to address with the datatrust.  We’re happy to have her guest blogging about IAPP Academy 2010 here.

Several weeks ago we attended the 2010 Global Privacy Summit (IAPP 2010) in Baltimore, Maryland.   

In addition to some engaging high-profile keynotes – including FTC Bureau of Consumer Protection Director David Vladeck – we got to participate in the first ever IAPP Foo Camp

The Foo Camp was comprised of four discussion topics aimed at covering the top technology concerns facing a wide-range of privacy professionals.

The session we ran was titled “Low Impact Data Mining”.  The intention was to discuss, and better understand, the current challenges in managing data within an organization.  All with a lens on managing data in a way that is “low impact” on resources while returning “high (positive) impact” on the business.

The individuals in our group represented a vast array of industries including: financial services, insurance, pharmaceutical, law enforcement, online marketing, health care, retail and telecommunications.  It was fascinating that, even across such a wide range of industries, that there could be such a pervasive set of privacy  challenges that were common among them.

Starting with:

What is “anonymous enough”?

If all you need is gender, zip code and birthdate to re-identify someone then what data, when released, is truly “anonymous enough”?  Can a baseline be defined, and enforced, within our organization that ensures customer protection?

It feels safe to say that this was the root-challenge from which all the others stemmed.  Today the release of data is mostly controlled, and subsequently managed, by a trusted person(s). The individual(s) is the ones responsible for “sanitizing” the data that gets released internally, or externally, to the organization.  They are charged with managing the release of data to fulfill everything from understanding business performance to fulfilling business obligations with partners.  And their primary concern is to know how well they are protecting their customer’s information, not only from the perspective of company policy, but also from a perspective of personal morals. They are they gatekeepers for assessing the level of protection provided based on which data they released to whom and they want to have some guarantee that what they are releasing is “anonymous enough” to have the level of protection they want to achieve.  These gatekeepers want to know when the data they release is “anonymous enough” and how they can employ a definition, or measurement, that guarantees the right level of anonymity for their customers.

This challenge compounds for these individuals, and their organizations, when adding in various other truths of the nature of data today:

The silos are getting joined.

The convention that used to be held was that data within an organization was in a silo – all on it’s own and protected – such that anyone looking at the data, would only see that set of data.  Now, it’s starting to become the reality that these data sets are getting joined and it’s not always known where, when, how, with whom the join originated. Nor is it known where the joined data set could is currently stored since it was modified from its original silo.  Soon that joined data-set takes on a life of its own and makes its way around the institution.  Given the likelihood of this occurring, how can the person(s) responsible for being the gatekeeper(s) of the data, and assessing the level of protection provided to customers, do so with any kind of reliable measurement that guarantees the right level of anonymity?

And now there’s data in the public market.

Not only is the data joined with data (from other silos) within the organization, but also with data outside the organization sold in the public market.  This prospect has increased the ability for organizations to produce data that is “high impact” for the business – because they now know WAY MORE about their customers.  But does the benefit outweigh the liability? As the ability to know more about individual customers increases, so does the level of sensitivity and the concern for privacy.    How do organizations successfully navigate mounting privacy concerns as they move from in silos, to joined-silos, to joined-silos combined with public data?   

The line between “data analytics” and looking at “raw data” is blurring.

Because the data is richer, and more plentiful, the act of data analysis isn’t as benign as it might once have been.  The definition of “data analytics” has evolved from something high-level (to know, for example, how many new customers are using the service this quarter) to something that  looks a lot more like looking at raw data to target specific parts of their business to specific customers (to, for example, sell <these products> to customers that make <this much money>, are females ages 30 – 35 and live in <this neighborhood> and typically spend <this much> on <these types of products>, etc…).

And the data has different ways of exiting the system.

The truth is, as scary as this data can be, everyone wants to get their hands on it, because the data leads to awareness that is meaningful and valuable for the business.  Thus, the data is shared everywhere – inside and outside the organization.  With that fact comes a whole set of challenges emerge when considering all the ways data might be exiting any given “silo”, such as: Where is all the data going?  How is it getting modified (joined, sanitized, rejoined) and at which point is it no longer the data that needs to be protected by the organization? How much data needs to be released externally to fulfill partner/customer business obligations? Once the data has exited, can the organization’s privacy practices still be enforced? 

Brand affects privacy policy.  Privacy policy affects brand.

Privacy is a concern of the whole business, not just the resources that manage the data, nor solely the resources that manage liability.  In the event of a “big oopsie” where there is a data/privacy breach, it will be the communication with customers before, during and after the incident that determines the internal and external impact on the brand and the perception of the organization.  And that communication is dictated by both what the privacy policy enforces and what brand “allows”.  In today’s age of data, how can an organization have an open dialog with customers about their data if the brand does not support having that kind of a conversation?  No surprise that Facebook is the exemplary case for this: Facebook continues to pave a new path, and draw customers, to share and disclose more information about themselves.  As a result they have experienced the backlash from customers when they take it too far. The line of communication is very open – customers have a clear way to lash back when Facebook has gone too far, and Facebook has a way of visibly standing behind their decision or admitting their mistake.  Either way, it is now commonplace for Facebook’s customers to expect that there will be more incidents like this and that Facebook has a way (apparently suitable enough to keep most customers) of dealing with it.  Their “policy” allowed them to respond this way, and now it’s become a part of who Facebook is.  And now the policy that evolves to support this behavior moving forward.

In the discussion of data and privacy, it seems inherently obvious that the mountain of challenges we face is large, complicated and impacts the core of all our businesses.  Nonetheless, it is still fascinating to have been able to witness first-hand – and to now be able to specifically articulate – how similar the challenges are across a diverse group of businesses and how similar the concerns are across job-function. 

We want to re-thank everyone from IAPP that joined in on the discussions that we had at Foo Camp and throughout the conference.  We look forward to an opportunity to deep dive into these types of problems.

Post Script: Meanwhile, the challenges, and related questions, around the anonymization of data with some kind of measurable privacy guarantee that came up at Foo Camp are ones that we have been discussing on our blog for quite some time.  These are precisely the sorts of challenges that have motivated us to create a datatrust.  While we typically envision the datatrust being used in scenarios where there isn’t direct access to data, we walked away with specific examples from our discussions at IAPP Foo Camp where direct access to the data is required – particularly to fulfill business obligation – as a type of collateral (or currency). 

The concept of data as the new currency of today’s economy has emerged.  Not only did it come up at the IAPP Foo Camp, it also came up back in August where we heard Marc Davis talk about this at IPP 2010. With all of this in mind, it is interesting evaluate the possibility of the datatrust being able to act as a special type of data broker in these exchanges.  The idea being that the datatrust is a sanctioned data broker (by the industry, or possibly by the government), that inherently meets federal, local, municipal regulations and protects the consumers of business partners who want to exchange data as “currency,” while alleviating businesses and their partners from the headaches of managing data use/reuse.  The “tax” on using the service is that these aggregates are stored and made available to the public to query in the way we imagine (no direct access to the data) for policy-making and research.  This is something that feels compelling to us and will influence our thinking as we continue to move forward with our work.

Common Data Project at IAPP Privacy Academy 2010

Monday, September 13th, 2010

We will be giving a Lightning Talk on “Low-Impact Data-Mining” and running two breakout sessions at the IT Privacy Foo Camp – Preconference Session, Wednesday Sept 29.

Below is a preview of our slides and handout for the conference. Unlike our previous presentations, we won’t be talking about CDP and the Datatrust at all. Instead, we’ll be focused on presenting on how SGM helps companies minimize the privacy impact of their data-mining.

More specifically, we’ll be stepping through the symbiotic documentation system we’ve created between the product development/data science folks collecting and making use of the data and the privacy/legal folks trying to regulate and monitor compliance with privacy policies. We will be using the SGM Data Dictionary as a case study in the breakout sessions.

Still, we expect that many of issues we’ve been grappling with from the datatrust perspective (e.g. public perception, trust, ownership of data, meaningful privacy guarantees) will come up as they are universal issues that are central to any meaningful discussion about privacy today.


What is data science?

An introduction to data-mining from O’Reilly Radar that provides a good explanation of how data-mining is distinct from previous uses of data and provides plenty of examples of how data-mining is changing products and services today.

The “Anonymous” Promise and De-indentification

  1. How you can be re-identified: Zip code + Birth date + Gender = Identity
  2. Promising new technologies for anonymization: Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization by Paul Ohm.

Differential Privacy: A Programmatic Way to Enforce Your Privacy Guarantee?

  1. A Microsoft Research Implementation: PINQ
  2. CDP’s write-up about PINQ.
  3. A deeper look at how differential privacy’s mathematical guarantee might translate into laymen’s terms.

Paradigms of Data Ownership: Individuals vs Companies

  1. Markets and Privacy by Kenneth C. Laudon
  2. Privacy as Property by Lawrence Lessig
  3. CDP explores the advantages and challenges to a “Creative Commons-style” model for licensing personal information?
  4. CDP’s Guide to How to Read a Privacy Policy

“How to Read a Privacy Policy” published in IAPP newsletter

Tuesday, October 20th, 2009


We’re pleased to announce that our report, “How to Read a Privacy Policy,” has been published in the October newsletter of Inside 1to1: Privacy, a publication produced by the International Association of Privacy Professionals (IAPP) and the Peppers & Rogers Group.

Our report, first published on our website in July, provides a “how to read” guide for the user who is curious about what is happening to his or her data online, but has little understanding of the technological and legal mechanisms at work.  The report walks through seven questions meant to pinpoint the issues CDP believes are most crucial for a user’s privacy, from questions on how “personal information” is defined to the kind of choices offered to users regarding how their information is shared.

We’d love to hear what you think!

What does it take to be an IAPP-certified privacy professional? What should it take?

Wednesday, September 9th, 2009


UPDATE: I recently was referred to this thoughtful blog post on a similar topic, “Nurturing an Accountable Privacy Profession.” Well-worth a read.

A few weeks ago, I was very relieved to find out I had passed the IAPP exam to be a “Certified Information Privacy Professional” or CIPP.  I got this certificate and even a pin, which is more than I ever got for passing the bar exams of New York and California.

So what exactly did I need to know to become a CIPP?

To be certified in corporate privacy law, you’re expected to know what’s covered in the CIPP Body of Knowledge, primarily major U.S. privacy laws and regulations and “the legal requirements for the responsible transfer of sensitive personal data to/from the United States, the European Union and other jurisdictions.”

You’re also expected to pass the Certification Foundation, required for all three certifications offered by IAPP.  That covers basic privacy law, both in the U.S. and abroad, information security principles and practices, and “online privacy,” which includes an overview of the technologies used by online companies to collect information and the particular issues to be considered in this context.

So what do you think?  Should you be able to pass an all-objective, 180 question, three-hour exam (counting the CIPP and Certification Foundation exams together) on the above topics and be able to call yourself a “privacy professional”?

There are no sample questions available online, and I was too cheap to take a prep course, but if I remember correctly, a typical question on the exam went something like this:

The Gramm-Leach-Bliley Act authorizes financial institutions to share consumer information with third parties if:

a. The information is not personally identifiable.

b. The consumer is informed and given the opportunity to opt-out.

c.  Any information without notice if it is shared with affiliated companies.

d.  All of the above.

The answer would be “C,” since the consumer is only required to be given notice if the third party is “non-affiliated.”  My sample is poorly constructed, and there are also questions that require you to analyze a fact pattern, but essentially, the exam covers existing laws, practices, and technologies.

It doesn’t ever ask you, “What would you do if you were advising RealAge and they told you they wanted to sell answers from a health questionnaire to pharmaceutical companies?”  Or, “Is Facebook doing enough to prevent third parties from misusing images of Facebook members in their ads?”

IAPP presumably doesn’t ask you these questions because there’s no “objectively” right answer.  There may, one day, be an objectively legal answer, depending on if and when legislation gets passed.  Still, it’s obvious that in the field of privacy, the most interesting aspects are not what laws do exist, but what laws should exist, what practices should be used, what innovations, both technological and social, should be promoted to protect privacy in meaningful ways.  But the exam only covers what is, not what could be or what should be.

Privacy may be an ancient concept, but it’s a very modern, very new, very undefined profession, which perhaps is even more reason for the IAPP to exist.  We as a society, particularly in the U.S., are struggling to figure out what privacy means and what we need to do to protect it.  While the medical profession has the Hippocratic Oath dating back to the 4th century B.C., and the legal profession’s adherence to the concept of attorney-client privilege goes back at least as far as the 16th century, the privacy profession has no clear guiding principle.  We don’t know yet what it should be.

I’m not really criticizing the IAPP for having a test that doesn’t quite encompass the dynamic, constantly changing field of privacy.  It’s not like other professions do better.  The bar exam certainly doesn’t screen out incompetent, unethical people from practicing law, even if you are actually required to pass an ethics exam.  And the IAPP does provide resources to its members for tracking changes in privacy law and policy.  But I’m curious to see where the IAPP goes as it tries to “professionalize” the profession, whether the certification exam will change and what expectations will be set for IAPP-certified privacy professionals.  Perhaps in another 100 years, or hopefully sooner, we’ll have a code of conduct for privacy professionals.

Get Adobe Flash player