Posts Tagged ‘Information’

What does it take to be an IAPP-certified privacy professional? What should it take?

Wednesday, September 9th, 2009

IAPPcert

UPDATE: I recently was referred to this thoughtful blog post on a similar topic, “Nurturing an Accountable Privacy Profession.” Well-worth a read.

A few weeks ago, I was very relieved to find out I had passed the IAPP exam to be a “Certified Information Privacy Professional” or CIPP.  I got this certificate and even a pin, which is more than I ever got for passing the bar exams of New York and California.

So what exactly did I need to know to become a CIPP?

To be certified in corporate privacy law, you’re expected to know what’s covered in the CIPP Body of Knowledge, primarily major U.S. privacy laws and regulations and “the legal requirements for the responsible transfer of sensitive personal data to/from the United States, the European Union and other jurisdictions.”

You’re also expected to pass the Certification Foundation, required for all three certifications offered by IAPP.  That covers basic privacy law, both in the U.S. and abroad, information security principles and practices, and “online privacy,” which includes an overview of the technologies used by online companies to collect information and the particular issues to be considered in this context.

So what do you think?  Should you be able to pass an all-objective, 180 question, three-hour exam (counting the CIPP and Certification Foundation exams together) on the above topics and be able to call yourself a “privacy professional”?

There are no sample questions available online, and I was too cheap to take a prep course, but if I remember correctly, a typical question on the exam went something like this:

The Gramm-Leach-Bliley Act authorizes financial institutions to share consumer information with third parties if:

a. The information is not personally identifiable.

b. The consumer is informed and given the opportunity to opt-out.

c.  Any information without notice if it is shared with affiliated companies.

d.  All of the above.

The answer would be “C,” since the consumer is only required to be given notice if the third party is “non-affiliated.”  My sample is poorly constructed, and there are also questions that require you to analyze a fact pattern, but essentially, the exam covers existing laws, practices, and technologies.

It doesn’t ever ask you, “What would you do if you were advising RealAge and they told you they wanted to sell answers from a health questionnaire to pharmaceutical companies?”  Or, “Is Facebook doing enough to prevent third parties from misusing images of Facebook members in their ads?”

IAPP presumably doesn’t ask you these questions because there’s no “objectively” right answer.  There may, one day, be an objectively legal answer, depending on if and when legislation gets passed.  Still, it’s obvious that in the field of privacy, the most interesting aspects are not what laws do exist, but what laws should exist, what practices should be used, what innovations, both technological and social, should be promoted to protect privacy in meaningful ways.  But the exam only covers what is, not what could be or what should be.

Privacy may be an ancient concept, but it’s a very modern, very new, very undefined profession, which perhaps is even more reason for the IAPP to exist.  We as a society, particularly in the U.S., are struggling to figure out what privacy means and what we need to do to protect it.  While the medical profession has the Hippocratic Oath dating back to the 4th century B.C., and the legal profession’s adherence to the concept of attorney-client privilege goes back at least as far as the 16th century, the privacy profession has no clear guiding principle.  We don’t know yet what it should be.

I’m not really criticizing the IAPP for having a test that doesn’t quite encompass the dynamic, constantly changing field of privacy.  It’s not like other professions do better.  The bar exam certainly doesn’t screen out incompetent, unethical people from practicing law, even if you are actually required to pass an ethics exam.  And the IAPP does provide resources to its members for tracking changes in privacy law and policy.  But I’m curious to see where the IAPP goes as it tries to “professionalize” the profession, whether the certification exam will change and what expectations will be set for IAPP-certified privacy professionals.  Perhaps in another 100 years, or hopefully sooner, we’ll have a code of conduct for privacy professionals.

Data-Driven Healthcare

Monday, May 11th, 2009

The super-nerds in team Obama are turning their attention to the health care system. The leader in this drive, somewhat counterintuitively, is thought to be budget director Peter Orszag.

What makes Orszag’s vision so interesting is that it is utterly data-driven. In a time of negative growth and yawning deficits, Orszag believes it is possible to extend coverage to everyone and lower costs over all. The New Yorker’s Ryan Lizza has a great profile of the man & his ambitions:

Orszag is convinced that rising federal health-care costs are the most important cause of long-term deficits. As a fellow at the Brookings Institution, he became obsessed with the findings of a research team at Dartmouth showing that some regions of the country spend far more money on health care than others but that patients in those high-spending areas don’t have better outcomes than those in regions that spend less money. If spending more on health care has no correlation with making people healthier, then there must be enormous savings that a smart government, by determining precisely which medical procedures are worth financing and which are not, could wring out of the system. “I spent several months in very intense study,” Orszag told me. “The reason that I wanted to go to C.B.O. was I thought that was one of the key bodies that could really delve into what we could do about it.”

Interestingly, when the Times’ David Leonhardt sat down with President Obama recently, Obama struck a similar note:

“if it turns out that doctors in Florida are spending 25 percent more on treating their patients as doctors in Minnesota, and the doctors in Minnesota are getting outcomes that are just as good — then us going down to Florida and pointing out that this is how folks in Minnesota are doing it and they seem to be getting pretty good outcomes, and are there particular reasons why you’re doing what you’re doing? — I think that conversation will ultimately yield some significant savings and some significant benefits.”

But the President goes a step further, bringing up the example of his grandmother, who had hip replacement surgery last summer, and died a few months later. Expensive procedures like these, which frequently occur at the end of life, are some of the biggest drivers of healthcare costs.

NYT: So how do you — how do we deal with it?

THE PRESIDENT: Well, I think that there is going to have to be a conversation that is guided by doctors, scientists, ethicists. And then there is going to have to be a very difficult democratic conversation that takes place. It is very difficult to imagine the country making those decisions just through the normal political channels. And that’s part of why you have to have some independent group that can give you guidance. It’s not determinative, but I think has to be able to give you some guidance. And that’s part of what I suspect you’ll see emerging out of the various health care conversations that are taking place on the Hill right now.

Scary stuff, medical bills. The idea that better information can make health care cheaper is enticing but untested. We do know, however, that much of the cutting edge of medicine is super-expensive, precisely because it’s so advanced.

UPDATE: It seems like the healthcare industry is sufficiently frightened of having cost-cutting imposed on them to make some preemptive announcements that they’ll cut costs themselves.  Hmm…

Google Maps: good or evil?

Tuesday, April 7th, 2009

I love that these two news items posted on the same day last week:

The Natural Resources Defense Council and the National Audubon Society launch a new tool for environmentalists and green energy developers based on Google Earth, marking clearly which lands are available for solar and wind farm development.

Angry mob in England attacks Google Street View car.

What Would Diderot Do?

Monday, March 9th, 2009

Book historian Robert Darnton has read and summarized the lengthy draft settlement between Google and book publishers for the New York Review of Books.

Please allow me to now further simplify by summarizing Darnton’s analysis:

> The Enlightenment represented the dawn of a new age of learning, built on the free-ish exchange of ideas in letters and books.

> The enlightened founders of the United States limited copyright to 28 years, recognizing the necessity of both protecting authors’ rights and advancing public knowledge. Life expectancy was much shorter then, but a young author could have a reasonable expectation of his or her book losing copyright within their lifetime.

> The 1998 Sonny Bono Copyright Term Extension Act extends copyright to the life of the author + seventy years. That means the books now entering the public domain date to roughly to the 1920s, and all the authors are dead.

> Google has been digitizing millions of books. Some of them are in the public domain, some are still copyrighted, and the largest portion are copyrighted but out of print, and therefore largely out of reach.

> A draft settlement between Google and publishers promises to bring the texts of these books to the people, at low cost (at your home computer) or no cost (at public and university libraries which purchase a license). This archive could quickly become the world’s largest library, bar none.

> This exciting archive could represent a Digital Republic of Learning that would have made Diderot (the author of the first encyclopedia) salivate.

> While there have been some similar efforts by not-for-profit groups like the Open Content Alliance, Google Books, will eat their lunch.

> The draft agreement between Google and publishers has problems: libraries would be limited to a single computer terminal with access to the archive, and users would have to pay to print copyrighted material.

> The biggest problem, however, is this:

“What will happen if Google favors profitability over access? Nothing, if I read the terms of the settlement correctly. Only the registry, acting for the copyright holders, has the power to force a change in the subscription prices charged by Google, and there is no reason to expect the registry to object if the prices are too high.”

It’s interesting to consider this scenario. In the short life of Google, most criticism has come from a smallish cadre of geeks. Under different management, could the company ever do anything to make your mom mad?

Useful information on healthcare–where is it?

Wednesday, February 11th, 2009

I really couldn’t say it any better.

Why is it so hard for us to get concrete information on healthcare and our options?  As David Wessel describes so well in his (2004!) Wall Street Journal piece, he and his wife faced a bewildering array of options and terminology as they tried to choose between the two plans offered by their employers.  And as he acknowledges, they were actually lucky in that at least they knew they were going to have health insurance.  For those who aren’t being offered any plan by their employer or are unemployed, the lack of information is even starker.

Over four years later, little has changed.  There are worksheets like this one for people looking to buy an individual plan, where you can plug in your zip code and some basic information and compare different health plans, though you still need to figure out what the difference is between “co-insurance” and “deductible.”  If you click on “co-insurance,” the site helpfully tries to give you a definition, but ultimately needs to provide this disclaimer: “Please note, however, that definitions of certain terms may vary across insurance companies.”

I also found a forum or two where people write in questions, but as the answer very often seems to be, “It depends.”  Or if you’re not feeling so polite, “Ask your health insurance carrier. We haven’t read your policy.”  And this in a forum called “Health Insurance and HMO Plans”!  (Being a lawyer, I know this is often the correct and right answer, but it only goes to show how complicated it’s gotten.)

So much depends on which state you live in, as each state has its own rules and regulations regarding healthcare, so it was particularly depressing to find this.  Although many other consumer and advocacy sites had linked to this series of state-by-state guides, run by the Georgetown University Health Policy Institute, the site had lost its funding in September 2008.

Of course, this doesn’t mean there’s no information on healthcare out there.  There are a gazillion blogs, focusing on everything from the pharmaceutical industry to nurses.  There are numerous research institutes.  But the ones below are the sites I found interesting and useful as we began to think through what we wanted for our CDP healthcare pilot project.

Consumer Information

Online Data on Healthcare Coverage and Issues

Organizing Sites, with Places to Share Personal Stories

Please let me know if there are other sites you have found useful or interesting, as I’d love to add to this list.

Numbers are only as useful as the questions you ask of them

Friday, May 9th, 2008

Errol Morris recently made a point about filmmaking that expresses precisely how I feel about interpreting data:

“There is no mode of expression, no technique of production that will instantly produce truth or falsehood.”

Data, like film or photography is a representation of “the real world”. We study it in the hopes of finding enlightenment and understanding. However, there is no “technique of [data representation] that will instantly produce truth or falsehood.” If someone is disillusioned with data it is often because they expect too much from it. Data can’t “tell us” anything. It can only take something that is hard to grasp and offer up otherwise submerged surface area for examination, inquiry and analysis.

An important part of analyzing data is doubt and questioning, yet most data reported by the mainstream media is doubt-resistant. Some magic number is reported to the public with no real data to pick apart and study. Where do these magic numbers come from?

I wrote a while back about Yale Professor Don Green saying casually that he never believes tidy data. Beware of dumbed down data! When presented with data that conveniently boils down to “one number” that can explain it all, raise an eyebrow and dig deeper.

Responsible reporting of new data findings should probe and challenge the data. Where did the data come from? How reliable are these sources? What data is missing? How might what’s missing change the results? (This is the hardest to pull-off because it requires us to imagine what we don’t know.) How is the way the data is presented inadvertently influencing how it will be interpreted? What assumptions will each person bring to their interpretation of this data? Are they valid? Who’s in a position to make that judgment? Without at least asking these questions, eye-catching “magic number” headlines are a disservice to the public, designed to catch eyeballs with false clarity rather than expose the confusing uncertainty of reality.

More often than not, analyzing one data set simply propels you to collect and question more data. Now that I have this data, what other data do I need? Now that I have answers to these questions, what other questions do I now know I need to ask?

This is not to say that data never yields answers. Generally speaking, however, every hard-won answer simply opens the door to 5 more questions you couldn’t have imagined at the outset.

Yet another data breach

Thursday, March 20th, 2008

A major grocery chain, Hannaford, recently announced that due to a security breach, up to four million credit cards may be vulnerable to access by criminals. So add another to the list of 2008 security breaches, and it’s only March.

As Flowing Data points out, when you look at a timeline of big data breaches from Attrition.org, data breaches have occurred with more frequency, not less, the closer we get to the present. Yet data breaches seem to be getting less coverage than they used to. When I looked at the full list of breaches catalogued by Attrition.org, I saw some that I’d heard of and many I hadn’t. And with this recent breach, I haven’t seen as much coverage as I would have expected. Plenty of specialized blog reactions and local news coverage, but not much national attention. Are people just getting used to this? Or is it that they think they have no alternatives?

CDTF’s Presentation at the Workshop on Data Privacy

Friday, February 22nd, 2008

The Common Datatrust Foundation recently attended and made a short presentation at the Workshop on Data Privacy, hosted by Rutgers University’s Center for Discrete Mathematics & Theoretical Computer Science (DIMACS).

There were spirited conversations across disciplines as statisticians, mathematicians, computer scientists, and media experts discussed how to balance the public’s interest in both privacy and information sharing. The presentations ranged from tutorials on new security and privacy technology to the management of existing databases of personal information, such as the U.S. Census, as well as thought-provoking presentations on more abstract but highly relevant questions, such as what we mean when we say we want to protect “privacy.” As Professor Helen Nissenbaum from NYU Law School pointed out, certain kinds of information flow are appropriate for certain situations; there is no uniform way to understand privacy protection.

We were excited to see how our presentation provoked questions and conversations as well. Alex Selkirk introduced the concept of a “datatrust,” a secure, structured data storage system where each record in each dataset has a set of rules defining who may use it, what it may be used for, and with what level of anonymity it may be disclosed. The presentation focused primarily on one example of the current limits of data disclosure: the subprime mortgage crisis. Although there is a great deal of data held by banks and mortgage companies on subprime loans, investigators and researchers are unable to analyze the data because the data holders are bound by confidentiality agreements to individual borrowers. CDTF proposed that a datatrust, as a third party, could use new technology to anonymize and aggregate the data in a way that would allow researchers to query the loan data without forcing the disclosure of identifying details about the borrowers. Such data-sharing would further CDTF’s mission to both protect individual privacy and encourage the sharing of information for the public good.

We hope that the conversation we began at DIMACS will continue to engage conference participants and others in the coming months.

What exactly is Google up to?

Wednesday, February 6th, 2008

Even as Google has become the most coveted place to work, to the extent that even their cafeteria gets media coverage, it’s also getting increasingly negative attention as a potentially sinister force. The New Yorker recently published an article with rather vague speculation at the way Google might take over the world. Now, we hear that Microsoft is trying to buy Yahoo so they can together fight Google. (Isn’t it funny that Microsoft is seeing another company as the big, bad world-dominator?) More and more, people are starting to wonder, “What exactly is Google up to?”

But given that we can’t read the minds of Sergey Brin and Larry Page, perhaps what we should be looking at is the conflict-of-interest inherent in Google’s business model. Google’s stated mission as a company is to organize the world’s information and make it universally accessible and useful. But are Google’s customers really the individuals searching for information, or are they the advertisers who actually increase Google’s revenues and stock value? To be fair, Google makes a respectable effort to separate advertising from “legitimate,” as in “non-jerry-rigged” search results. But after ten years, the Google search experience is pretty much the same as it’s always been. Has Google been working really hard on tools to help people find better information faster, or has it been working really hard on tools to help advertisers better target potential customers?

Google doesn’t have to be evil to be troubling. It may have started out with the purest of intentions, but it’s hampered itself with the conflict-of-interest at the heart of its operations. Law professor Tim Wu, as quoted in the New Yorker, said it straight, “I predict that Google will end up at war with itself.”

Where that “study” you quoted came from: Remember that call you got during dinner?

Tuesday, May 29th, 2007

Over the last few months I’ve been to a number of interesting talks at the Stanford Methods of Analysis Program in the Social Sciences (MAPSS) colloquium. Two types of speakers have caught my attention: those who work closely with the logistics and mechanics of data collection, and those who try to use survey data to test their hypotheses.

Most recently I got to hear Linda Piekarski of Survey Sampling International on SSI’s efforts to address changes in the telephone system, as well as their recent forays into internet surveys. (I didn’t realize how perfect the original design for the U.S. phone system was for tele-survey companies.)

Also memorable was Yale Professor Don Green‘s talk about measuring the effectiveness of political campaign advertising. One of my favorite lines (though I’m paraphrasing) was that “Any time you see a clean, clear graph of data, there’s something wrong. Data “noise” is what reality looks like.”

What follows is a summary of the challenges facing the collection of data about individuals derived in part from these talks.
Today, there are three main ways of collecting data from individuals, each of which contain flaws that seriously undermine the quality of the data collected.

  1. Pay them a tiny reward, lure them with a sweepstakes or nag them at dinner with a phone call from a stranger. For example, online stores may offer a coupon or rebate for your feedback on your buying experience.
  2. Make it easy for individuals to inadvertently or unthinkingly consent to data being collected about them, and/or subsequently changing the substances of what is collected, or the uses for that data. One prominent example is Amazon.com’s site registration process, which makes no attempt to highlight their third-party data-sharing practices.
  3. Leverage data collected for some other purpose – so-called “Secondary Use”. For example addresses collected for fulfillment (shipping) being used for geographically targeted marketing messages.

These mechanisms have a set of critical flaws:

  1. Tiny rewards and nagging phone calls are an insufficient value proposition for many individuals, thus the pool of participants is unlikely to be well distributed across the target distribution. Instead it will favor those individuals for whom the reward remains attractive, however small; or those individuals for whom the cost of participation (time) is small enough to make the reward adequate. (Mechanism 1)
  2. Rewards or compensation that are distributed without regard to accuracy provide no incentive for either careful or genuine accurate self-reporting. (Mechanism 1)
  3. These practices cultivate a public perception of a mesh of “big brother” networks collecting an ever-expanding set of data, beyond the control of any one individual. Privacy outrage still surfaces in mainstream media occasionally, but the general public is increasingly numb to incremental discoveries of the erosion of personal privacy. While anesthesia may appear temporarily attractive to data collectors, it also disengages individuals from the data collection goals, which decreases participation and discourages accurate self-reporting. For example, when you are pressured to answer a survey at a department store or after check-out at a web retailer, do you react with an earnest attempt to supply them with the information they need? (All mechanisms)
  4. In an effort to fight back the ever-increasing invasive data collection going on, privacy legislation and legal liability has forced data to be “silo-ed” and “anonymized” as much as possible. That means that unless you are a part of a larger survey panel, each subsequent survey you complete or data you consent to have collected will be stored separately from your other data. This eliminates the possibility of data-accuracy maintenance by individuals, and makes longitudinal analysis increasingly difficult. (All mechanisms)

Get Adobe Flash player