What does a privacy guarantee mean to you? Harm v. Obscurity

December 18th, 2009 by Mimi Yin

Left: Senator Joseph McCarthy. Right: The band, Kajagoogoo.

At the Symposium in November we spent quite a bit of time trying to wrap our collective brain around the PINQ privacy technology, what it actually guarantees and how it does so.

I’ve attempted to condense our several hours of discussion into a series of blog posts.

We began our discussion with the question: What does a privacy guarantee mean to you?

There was a range of answers to this question. They all boiled down to one of the following two:

  1. Nothing bad will come of this information I give you: It won’t be used against me (discrimination, fraud investigations, psychological warfare). It won’t be used to harass me (spam).
  2. The absence or presence of a single record cannot be discerned.

Let’s just say definition 1 is the “layperson” definition, which is more focused on the consequences of giving up personal data.

And definition 2 is the “technologist'” definition, which is more focused on the mechanism behind how to actually fulfill the layperson’s guarantee in a meaningful, calculable way.

Q. What does PINQ guarantee?

Some context: PINQ is a layer of code that sits between data and anyone trying to ask questions of that data that guarantees privacy in a measurable way to the individuals represented in the data.

The privacy PINQ guarantees is broader than the layperson’s understanding of privacy. Not only does PINQ guard against re-identification, targeting, and in short, any kind of harm resulting from exposing your data, it prevents any and all things in the universe from changing as a direct result of your individual data contribution.

Sounds like cosmic wizardry. Not really, it’s simply a clever bit of armchair thinking.

If you want to guarantee that nothing in the world will change as a result of someone contributing their data to a data set, then you simply need to make sure that no one asking questions of that data set will get answers that are discernibly affected by the presence or absence of any one person.

Therefore, if you define privacy guarantee as “the absence or presence of a single record cannot be discerned,” meaning the inclusion of your data in a data set will have no discernible impact on the answers people get out of that data set, you also end up guaranteeing that nothing bad can ever happen to you if you contribute your data because in fact, absolutely nothing (good or bad) will happen to anyone as a direct result of you contributing your data, because with PINQ as the gatekeeper, your particular data record might as well not be there!

What is the practical fallout of such a guarantee?

Not only will you not be targeted to receive SPAM as a result of contributing your data to a dataset, no one else will be targeted to receive SPAM as a result of you contributing your data to a data set.

Not only will you not be discriminated against by future employers or insurance companies as a result of contributing your data to a dataset, no one else will be discriminated against as a result of contributing your data to a dataset.

Does this mean that my data doesn’t matter? Why then would I bother to contribute?

Now is a good time to point out that PINQ’s privacy guarantee is expansive, but in a very specific way. Nothing in the universe will change as a result of any one person’s data. However, the aggregate effect of everyone’s data will absolutely make a difference. It’s the same logic behind avoiding life’s little vices like telling white lies, littering or chewing gum in class. One person littering isn’t such a big deal. But what if everyone littered?

Still, is such an expansive privacy guarantee necessary?

It turns out, it’s incredibly hard to narrow a privacy guarantee to prevent just “harm,” because harm is a subjective concept with cultural and social overtones. One person’s spam is another’s helpful notification.

All privacy guarantees today which largely focus on preventing harm are by and large based on “good intentions” and “theoretical best practices,” not technical mechanisms that are measurable and “provable.”

However, change, as a function of how much any one person’s data is discernibly affecting answers to questions asked of a data set is readily measurable.

Up to this point, we’ve been engaged in a simple thought experiment that requires nothing more than a few turns of logic. How exactly PINQ keeps track of whether the absence or presence of a single record is discernible in the answers it gives out and the extent to which it’s discernible is a different matter and requires actual “innovation” and “technology.” Stay tuned for more on that.

Tags: , ,

3 Responses to “What does a privacy guarantee mean to you? Harm v. Obscurity”

  1. Grace Meng says:

    One concern I have about the “no harm” guarantee is this: just because no harm will come to someone directly because he or she has contributed a record doesn’t mean some harm could still come to them. For example, let’s say the datatrust’s data reveals that people with long black hair tend to have arthritis. I have long black hair, but never contributed a record. Yet that knowledge causes someone to discriminate against me. I don’t think this fear is a reason NOT to build a datatrust. It’s the same reasoning people have who are afraid of counting altogether, from the left and the right. But it’s a distinction that should be made.

  2. Mimi Yin says:

    Yes, I think for some people, it won’t be enough to reassure them that their particular data record won’t cause any harm. They simply won’t be comfortable with the idea that anybody’s data is being collected because of the “harm” and “misuse” that could come about as a result of amassing such data.

    This is the rationale behind boycotting the census (it unfairly apportions tax dollars to illegal immigrants) and locking down data that could help schools evaluate teachers (it could be used as a way to censor “un-American” teachers) and fighting access to patient data to evaluate doctors and hospitals (it could violate patient privacy).

    These are all valid concerns that any data collection effort will need to address independent of the issue of guaranteeing privacy for those who choose to contribute their information.

  3. Grant Baillie says:

    Hm, Kajagoogoo is an apt choice for a post on privacy, since their biggest hit was “Too Shy”.

    I’ve been thinking about what it really means to guarantee something. With a guarantee, you are assuring someone that some condition will hold in the future, but in truth you can never be completely certain of the future!

    So, looking at your wording:

    Now is a good time to point out that PINQ’s privacy guarantee is expansive, but in a very specific way. Nothing in the universe will change as a result of any one person’s data.

    I’d say that PINQ is really making sure that only a small change will happen in the universe if you submit your data. (After all, at the very least some bits are changing in a computer files somewhere when you do that).

    Continuing this mini-ramble, it seems to me that guarantees usually have some cost to the guarantor if the future doesn’t turn out as expected. For example, if I buy a new toaster at a guaranteed lowest price, the seller is willing to pay me some money if I find a lower price for that toaster elsewhere. Presumably, that seller is pretty confident in that condition, (i.e. of lowest price), but he or she is willing to pay a financial cost for being wrong.

    In the case of a failed privacy guarantee, one cost would be in lost reputation, and potentially financial penalties could be extracted via legal action (or threat thereof). I’m not sure if this is a useful way of thinking about the question, though.


Get Adobe Flash player