In The Mix…predicting the future; releasing healthcare claims; and $1.5 millions awarded to data privacy

November 30th, 2010 by Becky Pezely

Some people out there think they can predict the future by scraping content off the web. Does it work simply because web 2.0 technologies are great at creating echo chambers? Is this just another way of amplifying that echo chamber and generating yet more self-fulfilling trend prophecies? See the Future with a Search (MIT Technology Review)

The U.S. Office of Personnel Management wants to create a huge database that contains healthcare claims of millions of. Many are concerned for how the data will be protected and used. More federal health database details coming following privacy alarm (Computer World)

Researchers at Purdue were awarded $1.5 million to investigate how well current techniques for anonymizing data are working and whether there’s a need for better methods. It would be interesting to know what they think of differential privacy. They  appear to be actually doing the dirty work of figuring out whether theoretical re-identification is more than just a theory. National Science Foundation Funds Purdue Data-Anonymization Project (Threat Post)

@IAPP Privacy Foo Camp 2010: What Is Anonymous Enough?

October 26th, 2010 by Becky Pezely

Editor’s Note: Becky Pezely is an independent contractor for Shan Gao Ma, a consulting company started by Alex Selkirk, President of the Board of the Common Data Project.  Becky’s work, like Tony’s, touches on many of the privacy challenges that CDP hopes to address with the datatrust.  We’re happy to have her guest blogging about IAPP Academy 2010 here.

Several weeks ago we attended the 2010 Global Privacy Summit (IAPP 2010) in Baltimore, Maryland.   

In addition to some engaging high-profile keynotes – including FTC Bureau of Consumer Protection Director David Vladeck – we got to participate in the first ever IAPP Foo Camp

The Foo Camp was comprised of four discussion topics aimed at covering the top technology concerns facing a wide-range of privacy professionals.

The session we ran was titled “Low Impact Data Mining”.  The intention was to discuss, and better understand, the current challenges in managing data within an organization.  All with a lens on managing data in a way that is “low impact” on resources while returning “high (positive) impact” on the business.

The individuals in our group represented a vast array of industries including: financial services, insurance, pharmaceutical, law enforcement, online marketing, health care, retail and telecommunications.  It was fascinating that, even across such a wide range of industries, that there could be such a pervasive set of privacy  challenges that were common among them.

Starting with:

What is “anonymous enough”?

If all you need is gender, zip code and birthdate to re-identify someone then what data, when released, is truly “anonymous enough”?  Can a baseline be defined, and enforced, within our organization that ensures customer protection?

It feels safe to say that this was the root-challenge from which all the others stemmed.  Today the release of data is mostly controlled, and subsequently managed, by a trusted person(s). The individual(s) is the ones responsible for “sanitizing” the data that gets released internally, or externally, to the organization.  They are charged with managing the release of data to fulfill everything from understanding business performance to fulfilling business obligations with partners.  And their primary concern is to know how well they are protecting their customer’s information, not only from the perspective of company policy, but also from a perspective of personal morals. They are they gatekeepers for assessing the level of protection provided based on which data they released to whom and they want to have some guarantee that what they are releasing is “anonymous enough” to have the level of protection they want to achieve.  These gatekeepers want to know when the data they release is “anonymous enough” and how they can employ a definition, or measurement, that guarantees the right level of anonymity for their customers.

This challenge compounds for these individuals, and their organizations, when adding in various other truths of the nature of data today:

The silos are getting joined.

The convention that used to be held was that data within an organization was in a silo – all on it’s own and protected – such that anyone looking at the data, would only see that set of data.  Now, it’s starting to become the reality that these data sets are getting joined and it’s not always known where, when, how, with whom the join originated. Nor is it known where the joined data set could is currently stored since it was modified from its original silo.  Soon that joined data-set takes on a life of its own and makes its way around the institution.  Given the likelihood of this occurring, how can the person(s) responsible for being the gatekeeper(s) of the data, and assessing the level of protection provided to customers, do so with any kind of reliable measurement that guarantees the right level of anonymity?

And now there’s data in the public market.

Not only is the data joined with data (from other silos) within the organization, but also with data outside the organization sold in the public market.  This prospect has increased the ability for organizations to produce data that is “high impact” for the business – because they now know WAY MORE about their customers.  But does the benefit outweigh the liability? As the ability to know more about individual customers increases, so does the level of sensitivity and the concern for privacy.    How do organizations successfully navigate mounting privacy concerns as they move from in silos, to joined-silos, to joined-silos combined with public data?   

The line between “data analytics” and looking at “raw data” is blurring.

Because the data is richer, and more plentiful, the act of data analysis isn’t as benign as it might once have been.  The definition of “data analytics” has evolved from something high-level (to know, for example, how many new customers are using the service this quarter) to something that  looks a lot more like looking at raw data to target specific parts of their business to specific customers (to, for example, sell <these products> to customers that make <this much money>, are females ages 30 – 35 and live in <this neighborhood> and typically spend <this much> on <these types of products>, etc…).

And the data has different ways of exiting the system.

The truth is, as scary as this data can be, everyone wants to get their hands on it, because the data leads to awareness that is meaningful and valuable for the business.  Thus, the data is shared everywhere – inside and outside the organization.  With that fact comes a whole set of challenges emerge when considering all the ways data might be exiting any given “silo”, such as: Where is all the data going?  How is it getting modified (joined, sanitized, rejoined) and at which point is it no longer the data that needs to be protected by the organization? How much data needs to be released externally to fulfill partner/customer business obligations? Once the data has exited, can the organization’s privacy practices still be enforced? 

Brand affects privacy policy.  Privacy policy affects brand.

Privacy is a concern of the whole business, not just the resources that manage the data, nor solely the resources that manage liability.  In the event of a “big oopsie” where there is a data/privacy breach, it will be the communication with customers before, during and after the incident that determines the internal and external impact on the brand and the perception of the organization.  And that communication is dictated by both what the privacy policy enforces and what brand “allows”.  In today’s age of data, how can an organization have an open dialog with customers about their data if the brand does not support having that kind of a conversation?  No surprise that Facebook is the exemplary case for this: Facebook continues to pave a new path, and draw customers, to share and disclose more information about themselves.  As a result they have experienced the backlash from customers when they take it too far. The line of communication is very open – customers have a clear way to lash back when Facebook has gone too far, and Facebook has a way of visibly standing behind their decision or admitting their mistake.  Either way, it is now commonplace for Facebook’s customers to expect that there will be more incidents like this and that Facebook has a way (apparently suitable enough to keep most customers) of dealing with it.  Their “policy” allowed them to respond this way, and now it’s become a part of who Facebook is.  And now the policy that evolves to support this behavior moving forward.

In the discussion of data and privacy, it seems inherently obvious that the mountain of challenges we face is large, complicated and impacts the core of all our businesses.  Nonetheless, it is still fascinating to have been able to witness first-hand – and to now be able to specifically articulate – how similar the challenges are across a diverse group of businesses and how similar the concerns are across job-function. 

We want to re-thank everyone from IAPP that joined in on the discussions that we had at Foo Camp and throughout the conference.  We look forward to an opportunity to deep dive into these types of problems.

Post Script: Meanwhile, the challenges, and related questions, around the anonymization of data with some kind of measurable privacy guarantee that came up at Foo Camp are ones that we have been discussing on our blog for quite some time.  These are precisely the sorts of challenges that have motivated us to create a datatrust.  While we typically envision the datatrust being used in scenarios where there isn’t direct access to data, we walked away with specific examples from our discussions at IAPP Foo Camp where direct access to the data is required – particularly to fulfill business obligation – as a type of collateral (or currency). 

The concept of data as the new currency of today’s economy has emerged.  Not only did it come up at the IAPP Foo Camp, it also came up back in August where we heard Marc Davis talk about this at IPP 2010. With all of this in mind, it is interesting evaluate the possibility of the datatrust being able to act as a special type of data broker in these exchanges.  The idea being that the datatrust is a sanctioned data broker (by the industry, or possibly by the government), that inherently meets federal, local, municipal regulations and protects the consumers of business partners who want to exchange data as “currency,” while alleviating businesses and their partners from the headaches of managing data use/reuse.  The “tax” on using the service is that these aggregates are stored and made available to the public to query in the way we imagine (no direct access to the data) for policy-making and research.  This is something that feels compelling to us and will influence our thinking as we continue to move forward with our work.

Common Data Project at IAPP Privacy Academy 2010

September 13th, 2010 by Alex Selkirk

We will be giving a Lightning Talk on “Low-Impact Data-Mining” and running two breakout sessions at the IT Privacy Foo Camp – Preconference Session, Wednesday Sept 29.

Below is a preview of our slides and handout for the conference. Unlike our previous presentations, we won’t be talking about CDP and the Datatrust at all. Instead, we’ll be focused on presenting on how SGM helps companies minimize the privacy impact of their data-mining.

More specifically, we’ll be stepping through the symbiotic documentation system we’ve created between the product development/data science folks collecting and making use of the data and the privacy/legal folks trying to regulate and monitor compliance with privacy policies. We will be using the SGM Data Dictionary as a case study in the breakout sessions.

Still, we expect that many of issues we’ve been grappling with from the datatrust perspective (e.g. public perception, trust, ownership of data, meaningful privacy guarantees) will come up as they are universal issues that are central to any meaningful discussion about privacy today.


Handout

What is data science?

An introduction to data-mining from O’Reilly Radar that provides a good explanation of how data-mining is distinct from previous uses of data and provides plenty of examples of how data-mining is changing products and services today.

The “Anonymous” Promise and De-indentification

  1. How you can be re-identified: Zip code + Birth date + Gender = Identity
  2. Promising new technologies for anonymization: Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization by Paul Ohm.

Differential Privacy: A Programmatic Way to Enforce Your Privacy Guarantee?

  1. A Microsoft Research Implementation: PINQ
  2. CDP’s write-up about PINQ.
  3. A deeper look at how differential privacy’s mathematical guarantee might translate into laymen’s terms.

Paradigms of Data Ownership: Individuals vs Companies

  1. Markets and Privacy by Kenneth C. Laudon
  2. Privacy as Property by Lawrence Lessig
  3. CDP explores the advantages and challenges to a “Creative Commons-style” model for licensing personal information?
  4. CDP’s Guide to How to Read a Privacy Policy

In the mix…nonprofit technology failures; not counting religion; medical privacy after death; and the business of open data

August 20th, 2010 by Grace Meng

1) Impressive nonprofit transparency around technology failures. It might seem odd for us to highlight technology failures when we’re hoping to make CDP and its technology useful to nonprofits, but the transparency demonstrated by these nonprofits talking openly about their mistakes is precisely the kind of transparency we hope to support.  If nonprofits, or any other organization, is going to share more of their data with the public, they have to be willing to share the bad with the good, all in the hope of actually doing better.

2) I was really surprised to find out the U.S. Census doesn’t ask about religion.  It’s a sensitive subject, but is it really more sensitive than race and ethnicity, which the U.S. Census asks about quite openly?  The article goes through why having a better count of different religions could be useful to a lot of people. What are other things we’re afraid to count, and how might that be holding us back from important knowledge?

3) How long should we protect people’s privacy around their medical history? HHS proposes to remove protections that prevent researchers and archivists from accessing medical records for people who have been dead for 50 years; CDT thinks this is a bad idea.  Is there a way that this information can be made available without revealing individual identity?  That’s the essential problem the datatrust is trying to solve.

4) It may be counterintuitive, but open data can foster industry and business. Clay Johnson, formerly at the Sunlight Foundation, writes about how weather data, collected by the U.S. government, became open data, thereby creating a whole new industry around weather prediction.  As he points out, though, that $1.5 billion industry is now not that excited by the National Weather Service expanding into providing data directly to citizens.

We at CDP have been talking about how the datatrust might change the business of data.  We think that it could enable all kinds of new business and new services, but it will likely change how data is bought and sold.  Already, the business of buying and selling data has changed so much in the past 10 years.  Exciting years ahead.

In the mix…data-sharing’s impact on Alzheimer’s research, the limits of a Do Not Track registry, meaning of data ownership and more

August 13th, 2010 by Grace Meng

1)  It’s heartening that an article on how data-sharing led to a breakthrough in Alzheimer’s research is the Most Emailed article on the NYTimes website right now. The reasons for resisting data-sharing are the same in so many contexts:

At first, the collaboration struck many scientists as worrisome — they would be giving up ownership of data, and anyone could use it, publish papers, maybe even misinterpret it and publish information that was wrong.

But Alzheimer’s researchers and drug companies realized they had little choice.

“Companies were caught in a prisoner’s dilemma,” said Dr. Jason Karlawish, an Alzheimer’s researcher at the University of Pennsylvania. “They all wanted to move the field forward, but no one wanted to take the risks of doing it.”

2) Google agonizes on privacy. The Wall Street Journal article discusses a confidential Google document that reveals the disagreements within the company on how it should use its data.  Interestingly, all the scenarios in which Google considers using its data involve targeted advertising; none involve sharing that data with Google users in a broader, more extensive way than they do now.  Google believes it owns the data it’s collected, but it also clearly senses that ownership of such data has implications that are different from ownership of other assets.  There are individuals who are implicated — what claims might they have to how that data is used?

3) Some people have suggested that if people are unhappy with targeted advertising, the government should come up with a Do Not Track registry, similar to the Do Not Call list.  But as Harlan Yu notes, Do Not Track would not be as simple as it sounds. He notes that the challenges involve both technology and policy:

Privacy isn’t a single binary choice but rather a series of individually-considered decisions that each depend on who the tracking party is, how much information can be combined and what the user gets in return for being tracked. This makes the general concept of online Do Not Track—or any blanket opt-out regime—a fairly awkward fit. Users need simplicity, but whether simple controls can adequately capture the nuances of individual privacy preferences is an open question.

4) What happens to a business’s data when it goes bankrupt? The former publisher and partners of a magazine and dating website for gay youth were fighting over ownership of the company’s assets, including its databases.  They recently came to an agreement to destroy the dataEFF argues that the Bankruptcy Code should be amended to require such outcomes for data assets.  I don’t know enough about bankruptcy law to have an opinion on that, but this conflict illuminates what’s so problematic about the way we treat data and property.  No one can own a fact, but everyone acts like they own data.  Something fundamental needs to be thrashed out.

5) Geotags are revealing more locational info than the photographers intended.

6) The owner of an ISP that resisted an FBI request for information can finally reveal his identity. Nicholas Merrill can now reveal that he was the plaintiff behind an ACLU lawsuit that challenged the legality of national security letter, by which the FBI can request information without a court order or proving just cause.  In fact, the FBI can even impose a gag order prohibiting the recipient of the NSL from telling anyone about the NSL, which is what happened to Merrill.

In the mix…WSJ’s “What They Know”; data potential in healthcare; and comparing the privacy bills in Congress

August 9th, 2010 by Grace Meng

1.  The Wall Street Journal Online has a new feature section, ominously named, “What They Know.” The section highlights articles that focus on technology and tracking.  The tone feels a little overwrought, with language that evokes spies, like “Stalking by Cellphone” and “The Web’s New Gold Mine: Your Secrets.”  Some of their methodology is a little simplistic.  Their study on how much people are “exposed” online was based on simply counting tracking tools, such as cookies and beacons, installed by certain websites.

It is interesting, though, to see that the big, bad wolves of privacy, like Facebook and Google, are pretty low on the WSJ’s exposure scale, while sites people don’t really think about, like dictionary.com, are very high.  The debate around online data collection does need to shift to include companies that aren’t so name-brand.

2.  In response to WSJ’s feature, AdAge published this article by Erin Jo Richey, a digital marketing analyst, addressing whether “online marketers are actually spies.” She argues that she doesn’t know that much about the people she’s tracking, but she does admit she could know more:

I spend most of my time looking at trends and segments of visitors with shared characteristics rather than focusing on profiles of individual browsers. However, if I already know that Mary Smith bought a black toaster with product number 08971 on Monday morning, I can probably isolate the anonymous profile that represents Mary’s visit to my website Monday morning.

3.  A nice graphic illustrates how data could transform healthcare. Are there people making these kinds of detailed arguments made for other industries and areas of research and policy?

4.  There are now two proposed privacy bills in Congress, the BEST PRACTICES bill proposed by Representative Bobby Rush and the draft proposed by Representatives Rick Boucher and Cliff Stearns.  CDT has released a clear and concise table breaking down the differences between these two proposed bills and what CDT recommends.  Some things that jumped out at us:

  • Both bills make exceptions for aggregated or de-identified data.  The BEST PRACTICES bill has a more descriptive definition of what that means, stating that it excepts aggregated information and information from which identifying information has been obscured or removed, such that there is no reasonable basis to believe that the information could be used to identify an individual or a computer used by the individual.  CDT supports the BEST PRACTICES exception.
  • Both bills make some, though not sweeping provisions, for consumer access to the information collected about them.  CDT endorses neither, and would support a bill that would generally require covered entities to make available to consumers the covered information possessed about them along with a reasonable method of correction.  Some companies, including a start-up called Bynamite, have already begun to show consumers what’s being collected, albeit in rather limited ways.  We at the Common Data Project hope this push to access also includes access to the richness of the information collected from all of us, and not just the interests asssociated with me.  It’ll be interesting to see where this legislation goes, and how it might affect the development of our datatrust.

In the mix…Facebook “breach” of public data, data-mining for everyone, thinking through the Panton Principles, and BEST PRACTICES Act in Congress

July 30th, 2010 by Grace Meng

1) Facebook’s in privacy trouble again. Ron Bowes created a downloadable file containing information on 100 million searchable Facebook profiles, including the URL, name, and unique ID.  What’s interesting is that it’s not exactly a breach.  As Facebook pointed out, the information was already public.  What Facebook will likely never admit, though, is that there is a qualitative difference between information that is publicly available, and information that is organized into an easily searchable database.  This is what we as a society are struggling to define — if “public” means more public than ever before, how do we balance our societal interests in both privacy and disclosure?

2) Can data mining go mainstream? The article doesn’t actually say much, but it does at least raise an important question.  The value of data and data-mining is immense, as corporations and large government agencies know well.  Will those tools every be available to individuals?  Smaller businesses and organizations?  And what would that mean for them?  It’s a big motivator for us at the Common Data Project — if data doesn’t belong to anyone, and it’s been collected from us, shouldn’t we all be benefiting from data?

3) In the same vein is a new blog by Peter Murray-Rust discussing open knowledge/open data issues, focusing on the Panton Principles for open science data.

4) A new data privacy bill has been introduced in Congress called “Building Effective Strategies to Promote Responsibility Accountability Choice Transparency Innovation Consumer Expectations and Safeguards” Act, aka “BEST PRACTICES Act.”  The Information Law Group has posted Part One of FAQs on this proposed bill.

Although the bill is still being debated and rewritten, some of its provisions indicate that the author of the bill knows a bit more about data and privacy issues than many other Congressional representatives.

  • The information regulated by the Act goes beyond the traditional, American definition of personally identifiable information.  “The definition of “covered information” in the Act does not require such a combination – each data element stands on its own and may not need to be tied to or identify a specific person. If I, as an individual, had an email address that was wildwolf432@hotmail.com, that would would appear to satisfy the definition of covered information even if my name was not associated with it.”
  • Notice is required when information will be merged or combined with other data.
  • There’s some limited push to making more information accessible to users: “covered entities, upon request, must provide individuals with access to their personal files.” However, they only have to if “the entity stores such file in a manner that makes it accessible in the normal course of business,” which I’m guessing would apply to much of the data collected by internet companies.

Google buys Metaweb: Can corporations acquire the halo effect of the underdog?

July 26th, 2010 by Grace Meng

Google recently bought Metaweb, a major semantic web company.  The value of Metaweb to Google is obvious — as ReadWriteWeb notes, “For the most part,…Google merely serves up links to Web pages; knowing more about what is behind those links could allow the search giant to provide better, more contextual results.” But what does the purchase mean for Metaweb?

Big companies buy small companies all the time.  Some entrepreneurs create their start-ups with that goal in mind — get something going and then make a killing when Google buys it.  But what do you think of a company when it seems to be doing something different and then is bought by Google?

Metaweb was never a nonprofit, but like Wikipedia, it has had a similar, community-driven vibe.  Freebase, its database of entities, is crowd-sourced, open, and free.  Google promises that Freebase will remain free, but will the community of people who contribute to Freebase feel the same contributing free labor to a mega-corporation?  Is there anything keeping Google from changing its mind in the future about keeping Freebase free?  How will the culture of Metaweb change as its technologies evolve within Google?

This isn’t to say that Metaweb’s goals have necessarily been compromised by its purchase by Google.   Many people feel like this is the best thing that could have happened to the semantic web.

(Though a few feel, “They didn’t make it big. In fact, this means they failed at their mission of better organizing the world’s information so that rich apps could be built around it. They never got to the APPS part. FAIL!”, and at least one person is concerned Google bought Freebase to kill it.)

But what did you think when Google bought the nonprofit Gapminder, of Hans Rosling’s famous TED talk?

Or when eBay bought a 25% stake in Craigslist?

Or outside the tech world, when Unilever bought Ben & Jerry’s?

Can a company or organization maintain any high-minded mission to be driven by principles other than profit when they’re bought by a major publicly held corporation?

This isn’t just an abstract question for us.  One of the biggest reasons why we chose to be a 501(c)(3) nonprofit organization is that we wanted to make sure no one running the Common Data Project would be tempted to sell its greatest asset, the data it hopes to bring together, for short-term profit.  As a nonprofit, CDP is not barred from making profits, but no profits can inure to the benefit of any private individual or shareholder.  Also as a nonprofit, should CDP dissolve, it cannot merely sell its assets to the highest bidder but must transfer them to another nonprofit organization with a similar mission.

We’re still working on understanding the legal distinctions between IRS-recognized tax-exempt organizations and for-profit businesses.  We were surprised when we first found out that Gapminder, a Swedish nonprofit, had been bought by Google.  Swedish nonprofit law may differ from U.S. nonprofit law.  But it appears Hans Rosling did not stand to make a dime.  Google only bought the software and the website, and whatever that was worth went to the organization itself.  So in a way, the experience of Gapminder supports the idea that being a nonprofit does make a difference in restricting the profit motives of individuals.  Alex Selkirk, as the founder and President of CDP, will never make a windfall through CDP.

The fact that CDP is not profit-driven, and will never be profit-driven makes a difference to us.  Does it make a difference to you?

In the mix..government surveillance, HIPAA updates, and user control over online data

July 19th, 2010 by Grace Meng

1) The U.S. government comes up with some, um, interesting names for its surveillance programs.  “Perfect Citizen” sounds like it’s right out of Orwell. As the article points out, there are some major unanswered questions.  How do they collect this data?  Where do they get it?  Do they use it just to look for interesting patterns that then lead them to identify specific individuals, or are all the individuals apparent and visible from the get-go?  And what are the regulations around re-use of this data?

2) Health and Human Services has issued proposed updated regulations to HIPAA, the law regulating how personal health information is shared. CDT has made some comments about how these regulations will affect patient privacy, data security, and enforcement.  HIPAA, to some extent, lays out some useful standards on things like how electronic health information should transmitted.  But it also has been controversial for suppressing information-sharing, even when it is legal and warranted.

So what if instead of talking about what we can’t do, what if we started talking about what we can do with electronic health data?  I’m not imagining a list of uses where anything outside of the list is barred, but rather an outline of the kinds of uses that are useful.  The whole point of electronic health records is to make information more easily shareable so care is more continuous and comprehensive and research more efficient and effective.

I love this bit from an interview with a neuroscientist who studies dog brains because, “dogs aren’t covered by Hipaa! Their records aren’t confidential!”

3) A start-up called Bynamite is trying to give users control over the information they share with advertisers online. It’s another take on something we’ve seen from Google and BlueKai, where users get to see what interests have been associated with them.  Like those services, Bynamite allows you to remove interests that don’t pertain to you or that you don’t want to share.  Bynamite then goes further by opting you out of networks that won’t let you make these choices.  That definitely sounds easier to managing P3P, and easier than reading through the policies of all the companies that participate in the National Advertising Initiative.

I agree with Professor Acquisti that all of us, when we use Google or any other free online service, are paying for our use of the service with our personal information, and that Bynamite is trying to make that transaction more explicit.  But I wonder if the value of the data companies have gained is explicit.  Is the price of the transaction fair?  Does 1 hour of free Google search equal x amounts of personal data bits?  Can you even put a dollar value on that transaction, given that the true value of all this data is in aggregate?

The accompanying blog post to this article cites a study demonstrating how hard it is to assign a dollar value to privacy.  The study subjects clearly did value “privacy,” but the price they put on it depended on how much they felt they had any privacy to begin with!

In the mix…new organizational structures, giant list of data brokers, governments sharing citizens’ financial data, and what IT security has to do with Lady Gaga

July 9th, 2010 by Grace Meng

1) More on new kinds of organizational structures for entities that want to form for philanthropic purposes but not fit into the IRS definition of a nonprofit.

2) CDT shone a spotlight on Spokeo, a data broker last week.  Who are other data brokers? Don’t be shocked, there are A LOT of them.  What they do, they mainly do out of the spotlight shone on companies like Facebook, but with very real effects.  In 2005, ChoicePoint sold data to identity thieves posing as a legitimate business.

3) The U.S. has come to an agreement with Europe on sharing finance data, which the U.S. argues is an essential tool of counterterrorism.  The article doesn’t say exactly how these investigations work, whether specific suspects are targeted or whether large amounts of financial data are combed for suspicious activity.  It does make me wonder, given how data crosses borders more easily than any other resource, how will Fourth Amendment protections in the U.S. (and similar protections in other countries) apply to these international data exchanges?  There is also this pithy quote:

Giving passengers a way to challenge the sharing of their personal data in United States courts is a key demand of privacy advocates in Europe — though it is not clear under what circumstances passengers would learn that their records were being misused or were inaccurate.

4) Don’t mean to focus so much on scary data stuff, but 41% of IT professionals admit to abusing privileges.  In a related vein, it turns out a disgruntled soldier accused of illegally downloading classified data managed to do it by disguising his CDs as Lady Gaga CDs.  Even better,

He was able to avoid detection not because he kept a poker face, they said, but apparently because he hummed and lip-synched to Lady Gaga songs to make it appear that he was using the classified computer’s CD player to listen to music.

The New York Times is definitely getting cheekier.


Get Adobe Flash player