Posts Tagged ‘Google’

Google buys Metaweb: Can corporations acquire the halo effect of the underdog?

Monday, July 26th, 2010

Google recently bought Metaweb, a major semantic web company.  The value of Metaweb to Google is obvious — as ReadWriteWeb notes, “For the most part,…Google merely serves up links to Web pages; knowing more about what is behind those links could allow the search giant to provide better, more contextual results.” But what does the purchase mean for Metaweb?

Big companies buy small companies all the time.  Some entrepreneurs create their start-ups with that goal in mind — get something going and then make a killing when Google buys it.  But what do you think of a company when it seems to be doing something different and then is bought by Google?

Metaweb was never a nonprofit, but like Wikipedia, it has had a similar, community-driven vibe.  Freebase, its database of entities, is crowd-sourced, open, and free.  Google promises that Freebase will remain free, but will the community of people who contribute to Freebase feel the same contributing free labor to a mega-corporation?  Is there anything keeping Google from changing its mind in the future about keeping Freebase free?  How will the culture of Metaweb change as its technologies evolve within Google?

This isn’t to say that Metaweb’s goals have necessarily been compromised by its purchase by Google.   Many people feel like this is the best thing that could have happened to the semantic web.

(Though a few feel, “They didn’t make it big. In fact, this means they failed at their mission of better organizing the world’s information so that rich apps could be built around it. They never got to the APPS part. FAIL!”, and at least one person is concerned Google bought Freebase to kill it.)

But what did you think when Google bought the nonprofit Gapminder, of Hans Rosling’s famous TED talk?

Or when eBay bought a 25% stake in Craigslist?

Or outside the tech world, when Unilever bought Ben & Jerry’s?

Can a company or organization maintain any high-minded mission to be driven by principles other than profit when they’re bought by a major publicly held corporation?

This isn’t just an abstract question for us.  One of the biggest reasons why we chose to be a 501(c)(3) nonprofit organization is that we wanted to make sure no one running the Common Data Project would be tempted to sell its greatest asset, the data it hopes to bring together, for short-term profit.  As a nonprofit, CDP is not barred from making profits, but no profits can inure to the benefit of any private individual or shareholder.  Also as a nonprofit, should CDP dissolve, it cannot merely sell its assets to the highest bidder but must transfer them to another nonprofit organization with a similar mission.

We’re still working on understanding the legal distinctions between IRS-recognized tax-exempt organizations and for-profit businesses.  We were surprised when we first found out that Gapminder, a Swedish nonprofit, had been bought by Google.  Swedish nonprofit law may differ from U.S. nonprofit law.  But it appears Hans Rosling did not stand to make a dime.  Google only bought the software and the website, and whatever that was worth went to the organization itself.  So in a way, the experience of Gapminder supports the idea that being a nonprofit does make a difference in restricting the profit motives of individuals.  Alex Selkirk, as the founder and President of CDP, will never make a windfall through CDP.

The fact that CDP is not profit-driven, and will never be profit-driven makes a difference to us.  Does it make a difference to you?

In the mix…EU data retention laws, Wikipedia growing

Friday, June 11th, 2010

1) Australia thinking about requiring ISPs to record browsing histories (via Truste).

Electronic Frontier Australia (EFA) chair Colin Jacobs said the regime was “a step too far”.

“At some point data retention laws can be reasonable, but highly-personal information such as browsing history is a step too far,” Jacobs said. “You can’t treat everybody like a criminal. That would be like tapping people’s phones before they are suspected of doing any crime.”

Sounds shocking, but the EU already requires it.

2) European privacy officials are pointing out that Microsoft, Google and Yahoo’s methods of “anonymization” are not good enough to comply with EU requirements (via EFF).  As we’ve been saying for awhile, “anonymization” is not a very precise claim.  (Even though they also want ISPs to retain browsing histories for law enforcement–confused? I am.)

3) Wikipedia is adding two new executive roles.  In the process of researching our community study, it really struck me how small Wikipedia‘s staff was compared to the staff of more centralized, less community-run businesses like Yelp and Facebook.  Having two more staff members is not a huge increase, but it does make me wonder, is a larger staff inevitable when an organization tries to assert more editorial control over what the community produces?

In the mix…data for coupons, information literacy, most-visited sites

Friday, June 4th, 2010

1) There’s obviously an increasing move to a model of data collection in which the company says, “give us your data and get something in return,” a quid pro quo.  But as Marc Rotenberg at EPIC points out,

The big problem is that these business models are not very stable. Companies set out privacy policies, consumers disclose data, and then the action begins…The business model changes. The companies simply want the data, and the consumer benefit disappears.

It’s not enough to start with compensating consumers for their data.  The persistent, shareable nature of data makes it very different from a transaction involving money, where someone can buy, walk away, and never interact with the company again.  These data-centered companies are creating a network of users whose data are continually used in the business.  Maybe it’s time for a new model of business, where governance plans incorporate ways for users to be involved in decisions about their data.

2) In a related vein, danah boyd argues that transparency should not be an end in itself, and that information literacy needs to developed in conjunction with information access.  A similar argument can be made about the concept of privacy.  In “real life” (i.e., offline life), no one aims for total privacy.  Everyday, we make decisions about what we want to share with whom.  Online, total privacy and “anonymization” are also impossible, no matter the company promises in its privacy policy.  For our datatrust, we’re going to use PINQ, a technology using differential privacy, that acknowledges privacy is not binary, but something one spends.  So perhaps we’ll need to work on privacy and data literacy as well?

3) Google recently released a list of the most visited sites on the Internet. Two questions jump out: a) Where is Google on this list? and b) Could the list be a proxy for the biggest data collectors online?

In the mix…DNA testing for college kids, Germany trying to get illegally gathered Google data, and the EFF’s privacy bill of rights for social networks

Friday, May 21st, 2010

1) UC Berkeley’s incoming class will all get DNA tests to identify genes that show how well you metabolize alcohol, lactose, and folates. “After the genetic testing, the university will offer a campuswide lecture by Mr. Rine about the three genetic markers, along with other lectures and panels with philosophers, ethicists, biologists and statisticians exploring the benefits and risks of personal genomics.”

Obviously, genetic testing is not something to take lightly, but the objections quoted sounded a little paternalistic. For example, “They may think these are noncontroversial genes, but there’s nothing noncontroversial about alcohol on campus,” said George Annas, a bioethicist at the Boston University School of Public Health. “What if someone tests negative, and they don’t have the marker, so they think that means they can drink more? Like all genetic information, it’s potentially harmful.”

Isn’t this the reasoning of people who preach abstinence-only sex education?

2) Google recently admitted they were collecting wifi information during their Streetview runs.  Germany’s reaction? To ask for the data so they can see if there’s reason to charge Google criminally.  I don’t understand this.  Private information is collected illegally so it should just be handed over to the government?  Are there useful ways to review this data and identify potential illegalities without handing the raw data over to the government?  Another example of why we can’t rest on our laurels — we need to find new ways to look at private data.

3) EFF issued a privacy bill of rights for social network users.  Short and simple.  It’s gotten me thinking, though, about what it means that we’re demanding rights from a private company. Not to get all Rand Paul on people (I really believe in the Civil Rights Act, all of it), but users’ frustrations with Facebook and their unwillingness to actually leave makes clear that the service Facebook is offering is not just a service provided to just a customer.  danah boyd has a suggestion — let’s think of Facebook as a utility and regulate it the way we regulate electric, water, and other similar utilities.

In the mix…Google reveals how many government requests for data it gets, Amazon tries First Amendment privacy argument, and the World Bank opens its databases

Wednesday, April 21st, 2010

1) Google is providing data on how many government requests they get for data. As various people have pointed out, the site has its limitations, but it’s still fascinating.  We’ve been thinking a lot about how attractive our datatrust would be to governments, and how we can best deal with requests and remain transparent.  This seems like a good option and maybe something all companies should consider doing.

2) In related news, Amazon is refusing the state of North Carolina’s request for its customer data. North Carolina wants the names and addresses of every customer and what they bought since 2003!  They want to audit Amazon’s compliance with North Carolina’s state tax laws.  I think NC’s request is nuts–are they really prepared to go through 50 million purchases?  It may just be legal posturing, given Amazon already gave them anonymized data on the purchases of NC residents, but what’s really interesting to me is Amazon’s argument that its customers have First Amendment rights in their purchases.  I heard a similar argument at a talk at NYU a few months ago, that instead of arguing privacy rights, which are not explicitly defined in the Constitution, we should be arguing for freedom of association rights when we seek to protect ourselves from data requests like this.  Interesting to see where this goes.

3) The World Bank is opening up its development data. This is data people used to pay for and now it’s free, so it’s exciting news.  But as with most public data out there, it’s really just indicators, aggregates, statistics, and such, rather than raw data you can query in an open-ended way.  Wouldn’t that be really exciting?

In the mix

Tuesday, March 2nd, 2010

1) I’m looking forward to reading this series of blog posts from the Freedom to Tinker blog at Princeton’s Center for Information Technology Policy on what government datasets should look like to facilitate innovation, as the first one is incredibly clear and smart.

2) The NYTimes Bits blog recently interviewed Esther Dyson, “Health Tech Investor and Space Tourist” as the Times calls her, where she shares her thoughts on why ordinary people might want to track their own data and why we shouldn’t worry so much about privacy.

3) A commenter on the Bits interview with Esther Dyson referenced this new 501(c)(6) nonprofit, CLOUD: Consortium for Local Ownership and Use of Data.  Their site says, “CLOUD has been formed to create standards to give people property rights in their personal information on the Web and in the cloud, including the right to decide how and when others might use personal information and whether others might be allowed to connect personal information with identifying information.”

We’ve been thinking about whether personal information could or should be viewed as personal property, as understood by the American legal system, for awhile now.  I’m not quite sure it’s the best or most practical solution, but I’m curious to see where CLOUD goes.

4) The German Federal Constitutional Court has ruled that the law requiring data retention for 6 months is unconstitutional.  Previously, all phone and email records had to be kept for 6 months for law enforcement purposes.  The court criticized the lack of data security and insufficient restrictions to access to the data.

Although Europe has more comprehensive and arguably “stricter” privacy laws, many countries also require data retention for law enforcement purposes.  We in the U.S. might think the Fourth Amendment is going to protect our phone and email records from being poked into unnecessarily by law enforcement, but existing law is even less clear than in Europe.  So much privacy law around telephone and email records is built around antiquated ideas of our “expectations,” with analogies to what’s “inside the envelope” and what’s “outside the envelope,” as if all our communications can be easily analogized to snail mail.  All these issues are clearly simmering to a boil.

5) Google’s introduced a new version of Chrome with more privacy controls that allow you to determine how browser cookies, plug-ins, pop-ups and more are handled on a site-by-site basis.  Of course, those controls won’t necessarily stop a publisher from selling your IP address to a third-party behavioral targeting company!

Wow, new privacy features!

Friday, December 11th, 2009

Wow, so many companies rolling out new privacy features lately!

Facebook rolled out its new “simplified” privacy settingsGoogle introduced Google Dashboard, a central location from which to manage your profile data, which supplements Google Ads Preferences.  And Yahoo released a beta version of the Ad Interest Manager.

Many, many people have reviewed Facebook’s new changes, and pointed out some of the “bait-and-switch” Facebook has done for some new, and I think better, controls.  I don’t have much more to say about that.

But it’s interesting to me that Google and Yahoo have chosen similar strategies around privacy issues, though with some differences in execution.  Both companies haven’t actually changed their data collection practices, and cynics have argued that they’re both just trying to stave off government regulation.  Still, I think that it makes a difference when companies actually make clear and visible what they are doing with user data.

“Is this everything?”

Both Google and Yahoo indicate in different ways that the user who is looking at Dashboard or Ad Interest Manager is not getting the full data story.

Google’s Dashboard is supposed to be a central place where a user can manage his or her own data.  In and of itself, it’s not that exciting.  As ReadWriteWeb put it, it doesn’t tell you anything you didn’t know before.  It provides links in one place to the privacy settings for various applications, but it focuses on profile information the user provides, which represents only a tiny bit of the personal information Google is tracking.

Google does, however, provide a link to the question, “Is this everything?” that describes some of their browser-based data collection and a link to the Ads Preferences Manager page.  To me, it feels a little shifty, that the Dashboard promises to be a place for you to control “data that is personally associated with you,” but it doesn’t reveal until you scroll to the bottom that this might not be everything.  Others may feel differently, but this to me goes right at the heart of the problem of how “personal information” is defined.  When I go to the Ads Preferences Manager, I see clearly that Google has associated all kinds of interests with me–how is this not “personally associated” with me?  Google states it’s not linking this data to my personal account data which is why they haven’t put it all in one place, which is good, but it seems too convenient a reason to silo that off.

Yahoo’s strategy is a little different.  It may not be fair to compare Yahoo’s Ad Interest Manager to Google’s Dashboard at this point, given that it’s in such a rudimentary phase.  It’s in beta and doesn’t work yet with all browsers.  (As David Courtney points out in PCWorld, being in beta is a pretty sorry excuse for the fact that it doesn’t work with IE8 and Firefox.)  Depending on how much you use Yahoo, you may not see anything about yourself.

Still, I thought it was interesting that Yahoo highlighted some of the hairy parts of its privacy policy in separate boxes high up on the page.  Starting from the top, Yahoo states clearly in separate boxes with bold headings that there are ways in which your data is collected and analyzed that are not addressed in this Ad Interest Manager.  The box for the Network Advertising Initiative is a little weak; it doesn’t really explain what it means that Yahoo is connected to the NAI.  But the box on “other inputs,” shows prominently that even as you manage your settings on this page, there may be other sources of data Yahoo is using to find out more about you.

zoombox

Yahoo also reveals that the information they’re tracking from you is collected from a wide range of sources, including both Yahoo account information like Mail and non-account websites like its Front Page.  Unlike Google, Yahoo doesn’t ask you to click around to find out that some of “everything” is elsewhere.

zoombox2

Turning “interests” on and off

Google and Yahoo are very similar here.  Google’s Ad Preferences Manager indicates which interests have been associated with you with a clear link to how they can be removed, with a button for opting out from tracking altogether.

Googleopt

Yahoo’s Ad Interest Manager has a different design, but the button for opting out altogether is similarly visible.

Yahooopt

We’re using cookies!

Compared to the other issues, this is the most obvious difference between Google and Yahoo.

Google has this on its Ads Preferences Manager:

Googlecookie

So you can see that some string of numbers and letters has somehow been attached to your computer, but you’re not told what this means in terms of what Google knows about you.

In contrast, Yahoo shows this at the bottom of the Ad Interest Manager:

YahooAd3

Yahoo knows I’m a woman!  Between 26 and 35!  The location is actually wrong, as I am in Brooklyn, NY, but I did live in San Francisco 5 years ago when I first signed up for a Yahoo account.  Still, Yahoo is very explicitly showing, and not just telling, that it knows geographical information, age, gender, and the make and operating system of your computer.  I’m impressed—they must know this is going to scare some people.

Does any of this even matter?

I prefer the Yahoo design in many ways — the boxes and verticality of the manager to me are easier to read and understand than the horizontal spareness of the Google design.  But in the end, the design differences between Google and Yahoo’s new privacy tools may not even matter.  I don’t know how many people will actually see either Manager.  You still have to be curious enough about privacy to click on “Privacy Policy,” which takes you to Yahoo! Privacy, at which point, in the top right-hand corner, you see a link to “Opt-out” of interest-based advertising.  The same is true with Google. And neither company has actually changed much about their data collection practices.  They’re just being more open about them.

But I am impressed and heartened that both companies have started to reveal more about what they’re tracking and in ways that are more visually understandable than a long, boring, legalistic privacy policy.  I hope Yahoo is feeling competitive with Google on privacy issues and vice-versa.  I’d love to see a race to the top.

In the mix

Friday, November 6th, 2009

Cuil’s Famous Privacy Policy No Longer Protects Privacy (michaelzimmer.org)

Google’s Privacy Dashboard Doesn’t Tell Us Anything We Didn’t Know Before (ReadWriteWeb)

In the mix

Tuesday, June 16th, 2009

EFF Launches TOSBack–A “Terms of Service” Tracker for Facebook, Google, eBay, and More.  (EFF)

The “Hidden Cost” of Privacy.  (Schneier on Security)

Google Fusion Tables.  (Official Google Research Blog)

In the mix

Wednesday, June 3rd, 2009

Google is Top Tracker of Surfers in Study. (NY Times Bits Blog)

The Obama Administration’s Silence on Privacy. (NY Times Bits Blog)

This UK Sheriff Cites Officials for Serious Statistical Violations.  (WSJ The Numbers Guy)


Get Adobe Flash player