The Future

The Future

If your heart rate spiked today, your insurance company knows

An interview with Priya Kumar, a data expert who is trying to help companies not abuse their customers’ data rights.

Privacy policies are one of the most horrifically boring things online for a reason. Everytime you scroll past one to get to the “I agree” button and move on with your life, you’re signing over the rights to your personal data to some company to get rich off of. Worse, they tell you almost nothing about the worst of what companies will do with your data.

The Outline called up privacy and digital life researcher, Priya Kumar, a current affiliate with Ranking Digital Rights, a non-profit research initiative focused on how tech companies could respect consumers’ privacy and freedom of expression.

The Outline: Privacy policies are some of the most boring, unintelligible collections of words online — why should anyone care about them?

Kumar: Companies are sharing this data and using it to make decisions that are a lot more consequential than advertising. ProPublica showed how advertising can be discriminatory, and we've got the 23andMe issue [that the company is selling and sharing users’ genomic data]. People are realizing that this data is being used in ways that we don't understand and that we actually do care about.

While the end result of all of this misuse can shocking and obviously worth caring about, the actual signing over of your data seems so innocuous.

Fifty-two percent of American internet users believe that when a company posts a privacy policy that it means that the company keeps the data confidential. I could totally understand why somebody would think that.

You’ve written before about how disingenuous the name “privacy policy” is; you suggested using the name “data use policy” instead.

Privacy is a really loaded term. It is a fundamental human right and key to human dignity, but some people are more open than others. Privacy is different for different people.

Which is a very different conversation than how do we properly protect customers’ data.

Companies will tell you a lot more about the information that you actively give them, [like] your email address. But every time you log in, your IP address or maybe even other location information about where you’re logging in from, the device that you’re using to log in, the browser that you’re using to log in, what you’re doing on the site, what you’re clicking on, what you’re hovering over, they may collect your browsing history — that stuff that is data we as users generate as we are actually using the service but companies are also collecting that data and it’s really hard to understand what they are doing with that.

I’m thinking of the article that came out about the researcher that took public Venmo data and showed how you can just use the [public] financial transactions that people are displaying on Venmo to infer a lot about people’s behavior. If you have somebody’s birthday and zip code and one other piece of information, 80-some percent of Americans are identifiable just based on those three pieces of information.

The data got combined and now all of a sudden the company can infer all of this information about me that I didn't necessarily consent to.

All of this information we hand over to companies is out there forever. There’s no way to get it back, and it doesn’t decay. There’s a clause in basically every privacy policy that gives companies the right to hold on to your data if they’re bought out by someone else, or even sell off the data they collected on you to someone else if they’re bankrupt. There’s no temporal limit or any limit at all really.

There’s such little information on data retention. And this data it can get more and more valuable over time, also it becomes more valuable when you combine it with other pieces of information.

If I say I’m fine with you giving my genomic data to somebody else, if that gets combined with the data that I'm putting on MyFitnessPal or the calorie counting app I’m using? And now maybe GSK [GlaxoSmithKline, the pharmaceutical company that partnered with 23andMe] has partnered with my insurance company, and now my insurance company is going to see my calorie count data and my genome data and think, this person has a predisposition to this particular disease, and they’re not eating very well, their habits could exacerbate the onset of the disease. And I wasn’t okay with that [information being used in that way], but the data got combined and now all of a sudden the company that has it can infer all of this information about me that I wasn't aware of or didn't necessarily consent to.

One of the key issues here is that people don't view their data as valuable, even though companies are using it to get rich. They just see it as something that they can exchange for access to Facebook or what have you. But if you need to use Gmail for work, you're going to agree regardless of what exactly you're agreeing to. Do you think there's a realistic way out of this double-bind?

I’m part of a research team at the University of Maryland where we surveyed people who used fitness trackers. We found that they were not very aware of the data practices of Fitbit and Jawbone. A lot of people were saying, “Oh well it’s just step data. Why would anyone need this?” Sure, step data seems really meaningless. The number of glasses of water you drink a day seems really meaningless. But if you collect all of this data over time it can tell you something about the person, especially if you tie location to it or something like that. That’s where I think that Strava data that came out earlier this year...

You mean when Strava published data from its social fitness app and accidentally revealed the location of secret U.S. Military bases in the Middle East, right?

Yes!

One data point on its own might be meaningless, but when you add in multiple types of information, you paint another picture.

Oh god yeah, that was dystopian.

People realize that this [data] can be connected to [their other] data, and then all of a sudden something that seemed really granular actually can be a concern. Yeah sure, one data point on its own might be meaningless or even the whole type of data point — like a step count — might be meaningless, but when you add in either temporal components, or multiple types of information like distance or calories or stuff like that, then you paint another picture.

And then of course combining data sets, like the calorie data plus your genetic data plus your grocery store purchasing data or whatever. It's not that one data point on it's own is sensitive, it's that when all of these things come together, that's where the actual problem is. I think helping people to realize that is one way to make them care about [privacy policies].

Yeah that certainly seems like a more attainable individual goal than say, the legislative approach.

[For a lot of projects conducted by] civil society research organizations, Ranking Digital Rights included, the goal is to go after investors and say: Companies need to be transparent about what they're doing with data, because privacy and freedom of expression are really big factors that actually could pose a risk to your investment. Facebook is the poster child for privacy scandals, and we are starting to see that investors are paying attention to that. Obviously it’s good to have legislation so we have that floor that companies need to meet, and we’ve seen the Federal Trade Commission go after a lot of companies around their data practices — and other companies, not just tech companies — but I think all of the players in the ecosystem working together can really make change in this area.

I certainly hope so.

We talk about privacy as an individual thing: Have you checked your privacy settings? If you don't want to use the service, delete your account, or things like that. But that’s not going to solve these big issues that we’ve been talking about like the value of data. The more that we can show that it’s not about you know your employer knowing your step count from your Fitbit, it’s about something much larger, then I think we can [fix things] and bring public attention to it.

This interview has been condensed and edited.

The Future

85%
The accuracy of an algorithm that uses Apple Watch data to predict early signs of diabetes, high blood pressure, and other health issues.
Read More
Hey you! We want to know what you think about The Outline (and you can win some cool swag too). We know you love to answer questions, so take our 5 minute survey.