Big Myths About Big Data

big data post

by Daniel J. Solove

The FTC held a workshop this Monday about Big Data. The term “Big Data” is used everywhere these days, and depending upon who is talking about it, Big Data is either the hippest thing in the world and the producer of miracles that will save the human race, or it is the scourge of all evil and the doom of freedom and democracy. I think that neither is the truth, and I want to dispel some myths about Big Data:

MYTH: Big Data is a new phenomenon.

Big Data is not new. Just the term is. We’ve long had Big Data. Credit reporting agencies have long amassed enormous profiles on individuals. Companies and the government have long amassed massive databases of personal information. And they have analyzed this data for patterns. Over the past 15 years, we’ve heard of databases, data mining, and fusion centers. All of these are Big Data. The only thing really new about Big Data is the cool new name.

MYTH: Because Big Data will lead to miraculous benefits that will save humankind, we must avoid regulating Big Data.

I repeatedly hear over-the-top claims about how Big Data will produce astounding new benefits that can save lives and revolutionize the world. Big Data certainly has many great benefits. I think that a lot of great things can be learned by combining data and analyzing it. But it is important to distinguish between benefits to the consumer and society versus benefits primarily to the companies and organizations using Big Data. Often times, organizations cite wonderful benefits to justify Big Data but then the way these organizations are using Big Data is primarily for private gain.

Just because some uses of Big Data have big benefits doesn’t mean that all uses of Big Data are equally as good or as promising.

MYTH: Big Data is currently unregulated, and we should be cautious about introducing new regulation before we fully understand the implications of Big Data.

Some argue that Big Data is unprecedented, and we should be very cautious about regulating it. But the credit reporting agencies (CRAs) were “Big Data” and we regulated them back in the 1970s with the Fair Credit Reporting Act (FCRA). There needed to be regulation — the legislative history of FCRA is filled with examples of bad behavior by the CRAs. FCRA was passed to make them more responsive to consumers and to provide people with basic rights regarding their data.

The problem today is that many entities amassing and using personal date are not regulated by FCRA. The answer isn’t necessarily to throw everyone into FCRA, but we should look to FCRA for lessons in how to regulate Big Data.

MYTH: The best way to deal with Big Data is to provide more transparency to people about how their data is being collected and used.

At first blush, this sounds great. More transparency. Who would argue with that? I’m all for transparency. But transparency is not a cure for the problems of Big Data.

The problem is that even with transparency, people can’t understand the consequences of how their data might be used or make reasonable cost-benefit determinations about particular uses of their data. If Big Data were innately evil, then the answer would be easy: Stop it. If Big Data were pure goodness, then the answer would also be easy: Spread the joy!

But the challenge is that Big Data has some great benefits but it also has some costs as well. It can reinforce stereotypes. Big Data can potentially be used in ways that harm minorities or low income communities. It can be used to deny people opportunities or even exposure to certain ideas, products, or other things. It can be used in ways that cause harm, that uncover secrets, and that invade privacy.

Because Big Data has potential benefits and potential costs, we must make a cost-benefit determination about it. And that is hard because the benefits and costs are often very hard to measure for specific transactions, especially in advance. The great difficulty with privacy is that the harms are often cumulative in nature, and these are challenging for people to assess at the time of each individual data transaction. And there are simply too many entities that collect, use, and disclose people’s data for the rational person to be able to manage. For more of my pessimism about people’s ability to weigh the costs and benefits of Big Data, see my article, Privacy Self-Management and the Consent Dilemma, 126 Harvard Law Review 1880 (2013).

So although transparency is important, it is just a small part of the solution.

* * * *

This post was authored by Professor Daniel J. Solove, who through TeachPrivacy develops computer-based privacy training, data security training, HIPAA training, and many other forms of training on privacy and security topics. This post was originally posted on his blog at LinkedIn, where Solove is an “LinkedIn Influencer.” His blog has more than 800,000 followers.

If you are interested in privacy and data security issues, there are many great ways Professor Solove can help you stay informed:
* Professor Solove’s LinkedIn Influencer blog
* Professor Solove’s Twitter Feed
* Professor Solove’s Newsletter

Please join one or more of Professor Solove’s LinkedIn Discussion Groups:
* Privacy and Data Security
* HIPAA Privacy & Security
* Education Privacy and Data Security

PRIVACY + SECURITY BLOG

News, Developments, and Insights

Big Myths About Big Data