Locating Personal Data and Tracking Privacy Rights

Locating Personal Data and Tracking Privacy Rights: An Interview with Dimitri Sirota

One of the biggest challenges for organizations is locating all the personal data they have. This task must be done, however, to comply with the General Data Protection Regulation (GDPR) and other privacy laws. Moreover, the GDPR and the new California Consumer Privacy Act provide that individuals have rights regarding their data. These rights often require that organizations must keep records of individual privacy preferences regarding their data.

I had the opportunity to interview Dimitri Sirota about these issues. Dimitri is the CEO and co-founder of one of the first enterprise privacy management platforms, BigID, and a privacy and identity expert.

Dimitri Sirota

Dimitri is an entrepreneur. He previously founded two enterprise software companies focused on security (eTunnels) and API management (Layer 7 Technologies), which was sold to CA Technologies in 2013. His current company, BigID, provides technology to help organizations track and govern their customer data at scale. By bringing data science to data privacy, BigID aims to give enterprises the software to safeguard and steward the most important asset organizations manage: their customer data.

SOLOVE: Why did you choose to focus on the challenge of organizations being able to better find their data?

SIROTA: The concepts of privacy and data protection are key business challenges of the 21st century. BigID recognized that data privacy would be front and center of digital business even in advance of the landmark EU GDPR being finalized. If data is the new oil, organizations must find a better way to govern how data flows and prevent spills. Core to this mission is delivering the systematic understanding not just of what data a company holds, but whose data it is and whether what’s being done with that data is Privacy should be integral to how companies treat and manage their data, and BigID recognized that privacy-first, data-driven tools were needed to fill the void.

With digital transformation gaining ground, enterprises have invested in ways to collect, process and analyze personal data at unprecedented scale – but didn’t have a systematic way of assessing the risk they were accumulating, or the privacy implications. To extract value from the data, minimize and responsibly manage and process data, the starting point of necessity must be data knowledge. This is the missing part of the data privacy and data protection puzzle that BigID set out to address.

SOLOVE: One of the greatest challenges of identifying personally identifiable information (or “personal data” in the EU) is that this category of data often includes data that isn’t currently identified to a person but that could be potentially identifiable to a person. How do you best determine whether data is identifiable?

SIROTA: There are two primary elements to this question – the first is the technical element of how you factor in content and context to establish how data relates to an entity – whether a data subject in GDPR parlance or an entity in data management terms. In our case, the relationship and indexing of personal data to an entity is accomplished through correlation and machine learning. We approach this process without having to rely on pattern matching, manual tweaking or a set of pre-defined classification models. Instead, work from the individual data values up. We look to uniquely identifiable, quantify how highly correlated it is to other known data values and then ask the question who the data identifies.

The reason why identifying personal data, or personally identifiable information, is so difficult is the current technology being used is not designed for the task at hand. For the most part, legacy technology simply tries to find PII using regular expression classifiers designed to solve PCI era problems. However, these classifiers to locate personal information that is personal not because it can be defined by a regular expression but because it is associated with a person. Privacy is about people and that requires knowledge of what data is associated with that person.

GDPR and other privacy laws are increasingly requiring various rights for various types of data, as well as maintaining information from people about their preferences with regard to how that data is used and maintained. This requires keeping additional data about the personal data. At a general level, can this be done efficiently and without considerable extra cost and burden to organizations?

The answer to this question boils down to consent. CaCPA mirrors GDPR in many aspects but is not as explicit about consent, although it does provide opt-outs. The first point to make regardless of the specific regulation is that organizations now have accountability placed on them for the data they collect and process. One of the aspects of being accountable is making sure that customers have some understanding of what data is being collected and who it is being shared with. Even with no formal requirements for consent under CaCPA, we anticipate responsible covered companies will look for consent from their consumers, regardless of regulatory requirements.

This requires greater corporate visibility into what consent they have for data and individuals. This is made difficult owing to the fragmented nature of consent collection across Web, mobile, IoT and non-digital. BigID’s POV on consent is that organizations need a consent governance framework that provides centralized visibility into consent without centralizing the consent collection which is often impractical in real enterprises with existing applications that are challenging to retrofit.

SOLOVE: Whose job is privacy?

SIROTA: Because of GDPR and CaCPA, the chief privacy officer’s role has become more visible within the enterprise, but the responsibility of ensuring data privacy extends far beyond the CPO – policy process and technology must become integral to how today’s organizations operate.

A role that is accepting increased responsibility is that of the chief data officer. Operating under modern regulation, organizations must implement privacy by design to how they gather, collect, ingest and analyze data – this begins with the CDO.

In addition, security teams must consider data protection more broadly – this means understanding privacy vs. context and ensuring controls and monitoring incorporate privacy risk.

In essence, the job of privacy within an organization is that of the entire organization, starting from the board level down. Data is an essential part of modern organizational success that is touched by a great number of parties, all of which must do their part to ensure privacy.

SOLOVE: Thanks, Dimitri, for your insights.

* * * *

This post was authored by Professor Daniel J. Solove, who through TeachPrivacy develops computer-based privacy and data security training. He also posts at his blog at LinkedIn, which has more than 1 million followers.

Professor Solove is the organizer, along with Paul Schwartz, of the Privacy + Security Forum (Oct. 3-5, 2018 in Washington, DC), an annual event designed for seasoned professionals.

This post was originally posted on LinkedIn.

NEWSLETTER: Subscribe to Professor Solove’s free newsletter
TWITTER: Follow Professor Solove on Twitter.

PRIVACY + SECURITY BLOG

News, Developments, and Insights

Locating Personal Data and Tracking Privacy Rights: An Interview with Dimitri Sirota

PRIVACY + SECURITY BLOG

News, Developments, and Insights

The Privacy+Security Forum (Oct 3-5, 2018 in DC)

GDPR Training by Professor Daniel Solove

The Privacy+Security Forum
(Oct 3-5, 2018 in DC)