In the good old days – let’s say, 10 years ago – data protection was synonymous with technologies such as encryption and tokenization. By transforming the data or making it less visible, companies were able to protect their data and that of their customers. No more.
Times have changed, and with them, so has the approach to data privacy. New regulations pivot on the notion that data belongs to individuals, not the enterprises that collect it. Instead of just masking or hiding the data, companies need to provide fundamental data accountability to their employees and customers.
But how can enterprises be accountable for data when they don’t know what they have, where it is, who it belongs to, where it’s been, or where it’s going? To meet today’s mandates and user expectations, companies need to completely rethink data protection.
Personal Information, Defined
Data privacy is a game changer. Until recently, though, it was very low on the list of enterprise priorities – even for chief information security officers, CIOs, and board members.
The breach regulations that were introduced over a decade ago were based on the idea of protecting personally identifiable information (PII). PII is defined as highly identifiable information and data that can be uniquely correlated with an individual, such as a Social Security number or a credit card number.
Data privacy is different. It requires companies to take responsibility for the collection of data that belongs to an individual – a concept known as personal information (PI).
The trouble is that PI is not necessarily highly identifiable. Here’s an example: a birthdate is one date in the 365-day calendar. That’s not highly identifiable. A GPS location is a point on the globe. Again, not highly identifiable. An eight-digit string of numbers could refer to many things.
But when these bits of data are in a different context, they can be highly personal and thus qualify as PI. If the birthdate is mine, or the GPS location was collected as part of my mobile session, or if the string of digits is my password to an application, it’s all highly personal and highly confidential. When data such as an IP address, cookie, session key, date, gender, birthdate, password, or location is about me, that is PI.
Preliminary Data Identification Processes
New privacy regulations require companies to find and protect PI. The European Union’s General Data Protection Regulation (GDPR), implemented in May 2018, and the California Consumer Privacy Act (CCPA), which came into effect January 1, 2020, are two high-profile examples of PI legislation. More than 20 other U.S. states have privacy-related draft bills and there is talk of a federal law.
With this legislative push, companies can no longer ignore the need to protect data privacy. Decision-makers need to get acquainted with the implications of these laws and identify compliance gaps.
Most organizations begin by revising their data identification and classification processes. They look for ways to find and identify data manually, because that seems like the simplest approach. Then they implement policies to reduce the scope of things that fall under their responsibility. The majority of companies are still at this point in the data privacy process.
Before long, however, organizations realize that the ROI is so unattractive and the accuracy of these processes is so poor that they need to replace manual efforts with automated approaches.
The only way to identify PI is to use context to determine whether data is personal. That requires a completely new way to examine and assess data. That’s where innovative new technologies come in.
New purpose-built PI technologies address these privacy-centric data discovery and data intelligence use cases. The solutions bring data science, machine learning, and advanced data insight to the challenges of data privacy, helping enterprises safeguard and steward data by finding it and learning its context. The solutions also help companies track and govern their customer data at scale, which is important when dealing with huge and growing volumes of data.
New enterprise data intelligence technologies work with different IT systems, applications, and products – on-premise or in the cloud – to discover PI. Using context, it automatically finds hidden information and relation- ships among data to identify PI and inventory it by data subject and residency. Advanced solutions use dozens of parameters to score the data and then build a map of the data and its flows, which is especially important for tracking ephemeral data assets.
This data privacy technology is basically the IT version of accounting standards like GAAP. Before GAAP, there was no standardized way of tracking deposits and withdrawals in financial institutions. The introduction of standards helped banks identify funds and report information in a standard way – allowing any analyst or observer to understand the health of the business.
With data, organizations traditionally collected information from individuals, but what happened to it afterwards was unclear. With no GAAP-like standards, it was up to the enterprise to determine how or whether the data was protected, tracked, or reported.
Now people say that “Data is the new oil,” or “Data is the new currency of the digital enterprise.” New data privacy regulations recognize data’s increasing importance. But they also demand that organizations reconsider data and how they protect it.
To be compliant, companies must know where they got their data, who can access that data, and whose data they have. They need insight into where they stored the data and who they shared it with. And if they are sharing data, enterprises need to know why they are sharing it and whether they have the permission of the data owners to do so. The answers to these questions not only help companies meet these compliance requirements but also get a handle on their most important assets.
Opportunity for New Business Value
The way enterprises understand, process, and protect their data influences the type of consent management functionality they offer users. Until recently, gaining user consent was a matter of asking users – repeatedly – to agree to allow their data to be collected and used. These repetitive pop-ups and interruptions can be overwhelming for users.
Fortunately, regulations like CCPA make it easier for users to opt out of data collection by inverting the power dynamic. Instead of a long series of radio buttons requesting unlimited rights to data, new consent management features allow individuals to quickly and easily refuse to allow organizations to resell or reuse their data. The responsibility then falls to each enterprise to ensure there is no violation of the user’s opt-out request.
Yet meeting this challenge requires companies to gain more granular insight than has been available previously. Enterprises need to know where all of a person’s data is throughout the data lifecycle – whether it resides in files, data warehouses, data lakes, business solutions, mail applications, or messaging apps, to name just a few possibilities. Then they need to be able to disambiguate the information, knowing when an eight-digit numeric string is just a sequence of numbers and when it is a password. Also important is the ability to find contextual PI and connect it to an individual, which requires understanding of data both at a single point in time and as it evolves over time.
All of this information is critical to meeting new PI compliance requirements. More importantly, it can help companies get more value from their data assets. With the proper context, organizations can know where customer data is – across multiple countries, languages, and businesses. Essentially, they’ll have a much richer understanding of the crown jewels of the organization.
With that understanding, not only is there an opportunity to do better – in terms of revenue and profitability – but enterprises can more effectively protect their assets. Context-enabled insight allows companies to reduce data duplicates and rationalize the infrastructure needed to support the data. It also helps identify the right time to consolidate servers or migrate data to the cloud.
What’s more, a complete inventory of the data can help companies identify potential vulnerabilities, areas of exposure, and potential for non-compliance. They can also better safeguard data, get more value from it, and reduce overall costs. And that’s value that today’s businesses cannot resist.
About Horizons by SAP
Horizons by SAP is a future-focused journal where forward thinkers in the global tech ecosystem share perspectives on how technologies and business trends will impact SAP customers in the future. The 2020 issue of Horizons by SAP focuses on Context-Aware IT, with contributors from SAP, Microsoft, Verizon, Mozilla, and more. To learn and read more, visit www.sap.com/horizons.
Dimitri Sirota is CEO of BigID.