
Blog Post |
Modern Data Sharing—Even Sensitive Data
The world is awash with data: the data that are in demand to drive the healthcare, financial, and personal applications that promise to better everyday life. These applications, often driven by machine learning and artificial intelligence capabilities, depend on data—often some of the most private and sensitive data the world has to offer. To make matters more complicated, businesses often need the data they do not own, thus creating a need to share data between people and entities. How can we satisfy this need while still protecting the private and sensitive nature of our data?
Enterprise Data Sharing
Today, organizations may share data with each other via partnerships or may directly purchase new information from data providers—a new and growing segment often referred to as alternative data in the finance markets or as health information exchanges in the healthcare space. If the data in question are regulated in some way (for example, healthcare data is subject to HIPAA), methods for protecting and sharing data are generally specified. These methods may include the options of anonymization, tokenization, or de-identification. However, these options do not have any provable measure of privacy protection and often give a false sense of security to the data owners. In fact, data breaches are frequently recorded on data sets that are supposedly “anonymous.” Worse is that most techniques also destroy data value by removing or significantly altering the underlying data. In today’s typical case, data sets are often physically sent between parties, meaning the data owners lose control of their asset.
The Future of Data Sharing
Data sharing will not stop—in fact, the applications of tomorrow will demand more access to sensitive data sets for next generation capabilities. Therefore, we have to plan to support the dual goal of data utility and data privacy. Here is the list of the top five things to consider when sharing data in the future:
- What’s the threat? Teams should consider what threats they are protecting the data from—is it an internal bad actor, hacking on data at rest, or the privacy of individual records? Each type of threat will need consideration on the best defense (this blog post discusses some common technologies in the privacy and security space).
- Provable privacy. Data owners should not be satisfied with heuristic or “hope and pray” methods to protect their data. Frameworks must be implemented that can quantifiably protect sensitive and confidential data.
- Retaining ownership. Businesses should look to stop “shipping data” and instead look for ways to share data access—retaining full control of their data while still allowing others to use the data for analysis.
- Maximum value. Teams should adopt an “analyst first” view of their data, working to simplify both the access to and the value of their data sets. Importantly, teams should shy away from any protection method that destroys data value by removing or otherwise modifying the underlying data.
- Commercialization. Data monetization will become an important new, high-margin revenue stream for many businesses in the future. Data owners should acknowledge and plan for this eventuality by working on private and secure go-to market strategies that can both support business growth and protect data.
Tomorrow’s data landscape will look far different than that of today: the world will see more interconnected and shared data to support even more complicated applications (e.g., self-driving cars or faster drug discovery). Data owners have commercial, reputational, and regulatory responsibilities to ensure that they can adequately share and protect their most sensitive data assets. This is why we started LeapYear. To learn more about how our platform protects privacy and ensures maximum data value, have a look at our product page.