I recently joined LeapYear, which offers a software solution that provides mathematically rigorous privacy for sensitive data while also giving analysts the ability to generate valuable insights from those same datasets. It’s a fairly significant (and admittedly risky) career change after more than 10 years in the alternative data space, where I worked as the ‘middleman’, bringing unique data signals to equity, credit, and macro investors who leveraged them as inputs in their respective investment strategies.
Why pivot from alternative data to software when every financial and industry publication seems to agree that alternative data is still in its infancy? I’ll tell you why. From where I sit, the industry to a large degree has matured, and it looks like we’re hitting the ‘disillusionment’ phase of the hype cycle in the alternative data market. For context, it’s that phase after the early adopters find value and then everyone piles in and the limitations of the solution(s) start to surface. These limitations are myriad. Many high profile datasets have been in the market for years and there seems to be a collective ‘now what’ from both buyers and sellers. Data companies are fighting for dollars while investors are fighting for alpha. At the same time, we’re seeing the beginnings of what will likely be a massive wave of privacy regulations starting with GDPR, moving to CCPA, and potentially federal policy in the US. Worse, consumers are becoming more and more concerned about the security of their personal data and how it’s being used. And who can blame them? The alternative data ecosystem already deals with a host of concerns around data privacy, resulting in many data sets simply staying off market, or being stripped of value through anonymization and aggregation. This is creating greater headwinds for the industry, decreasing the ability to use data for maximum impact. Seeing the maturation of the data ecosystem and the rising tide of regulations, it felt like the time to consider “what’s next” for me.
That (you guessed it) is where LeapYear comes in. When I first spoke to our CEO, Ishaan, I had a hard time believing that the team actually created enterprise ready software that could provide privacy without leaving some of the data’s value on the table. And, as a bonus, the software ensures data never has to leave the owners’ environment and is protected from reverse engineering attacks? Psht. Didn’t believe it. After all, for the last 10 years all of us in the alternative data industry have created or consumed redacted or aggregated datasets, in the name of ensuring privacy. It’s “how we’ve always done it”. But it’s time to broaden the scope of possibility. Data quality, the need for differentiation, and privacy standards have changed.
So where is the catch? What price do I pay to achieve simultaneous use of sensitive data and privacy with LeapYear? We all know that most of the data value lives in what is the most sensitive information, and nearly all the traditional forms of protecting privacy remove or aggregate these fields. This blog isn’t the right place for a full explanation, but the short answer is that the catch is differential privacy, and it’s implementation in our software. The system automates privacy, introducing just enough randomization to the computation itself (never the data) to protect privacy. Here’s the thing though – it’s not really a catch. Any honest analyst will tell you that a nominal amount of variance (let’s say +/- .5%) for any model or forecast is tolerable and in most cases insignificant in a broader analytical context. And it’s an extremely small price to pay vs today’s status quo: not being able to compute on that extremely valuable sensitive data at all.
As I learned about the technology and the use cases we have in all the major regulated industries, LeapYear’s value to the alternative data ecosystem became clear to me. We now have a way to bring never before seen, extremely sensitive data to market – enabling huge value for buyers, and new revenue for data owners. We can go back to data sets that have been stripped of value through anonymization and aggregation, and put back those sensitive fields for improved modeling and insights. We can help alternative data providers further differentiate their products and services. We can even consider, if data isn’t pre-aggregated and “shipped off”, how to innovate the way we architect the business relationships in the ecosystem. I believe LeapYear’s software will provide a key transformational capability to the alternative data space, and helping my industry move to its next stage of maturity and growth is why I joined. I’d love to talk with you about my vision, and our technology. Feel free to reach out directly or contact me here, and we can discuss how LeapYear can enable you to get more value out of the alternative data space.