Fluree Sense
Fluree Blog Blog Post Kevin Doubleday09.22.22

Why we’ve joined forces with ZettaLabs

Fluree and Zetta Labs officially merge to build best-in-class data-centric infrastructure.

It was fall 2008, and Eliud Polanco had a problem. The financial crisis had just hit. Polanco, a data & analytics expert, was sifting through endless Excel spreadsheets and mainframe printouts to figure out the risk exposure to the large, global universal bank where he worked. The spreadsheets contained information about the financial products sold and traded, as well as risk forecasts and models produced by quants—quantitative analysts who wrote code for pricing, high-speed trading and profit maximization. Not only was the data within the mainframe reports and spreadsheets complex, but they were spread out around the world.  

This was in one of the largest Banking institutions in the world, with offices in 100+ countries and hundreds of thousands of employees. Branches, departments, and countries had their own systems that were designed to make day-to-day life easier for the functions where people worked in. When you zoomed out and tried to look at all those systems as a whole, however, things grew chaotic. 

One of Polanco’s first steps was to put everything in a data warehouse, which then evolved into one of the largest data lakes at the time. He then methodically tried to query and sort the data using every analysis tool on the market. All of them came up short. 

It wasn’t terribly surprising. Master data management, the ability to view, manage, and analyze all data from a single pane of glass, has always been an elusive goal. Big, complex organizations have data in an array of siloes, from cloud applications like Salesforce to custom-built, department-specific ERPs. Trying to work with all the data in one place requires automation, and that, in turn, leads to unanticipated consequences. Software updates changing data is a common example. Approximately 62 billion hours of data and analytic work “are lost annually worldwide due to analytic inefficiencies,” according to the Data and Analytics in a Digital-First World report.

Polanco decided to build his own master data management platform. The software should help him understand not only the big picture of the organization’s data, but enable him to dig into specifics without getting mired in bug fixes. Wall Street has long been a first mover in AI- and machine learning applications, and Polanco took that innovative mindset into his platform, using unsupervised machine learning to crawl data, and supervised machine learning to ask humans the right questions about that data. 

The result was ZettaLabs, now Fluree Sense. By taking advantage of two kinds of machine learning, Fluree Sense organizes even the most diverse and chaotic data to 90th percentile accuracy. When you send a query, an unsupervised machine learning algorithm crawls data sets in your data lake and aggregates together potential answers. Next, a supervised machine learning algo does entity resolution, grouping together names, addresses, etc and so on that seem to refer to the same person or object. The algo then creates questions for humans to answer, and fires them off to subject matter experts. The experts respond, the algo learns, and Fluree Sense serves up your data on a color-coded plate.

The image shows the fluree sense pipeline.
The image shows the resolution results from running fluree sense.

In a company as complex as the large global Bank, ZettaLabs/Fluree Sense was a boon. Polanco was able to complete the seemingly impossible organization of quant data. He also realized that many other big organizations had chaotic data sets, and built a company around his software. Zettalabs/Fluree Sense went on to be used for fraud and money laundering risk detection as well as for customer-facing purposes, such as new customer acquisition, upsell/cross-sell opportunities, and customer delinquency/churn.

Fluree Acquires ZettaLabs

Fluree has long known that organizations are going from data as a byproduct—where it is stuck in cloud instances and other siloes—to data being the product. Those organizations that can leverage data effectively will come out ahead in the transition to Web3. Designed for Web3, Fluree lets users wrap policy around data for permissions, allow machines to collaborate around data, lets users time travel to verify and validate data during different moments in time, as well as other data-centric features

An ecosystem of startups and enterprise and governmental pilot programs flourished on Fluree. Big, complex organizations with legacy data, however, still had their mess of existing data to figure out. ZettaLabs was designed to figure out a mess of existing data, providing a bridge between legacy data infrastructure and data-centric Web3. 

“Dealing with legacy infrastructure is one of the biggest challenges for modern businesses, but nearly 74% of organizations are failing to complete legacy data migration projects today due to inefficient tooling and a lack of interoperability,” said Fluree CEO and Co-Founder Brian Platz. “By adding the ZettaLabs team and product suite to our own, Fluree is poised to help organizations on their data infrastructure transformation journeys by uniquely addressing all major aspects of migration and integration: security, governance, and semantic interoperability.”

“We developed our flagship product, ZettaSense, to ingest, classify, resolve and cleanse big data coming from a variety of sources,” said Eliud Polanco, co-founder and CEO of ZettaLabs, who will become Fluree’s president. “The problem is that the underlying data technical architecture—with multiple operational data stores, warehouses and lakes, now spreading out across multiple clouds—is continuing to grow in complexity. Now with Fluree, our shared customer base and any new customers can evolve to a modern and elegant data-centric infrastructure that will allow them to more efficiently and effectively share cleansed data both inside and outside its organizational borders.”

A Golden Record for Data 

Digital transformation is a journey that often takes multiple years. Clean data by collecting, integrating, reconciling, and remediating it across the organization is the gargantuan first step—and a massive technical challenge. Most big organizations have multiple databases and operational data stores, much of it containing low-quality data. Even after purchasing data warehouses and governance tools, organizations find their analytics stymied by data quality. 

Then there are cultural challenges. Business units often have their own data stores that are custom configured, often in SaaS software such as a Salesforce instance. Merging that instance into an organization-wide process and workflow is an interruption to daily life—and threatens the KPIs of that business unit. Imagine you’re a salesperson with a quota to reach, and IT comes in wanting to reconfigure your Salesforce for a few months. The resistance is understandable but also delays digital transformation.

Fluree Sense solves the problem by getting data activation-ready within weeks, not months. The machine learning algorithm crawls data lakes, automatically integrating and cleansing data. The combination of supervised- and unsupervised ML quality assures 90% of data, leaving a far reduced human workload. Cleaned and quality-assured data becomes available to the entire organization, accelerating the move to modern data-centric architecture. 

With Fluree Sense, you can: 

  •  Use machine learning to normalize, cleanse, and harmonize disparate data sources—no additional software needed.
  • Programmatically scan and fix billions of rows, cleaning up to 90% of data, within a few months.
  • Create order and structure out of chaotic data sets, gaining the ability to explore 360-degree relationships across all data assets.
  • Preserve and enforce data governance, data security, and data privacy, while also simplifying data provenance and lineage.

For a large enterprise, data migration costs up to $150 million. Fluree Sense can successfully operate at a small fraction of that cost. Organizations can consolidate business data from various stores—sometimes dozens of warehouses—onto the single cloud that is their data lake. Fluree Sense organizes and cleans it in one place, with a small team using low- and no-code software to manage data. Time-to-value from raw data to business consumption is reduced from months to weeks. Data scientists, analysts, and business users can access data through tools such as Tableau, Synapse, and Databricks – or, in the near future, a secure Fluree Core instance for a true ASOT (authoritative source of truth). 

There is also an easy path to Fluree’s Web3 features, which include data audit and compliance solutions, verifiable credentials, data-centric applications, decentralized apps, and enterprise knowledge graphs. By making data usable and freeing it to be applied in Web3 architectures, Fluree Sense is shortening the timeline to digital transformation and making even the most chaotic data architectures Web3-ready. 

To learn more about the Fluree Sense product, sign up for our webinar: Introduction to Fluree Sense.