Local or Cloud
AI/ML Data Cleansing
Golden Record Pipeline
486 Patterson Ave
Winston-Salem, NC 27101
– – –
11 Park Place
New York, NY, 1007
– – –
Bagmane Laurel, Krishnappa
Garden, C V Raman Nagar,
Karnataka 560093, India
– – –
1644 Platte Street
Denver, CO 80202
– – –
Lange Dreef 11
4131 NJ Vianen
In 2020, the Department of Defense published a data management strategy that recognized data as a critical and central asset to operational success and called for digital transformation in the name of “data-centricity.” The strategy identified a need to leverage data across the DOD “at speed and scale for operational advantage and increased efficiency.”
To accomplish this data-centric vision, the strategy listed 7 guiding principles under the acronym VAULTIS – each letter representing a requirement for building a successful data-centric strategy within the organization.
VAULTIS provides guiding principles for data-centric data management. VAULTIS stands for: Visible, Accessible, Understandable, Linked, Trusted, Interoperable, and Secure. In providing these characteristics to a data strategy, organizations can move towards a data-centric approach to conducting their business, merging operational and analytical data domains and treating data as a valuable and versatile asset serving many stakeholders. Data-centricity increases an organization’s ability to securely share and collaborate around data, unlocks higher levels of analytical insights, and allows organizations to iterate business operations with more agility and precision.
VAULTIS may have originated out of the Defense sector, but private sector organizations would do well to emulate these guiding principles and apply VAULTIS to their own data-centric strategies. The VAULTIS principles ultimately prepare data for better outcomes across the data value chain – bringing together data producers and consumers for increased efficiencies while preserving data integrity, trust, and security. Let’s dive into each of the VAULTIS principles:
Data visibility remains an issue for most organizations – at many times, data consumers aren’t aware of the types of data assets that are available to them, resulting in duplication of work (and data).Fluree provides data publishers, stewards, custodians, and managers from any given data domain a means to consolidate sources of truth for data stakeholders and drive visibility for their registered data or datasets. The platform allows organizations to make rich use of semantic ontology-driven data modeling to index data assets under connected vocabularies, providing a semantically linked data catalog of data assets that are highly discoverable by data consumers.
Our upcoming Fluree Nexus cloud product interface will allow permissioned data consumers to browse and search data sets as well as discover related data assets via linked metadata. URL-style saving and sharing of data sets and queries will make the data sharing process intuitive.
Enterprise data platforms should make data readily available to those that need to for better and quicker decisions, but should also recognize data security and privacy challenges that emerge alongside a higher level of data sharing.
With Fluree’s data-centric approach to data security, data owners can manage data policies, including data access control, at the data layer in order to restrict data access at the dataset, row, column, or cell level. Data owners may define very granular and arbitrary rules to determine access privileges – leveraging Fluree’s data-centric governance, databases are capable of relationship-based-access-control where conditions around the data itself (e.g. mission relationships, metadata characteristics, affiliation to storage environments) can provide more powerful, granular, and complex contexts for data-centric data security. As such, credentialing can be flexible and dynamic depending on changing environments. In the case of multiple stakeholders in the policy enforcement process, data stewards, custodians, and managers can define their own unique access policies around the same data, as per the requirements of their own data governance contexts or dataset-specific sharing agreements.
This data-centric framework of embedding data security policies within the data itself allows a single data set to service multiple consumers, even with varying degrees of credentialed access.
Fluree’s RDF core allows organizations to make use of semantic graph ontologies to formally define common concepts and relationships in order to place shared meaning on registered data assets – even across disparate data domains and vocabularies. Leveraging an ontological schema approach allows data consumers – both human and machine – to derive globally-defined meaning from a data asset beyond the scope of its origin domain.
Data aggregation and insight-generation is then capable of interoperating across multiple data sources, creating a highly-dynamic distributed knowledge graph through which data consumers can understand and explore relationships between data assets both within their respective domain and across the wider data fabric. Users can then leverage analyses and insights originating from one data domain and apply them directly to a separate domain with ease.
Fluree is built on linked data standards, capable of defining relationships between data in a linked graph database. Fluree is built on W3C data standards for linked data, with native support for RDF (Resource Description Framework) interchange form, SPARQL for semantic federated queries, and JSON-LD for instant ontology mapping to payloads and defining unique IRIs for data assets. Fluree is built for many-to-many sources and consumers interfacing with the data layer, allowing a collaborative environment of insight generation, where data can be extended and linked to other data in ways that enrich ecosystem-wide data experience.
By leveraging a semantic ontology, Fluree-backed solutions can leverage inferencing, which will uncover hidden insights and patterns for data consumers. Importantly, this linked data and associated inferences can power operational domains, in which a data graph serves as both an analytical tool for downstream consumers, and also simultaneously the source of operation for deployed applications. This model allows for linked data knowledge to grow in analytical and operational value without splitting into duplicative data silos.
With Fluree, data publishers can have high confidence that their data will reach the intended consumer un-altered by process or middleware, and data consumers can have high confidence in the accuracy, reliability, and trustworthiness of data assets on the platform.
Fluree’s data-centric approach to digital trust provides a zero-trust framework for verifying the legitimacy of data, metadata, and data sources by using public/private key cryptography and blockchain asymmetric hashing algorithms. By leveraging Fluree’s distributed ledger backplane, facts about data and metadata registered within datasets can be proven to be true and not tampered with, including data provenance (who or what originated the data), data lineage (complete lifecycle of data path across systems, users, and organizations), and identity associated with actions (machine or human interaction with data).
The underlying Fluree ledger system allows for “time travel” queries in which users have the ability to track and review the history of changes to data assets over time, with the option to reproduce any versioned state of data. Because Fluree indexes a timestamp as metadata related to every delta to data, queries to data within Fluree’s system can specify any moment of time and retrieve an instantaneous result. This temporal versioning for immutable data allows for highly-explainable data decision-making and provable data audit review.
By leveraging universal data standards, Fluree’s graph database allows for the exchange of data across systems using a lightweight RDF interchange format and universal semantic interoperability, with zero integration overhead. RDF is a non-proprietary format universally understood by databases and applications. Fluree is entirely built upon these W3C standards, meaning our solution avoids a situation of proprietary standards lock-in, and natively facilitates integration and interoperation with systems and data designed around the same W3C open standards.
Disparate data assets, including data, metadata, relationships, and even ontologies, will be able to maintain their semantic meaning in and across various domains. Built on W3C RDF semantic standards, Fluree natively supports the SPARQL standard for semantic queries that simultaneously interoperate across systems.
Fluree’s platform satisfies encryption-at-rest and in-motion, as well as employs data-centric access restrictions for securing data in use, as mentioned above under Accessible. Fluree’s data-centric approach to data security delivers a zero-trust security framework for all data assets.
Data-embedded access policies require signatures on all queries and transactions to the system, meaning even breaches of the firewall or network by bad-faith actors do not risk unfettered data access or leakage. Data consumers with direct subscriptions to the data layer are always permissioned using cryptographic private keys, so that there is no risk of data leakage during data-event-driven messaging.
Fluree’s system natively supports ABAC/RBAC, but also provides an option for RelBAC, in which the data state itself can provide context for dynamic policy enforcement. RelBAC and cryptographic signatures create precise, relationship-based data policy enforcement that ensures arbitrary changes to user identity attribution won’t risk that user’s private key affording too much or too little data access.
Because Fluree’s system supports decentralized semantics and distributed ledger technology, it is possible for a Fluree-powered solution to make use of decentralized identity attribution, including identities issued across the enterprise or enterprise-external systems, so that organizations can extend data capabilities beyond their borders without introducing a set of security risks.
At the end of the day, your data strategy must ultimately benefit end-users along the data value chain: developers, programmers, data governance stewards, information security, data architects, analysts, scientists, and the general business user. The VAULTIS principles are excellent guidelines for organizations looking to build their data-centric strategy that addresses the diverse needs of these data stakeholders. Organizations that emulate the VAULTIS principles will move to a more fluid and functional data architecture in which analytical and operational domains can realize data’s full potential.
Interested in learning more? Get started with Fluree here!
Follow us on Linkedin
Join our Mailing List
Subscribe to our LinkedIn Newsletter
Subscribe to our YouTube channel
Partner, Analytic Strategy Partners; Frederick H. Rawson Professor in Medicine and Computer Science, University of Chicago and Chief of the Section of Biomedical Data Science in the Department of Medicine
Robert Grossman has been working in the field of data science, machine learning, big data, and distributed computing for over 25 years. He is a faculty member at the University of Chicago, where he is the Jim and Karen Frank Director of the Center for Translational Data Science. He is the Principal Investigator for the Genomic Data Commons, one of the largest collections of harmonized cancer genomics data in the world.
He founded Analytic Strategy Partners in 2016, which helps companies develop analytic strategies, improve their analytic operations, and evaluate potential analytic acquisitions and opportunities. From 2002-2015, he was the Founder and Managing Partner of Open Data Group (now ModelOp), which was one of the pioneers scaling predictive analytics to large datasets and helping companies develop and deploy innovative analytic solutions. From 1996 to 2001, he was the Founder and CEO of Magnify, which is now part of Lexis-Nexis (RELX Group) and provides predictive analytics solutions to the insurance industry.
Robert is also the Chair of the Open Commons Consortium (OCC), which is a not-for-profit that manages and operates cloud computing infrastructure to support scientific, medical, health care and environmental research.
Connect with Robert on Linkedin
Founder, DataStraits Inc., Chief Revenue Officer, 3i Infotech Ltd
Sudeep Nadkarni has decades of experience in scaling managed services and hi-tech product firms. He has driven several new ventures and corporate turnarounds resulting in one IPO and three $1B+ exits. VC/PE firms have entrusted Sudeep with key executive roles that include entering new opportunity areas, leading global sales, scaling operations & post-merger integrations.
Sudeep has broad international experience having worked, lived, and led firms operating in US, UK, Middle East, Asia & Africa. He is passionate about bringing innovative business products to market that leverage web 3.0 technologies and have embedded governance risk and compliance.
Connect with Sudeep on Linkedin
CEO, Data4Real LLC
Julia Bardmesser is a technology, architecture and data strategy executive, board member and advisor. In addition to her role as CEO of Data4Real LLC, she currently serves as Chair of Technology Advisory Council, Women Leaders In Data & AI (WLDA). She is a recognized thought leader in data driven digital transformation with over 30 years of experience in building technology and business capabilities that enable business growth, innovation, and agility. Julia has led transformational initiatives in many financial services companies such as Voya Financial, Deutsche Bank Citi, FINRA, Freddie Mac, and others.
Julia is a much sought-after speaker and mentor in the industry, and she has received recognition across the industry for her significant contributions. She has been named to engatica 2023 list of World’s Top 200 Business and Technology Innovators; received 2022 WLDA Changemaker in AI award; has been named to CDO Magazine’s List of Global Data Power Wdomen three years in the row (2020-2022); named Top 150 Business Transformation Leader by Constellation Research in 2019; and recognized as the Best Data Management Practitioner by A-Team Data Management Insight in 2017.
Connect with Julia on Linkedin
Senior Advisor, Board Member, Strategic Investor
After nine years leading the rescue and turnaround of Banco del Progreso in the Dominican Republic culminating with its acquisition by Scotiabank (for a 2.7x book value multiple), Mark focuses on advisory relationships and Boards of Directors where he brings the breadth of his prior consulting and banking/payments experience.
In 2018, Mark founded Alberdi Advisory Corporation where he is engaged in advisory services for the biotechnology, technology, distribution, and financial services industries. Mark enjoys working with founders of successful businesses as well as start-ups and VC; he serves on several Boards of Directors and Advisory Boards including MPX – Marco Polo Exchange – providing world-class systems and support to interconnect Broker-Dealers and Family Offices around the world and Fluree – focusing on web3 and blockchain. He is actively engaged in strategic advisory with the founder and Executive Committee of the Biotechnology Institute of Spain with over 50 patents and sales of its world-class regenerative therapies in more than 30 countries.
Prior work experience includes leadership positions with MasterCard, IBM/PwC, Kearney, BBVA and Citibank. Mark has worked in over 30 countries – extensively across Europe and the Americas as well as occasional experiences in Asia.
Connect with Mark on Linkedin
Chair of the Board, Enterprise Data Management Council
Peter Serenita was one of the first Chief Data Officers (CDOs) in financial services. He was a 28-year veteran of JPMorgan having held several key positions in business and information technology including the role of Chief Data Officer of the Worldwide Securities division. Subsequently, Peter became HSBC’s first Group Chief Data Officer, focusing on establishing a global data organization and capability to improve data consistency across the firm. More recently, Peter was the Enterprise Chief Data Officer for Scotiabank focused on defining and implementing a data management capability to improve data quality.
Peter is currently the Chairman of the Enterprise Data Management Council, a trade organization advancing data management globally across industries. Peter was a member of the inaugural Financial Research Advisory Committee (under the U.S. Department of Treasury) tasked with improving data quality in regulatory submissions to identify systemic risk.
Connect with Peter on Linkedin
Turn Data Chaos into Data Clarity
"*" indicates required fields
Enter details below to access the whitepaper.
Pawan came to Fluree via its acquisition of ZettaLabs, an AI based data cleansing and mastering company.His previous experiences include IBM where he was part of the Strategy, Business Development and Operations team at IBM Watson Health’s Provider business. Prior to that Pawan spent 10 years with Thomson Reuters in the UK, US, and the Middle East. During his tenure he held executive positions in Finance, Sales and Corporate Development and Strategy. He is an alumnus of The Georgia Institute of Technology and Georgia State University.
Connect with Pawan on Linkedin
Andrew “Flip” Filipowski is one of the world’s most successful high-tech entrepreneurs, philanthropists and industry visionaries. Mr. Filipowski serves as Co-founder and Co-CEO of Fluree, where he seeks to bring trust, security, and versatility to data.
Mr. Filipowski also serves as co-founder, chairman and chief executive officer of SilkRoad Equity, a global private investment firm, as well as the co-founder, of Tally Capital.
Mr. Filipowski was the former COO of Cullinet, the largest software company of the 1980’s. Mr. Filipowski founded and served as Chairman and CEO of PLATINUM technology, where he grew PLATINUM into the 8th largest software company in the world at the time of its sale to Computer Associates for $4 billion – the largest such transaction for a software company at the time. Upside Magazine named Mr. Filipowski one of the Top 100 Most Influential People in Information Technology. A recipient of Entrepreneur of the Year Awards from both Ernst & Young and Merrill Lynch, Mr. Filipowski has also been awarded the Young President’s Organization Legacy Award and the Anti-Defamation League’s Torch of Liberty award for his work fighting hate on the Internet.
Mr. Filipowski is or has been a founder, director or executive of various companies, including: Fuel 50, Veriblock, MissionMode, Onramp Branding, House of Blues, Blue Rhino Littermaid and dozens of other recognized enterprises.
Connect with Flip on Linkedin
Brian is the Co-founder and Co-CEO of Fluree, PBC, a North Carolina-based Public Benefit Corporation.
Platz was an entrepreneur and executive throughout the early internet days and SaaS boom, having founded the popular A-list apart web development community, along with a host of successful SaaS companies. He is now helping companies navigate the complexity of the enterprise data transformation movement.
Previous to establishing Fluree, Brian co-founded SilkRoad Technology which grew to over 2,000 customers and 500 employees in 12 global offices. Brian sits on the board of Fuel50 and Odigia, and is an advisor to Fabric Inc.
Connect with Brian on Linkedin
Eliud Polanco is a seasoned data executive with extensive experience in leading global enterprise data transformation and management initiatives. Previous to his current role as President of Fluree, a data collaboration and transformation company, Eliud was formerly the Head of Analytics at Scotiabank, Global Head of Analytics and Big Data at HSBC, head of Anti-Financial Crime Technology Architecture for U.S.DeutscheBank, and Head of Data Innovation at Citi.
In his most recent role as Head of Analytics and Data Standards at Scotiabank, Eliud led a full-spectrum data transformation initiative to implement new tools and technology architecture strategies, both on-premises as well as on Cloud, for ingesting, analyzing, cleansing, and creating consumption ready data assets.
Connect with Eliud on Linkedin
Get the right data into the right hands.
Build your Verifiable Credentials/DID solution with Fluree.
Wherever you are in your Knowledge Graph journey, Fluree has the tools and technology to unify data based on universal meaning, answer complex questions that span your business, and democratize insights across your organization.
Build real-time data collaboration that spans internal and external organizational boundaries, with protections and controls to meet evolving data policy and privacy regulations.
Fluree Sense auto-discovers data fitting across applications and data lakes, cleans and formats them into JSON-LD, and loads them into Fluree’s trusted data platform for sharing, analytics, and re-use.
Transform legacy data into linked, semantic knowledge graphs. Fluree Sense automates the data mappings from local formats to a universal ontology and transforms the flat files into RDF.
Whether you are consolidating data silos, migrating your data to a new platform, or building an MDM platform, we can help you build clean, accurate, and reliable golden records.
Our enterprise users receive exclusive support and even more features. Book a call with our sales team to get started.
Download Stable Version
Download Pre-Release Version
Register for Alpha Version
By downloading and running Fluree you agree to our terms of service (pdf).
Hello this is some content.