AI Golden Record Pipeline
Auto Content Tagging
486 Patterson Ave
Winston-Salem, NC 27101
– – –
11 Park Place
New York, NY, 10007
– – –
Bagmane Laurel, Krishnappa
Garden, C V Raman Nagar,
Karnataka 560093, India
– – –
1644 Platte Street
Denver, CO 80202
– – –
Lange Dreef 11
4131 NJ Vianen
When you get down to it, if you are building an app with Fluree as the backend, it is simplest to think of Fluree as a database. This can be a useful way to think about working with Fluree, but by doing so there is a lot being left on the table. The unique combination of technologies that make Fluree what it is enables some extremely powerful functionality and unlocks ways of working with data that are either uncommon or simply not possible with other data stores. Let’s talk about one of those functionalities: time travel.
In addition to a graph database for querying data, Fluree is built with an immutable ledger as the backbone which holds the dataset. This part of Fluree is what enables some really interesting and particularly unique functionality. ‘Immutable ledger’ is one of those terms which I had to Google in order to understand when I started at Fluree, so let’s break that down a bit.
Fluree associates related data elements, called subjects. Each subject has an _id which is used to correlate the attributes (called predicates) and the values of those predicates together to form the “facts” about that subject.
Fluree is based on an extended version of the W3C standard for RDF, which is where this notion of SPO (Subject, Predicate, Object) comes from.
You can think of it like a row in a db table with _id being the unique identifier for the row, predicates are the columns, and the values are the fields. Each field makes up a fact about the instance of data stored in that particular row in the db. For example, in a Dog table with a Breed column, each row corresponds to a unique Dog who is described by the attributes and fields. The same idea holds in Fluree. An _id groups the related predicates, which point to values, in order to make up the “facts” of that subject. So, the dog/breed predicate would point at an object, "french bulldog", for example. At the point in time when that fact was written to the ledger, that specific subject’s breed was french bulldog.
Each of these facts are stored in an immutable data structure. Immutable means that those data structures are not available to be modified or changed in any way. Instead of simply changing a value or updating a “row” in the data, Fluree will make a new true fact in the ledger and associate it with the appropriate subject. If this is a value which is being “modified” then Fluree will make two new facts; one where the old fact is false and the second a new, true fact, both of these facts are then associated with the subject and written to the ledger.
This is part of the “extension” to RDF. Each flake contains a boolean which indicates whether it is true or has been falsified. You can read more about this in the flakes page in the docs.
This brings us to what a ledger is. You can think of a ledger as discrete units or “blocks” which contain the history of the data as it is transacted. These blocks are made of groups of immutable facts that are sent to an instance of Fluree. Each block is linked to the one which came before it so there is a chain of blocks from when the ledger was created to the current block. In Fluree, this chain is queryable, which means once some data has been transacted to Fluree you have the history of every data element in that data set!
So, back to those powerful pieces of functionality I mentioned at the beginning.
There are two ways of querying the data which enable what we call time travel in the Fluree world. There are block queries and history queries, both unlock elements of Fluree which are only possible because of the immutable data structures and the ledger. Block queries enable querying the data state at specific moments in time and history queries allow you to get an overview of all of the modifications to a particular subject.
We’ll get into how each of these types of queries works, but first, why does this even matter?
One of the primary benefits to having this type of view into your data is the ability to correlate events with the data state at the time that event happened. For example, say you are tracking prices for flights and you want to see what effect the weather had on flight prices or which day of the week prices tended to be the cheapest. The sky’s the limit for these types of analytical queries.
You also may want to enable your users to see the state of some piece of data when it was updated. I saw a comment on a LinkedIn post once and was pretty sure that the commenter worked for the company who’s post he was commenting on, but his current job title was recently updated so I couldn’t tell where he worked when the comment was added, only where he currently worked – the current state of the data.
This type of functionality can be useful in a wide range of circumstances or situations. Having a way to view not only the current state of the data (table stakes for any database), but a way to see the state of a piece of data at a specified time OR for a specified range of time, can be extremely useful. Fluree goes a step further though. When you are querying some data a point in time, you are also seeing all of the facts which were true at that time as well. This includes all of the relationships which existed at the time. This is something which is not possible in any other database or data store that I am aware of. You are able to query not only the historical values of something in your data but also all of the context associated with that data as well. That is huge.
Now, think about how you would go about making something like this in your db of choice. Building out a historical view of a table in a traditional database, whether in a relational or NoSQL db, is a large and expensive maintenance burden, the size of your db will explode because of data duplication without significant optimization, and querying these db tables or documents can become relatively complex; specifically, what happens to references? Does the reference point to the current table or is there a way to manage the reference such that it points to the correct row in the historical table? What happens when you want to do a join to with another table? There isn’t an expedient or simple way to do either of those things, to my knowledge. One or two other data stores enable historical views but are not able to pull in all of the contexts and maintain relationships as well.
Both of these operations are exposed via an API within a Fluree db instance. Simply passing a JSON to the /block or /history endpoint is all that is needed to query this type of data. Let’s get into how to use each of these queries. I will be using the Fluree Query Language (FlureeQL), which is a JSON-based way to query the backend. Fluree also supports querying via GraphQL, SPARQL, SQL or calling these endpoints directly from Clojure, but we’ll use FlureeQL to illustrate this functionality. If you want to read more about our query surfaces, check out the query pages for more details.
There are two ways to query a block in Fluree. You can either issue a query against the /block endpoint which returns the flakes in that particular block or range of blocks, or you can add a "block" key to a basic query issued to the /query endpoint. This basic query method of querying can, and probably will, pull in facts which were transacted to the ledger before the specified block. When you issue a regular query with a block key, you are issuing a query as if the specified block were the current block.Each of these types of query is beneficial, and can be useful depending on how you need to view your data.
Let’s start with a query issued to the /block endpoint. This type of query currently supports 2 keys:
"prettyPrint" is a boolean, which if true, prints the results in a pretty printed, aka styled format, for easier reading, as well as separating the asserted and retracted flakes into their own arrays in order to make them easier to parse. The "block" key is much more interesting. It can take a number, a string in the form of an ISO-8601 formatted date-time or duration, or an array which specifies a range of block for the query.
For example, to query a specific block:
You can query via a time stamp. This will return the first block which was transacted before this timestamp. In other words, it will give you the facts which were true at that time.
You can also use an ISO-8601 formatted duration:
This will return the state of the data as of 5 minutes ago.
If you would like to query a range of blocks, you can pass an array containing the blocks you would like to see. This range is inclusive, meaning the data returned will include both blocks you put in the array.
"block": [5, 18]
You can also pass an array with a single block which will specify a lower, also inclusive, block and return the facts from that block up to the current block.
Using the /block endpoint will return an array of flakes, each of which is a fact stored in Fluree at that block or range of blocks. While this is useful, it is probably more realistic that you would want to see a specific set of data using a normal query, but have the results returned as if they had been issued at some point in the past. This is also enabled in Fluree by issuing a query to the /query endpoint which contains the "block" key-value pair. This key expects the value to be structured in the same way as the examples above, with the value being one of a number, a formatted string, or an array of block numbers. So the main difference is that this type of query will pull in data which is not limited to a specific block, it returns data as if the query had been issued when that block was the current block. For example, if you had a Dog collection of subjects in your ledger, you could issue this query to see all of the dogs which had been transacted and not deleted as of block 7:
To read more on querying blocks, check out the docs pages for block queries and querying with the block key.
The way a /history query is structured and issued is relatively similar to /block queries, but are fairly different in what results are returned. As I mentioned above, a history query returns all of the modifications to a subject. I like to think of a block query showing the breadth of the data at a specific time and the history query as looking down the timeline of a specific piece of data.
For example, if you had a customer in your dataset who has connections to other customers, you could see the history of that customer’s connections from when they first joined your application up to the current block. If you wanted to see the connections that customer had at a specific block or over a range of blocks, that is possible, as is using the ISO-8601 date-times or durations.
You can build a /history query using FlureeQL in JSON the same way you would with a /block query. For example, if you know the subject’s _id you can simply hit the /history endpoint like this:
This query will return an array of objects, each object containing the block and t numbers for that block and an array of flakes for that subject.Another option is to issue a history query with a block key in order to constrain the results of the query to a specific timeframe in your data. That looks like this:
This query will return the flakes for this _id up to block 4. You can also use a block range or use the ISO-8601 formatted string similar to the /block queries.
Using a flake format is another way you can issue a history query. This means that you can use pieces of data to identify the subject you want to query. This works via the subject, predicate, object structure of a flake. You pass the elements you want to use to query in an array as the value of the “history” key in the query JSON. The array needs to be passed as ["subject", "predicate", "object"], but you do not have to use all 3 elements in the array for the query to resolve.
["subject", "predicate", "object"]
Please note that the order of these within the array is important and either a subject or a predicate is required.
Please note that the order of these within the array is important and either a subject or a predicate is required.
For example, if you want to query for the history of all subjects matching the predicate object pair dog/breed "french bulldog" in your collection, you could query the ledger like this:
Another way this could be done is using either a subject _id with a predicate, or substitute a two-tuple which uniquely identifies a subject for the _id.That would look like this:
or with a two-tuple
Both of these queries will return the history of the predicate "dog/favFoods" for the dog specified, with either the subject _id or the unique identifier of ["dog/name" "Jacques"] used to identify the subject you want to inspect.
Similar to the "/block" queries, a "/history" query can also accept a "prettyPrint" key-value pair. When true this will return the history of the subject or predicate as indicated, but will separate out the retracted and asserted flakes per block into their own arrays. That looks like this:
And will return something in this type of structure:
"dog/breed": "french bulldog"
In the return JSON, each block containing data which matches the query is its own labeled object containing a named array for asserted and retracted.
There is one other extremely powerful way to use "/history" queries to audit the history of who transacted the data. You can issue a "showAuth" boolean key-value pair or an array of _auth/id or _auth subject _id‘s in order to filter the history query to specific auth record’s transactions. Because each transaction is signed by a private key which is associated cryptographically with the _auth/id, every flake in Fluree contains a record of who issued that transaction. This is the way to view that data. It looks like this:
This will return an array of block objects, each of which will contain a named array of "auth" which consists of the auth’s subject _id and the "_auth/id of the individual (man or machine) which signed that block. Which will look something like this:
[ 17592186044436, 40, "dog", -3, true, null ],
[ 17592186044437, 40, "cat", -3, true, null ],
[ 17592186044438, 40, "ferret", -3, true, null ]
For more information on how Fluree stores and interacts with identity and authorization, please take a look at the identity section in the docs.
So that’s how you can go about time traveling in Fluree. There are powerful tools that come out of the box that enable you to do things like query as of a specific moment in time, see how a subject evolved over time in your dataset, or get all of the data which was transacted by a specific auth record. You can read more about it in our docs site or if you would prefer to engage with our community, come join us on Discord.
For more detail about this subject, you can watch our Time and Immutability Webinar.
This demo has a publicly available repository which you can view here.
Blurb about what the company does and how they interact with Fluree
Visit Partner Site
Visit Partner Site More Details
Blurb about what the company does and how they interact with Fluree blah blah blah minim officia amet nulla cupidatat eu id adipisicing velit aliquip elit labore labore aliquip exercitation enim do ea sunt nisi aute amet magna cillum culpa elit voluptate culpa officia eiusmod sunt ipsum duis laborum magna tempor cillum esse do sunt
"*" indicates required fields
Follow us on Linkedin
Join our Mailing List
Subscribe to our LinkedIn Newsletter
Subscribe to our YouTube channel
Partner, Analytic Strategy Partners; Frederick H. Rawson Professor in Medicine and Computer Science, University of Chicago and Chief of the Section of Biomedical Data Science in the Department of Medicine
Robert Grossman has been working in the field of data science, machine learning, big data, and distributed computing for over 25 years. He is a faculty member at the University of Chicago, where he is the Jim and Karen Frank Director of the Center for Translational Data Science. He is the Principal Investigator for the Genomic Data Commons, one of the largest collections of harmonized cancer genomics data in the world.
He founded Analytic Strategy Partners in 2016, which helps companies develop analytic strategies, improve their analytic operations, and evaluate potential analytic acquisitions and opportunities. From 2002-2015, he was the Founder and Managing Partner of Open Data Group (now ModelOp), which was one of the pioneers scaling predictive analytics to large datasets and helping companies develop and deploy innovative analytic solutions. From 1996 to 2001, he was the Founder and CEO of Magnify, which is now part of Lexis-Nexis (RELX Group) and provides predictive analytics solutions to the insurance industry.
Robert is also the Chair of the Open Commons Consortium (OCC), which is a not-for-profit that manages and operates cloud computing infrastructure to support scientific, medical, health care and environmental research.
Connect with Robert on Linkedin
Founder, DataStraits Inc., Chief Revenue Officer, 3i Infotech Ltd
Sudeep Nadkarni has decades of experience in scaling managed services and hi-tech product firms. He has driven several new ventures and corporate turnarounds resulting in one IPO and three $1B+ exits. VC/PE firms have entrusted Sudeep with key executive roles that include entering new opportunity areas, leading global sales, scaling operations & post-merger integrations.
Sudeep has broad international experience having worked, lived, and led firms operating in US, UK, Middle East, Asia & Africa. He is passionate about bringing innovative business products to market that leverage web 3.0 technologies and have embedded governance risk and compliance.
Connect with Sudeep on Linkedin
CEO, Data4Real LLC
Julia Bardmesser is a technology, architecture and data strategy executive, board member and advisor. In addition to her role as CEO of Data4Real LLC, she currently serves as Chair of Technology Advisory Council, Women Leaders In Data & AI (WLDA). She is a recognized thought leader in data driven digital transformation with over 30 years of experience in building technology and business capabilities that enable business growth, innovation, and agility. Julia has led transformational initiatives in many financial services companies such as Voya Financial, Deutsche Bank Citi, FINRA, Freddie Mac, and others.
Julia is a much sought-after speaker and mentor in the industry, and she has received recognition across the industry for her significant contributions. She has been named to engatica 2023 list of World’s Top 200 Business and Technology Innovators; received 2022 WLDA Changemaker in AI award; has been named to CDO Magazine’s List of Global Data Power Wdomen three years in the row (2020-2022); named Top 150 Business Transformation Leader by Constellation Research in 2019; and recognized as the Best Data Management Practitioner by A-Team Data Management Insight in 2017.
Connect with Julia on Linkedin
Senior Advisor, Board Member, Strategic Investor
After nine years leading the rescue and turnaround of Banco del Progreso in the Dominican Republic culminating with its acquisition by Scotiabank (for a 2.7x book value multiple), Mark focuses on advisory relationships and Boards of Directors where he brings the breadth of his prior consulting and banking/payments experience.
In 2018, Mark founded Alberdi Advisory Corporation where he is engaged in advisory services for the biotechnology, technology, distribution, and financial services industries. Mark enjoys working with founders of successful businesses as well as start-ups and VC; he serves on several Boards of Directors and Advisory Boards including MPX – Marco Polo Exchange – providing world-class systems and support to interconnect Broker-Dealers and Family Offices around the world and Fluree – focusing on web3 and blockchain. He is actively engaged in strategic advisory with the founder and Executive Committee of the Biotechnology Institute of Spain with over 50 patents and sales of its world-class regenerative therapies in more than 30 countries.
Prior work experience includes leadership positions with MasterCard, IBM/PwC, Kearney, BBVA and Citibank. Mark has worked in over 30 countries – extensively across Europe and the Americas as well as occasional experiences in Asia.
Connect with Mark on Linkedin
Chair of the Board, Enterprise Data Management Council
Peter Serenita was one of the first Chief Data Officers (CDOs) in financial services. He was a 28-year veteran of JPMorgan having held several key positions in business and information technology including the role of Chief Data Officer of the Worldwide Securities division. Subsequently, Peter became HSBC’s first Group Chief Data Officer, focusing on establishing a global data organization and capability to improve data consistency across the firm. More recently, Peter was the Enterprise Chief Data Officer for Scotiabank focused on defining and implementing a data management capability to improve data quality.
Peter is currently the Chairman of the Enterprise Data Management Council, a trade organization advancing data management globally across industries. Peter was a member of the inaugural Financial Research Advisory Committee (under the U.S. Department of Treasury) tasked with improving data quality in regulatory submissions to identify systemic risk.
Connect with Peter on Linkedin
Turn Data Chaos into Data Clarity
Enter details below to access the whitepaper.
Pawan came to Fluree via its acquisition of ZettaLabs, an AI based data cleansing and mastering company.His previous experiences include IBM where he was part of the Strategy, Business Development and Operations team at IBM Watson Health’s Provider business. Prior to that Pawan spent 10 years with Thomson Reuters in the UK, US, and the Middle East. During his tenure he held executive positions in Finance, Sales and Corporate Development and Strategy. He is an alumnus of The Georgia Institute of Technology and Georgia State University.
Connect with Pawan on Linkedin
Andrew “Flip” Filipowski is one of the world’s most successful high-tech entrepreneurs, philanthropists and industry visionaries. Mr. Filipowski serves as Co-founder and Co-CEO of Fluree, where he seeks to bring trust, security, and versatility to data.
Mr. Filipowski also serves as co-founder, chairman and chief executive officer of SilkRoad Equity, a global private investment firm, as well as the co-founder, of Tally Capital.
Mr. Filipowski was the former COO of Cullinet, the largest software company of the 1980’s. Mr. Filipowski founded and served as Chairman and CEO of PLATINUM technology, where he grew PLATINUM into the 8th largest software company in the world at the time of its sale to Computer Associates for $4 billion – the largest such transaction for a software company at the time. Upside Magazine named Mr. Filipowski one of the Top 100 Most Influential People in Information Technology. A recipient of Entrepreneur of the Year Awards from both Ernst & Young and Merrill Lynch, Mr. Filipowski has also been awarded the Young President’s Organization Legacy Award and the Anti-Defamation League’s Torch of Liberty award for his work fighting hate on the Internet.
Mr. Filipowski is or has been a founder, director or executive of various companies, including: Fuel 50, Veriblock, MissionMode, Onramp Branding, House of Blues, Blue Rhino Littermaid and dozens of other recognized enterprises.
Connect with Flip on Linkedin
Brian is the Co-founder and Co-CEO of Fluree, PBC, a North Carolina-based Public Benefit Corporation.
Platz was an entrepreneur and executive throughout the early internet days and SaaS boom, having founded the popular A-list apart web development community, along with a host of successful SaaS companies. He is now helping companies navigate the complexity of the enterprise data transformation movement.
Previous to establishing Fluree, Brian co-founded SilkRoad Technology which grew to over 2,000 customers and 500 employees in 12 global offices. Brian sits on the board of Fuel50 and Odigia, and is an advisor to Fabric Inc.
Connect with Brian on Linkedin
Eliud Polanco is a seasoned data executive with extensive experience in leading global enterprise data transformation and management initiatives. Previous to his current role as President of Fluree, a data collaboration and transformation company, Eliud was formerly the Head of Analytics at Scotiabank, Global Head of Analytics and Big Data at HSBC, head of Anti-Financial Crime Technology Architecture for U.S.DeutscheBank, and Head of Data Innovation at Citi.
In his most recent role as Head of Analytics and Data Standards at Scotiabank, Eliud led a full-spectrum data transformation initiative to implement new tools and technology architecture strategies, both on-premises as well as on Cloud, for ingesting, analyzing, cleansing, and creating consumption ready data assets.
Connect with Eliud on Linkedin
Get the right data into the right hands.
Build your Verifiable Credentials/DID solution with Fluree.
Wherever you are in your Knowledge Graph journey, Fluree has the tools and technology to unify data based on universal meaning, answer complex questions that span your business, and democratize insights across your organization.
Build real-time data collaboration that spans internal and external organizational boundaries, with protections and controls to meet evolving data policy and privacy regulations.
Fluree Sense auto-discovers data fitting across applications and data lakes, cleans and formats them into JSON-LD, and loads them into Fluree’s trusted data platform for sharing, analytics, and re-use.
Transform legacy data into linked, semantic knowledge graphs. Fluree Sense automates the data mappings from local formats to a universal ontology and transforms the flat files into RDF.
Whether you are consolidating data silos, migrating your data to a new platform, or building an MDM platform, we can help you build clean, accurate, and reliable golden records.
Our enterprise users receive exclusive support and even more features. Book a call with our sales team to get started.
Download Stable Version
Download Pre-Release Version
Register for Alpha Version
By downloading and running Fluree you agree to our terms of service (pdf).
Hello this is some content.