Introduction to The Graph — Decentralizing Data
Solving The Problem of Accessing On-Chain Data
One advantage of open and decentralized protocols is that they are public: everyone can see what’s happening on the networks. However, accessing and using this public data can be more complex than it appears. Today we have two options: going through all the history of the blockchain by yourself (e.g. running a full node), or querying a block explorer like Etherscan.
Option one is pretty resource consuming: it takes time, you need to store a copy of the entire blockchain, and to be connected at all times. Option two relies on a centralized third-party, it isn’t trustless.
Both options will also work differently for each blockchain and protocol, increasing a lot the complexity of having complete data in the cross-blockchain environment we see emerging today.
The Graph aims to solve that through a decentralized query protocol. If you want to dig deeper into the why, you can read their article comparing traditional APIs and SQL to The Graph Protocol and GraphQL.
A Gentle Introduction to The Graph
The Graph is a decentralized indexing protocol for Web3. It enables querying blockchain data without being connected to a blockchain or having to rely on a centralized third-party. In simpler terms, it’s a decentralized API protocol for blockchains and their decentralized applications.
A Short Example
Imagine a decentralised application built on top of Aave that would need access to the protocol’s data. Up until now, Aave had to build and maintain a centralized API on their servers to allow others to access and use the data. With The Graph, Aave developers will write a subgraph manifest (data schema), and multiple indexers will index Aave’s data, fetching it directly on the Ethereum network, actually creating a decentralized API for Aave.
Some advantages of the Graph are that dapps won’t have to maintain their APIs themselves, protocol data will be decentralized, and it allows developers to have a common query structure and language for all protocols of any open blockchain.
How it works
Data indices, called subgraphs, are built from a subgraph manifest — a document describing which data from a specific protocol needs to be indexed and how — so it can later be queried easily by users and applications. Each subgraph can be queried with a standard GraphQL API call. GraphQL is an open-source data query and manipulation language for APIs initially developed by Facebook.
Once these mapping instructions, from blockchain events to how the data is stored, are recorded in a Graph Node, the node listens for any changes on the chain and updates its subgraph accordingly.
Then, each indexed subgraph can be queried like a traditional API through its GraphQL endpoint, and the data is fetched from a decentralized network of indexers. You can find an example of a script I wrote querying data from an Aave subgraph here.