Very few of us were in our formative years during that magical time when the internet became a household necessity. There’s an extra appreciation in being able to remember the “before”, and then quickly becoming hopelessly hooked on the “after.” A big part of this change—arguably the biggest—was the ability to use a search engine. Believe it or not, but there was once a time when you had to know the entire URL in order to visit a website. No autofill either. Imagine having that limitation today… it’s probably impossible, because none of those features we depend on would even be possible.
Indexing—capturing data, cataloging it and making it searchable—brought us into the Information Age. Like the innovations of the 20th century that were brought about by a reliable supply of oil, our data is the lifeblood that makes these modern miracles possible.
That said, the ability to index—or rather, the lack of it—has limited a growing number of use cases for the crypto industry. This is because blockchain information can’t just be scooped up into search engines the way the rest of the internet can. And because there aren’t search engine giants parsing out the information and making it instantly available, it becomes difficult for that data to be usable for DApps within an L1 ecosystem.
The Problem
The problem is fairly straightforward, but has several wrinkles that make it difficult to solve.
Let’s first look at what Blockchain Indexing actually is. An “ancient” example of indexing is, well, an index. These are found at the back of books, and link certain keywords (subjects, people, or other groupings of keywords) to where they are found in the book. This benefits the reader by saving massive amounts of time they would otherwise spend having to go through each page until they found what they were looking for. A more modern example is a search engine—though it’s easier to use a search engine than to fully understand how it works. For a blockchain, you have what becomes a connected, linear history of activity on the chain. Instead of starting at Block 1 and searching everything until you find the piece of data you need, it would be nice if you could query the data and get what you asked for instantly. This hasn’t been the case, however. Various chains either don’t have any indexing solutions, or the solutions they have may take days for a query to be answered. This chokes out a large number of critical use cases for blockchain, and threatens to outweigh the benefits of having data on-chain in the first place.
The options have been limited for developers. There are services that capture all the data on the blockchain continuously, then store it off-chain in a large database. This information is then parsed and indexed so it can be searchable, and the service—sometimes called an Ingestion Service—charges a fee for users to search. This technically helps to solve the problem of making blockchain data searchable. However, it forces users to put a lot of trust in the company to provide the correct information, and creates a single point of failure, and ruins any benefits of a decentralized platform by forcing it into a centralized indexing solution.
The other alternative for developers has been to DIY their own platform, indexing their own data and writing software to parse and make their data searchable. This takes development time and effort from the core value of a platform/DApp, and likely isn’t going to be developed with best-in-class indexing specialists.
The Solution(s)
For Web3 blockchain indexing solutions, there have only been a few who have created successful innovations that can provide searchable blockchain data, do so quickly, and do so in a way that will eventually be decentralized.
These platforms are The Graph and SubQuery. While their goals are similar—provide easy access to blockchain information—their approaches are different, as are their primary launch ecosystems.
The Graph launched its mainnet for Ethereum in late 2020, and works by teaming indexers, consumers, and curators (playing minor roles are delegators, fishermen, and arbitrators). Indexers stake GRT and run the nodes that collect and index information, processing chunks of the ecosystem in the form of “graphs” and smaller “subgraphs.” Consumers query the data and pay to quickly and easily retrieve the data. Curators are incentivized to look for new subgraphs that may have demand, then stake tokens as a way to mark opportunities for indexers. Curators are rewarded based on how much their marked subgraphs are queried.
Subquery is a platform built initially for Polkadot/Substrate. Their philosophy for creating an indexed ecosystem is fundamentally different from The Graph, though their end goals are the same. Subquery’s key components are their software development kit (SDK), which is an open source toolkit for developers to very easily bolt onto their projects after staking SQT token, then create index/query capabilities; their SubQuery Projects, an online application where clients can publish & deploy their SubQuery project to the managed service, where SubQuery will run it online for free; and the SubQuery Explorer, an online managed service that provides access to published SubQuery Projects made by contributors in the community. Users can test queries directly using the playground or get GraphQL API endpoints for each Project.
Both solutions focus on a mix of indexers, customers who pay to query information, and an overarching body that settles disputes and disciplines bad actors. Both solutions are mostly centralized in their current structure, with a clear roadmap and vision toward a decentralized future state. Both solutions have scaling potential, though Subquery will likely scale first as they have announced plans to collab with Moonbeam to offer an EVM, making Subquery only indexing protocol to work on both Polkadot and Ethereum. The Graph was launched first however, so they may have other advantages on their path to increasing connectedness within and across blockchain ecosystems.
Looking Forward
The Graph and Subquery may be the catalysts that launch a massive number of use cases for blockchain and further unlock its potential for mainstream use. They have different approaches to the same problem, and time will tell which is more effective. So which platform will become the Google of Web3, and truly decentralize the search functionality of blockchain? It’s hard to say, but it may not be accurate to judge the contest as a zero sum game. Blockchain and decentralization have shown beyond a doubt that the world is a very big place, that collaborations on the blockchain abound, and that there just might be enough room for everyone to play and to thrive. The most certain aspect, however, is that we don’t want to go back to those early, dark days before search engines. Data is the new oil, and we can’t do without.
Disclaimer: This article is provided for informational purposes only. It is not offered or intended to be used as legal, tax, investment, financial, or other advice.