Ocean Protocol: Reimagining Data Accessibility

Key Insights

Ocean Protocol is a stack of decentralized data sharing technologies that aim to lower the barriers to high-quality data access. The Ocean ecosystem includes data marketplaces and a collection of data orchestration smart contracts.
Ocean V3 and V4 employ an automated market maker (AMM) with data pools that facilitate trade between data sellers and buyers. The protocol is introducing Ocean V4X to address design flaws and product demand challenges on V3 and V4.
The protocol offers novel features such as its Compute-to-Data functionality, which enables data users to train data models while keeping the actual dataset private.
Current industry norms have hindered Ocean from realizing the full benefit of potential network effects. As such, the protocol is primarily challenged with securing sustained data supplies and retaining reliable channels of demand. The protocol is addressing these challenges with innovative solutions in its recent Ocean V4X update.
Backed by an agile team, the Ocean Protocol is primed to gain a larger foothold in the space as industry norms continue to rapidly evolve.

Introduction

Data is an invaluable asset for artificial intelligence (AI), biotechnology, financial technology (fintech), consumer retail, and more. In 2017, The Economist argued that “The world’s most valuable resource is no longer oil, but data” and reported that the top five companies with the most access to data netted a total $25 billion profit in Q1 2017. Data is valuable for various reasons, a major one being its importance to advancing emerging technologies. AI companies need large high-quality datasets to develop supervised learning algorithms like neural networks. Biotechnology R&D teams consume significant amounts of data when they replicate their results in order to confirm research findings. Consumer retail is becoming increasingly reliant on substantial consumer data to develop targeted advertising. Web2 behemoths that specialize in big data capture, like Google or Meta, dominate market share in a variety of industries that require high-quality data. Furthermore these data oligopolies can act as gatekeepers in data-heavy industries, playing an outsized role in determining the success of smaller players.

Ocean is a stack of decentralized data sharing technologies. The protocol is interoperable and currently deployed on Ethereum, Polygon, Polkadot, Binance Smart Chain (BSC), Moonriver, and the Energy Web Chain. Its open-source code is readily available on GitHub, allowing other programmers to fork Ocean and build their own marketplaces. Aiming to make high-quality data more accessible, Ocean specifically intends to attract AI startups that need high-quality datasets. The protocol’s mission is outlined in its whitepaper. This report will look at Ocean V3 and V4, the protocol’s current challenges, and Ocean’s plan to address these challenges.

Ocean’s Product Evolution

At its core, Ocean protocol aims to provide the infrastructure and services required for decentralized data sharing technologies. The Ocean protocol had its initial release in 2018, allowing data providers to share data with consumers. Since then it’s iterated across multiple versions and has added in new features.

Ocean V2 introduced a novel Compute-to-Data feature. This allows data users to run models on private or sensitive data that would otherwise be unavailable to them. The architecture allows private data to remain on the data owner/provider’s servers while it is used to train models for the data user, making it compliant with the EU General Data Protection Regulations (GDPR).

Ocean V3 developed “datatokens”, an ERC-20 token (or equivalent token on other blockchains) that facilitates transactions between data providers and consumers. The open-sourced data marketplace, or the Ocean Market, was also introduced. The Ocean Market is structured as an automated market maker (AMM), allowing for data price discovery. By pairing the OCEAN token (the Ocean protocols native token) with datatokens, users could purchase access to data sets as needed. Ocean V3 also introduced the Data Farming Program, allowing OCEAN stakers to signal dataset quality. Ocean’s Data Supply Reward Function (RF) calculates staking rewards from the dataset’s popularity/perceived quality, community feedback, and other variables. Ocean’s program managers collect community feedback and calculate the RF off-chain at a pre-set time each week before manually airdropping OCEAN to contributors.

Ocean V4 added features meant to increase the platform’s demand. Ocean V4 launched in Q2 2022 with the creation of Data NFTs, which are tradable, non-fungible ERC-721 tokens that represent the NFT owner’s data as on-chain intellectual property (not enforced by legal contracts). The protocol also added protections against malicious data publishers who exit their datatoken positions at the detriment of stakers in the data pools. The V4 AMM model for price discovery is adjusted so that data providers do not receive any initial tokens and an algorithm independently works to mint/burn datatokens based on supply/demand to adjust prices. Ocean marketplaces also gained new ways to earn fees, including the Consume Market Consumption Fee where marketplaces can set their own rates in their preferred currencies and the Publisher Market Consumption Fee from dataset publications. V4 experienced impressive initial traction, growing from 20 to 38 data pools in just over a week at the end of June / early July.

OCEAN Tokenomics

OCEAN is designed to accrue value with network engagement growth. As users employ Ocean’s tools and marketplace, the protocol earns fee revenues that are reinvested into OceanDAO and subsequently used to burn the OCEAN token. Burning OCEAN reduces outstanding token supply, increasing upward pressure on OCEAN’s price as network-driven demand increases. The OCEAN token distribution is presented in the graphic below.

Ocean’s Challenges

Stagnating Traction Among Data Providers

The protocol has struggled to attract sufficient data providers. When Ocean originally launched the datatokens, the product launch hype and network rewards garnered high initial traction. Using the number of datatoken mint transactions and datatokens created as a proxy for traction, Ocean reached its demand peak among data providers in November 2020 (when 413 datatokens were minted and 340 datatokens were created). Since December 2020, Ocean has had an average 10–15 mint transactions and 10 datatokens created each month.

Stagnating Traction Among Data Consumers

Ocean has struggled to refine its product-market fit with data consumers. Using the number of datatoken transactions as a proxy for Ocean’s data consumption demand, Ocean garnered the most traction among consumers in November 2020 when the number of datatoken transfers or transactions (excluding mint transactions) exceeded 3,000. The number of datatoken transfers varied between 20 and 200 each month for the majority of 2021 before dropping to around 10–20 each month in 2022.

Unstable Network Effects

The achievement of network effects is critical to Ocean’s success. Data providers wouldn’t see the value of providing data on a platform with low traction among data consumers. Data consumers wouldn’t bother checking for data on a platform with limited data supply. As a result, Ocean is facing the “Cold Start Problem,” described by a16z partner Andrew Chen. Without sustained engagement, the incentives for participants become less clear, and the protocol’s network effect challenges are likely to continue.

Data Quality & Accountability

Some data consumers, like industrial-grade AI researchers, could consider the decentralized nature of the protocol to be a liability. While the Data Farming program incentivizes high-quality datasets, it does not actively discourage low-quality datasets. Ocean only identifies data providers by their wallet addresses, which limits accountability for data providers that offer low-quality datasets. To meet its stated goal of serving AI companies, Ocean could consider better tailoring its product offerings to meet common criteria for enterprise data.

July 2022 Exploit on Ocean V4

In the first month after its release, Ocean V4 faced an attack orchestrated by a series of transactions that took advantage of the AMM data pools. The exploit involved staking OCEAN in data pools, buying datatokens, removing the staked OCEAN tokens, and then selling the datatokens, which slowly drained OCEAN from the data pools. These attacks were made possible by the AMM mechanism that allowed data publishers and stakers to pull their liquidity from the data pools whenever they liked to sell datatokens for OCEAN.

Ocean’s team immediately responded to the attacks by raising the swap fee (Ocean Community Fee) to 15% so that the attackers would incur a loss from executing such transactions. The team also quickly alerted their community, advising its members to withdraw their liquidity from Ocean’s pools and recommending to its data providers to set a fixed price to their datasets. Setting a fixed price for a dataset counteracts the dynamic pricing mechanism inherent in AMM-style data pools.

Eventually Ocean’s team set the swap fee to 100% to completely neutralize the attack. Recognizing the need for a long-term solution to protect its community, Ocean Market is migrating from the AMM-powered price discovery model to a fixed price data asset model, Ocean V4X. Marketplaces built on Ocean are still allowed to choose a dynamic pricing model at their own risk.

Ocean’s Roadmap: Ocean V4X and veOCEAN

To counteract against the challenges around the Ocean pools AMM architecture, Ocean is introducing a new mechanism for data curation based on the vote-escrow (ve) token model. The veOCEAN design not only changes the dynamics for data curation, but it looks to improve the incentive alignment among data publishers, consumers, and OCEAN holders.

Ocean’s veOCEAN design is a fork of the veCRV model. This fixed price data asset model relies on staking smart contracts to facilitate dataset curation while simultaneously continuing the Data Farming community rewards. Under the previous Data Farming model, a staker’s rewards were based on their pro rata shares in AMM liquidity pools. The veOCEAN model would still allow stakers to earn based on their pro rata shares but their veOCEAN is locked in Ocean vaults for a specific period of time during which they cannot retrieve their capital. The stakers who actively participate in evaluating data quality earn Data Farming rewards each week. The longer a veOCEAN staker locks their capital, the higher their rewards. All veOCEAN token holders earn community fees paid to OceanDAO.

Ocean Data Bounty Program

Ocean’s Data Bounty Program encourages Ocean’s data consumers, of which 40% are data scientists, to brainstorm novel dataset use cases, compile datasets, analyze data for insights, train data models, and more with Ocean’s data assets. The Ocean Data Bounty is delivered through three phases: Ideation, Insights, and Problem-Solving:

Ideation Phase: 10-20 winning ideas are selected from 100-200 proposals and winners are awarded $5,000-$10,000 in OCEAN;
Insights Phase: 5-20 data models, reports, and other key insights will win $10,000-$20,000 in OCEAN;
Problem-Solving: proposals that improve Ocean’s business and product challenges are rewarded with $10,000-$25,000 in OCEAN.

This program promises to attract data consumers outside of the Web3 community, from academic data scientists to non-crypto industry researchers. This bounty program could create new demand from market segments that Ocean has been trying to penetrate, like industrial AI or biotechnology startups.

Final Remarks

Ocean is on a mission to democratize access to high-quality data, but its product offering has struggled to capture its intended product-market fit in current form. The protocol has developed many novel features (i.e., the Compute-to-Data program) that could be incredibly useful for various industrial and academic applications in the future. Despite its novelty, many industries are likely to remain hesitant to trust datasets on the platform due to its inability to guarantee high data quality. Companies will always choose the most cost-effective, efficient, and credible data solutions, and Ocean has work to do across those fronts. Ocean is trying to solve a real problem, but the protocol needs continued innovation to generate the necessary network effects and find its product-market fit.

Let us know what you loved about the report, what may be missing, or share any other feedback by filling out this short form.

Ocean Protocol: Reimagining Data Accessibility

Key Insights

Introduction

Ocean’s Product Evolution

OCEAN Tokenomics