Overview of Data Curation
A high level overview of data curation at MetricsDAO.
Data curation is the process of creating the data tables that power MetricsDAO bounties and analyses.
This page gives a high level overview of data curation and describes how curation projects are organized and coordinated at MetricsDAO.
At a high level, data curators write SQL code to transform blockchain data into easy to use formats.
For example, a curator will write SQL to transform JSON bytecode data into a table that is human readable, organized, and easy to query. Or a curator will write SQL to combine data from existing tables into a new table for a specific protocol or specialized analysis.
Each step in the data curation process involves transforming data into progressively easier to use formats.
To better understand these steps it helps to see them in the context of the full blockchain data journey: starting with a person using a dApp and ending with analysts creating insights from curated data tables:
Step in the Blockchain Data Journey
With each transformation curators need to deeply understand the data they’re modeling. They’ll make design decisions, write tests, write documentation, incorporate user feedback, and more during the phases of a curation project.
Curation projects that create tables for a protocol like a DEX generally follow these phases:
Deep dive to understand the protocol: from the perspective of its users (try it out!), and from the perspective of its founders and builders: what are their analytical needs and ecosystem priorities? If the curation team is building a specialty table for a chain, such as NFT sales, then they’ll work to anticipate analytical needs for that table.
With their understanding of the protocol, they’ll perform test transactions and examine how the protocol’s data show up in the core tables of its base layer chain(s).
The curation team will design and model the tables, working from the data in the protocol’s base layer chain(s).
The team writes tests on the new tables and releases them to users for querying and feedback.
Final tests written and resolution of any bugs. Release tables to public.
Work with the community to prioritize improvements and new tables based on feedback and the protocol’s needs evolve.
Curation projects that create tables for an entire chain like Near or Terra are broken down into phases:
One of MetricsDAO’s data providers, like Flipside, will ingest data from a node provider and into MetricsDAO’s Snowflake database.
The curation team will design and model tables like blocks, transactions, and messages.
The curation team writes tests on core tables and releases the core tables to users who use the tables and provide feedback.
The curation team models the data into core tables into final tables ready for analytics.
Final tests written and resolution of any bugs.
Work with the community to prioritize improvements and new tables based on feedback and as new protocols/projects launch on the chain.
To see an example of these phases with more details check out this Notion page for the Terra 2.0 curation project planning.
After an entire chain’s data has been curated, new projects may launch that focus on creating tables for a specific protocol or project like Uniswap. New projects may also launch to create specific tables like NFT mints and sales.
In addition to the developer tools that curators use to write code, curators will coordinate project work via:
- Coordinape for group consensus on contributor compensation.
This page describes how curation projects are organized and coordinated, but in practice we learn the most from each other. The most important details will come from participating, shadowing, and learning from the MetricsDAO community. See the Data Curator Onboarding page to get started.