Indexing data on ICE/SNOW with Subsquid

8 min readAug 25, 2022

Subsquid is a framework for indexing or querying data on Substrate-based blockchains. Moreover, Subsquid is an ETL (Extract, Transform, Load) project that collects on-chain data and provides a GraphQL API to query those data. Moreover, the multi-layer strategy of Subsquid seeks to pre-process and decode raw chain data and store it for faster access by query nodes, delivering improved performance over direct RPC calls.

This article will walk you through the process of creating a Subsquid project (also known as a “Squid”) that indexes ERC-721 token transactions in the Arctic network.

Prerequisites

To run a squid project, you need to install:

Creating the Project

Subsquid has a template repository squid-templateon GitHub which can be used to start the project. You can fork this repository to your GitHub account and then start building on it.

Installing Dependencies

npm ci

Installing Additional Dependencies

npm install @ethersproject/abi ethers @subsquid/substrate-evm-processor @subsquid/evm-typegen

The next thing to do, to customize the project for our own purpose, is to make changes to the schema and define the Entities we want to keep track of. So let us customize the project now.

Define Entity Schema

To index ERC-721 token transfers and approvals, we will need to track:

Token transfers
Ownership of tokens
Contracts and their minted tokens
Approval of tokens

so update theschema.graphql file with the following content:

Things to note in the above schema definition:

@entity - indicates that this type will be translated into a database-persisted ORM model
@derivedFrom - indicates the field will not be persisted in the database, rather it will be derived
type references (i.e. from: Owner) - establishes a relation between two entities

NOTE: Refer to schema definition for further information on GraphQL schema.

Since we have made changes to the schema, now new TypeScript entity classes have to be generated, and to do that we’ll have to run the codegen tool

npx sqd codegen

The generated TypeScript entity classes can be found under the src/model/generated path.

ABI Definition and Wrapper

Subsquid supports the automated creation of TypeScript type-safe interfaces for Substrate data sources (events, extrinsics, storage items).
Changes are recognized automatically throughout the runtime.

With the introduction of the evm-typegen tool, which generates TypeScript interfaces and decoding methods for EVM logs, this feature has been extended to EVM indexing.

First, the specification of its Application Binary Interface(ABI) must be obtained. This is available as a JSON file, which will be loaded into the project.

Create a folder named abi inside the src folder, and there create a JSON file named ERC721.json

mkdir src/abi
touch src/abi/ERC721.json

Copy the ABI for the ERC-721 Interface and paste it into the ERC721.json file

To automatically generate TypeScript interfaces from an ABI definition, and decode event data, simply run this command from the project’s root folder

npx squid-evm-typegen --abi src/abi/ERC721.json --output src/abi/erc721.ts

Define and Bind Event Handler

The Subsquid SDK includes a processor class called SubstrateProcessor, or in this example, SubstrateBatchProcessor. To obtain chain data, the processor connects to the Subsquid archive. It will index from the specified starting block to the specified end block, or until new data is added to the chain.

Managing the EVM Contract

It is also necessary to define some constants and some helper functions to manage the EVM contract. You can create an additional file for these items:

touch src/contract.ts

In the src/contract.ts file, we'll take the following steps:

Define the chain node endpoint
Create a contract interface to store information such as the address and the ABI
Define functions to fetch a contract entity from the database or create one

In the src/contract.ts file, add the following content:

src/contract.ts file

Here in the contract.ts file, we have defined the constant CHAIN_NODE to contain the endpoint URL of the Arctic archive node (i.e. wss://arctic-archive.icenetwork.io:9944). Then we have defined a map that maps the contract address to the contract model and the contract instance defined by the ethers library. Here we are going to index two ERC721 tokens:

ArcticToken with symbol ARTK deployed at 0x822f31039f5809fa9dd9877c4f91a46de71cde63
MyToken with symbol MTK deployed at 0x581522ca7b73935e4ad8c165d5635f5e15a7658d

Configure Processor and attach handlers

The src/processor.ts file is where the template project instantiates the SubstrateBatchProcessor class, configures it for execution, and attaches the handler functions.
It defines the Squid processor which retrieves on-chain data from the endpoint exposed by Squid Archive, applies arbitrary modifications, and stores the result in the target database schema (that we defined in the schema.graphql file).
It also defines data handlers (handleTransfers & handleApprovals) to subscribe to log entries of interest. The data handlers specify the data to be retrieved as well as how the data is processed and saved to the destination database. The handlers rigorously process the execution log items in the sequence prescribed by the data contained in the historical chain blocks.

Update the src/processor.ts file with the following snippet:

src/processor.ts file

Here, we have initialized the database with TypeormDatabase(), which is an implementation of database offered by Subsquid SDK.

Then we have initialized a processor with SubstrateBatchProcessor, which is designed to have only a single data handler that processes an array of ordered log items of different kinds in a single batch. Let us look at the configuration of processor:

setBlockRange(Range): Limits the range of blocks to be processed
setBatchSize(number): Set the maximal number of blocks fetched from the data source in a single request
setDataSource(DataSource): Set the data source to fetch the data from:

archive: an archive endpoint that will be made available to query data

chain: a node RPC endpoint (e.g: wss://arctic-archive.icenetwork.io:9944)

addEvmLog : used to subscribe to the EVM log data (event) emitted by a specific EVM contract

Here we are going to index transfer and approval event, so in the filter property of addEvmLog we have added both events in the double array as:

In case of tracking a single event say transfer, single array can be used as:

Later, we have defined two data types: Item, which is a type of Processor and Context which is type of Store. Here Store is a generic interface exposed to the handlers. In simple terms, Store is used to save data into database with the help of Typeorm Database that we have initialized earlier.

While executing processor.ts file, processor.run function is executed at first, which is responsible for fetching all the transfers and approval data by invoking handleTransfer() and handleApproval() functions as well as saving the data to the database by invoking saveTransfer() and saveApproval() functions respectively. The ctx passed to the handler is of type Store. TypeormDatabase provides ctx.store which can be used to store data to the database via ctx.store.save which is demonstrated in saveTrasnfer and saveApproval functions.

Launch Database

When executing the project locally, like in this article, the docker-compose.yml file in the root directory is included with the template which is used to launch a PostgreSQL container.
Run the following command in your terminal to accomplish this.

docker-compose up -d

NOTE: The -d parameter is optional, it launches the container in daemon mode, so the terminal will not be blocked, and no further output will be visible.

Squid projects automatically manage the database connection and schema, via an ORM abstraction.

To set up the database, you can take the following steps:

Build the code

npm run build

Remove previous migrations

rm -rf db/migrations/*.js

Generate and apply the migrations so that tables are created on database

npx squid-typeorm-migration generate
npx squid-typeorm-migration apply

Here in this article, to index the Arctic data, we have to run a docker container, which is defined in archive/docker-compose.yml file. It is responsible for fetching blocks from a substrate chain and dumps them into a Postgres-compatible database. The file also contains docker container configuration for ingest service and gateway service. The ingesting service is substrate-ingest, and the data is exposed by substrate-gateway ,which is the gateway service.

In order to run an archive locally, inspect archive/docker-compose.yml and provide the WebSocket endpoint for arctic archive node which is wss://arctic-archive.icenetwork.io:9944 at argument --e- as done in here, then start it with

docker-compose -f archive/docker-compose.yml up

To drop the archive, run

docker-compose -f archive/docker-compose.yml down -v

The archive gateway will be started at port 8888 and it can immediately be used with the processor (even if it's not in sync). That’s why in src/processor.ts we have already updated this endpoint in as:

Launch Project

To launch the processor (this will block the current terminal), you can run the following command in new terminal:

node -r dotenv/config lib/processor.js

Finally, in a separate terminal window, launch the GraphQL server:

npx squid-graphql-server

Now you can visit localhost:4350/graphql to access the GraphQL console which is a GraphQL playground for playing with queries. From this playground, you can perform queries.

Query Example

Let’s query owners of token with their balance and id(address), approvals events with address owner and approved along with their balances, and tokens with their name and symbol

query MyQuery {
  owners(limit: 10) {
    balance
    id
  }
  approvals(limit: 10) {
    approved {
      balance
      id
    }
    owner {
      balance
      id
    }
  }
  tokens(limit: 10) {
    contract {
      name
      symbol
    }
  }
}

You can play with queries here after all it’s a playground.

NOTE: Query results will only be shown after processor has synced the blocks where the tokens were deployed.

Summary

You can view the finalized and complete project on this GitHub repo.

As part of this article, we learned how to index ERC721 token transfers and approvals using subsquid and query the data using graphql. Similarly, you can index any other token/data by updating the schema.graphql file along with the other changes in this article.

Subsquid’s documentation is full of useful information, and it’s the best place to start if you have questions about anything that wasn’t covered in this introduction.