Hicdex: the Hic Et Nunc indexer
The HicEtNunc developer shut down the hicetnunc.xyz site. You can use one of the alternative marketplaces such as https://teia.art/.
Indexers play an essential role in providing quick and scalable access to the data persisted in blockchains. Indexers are the backbone of many apps and services in all blockchain ecosystems.
The hicdex developer, @marchingsquare, created it as an easy-to-use API for third-party developers to create websites with HEN data. However, it has become a critical backend for the HEN marketplace. Many third-party tools and sites use hicdex, including Cyber metaverse, NftBiker’s tools, and hen.radio.
The traffic to HEN is growing exponentially, placing a lot of strain on hicdex and its developer. Recently HEN experienced several outages, some of which are due to hicdex carrying a high load.
As part of the process, I wanted to understand better how to run an instance for myself and document the steps to help other developers.
At a high level, an indexer typically does the following:
- Listens to updates of the blockchain.
- Extracts data from each blockchain transaction.
- Persists the data it needs into a database.
- Provides an API with a query mechanism.
Since many clients could query the indexer’s database, scaling and rate-limiting mechanisms are also needed. Therefore it is typical for indexers to be hosted on cloud platforms that can scale rapidly.
The use of indexers is so broad that users might assume that various kinds of data are part of a service or app, when in fact, it is coming from an indexer. For example, your Tezos account doesn’t keep track of any of the tokens minted or owned. Tezos wallet apps use an indexer to track all the tokens associated with your account.
Hicdex is open-sourced under AGPLv3 and consists of these repositories:
- hicdex — the indexer.
- hicdex-graphiql — a GraphiQL explorer.
- hicdex-metadata — a cache of the HEN NFT (aka OBJKT) metadata.
Hicdex uses DipDup, a full-stack framework for building Tezos indexers. DipDup uses the TzKT API to access the Tezos blockchain data. DipDup, by default, uses a PostgreSQL database to store the indexed data.
Hicdex uses the Better Call Dev (BCD) API to query the HEN minter smart contract (OBJKT Swap v1) for the contract storage data of each token. The contract storage data includes a metadata URI, which should resolve a JSON object describing the OBJKT.
Hicdex uses the Cloudflare IPFS gateway to retrieve the metadata of each OBJKT. Since this gateway does rate-limiting, hicdex caches the JSON metadata as files to a writable folder on local storage. Hicdex also attempts to fix any invalid metadata.
Caddy is used as a reverse proxy to expose the public endpoints for the API and the GraphQL explorer.
A note of caution: the hicdex repositories provide you with everything you need to get up and running, but it isn’t a turnkey solution for a production system. You will also need developer experience, preferably previous experience with indexing.
The new HEN hicdex servers currently get 5 million requests and use approximately 120–160GB of bandwidth per day.
The size of the PostgreSQL database is 35–40 GB.
Your system needs to have enough bandwidth and local storage to download and cache metadata for about 300,000 OBJKTs (currently approximately 3GB total).
To install hicdex, you need to have the following:
I’ve created a doc with detailed steps for getting the code and configuring the Docker containers.
Some OBJKTs have broken metadata and will error out the indexing. There is a particular broken.json file with the OBJKT IDs to avoid. Get the broken.json file from Github.
I recommend that you run your own IPFS node to speed up the process for the initial bootstrapping since the default Cloudflare IPFS gateway is rate-limited.
Depending on your system, it can take 2–3 days to bootstrap the indexing. All the Tezos blocks since the beginning of the blockchain are processed in sequence until it reaches real-time.
During bootstrapping, you can determine the number of OBJKTs indexed by querying the GraphiQL endpoint or the Postgress database.
If you plan to make your instance public, you will need a reverse proxy to expose the API and the GraphQL endpoints. Hicdex uses Caddy, but you can also use alternatives such as Nginx.
Once bootstrapping has been completed, I recommend that you make regular backups of the database to recover the indexer faster if a critical error occurs.
Installing and running hicdex isn’t a smooth process; you will have to get used to troubleshooting and learn how to recover from crashes.
To troubleshoot, get familiar with DipDup’s code, as there are bugs, and you might have to step in and hotfix the code.
One of the first issues I encountered was that the indexing process didn’t want to start correctly; indexing downloaded none of the metadata files. After looking at the hicdex code, I found a dependency on all the metadata folders needing to exist.
Occasionally hicdex would crash, which was pinned down to hicdex not running the latest version of DipDup. Once hicdex updated to the latest version, those crashes went away.
The unfortunate side-effect of the crases was that the indexing process would start from scratch after each crash. It’s not clear why DipDup would do that, so the re-indexing logic is removed.
An outstanding issue that still causes crashes is blockchain rollbacks. Due to the proof of stake consensus algorithm that Tezos uses, there is the possibility of a chain reorganization. This means there was a brief fork in the blockchain, and one side won out over the other and became the chain that the network adopts. If you were on the fork that loses, you have incorrect data and have to rollback and pick up the new chain’s data. Fortunately, reorganizations are rare and getting even rarer with recent updates to the Tezos protocol.
DipDup doesn’t currently support handling rollbacks, but it is on their roadmap. So, at the moment, rollbacks crash hicdex and then have to be manually be restarted from a backed-up database; otherwise, reindexing will start from scratch.
HEN is experiencing exponential growth, having the most active daily users ever and record-breaking secondary sales (see charts).
HEN recently switched over to the new load-balanced server farm with significantly better computing resources. The farm consists of 3 hicdex systems behind a load balancer. Together these systems have 26 cores, 100 GB memory, and 1.4TB storage.
Here is a recent chart of how these servers are performing since taking over from hicdex.com:
The new servers are handling the growing traffic and daily spikes efficiently. The only production issue is the crashes related to rollbacks.
For hicdex.com, when a rollback happens, the last database snapshot is used to start a new indexer while the bad instance keeps running. Once the new indexer catches up, the traffic switches over to the new instance.
For the teztools.io server farm, one of the instances is always out of rotation and doing re-indexing. All automatic re-indexing is switched off.
One of the third-party apps recently accidentally overloaded the hicdex server with a large amount of traffic. This caused HEN to go down. The new farm is dedicated to the HEN marketplace to avoid third-party apps taking HEN down again.
Having the public instance of hicdex.com for third-party developers is still a vital resource for the HEN ecosystem, serving 4M API calls per day. The hicdex developer is still maintaining the software and running his public instance. He is adapting to new contracts such as HEN’s swap contract v2, the objkt.bid contracts and the upcoming support for split contracts for collaborative OBJKTs. He also works on new HEN features such as SUBJKTs (HEN profiles) and fraud detection.
Given HEN’s exponential growth, a scalable and reliable indexer has become essential. DipDup is missing critical features for HEN, but they have a roadmap to support some of the missing features.
The current server farm hosted by teztools.io is a temporary solution funded by some community members. This should give the HEN developers time to come up with a solution that is more future-proof. The community lacks expertise in indexers, except for the hicdex developer, who doesn’t want to be on point for HEN production issues anymore. Building more in-house expertise would help to make the site more robust.
HEN used to have its own backend API. It consumes information from some Tezos blockchain information providers, including Conseil and BCD, along with OBJKT metadata sourced from IPFS. It might make sense to take back ownership of such a critical backend, whether based on hicdex or not.
HEN has dependencies on various third-party APIs and services, many of which are free. However, these free tiers come with limitations. Given HENs exponential growth, it might make sense to instead pay for services that provide service-level agreements and support.
It is also important to empower third-party developers in the HEN ecosystem by having a public API and even an SDK that they can use to build their sites and tools. But this should not affect the reliability of the HEN marketplace and should probably have its own dedicated endpoint.
For me, it has been interesting to see how an experimental NFT platform like HEN has been coping with its tremendous growth and the kinds of technical issues it has to handle. The importance of an indexer is an eye-opener to me, and I wish the HEN developers the best in solving this critical problem.
If you want to learn how to use Hicdex in your app, try my tutorial: Hicdex image viewer web app. If you want to learn more about how HEN works, start reading my 3 part series on HEN smart contracts. You can follow my 3D art on HEN.