TLDR: we are assembling a bunch of public crypto datasets for researchers and tool builders
short thread
crypto has a very unique relationship to data
- unprecedented levels of openness and detail π
- trustless consensus on a single global state π
- culture of open-source-first π½
but there's a massive problem
obtaining crypto datasets requires a lot of of time β, money π°, and/or expertise π΅οΈ
this creates massive friction for any research that requires such datasets
are you trying to scrape large amounts of data from an EVM RPC node? π€π€
it's a simple task in theory, but in practice it can require a lot of book-keeping, boilerplate, and fault-tolerant infra
allow me to explain how ctc's new RPC client makes this process easy and efficient
ctc is a python library for collecting and analyzing EVM data
it implements an RPC client that is specifically designed for data scraping: 1. it is async-first, making it easy to orchestrate many concurrent queries 2. it utilizes JSON-RPC batching to maximize performance
being async-first gives you a low-effort way to parallelize RPC requests using standard python async idioms
this is particularly helpful when querying a remote RPC node, as network latency is usually the biggest bottleneck
new set of tools for performing eth_call's from the command line
βπ±οΈβ
[a thread with many screenshots and examples]
1 / 16
whether you work on smart contracts, frontend products, or data analysis, there **will** come a point where you need to check the outputs of on-chain contract functions
the easiest way to do this is using etherscan's "Read Contracts" tab
2 / 16
but as you do this more and more, you might want something that's a little faster and a little more programmatic
a natural choice is cli tools
there are many great tools for performing eth_call's from the cli, including:
- seth (dapptools)
- cast (foundry)
- ethereal