Benjamin Profile picture
Jun 6 36 tweets 11 min read
1/ Understanding the EVM – Simplified

You’ve probably heard of the Ethereum Virtual Machine (EVM), but if you’re like me find it more than slightly intimidating. Below is my attempt at a basic guide to what it is and how it works:
2/ To start, it's important to understand what a virtual machine is. In the context of Ethereum, a virtual machine is a program built into software (Ethereum clients) that recreates a computer's functionality.
3/ To understand how it works, let’s take a look into its use through the lens of an Ethereum validator by breaking down some key categories: Clients, Storage, Transactions, and Execution.
4/ Clients: To become a validator and produce blocks, running the EVM as a means to compute state transitions is necessary. To best understand what state & state transitions, consider that when you’re sending Ether, this requires a transition in the state of the blockchain:
5/ Tokens now belong to a new account than before and this needs to be reflected in Ethereum's map of who owns what. In order to do this, validators need to take inputs/transactions, run their respective instructions in the EVM, and print the output in a block header.
6/ This is so other nodes are aware of how the state has changed. A validator runs the EVM by downloading an Ethereum client – software which contains the EVM program. An Ethereum client contains the Ethereum Virtual Machine (EVM) as well as other necessary features:
7/ a) Memory pool: Location of signed transactions ready to get included in a block gets stored
b) JSON-RPC API, which provides a data structure for processing requests to read/write data to Ethereum
c) A client process, which sends transactions from the mempool to the EVM
8/ There are many different types of clients that run in different languages, but they can interoperate with each other because they follow the same specifications in the Ethereum yellow paper.
9/ Storage: How does the EVM store state prior to a new transaction? Like a computer, a virtual computer is able to store data. In the context of the EVM, one of its critical functions is to store the “state” of all accounts and what information those accounts store.
10/ The EVM stores state according to a data structure called a “Merkle Patricia trie”, which is able to contain all the key:value pairs of all addresses on Ethereum.
11/ The keys correspond to both public and smart contract addresses, and their respective values represent the current state of those addresses.
12/ The value/state for each address is itself an encoding of the hash of the address’s respective code, a hash of the data stored by the account, its balance, and the number of transactions it’s carried out (represented as a nonce).
13/ A Merkle Patricia trie is used to store this data because it makes it easy to perform hashes of all the key-value pairs to eventually get to a singular “Merkle Root Hash” of the state of who owns what – a required field in the block header of a validator’s proposed block.
14/ Because of how hashes work, even a minute change in blockchain state will result in a completely different root hash. The reason for mandating validators to include this root hash in a block header is that it significantly enhances the security of the network.
15/ This is because it enables light nodes, which don’t have the space to store this Merkle Patricia trie, to verify the legitimacy of the block that a validator is attempting to proliferate.
16/ A light node can compute a “Merkle Proof” with the root hash, account key, and balance value of the proposing full node, and compare that to a Merkle proof of its own address & balance. There would be no match in the Merkle proofs if there were any incorrect data.
17/ Transactions: When a user is submitting a signed transaction through a wallet, the transaction data is compiled into Bytecode and sent to a node using the aforementioned JSON-RPC API. Bytecode is the low-level language that the EVM reads to compute state transitions.
18/ Bytecode appears as a HEX encoding of a string of binary. Collections of these bytes represent specific operations, known as opcodes that the EVM will perform.
19/ Opcodes are instructions that the EVM follows to manipulate the input data/transaction on the stack (more on this later) as a means to change the state. HEX format is just a numeric system (like the binary and decimal system) that is used to convey binary in a readable way.
20/ The reason why opcodes are important is that when computed, they enable the EVM to find the output of the state transition requested by the transaction (more on this later).
21/ When sent to the mempool the bytecode is passed as an argument in a transaction broadcast to a node’s mempool using the JSON-RPC API. After this, the transaction sits in the mempool with other unconfirmed transactions ready to be included in a block by a validator.
22/ When a validator picks the transactions it wants to include in a block, it will have to compute state transitions as defined by the operations/opcodes indicated by each transaction's respective bytecode. This is where the EVM's core functionality – execution – comes in.
23/ Execution: Storing the state of all accounts is data that the EVM stores permanently, but the EVM uses temporary memory to execute opcodes. To understand this, it's necessary to understand two types of temporary memory – “stack” and “memory” used during opcode execution.
24/ a) The stack is the data area where the computations as defined by opcodes are performed.
b) Memory is an array of data that can be used to store information temporarily to pass through data required to compute the instructions on top of the stack.
25/ When transactions are accessed by the EVM through the mempool using the aforementioned client process, the EVM takes the instructions in the sequence specified by the bytecode and the bytecode gets separated into their respective opcodes.
26/ The opcodes are sequentially loaded into the data area in a stacked sequence – where each opcode falls on top of another in the sequence specified by the bytecode.
27/ The stack follows the instructions at the top of the stack and utilizes data/variables moved to memory during the process to compute the instructions on top of the stack. The EVM also requires information about the state that it fetches from storage to run these opcodes too.
28/ When all opcodes have been run, this implies that any output about a change in state will be loaded to permanent storage by the stack. It’s important to note that each opcode results in a new computation of what the Ethereum state is, and each opcode has a specific cost.
29/ Once each opcode is run, the amount of gas expended in executing it is subtracted from the available gas specified when the user originally submitted the transaction.
30/ If insufficient gas was sent to cover the cost of running the opcodes, the validator does not include the transaction in the block but the user doesn’t get refunded because the validator has already incurred the cost to run the computation.
31/ Once this process has been completed for all transactions in the mempool the validator wants to include in the block, the validator can compute the aforementioned root hash of the new state and include it in the block header.
32/ Though there are many other fields required to then proliferate the block to other nodes, this concludes the general explanation of how the EVM works through the lens of a validator.
33/ For deeper information not covered, I recommend reading the exceptional articles by @iam_preethi, @Luit_H, @noxx3xxon, @vasa_develop, @chiqing whose articles are the basis of this thread.
If there are any clarifications/corrections that need to be added, please let me know!
@iam_preethi @Luit_H @noxx3xxon @vasa_develop @chiqing 34/ Ethereum Virtual Machine (EVM) ethereum.org/en/developers/…
Nodes and clients | ethereum.org ethereum.org/en/developers/…

Getting Deep Into EVM: How Ethereum Works Backstage | by vasa | The Startup | Medium medium.com/swlh/getting-d….
@iam_preethi @Luit_H @noxx3xxon @vasa_develop @chiqing 35/ Ethereum EVM illustrated takenobu-hs.github.io/downloads/ethe…
Ethereum Merkle Patricia Trie Explained | by Leo Zhang | Medium medium.com/@chiqing/merkl…
EVM Deep Dives: The Path to Shadowy Super Coder 🥷 💻 - Part 1 noxx.substack.com/p/evm-deep-div…
Important clarification: the EVM will only be run if it's a transaction that involves a smart contract, for example when you’re depositing some tokens into Yearn or providing liquidity on Uniswap. A transaction between two EOAs transfers value with no need to execute any code!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Benjamin

Benjamin Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(