Understanding the Ethereum Virtual Machine (EVM) and How It Works

The Ethereum Virtual Machine, or EVM, is the runtime environment that executes all transactions and smart contracts on the Ethereum blockchain. It is a globally accessible, decentralized computer that ensures consistent execution of code across all network nodes.

In this guide, we will explore the inner workings of the EVM, from processing transactions to executing smart contract code. Whether you are a developer, blockchain enthusiast, or simply curious about how Ethereum operates, this article will provide a clear and structured overview.

How the EVM Processes Transactions

When a transaction is submitted to the Ethereum network, it is converted into a Message object and passed to the EVM for execution. The exact process depends on the type of transaction:

For a simple transfer of ETH, the EVM directly updates the account balances in the state database (StateDB).
For smart contract creation or interaction, the EVM interpreter loads and executes the contract's bytecode. During execution, the contract may read from or modify the StateDB.

Intrinsic Gas: The Base Cost of Transactions

Every transaction on Ethereum must pay a base fee known as intrinsic gas. This cost is calculated as follows:

If the transaction carries no additional data (e.g., a standard ETH transfer), the fixed cost is 21,000 gas.
If the transaction includes data payload, each byte of data incurs an additional cost:
- 4 gas per zero byte.
- 68 gas per non-zero byte.

This is why many developers optimize smart contracts to reduce non-zero bytes in transaction data, ultimately lowering gas fees.

👉 Explore gas optimization techniques

Creating a Contract Object

The Message object derived from a transaction is used to create a Contract object within the EVM. This object contains:

The contract’s address.
The contract’s bytecode (loaded from the StateDB).
The input data provided for execution.

The total gas available for contract execution is limited by the network’s block gas limit, which prevents infinite loops and excessive resource consumption.

Inside the EVM Interpreter

The EVM is a stack-based virtual machine. Its interpreter manages four key components during execution:

Program Counter (PC): Points to the current instruction in the bytecode.
Stack: A last-in-first-out (LIFO) structure with a maximum depth of 1,024 entries. Each entry is 256 bits wide.
Memory: A volatile byte array used for temporary data storage during execution.
Gas: A counter that tracks the remaining gas available for the transaction. If gas is exhausted, execution fails.

The Opcode Execution Cycle

Each operation in the EVM is represented by a one-byte opcode, allowing for up to 256 possible instructions. The execution cycle follows these steps:

The PC reads the next opcode from the contract bytecode.
The interpreter looks up the corresponding operation in a jump table.
The gas cost for the operation is calculated and deducted from the gas pool.
If sufficient gas remains, the operation is executed. This may involve reading/writing to the stack, memory, or StateDB.

Common opcodes include arithmetic operations, memory management, and storage access. For a full reference, developers often consult opcode lists and gas costs available online.

How Contract Functions Are Called

When a transaction calls a smart contract function, the input data includes:

A 4-byte function signature: The first four bytes of the Keccak hash of the function name and parameter types.
The function arguments: Encoded parameters required by the function.

During compilation, the contract bytecode is prefixed with a dispatcher logic. This code uses the CALLDATALOAD opcode to read the function signature and compare it against known functions. If a match is found, the execution jumps to the corresponding function code.

Data Loading Opcodes

Smart contracts use several opcodes to access data:

CALLDATALOAD: Loads input data onto the stack.
CALLDATACOPY: Copies input data to memory.
CODECOPY: Copies the current contract’s bytecode to memory.
EXTCODECOPY: Copies an external contract’s bytecode to memory (used for auditing).

These instructions enable contracts to interact with internal and external data efficiently.

Smart Contract Interactions

A contract can call another contract using one of four opcodes:

CALL
CALLCODE
DELEGATECALL
STATICCALL

Each opcode offers different semantics regarding context, state modification, and gas allocation. For example, a standard CALL creates a new execution environment for the called contract, with separate stack and memory spaces. After the called contract finishes, its return data is placed in the caller’s memory, and execution resumes.

👉 Learn advanced contract interaction methods

Creating a New Smart Contract

A transaction with a nil recipient address is interpreted as a contract creation request. The process involves:

Generating a contract address: The address is derived from the sender’s address and nonce by computing Keccak(RLP(sender_address, nonce))[12:] (last 20 bytes).
Creating a state object: A new state object is created for the contract address in the StateDB.
Storing the code: The contract bytecode is stored permanently and cannot be changed.
Initializing storage: The contract’s storage trie is initialized. Future state changes are made via SSTORE instructions.

Gas Calculation and Optimization

Gas costs for EVM operations are defined in the Ethereum Yellow Paper. Developers can also inspect the source code in the Ethereum client implementation (e.g., gas.go and gas_table.go files) for exact values.

Understanding gas costs is essential for writing efficient smart contracts and minimizing transaction fees.

Frequently Asked Questions

What is the EVM?
The Ethereum Virtual Machine is a decentralized, Turing-complete virtual machine that executes smart contracts on the Ethereum blockchain. It ensures that all nodes process transactions consistently and securely.

How does gas work in the EVM?
Gas is a unit of computational effort. Users pay gas fees to compensate nodes for executing transactions or smart contracts. More complex operations require more gas.

What are opcodes?
Opcodes are low-level instructions that the EVM interpreter executes. Each opcode represents a specific operation, such as arithmetic, memory access, or control flow.

How are contract addresses generated?
Contract addresses are derived from the creator’s address and nonce using a Keccak hash function. This ensures each address is unique and deterministic.

Can smart contracts be updated after deployment?
No. Once deployed, a contract’s bytecode is immutable. However, developers can design upgradeable contracts using proxy patterns or state separation techniques.

What is the difference between CALL and DELEGATECALL?
CALL executes code in a separate context, while DELEGATECALL runs the code in the context of the caller, including storage and value. This is useful for library contracts.

Conclusion

The Ethereum Virtual Machine is the heart of the Ethereum ecosystem, enabling secure and deterministic execution of smart contracts. By understanding its architecture, gas model, and execution flow, developers can write more efficient contracts and users can better appreciate the technology behind decentralized applications.