The Surge: Ethereum's Path to Mass Scalability

Ethereum's scaling strategy has evolved significantly, shifting from early concepts like sharding and layer-two protocols to today's rollup-centric roadmap. This new approach establishes a clear division of labor: Ethereum's Layer 1 (L1) serves as a robust, decentralized base layer, while Layer 2 (L2) solutions handle ecosystem expansion. Recent achievements include EIP-4844 blobs that increased L1 data bandwidth and multiple EVM rollups reaching Stage 1 maturity. Future goals target 100,000+ TPS while preserving L1 decentralization, ensuring some L2s fully inherit Ethereum's core properties, and maximizing interoperability between L2s.

Understanding Ethereum's Scaling Evolution

Initially, Ethereum's roadmap featured two primary scaling strategies. The first approach, often referenced in early papers from 2015, was "sharding" - where each node would only need to validate and store a small portion of transactions rather than the entire chain. This mirrors how peer-to-peer networks like BitTorrent operate. The second strategy involved layer-two protocols: networks built on top of Ethereum that benefit from its security while keeping most data and computation off the main chain.

The concept of "layer-two" has evolved significantly over time. In 2015, it primarily referred to state channels. By 2017, the focus shifted to Plasma, and by 2019, the community converged on rollups as the most promising solution. Rollups proved more powerful than previous approaches but required substantial on-chain data bandwidth. Fortunately, sharding research had by then solved the problem of大规模验证"数据可用性"的问题. These parallel developments led to the current rollup-centric roadmap that remains Ethereum's primary scaling strategy today.

The Surge represents Ethereum's commitment to scaling while maintaining decentralization. This division of labor mirrors patterns found throughout society: court systems (L1) aren't designed for maximum speed but for protecting rights and enforcing contracts, while entrepreneurs (L2) build upon this solid foundation to drive innovation and growth.

Key Objectives of The Surge

Achieve 100,000+ transactions per second (TPS) across L1 and L2 combined
Maintain L1 decentralization and robustness
Ensure at least some L2 solutions fully inherit Ethereum's core properties: trustlessness, openness, and censorship resistance
Maximize interoperability between L2s, creating a unified ecosystem rather than dozens of separate chains

Data Availability Sampling: The Foundation of Scaling

The Current Challenge

Since the Dencun upgrade went live on March 13, 2024, the Ethereum blockchain includes three approximately 125 kB "blobs" per slot, providing about 375 kB of data availability bandwidth every 12 seconds. Assuming transaction data is published directly on-chain, with an ERC20 transfer requiring about 180 bytes, the maximum TPS for rollups on Ethereum becomes:

375000 / 12 / 180 = 173.6 TPS

Adding Ethereum's calldata (theoretical maximum: 30 million gas per slot / 16 gas per byte = 1,875,000 bytes per slot) brings this number to 607 TPS. With PeerDAS, the plan is to increase the target blob count to 8-16, which would achieve 463-926 TPS using calldata.

While significant compared to Ethereum L1 alone, this remains insufficient for mass adoption. Our mid-term target is 16 MB per slot, which combined with rollup data compression improvements could deliver approximately 58,000 TPS.

How Data Availability Sampling Works

PeerDAS represents a relatively simple implementation of "one-dimensional sampling." Each blob in Ethereum is a 4096-degree polynomial over a 253-bit prime field. The network broadcasts "shares" of the polynomial, with each share consisting of 16 evaluations of 16 adjacent coordinates selected from a total of 8192 coordinates. Any 4096 of the 8192 evaluations (according to current proposed parameters: any 64 of 128 possible samples) can recover the entire blob.

PeerDAS works by having each client listen to a small number of subnets, where the i-th subnet broadcasts the i-th sample of any blob, and obtains blobs on other subnets it needs by querying peers in the global p2p network who listen to different subnets. A more conservative version, SubnetDAS, uses only the subnet mechanism without the additional layer of querying peers. The current proposal is for proof-of-stake nodes to use SubnetDAS while other nodes (i.e., "clients") use PeerDAS.

Theoretically, we can extend one-dimensional sampling fairly far: if we increase the maximum blob count to 256 (targeting 128), we would reach the 16 MB goal while keeping the data availability sampling cost per node at just 16 samples 128 blobs 512 bytes per sample per blob = 1 MB of data bandwidth per slot. This is at the edge of our tolerance: possible, but it means bandwidth-constrained clients cannot sample. We can optimize this by reducing blob count and increasing blob size, but this makes reconstruction more expensive.

The ultimate solution involves implementing two-dimensional sampling, which performs random sampling not just within blobs but across blobs. The linear properties of KZG commitments are used to "extend" the set of blobs in a block, generating a series of new "virtual blobs" that redundantly encode the same information.

Crucially, computing the extension of commitments doesn't require having the blobs, making this scheme fundamentally suitable for distributed block building. The node actually building the block only needs to have the KZG commitments of the blobs and can itself rely on DAS to verify blob availability. One-dimensional DAS is also naturally suitable for distributed block building.

Data Compression: Doing More With Less

The Data Efficiency Problem

Each transaction in a rollup consumes significant on-chain data space: an ERC20 transfer requires approximately 180 bytes. Even with ideal data availability sampling, this limits layer-two scalability. At 16 MB per slot, we get:

16000000 / 12 / 180 = 7407 TPS

What if we could address not just the numerator (data availability) but also the denominator, making each transaction in a rollup consume fewer bytes on-chain?

Compression Techniques

The simplest gains come from zero-byte compression: replacing long sequences of zero bytes with two bytes indicating how many consecutive zero bytes follow. To go further, we leverage specific properties of transactions:

Signature aggregation - Switching from ECDSA signatures to BLS signatures, which have the property that multiple signatures can be combined into a single signature that proves the validity of all originals. While computationally expensive to verify even when aggregated, this makes sense in data-scarce environments like L2. ERC-4337's aggregation capabilities provide one pathway to implement this.
Replacing addresses with pointers - If an address has been used before, we can replace the 20-byte address with a 4-byte pointer to its position in history. This is necessary for maximum gains, though implementing it requires effort since it requires (at least partially) making the blockchain's history effectively part of the state.
Custom serialization for transaction values - Most transaction values involve small numbers, like 0.25 ETH represented as 250,000,000,000,000,000 wei. Gas max base fees and priority fees are similar. Thus, we can use custom decimal floating-point formats, or even dictionaries for especially common values, to represent most currency values very compactly.

Generalized Plasma: Extreme Scaling Solutions

Beyond Conventional Rollups

Even with 16 MB blobs and data compression, 58,000 TPS may not suffice to fully接管消费者支付、去中心化社交或其他高带宽领域. When considering privacy requirements, scalability might drop by 3-8x. For high-transaction-volume, low-value applications, one current option is validium, which keeps data off-chain with an interesting security model: operators cannot steal user funds but might disappear and temporarily or permanently freeze all user assets. But we can do better.

How Plasma Works

Plasma is a scaling solution where operators publish blocks off-chain and put Merkle roots of these blocks on-chain (unlike rollups, which put full blocks on-chain). For each block, the operator sends each user a Merkle branch proving what did or did not happen to that user's assets. Users can withdraw their assets by providing Merkle branches. Importantly, this branch does not need to be rooted in the latest state - thus, even if data availability fails, users can still recover their assets by withdrawing from the latest available state they have. If users submit invalid branches (e.g., exiting assets already sent to someone else, or operators creating assets out of thin air), an on-chain challenge mechanism can adjudicate who the assets should belong to.

Early versions of Plasma could only handle payments and could not be efficiently generalized further. However, if we require each root to be verified with a SNARK, Plasma becomes much more powerful. Each challenge game can be greatly simplified since we eliminate most possible paths for operator cheating. New paths also open up, allowing Plasma techniques to scale to a wider asset classes. Finally, if the operator is not cheating, users can withdraw their funds immediately without waiting for a week-long challenge period.

One approach to making an EVM Plasma chain (not the only one) is: use ZK-SNARKs to build a UTXO tree parallel to the EVM, reflecting balance changes in the EVM, and define a unique mapping of "the same coin" at different points in time. A Plasma system can then be built on top of this.

A key insight is that Plasma systems don't need to be perfect. Even if you can only protect some assets (e.g., only coins that haven't moved in the past week), you've still significantly improved upon the status quo of hyper-scalable EVM, which is validium.

Another class of constructions are Plasma/rollup hybrids. These store very small amounts of data on-chain per user (e.g., 5 bytes), achieving properties somewhere between Plasma and rollups: offering very high scalability and privacy, though even in a 16 MB world, the theoretical capacity cap would be about 16,000,000 / 12 / 5 = 266,667 TPS.

Mature L2 Proof Systems

The Trust Problem

Today, most rollups aren't actually trustless; a security council has the power to override the (optimistic or validity) proof system. In some cases, the proof system doesn't even exist, or exists only in an "advisory" capacity even if present. The most advanced are (i) some application-specific rollups like Fuel, which are trustless, and (ii) as of this writing, Optimism and Arbitrum, two full EVM rollups that have achieved partial trustlessness milestones called "Stage 1." The reason rollups haven't progressed further is concern about bugs in the code. We need trustless rollups, so we need to address this head-on.

The Path to Maturity

First, let's recall the "stages" system, initially proposed. While there are more detailed requirements, the summary is:

Stage 0: Users must be able to run a node and sync the chain. Validation can be fully trusted/centralized at this point.
Stage 1: There must be a (trustless) proof system ensuring only valid transactions can be accepted. A security council may override the proof system, but only with a 75% vote threshold. Additionally, at least 26% of the council must come from outside the main company developing the scaling solution. An upgrade mechanism with weak capabilities (like a DAO) is allowed, but it must have sufficiently long delay that users can withdraw funds before a malicious upgrade takes effect.
Stage 2: There must be a (trustless) proof system ensuring only valid transactions can be accepted. The security council is only allowed to intervene in cases of provable bugs in the code. Upgrade mechanisms are allowed but must have very long delays.

The goal is to reach Stage 2. The main challenge to reaching Stage 2 is gaining enough confidence that the proof system is actually sufficiently trustworthy. Two main approaches exist:

Formal verification: Using modern mathematical and computational techniques to prove that the (optimistic or validity) proof system only accepts blocks that pass the EVM specification. These techniques have existed for decades, but recent advances make them more practical, while AI-assisted proof progress may accelerate this trend further.
Multi-provers: Creating multiple proof systems and putting funds into a 2-of-3 (or larger) multisig between these proof systems and the security council. If the proof systems agree, the security council has no power; if they disagree, the security council can only choose between them, not unilaterally impose its own answer.

Cross-L2 Interoperability and UX Improvements

The Ecosystem Fragmentation Problem

A major challenge in the current L2 ecosystem is user difficulty in navigation. Furthermore, the simplest ways to use it often reintroduce trust assumptions: centralized bridges, RPC clients, etc. If we're serious about L2s being part of Ethereum, we need to make using the L2 ecosystem feel like using a unified Ethereum ecosystem.

Pathological and dangerous cross-L2 user experience examples exist where users can lose funds simply by selecting the wrong chain. In a well-functioning Ethereum ecosystem, sending tokens from L1 to L2, or from one L2 to another, should feel exactly like sending tokens within the same L1.

Interoperability Solutions

Many categories of cross-L2 interoperability improvements exist. Typically, these improvements are proposed by noting that theoretically, rollup-centric Ethereum is the same thing as L1 execution sharding, then asking how the current Ethereum L2 ecosystem falls short of this ideal in practice. Examples include:

Chain-specific addresses: The chain (L1, Optimism, Arbitrum...) should be part of the address. Once implemented, cross-L2 sending could work by putting the address in the "send" field, at which point the wallet could determine in the background how to perform the send (including using bridge protocols).
Chain-specific payment requests: It should be easy and standardized to create messages of the form "send me X token of type Y on chain Z." This has two main use cases: (i) payments, whether person-to-person or person-to-merchant services, and (ii) dapps requesting funds.
Cross-chain swaps and gas payments: There should be a standardized open protocol for expressing cross-chain operations.
Light clients: Users should actually be able to verify the chains they're interacting with, not just trust RPC providers.
Keystore wallets: Today, if you want to update the keys controlling your smart contract wallet, you have to do it on all N chains where the wallet exists. Keystore wallet is a technique that allows keys to live in one place (perhaps on L1, or later perhaps on an L2), and then can be read from any L2 that has a copy of the wallet. This means updates only need to be done once.
More radical "shared token bridge" ideas: Imagine a world where all L2s are validity proof rollups submitting to Ethereum every slot. Even in this world, moving assets from one L2 to another "natively" requires withdrawal and deposit, incurring significant L1 gas fees. One way to solve this is to create a shared minimal rollup whose only function is to maintain balances of what types and quantities of tokens each L2 has, and allow these balances to be updated in batches through a series of cross-L2 send operations initiated from any L2.
Synchronous composability: Allowing synchronous calls to happen between specific L2s and L1, or between multiple L2s. This could help improve the financial efficiency of DeFi protocols.

Scaling Execution on L1

Why L1 Scaling Still Matters

If L2s become highly scalable and successful, but L1 can still only handle a very small number of transactions, Ethereum could face several risks:

The economic situation of the ETH asset becomes more precarious, affecting the network's long-term security.
Many L2s benefit from tight integration with the highly developed financial ecosystem on L1; if this ecosystem is significantly weakened, the incentive to be an L2 (rather than an independent L1) diminishes
L2s take a long time to have exactly the same security guarantees as L1.
If an L2 fails (e.g., due to malicious or disappearing operators), users still need to go through L1 to recover their assets. Thus, L1 needs to be powerful enough to at least occasionally handle the highly complex and messy liquidations of L2s.

For these reasons, it's valuable to continue scaling L1 itself and ensure it can continue to accommodate a growing number of uses.

Approaches to L1 Scaling

The simplest way to scale Ethereum is to directly raise the gas limit. However, this could lead to L1 centralization, undermining one of Ethereum L1's strongest features: its credibility as a robust base layer. How much gas limit increase is sustainable remains debated. This question also changes with the implementation of other technologies that make large blocks easier to verify. Another important aspect needing continuous improvement is the efficiency of Ethereum client software, which is already much more efficient than five years ago. An effective L1 gas limit increase strategy should include accelerating these verification technologies.

Another scaling strategy involves identifying specific functions and computation types whose costs can be reduced without compromising network decentralization or security. Examples include EOF (a new EVM bytecode format more amenable to static analysis), multidimensional gas pricing (establishing separate base fees and limits for computation, data, and storage), reducing gas costs for specific opcodes and precompiles, and EVM-MAX and SIMD extensions for more efficient native large-number modular arithmetic.

Finally, a third strategy is native rollups (or "enshrined rollups"): essentially creating many copies of the EVM running in parallel, resulting in a model equivalent to what rollups can provide, but more natively integrated into the protocol.

Frequently Asked Questions

What is the difference between Layer 1 and Layer 2 scaling?

Layer 1 scaling refers to improving the base Ethereum blockchain itself through techniques like sharding or gas limit increases. Layer 2 scaling involves building protocols on top of Ethereum that handle transactions off-chain while periodically settling batches on the main chain. L2 solutions include rollups, plasma, and state channels, each with different security and efficiency tradeoffs.

How does data availability sampling improve Ethereum's scalability?

Data availability sampling allows nodes to verify that all data in a block is available without downloading the entire block. This enables the network to safely increase block sizes while maintaining light client verifiability. Techniques like PeerDAS and eventual 2D sampling create redundant encoding of block data, ensuring nodes can reconstruct complete blocks from small samples.

When will Ethereum achieve 100,000 transactions per second?

Ethereum is progressing toward this goal through a combination of L1 improvements like data availability sampling and L2 scaling solutions. Current estimates suggest this capability could emerge within the next few years as technologies like PeerDAS maturation, data compression techniques improve, and L2 proof systems become more efficient. The exact timeline depends on successful implementation of these complex technologies.

Are Layer 2 solutions as secure as Ethereum mainnet?

Security varies between L2 solutions. Optimistic rollups rely on fraud proofs and challenge periods, while ZK-rollups use cryptographic validity proofs. Some L2s currently have security councils that can override proofs in case of bugs. The goal is for L2s to reach "Stage 2" maturity where they're fully trustless and inherit Ethereum's security properties. 👉 Explore advanced scaling techniques to understand how different approaches compare.

How will cross-L2 interoperability improve user experience?

Future interoperability standards will make moving between L2s feel seamless through chain-specific addresses, standardized cross-chain communication protocols, light client verification, and shared token bridges. These improvements will create a unified ecosystem where users won't need to worry about which specific L2 they're using for different applications.

Why continue scaling Layer 1 if Layer 2 solutions exist?

L1 scaling remains crucial for maintaining ETH's economic security, supporting L2 emergency withdrawals, preserving Ethereum's rich financial ecosystem, and ensuring the network doesn't become overly dependent on potentially centralized L2 operators. A healthy balance between L1 and L2 scaling ensures Ethereum remains robust and decentralized while achieving high throughput.