Why Solana Has Outages

Solana's outages are an architectural consequence, not a maintenance failure. The same design choices that produce high throughput — continuous consensus, a single validator client — create specific failure modes when the network is under stress.

Lewis Jackson

CEO and Founder

Solana's outages don't happen because the team isn't paying attention. They happen because of architectural choices made deliberately in pursuit of performance — and the same choices that give Solana its speed also make it brittle in specific failure modes.

Understanding the pattern requires understanding how Solana actually processes transactions, which is different enough from other blockchains that the normal mental models don't apply.

How Solana's Architecture Creates the Failure Mode

Most blockchains separate transaction processing from consensus. Ethereum validators collect transactions, build a block, and then vote on that block in a distinct phase.

Solana doesn't work that way. Its consensus mechanism — Tower BFT — is designed to run continuously, with validators casting votes that get embedded directly into the transaction stream. Every fraction of a second, validators are producing and broadcasting vote transactions alongside the user-initiated transactions they're processing.

This is partly why Solana can achieve real-world throughput of thousands of transactions per second. The consensus layer isn't a bottleneck because it operates in parallel with transaction processing.

But here's where things get fragile. When the network gets flooded with transactions — say, from a token launch attracting thousands of bots — those bots compete for block space not just with each other, but with validator vote messages. If vote messages get delayed or dropped, validators lose sight of the current network state. Their local view of which blocks have been confirmed starts to diverge from other validators.

Once that divergence is large enough, reaching the 2/3 supermajority required for consensus becomes impossible. The network can't agree on what happened. Everything stops.

The September 2021 outage is the clearest example. A single IDO (Grape Protocol) attracted such heavy bot traffic — over 400,000 transactions per second at peak — that the network's message-passing infrastructure collapsed. Validators went out of sync. The outage lasted around 17 hours.

The Fee Problem Made It Worse

Before Solana introduced fee markets, transaction fees were effectively zero — fractions of a penny regardless of network demand. That's by design: low fees are a core part of the Solana thesis.

The problem is that without a meaningful price signal, bots have no reason to limit themselves. On Ethereum, gas fees rise sharply during congestion, pricing out lower-value transactions and creating a natural throttle. A bot calculating expected profit has to factor in the cost of submitting.

On early Solana, there was no such calculation to make. Sending 100,000 transactions cost almost nothing. So that's what happened.

Fee markets are a crude solution — they protect the network by taxing users, which conflicts with the accessibility thesis. But the alternative, as Solana learned, is a network that becomes unusable during popular events.

The Single Client Problem

There's a separate constraint worth understanding: client diversity.

Ethereum has multiple independent implementations of the same protocol — Geth, Besu, Erigon, Nethermind, and others. If a bug shows up in one client, it affects some fraction of validators. The others keep running. The network degrades but doesn't collapse.

Solana, until recently, has had essentially one validator client: the implementation maintained by Solana Labs. If there's a consensus bug in that client, it affects every validator simultaneously. That's what happened in October 2023 — a bug in handling "durable nonces" triggered a cascading consensus failure across the network, taking it offline for around 19 hours.

This isn't a knock on Solana's engineering quality. It's a structural fragility that exists whenever a network depends on a single codebase — regardless of how good that codebase is.

What's Changed

QUIC replaced TCP as the transport layer for transaction submission. QUIC has built-in flow control — validators can throttle incoming transactions more effectively without being overwhelmed. It doesn't eliminate spam, but makes the network much harder to saturate by brute force.

Localized fee markets were introduced, allowing fees for transactions accessing specific "hot" state (like a popular DEX pool) to rise independently without affecting the whole network. A token launch attracting bot traffic doesn't necessarily drag down everything else anymore.

Most importantly: Firedancer. Jump Crypto built a second Solana validator client from scratch — different language (C++), different architecture, different team. When Firedancer validators are running in production at meaningful scale, a bug in the Solana Labs client will no longer automatically mean a network-wide failure.

Firedancer has been in testnet and is progressively rolling out to mainnet. The timeline for meaningful mainnet client diversity is the clearest forward signal to watch.

Confirmation Signals

Firedancer running on a significant portion of mainnet validators, surviving high-load events without incident
Extended uptime streaks through conditions comparable to 2021-2022 (major token launches, DeFi activity spikes)
Fee markets demonstrably throttling bot activity during high-demand events

Invalidation Signals

A major outage occurring under load conditions that Firedancer-era Solana should handle — suggesting the architectural fixes haven't addressed the core failure mode
Firedancer deployment stalling or being adopted by only a small minority of validators
Discovery of a consensus bug that affects both clients simultaneously (possible if they share design assumptions despite different implementations)

Timing

Now: Solana has been meaningfully more stable through 2024-2025 than it was in 2021-2022. The QUIC and fee market changes have helped. Whether Firedancer's production deployment becomes the stability floor it's designed to be is the live question.

Next: The pace of Firedancer's mainnet rollout and its performance under real stress are the near-term signal that will either validate or challenge the "Solana's outage problem is solved" thesis.

Later: Whether Solana's performance-first architecture can achieve Ethereum-level resilience at scale is a genuine open question. Not a settled one.

Boundary Statement

This covers the mechanism behind Solana's outages and the structural changes being made to address them. It doesn't assess whether Solana is a better or worse choice for any particular use case, and it isn't a prediction of future outages or their absence.

The outage history is documented. The architectural changes are real. Whether they're sufficient is a question the next high-load events will answer — not this post.

See All

Crypto Research

New XRP-Focused Research Defining the “Velocity Threshold” for Global Settlement and Liquidity

A lot of people looking at my recent research have asked the same question: “Surely Ripple already understands all of this. So what does that mean for XRP?” That question is completely valid — and it turns out it’s the right question to ask. This research breaks down why XRP is unlikely to be the internal settlement asset of CBDC shared ledgers or unified bank platforms, and why that doesn’t mean XRP is irrelevant. Instead, it explains where XRP realistically fits in the system banks are actually building: at the seams, where different rulebooks, platforms, and networks still need to connect. Using liquidity math, system design, and real-world settlement mechanics, this piece explains: why most value settles inside venues, not through bridges why XRP’s role is narrower but more precise than most narratives suggest how velocity (refresh interval) determines whether XRP creates scarcity or just throughput and why Ripple’s strategy makes more sense once you stop assuming XRP must be “the core of everything” This isn’t a bullish or bearish take — it’s a structural one. If you want to understand XRP beyond hype and price targets, this is the question you need to grapple with.

Read Now

Crypto Research

The Jackson Liquidity Framework - Announcement

Lewis Jackson Ventures announces the release of the Jackson Liquidity Framework — the first quantitative, regulator-aligned model for liquidity sizing in AMM-based settlement systems, CBDC corridors, and tokenised financial infrastructures. Developed using advanced stochastic simulations and grounded in Basel III and PFMI principles, the framework provides a missing methodology for determining how much liquidity prefunded AMM pools actually require under real-world flow conditions.

Read Now

Crypto Research

Banks, Stablecoins, and Tokenized Assets

In Episode 011 of The Macro, crypto analyst Lewis Jackson unpacks a pivotal week in global finance — one marked by record growth in tokenized assets, expanding stablecoin adoption across emerging markets, and major institutions deepening their blockchain commitments. This research brief summarises Jackson’s key findings, from tokenized deposits to institutional RWA chains and AI-driven compliance, and explains how these developments signal a maturing, multi-rail settlement architecture spanning Ethereum, XRPL, stablecoin networks, and new interoperability layers.Taken together, this episode marks a structural shift toward programmable finance, instant settlement, and tokenized real-world assets at global scale.

Read Now

Why Solana Has Outages

How Solana's Architecture Creates the Failure Mode

The Fee Problem Made It Worse

The Single Client Problem

What's Changed

Confirmation Signals

Invalidation Signals

Timing

Boundary Statement

Related Posts

Related Posts

Weekly notes on what I’m seeing

Join The LewsLetter