Solana's outages don't happen because the team isn't paying attention. They happen because of architectural choices made deliberately in pursuit of performance — and the same choices that give Solana its speed also make it brittle in specific failure modes.
Understanding the pattern requires understanding how Solana actually processes transactions, which is different enough from other blockchains that the normal mental models don't apply.
Most blockchains separate transaction processing from consensus. Ethereum validators collect transactions, build a block, and then vote on that block in a distinct phase.
Solana doesn't work that way. Its consensus mechanism — Tower BFT — is designed to run continuously, with validators casting votes that get embedded directly into the transaction stream. Every fraction of a second, validators are producing and broadcasting vote transactions alongside the user-initiated transactions they're processing.
This is partly why Solana can achieve real-world throughput of thousands of transactions per second. The consensus layer isn't a bottleneck because it operates in parallel with transaction processing.
But here's where things get fragile. When the network gets flooded with transactions — say, from a token launch attracting thousands of bots — those bots compete for block space not just with each other, but with validator vote messages. If vote messages get delayed or dropped, validators lose sight of the current network state. Their local view of which blocks have been confirmed starts to diverge from other validators.
Once that divergence is large enough, reaching the 2/3 supermajority required for consensus becomes impossible. The network can't agree on what happened. Everything stops.
The September 2021 outage is the clearest example. A single IDO (Grape Protocol) attracted such heavy bot traffic — over 400,000 transactions per second at peak — that the network's message-passing infrastructure collapsed. Validators went out of sync. The outage lasted around 17 hours.
Before Solana introduced fee markets, transaction fees were effectively zero — fractions of a penny regardless of network demand. That's by design: low fees are a core part of the Solana thesis.
The problem is that without a meaningful price signal, bots have no reason to limit themselves. On Ethereum, gas fees rise sharply during congestion, pricing out lower-value transactions and creating a natural throttle. A bot calculating expected profit has to factor in the cost of submitting.
On early Solana, there was no such calculation to make. Sending 100,000 transactions cost almost nothing. So that's what happened.
Fee markets are a crude solution — they protect the network by taxing users, which conflicts with the accessibility thesis. But the alternative, as Solana learned, is a network that becomes unusable during popular events.
There's a separate constraint worth understanding: client diversity.
Ethereum has multiple independent implementations of the same protocol — Geth, Besu, Erigon, Nethermind, and others. If a bug shows up in one client, it affects some fraction of validators. The others keep running. The network degrades but doesn't collapse.
Solana, until recently, has had essentially one validator client: the implementation maintained by Solana Labs. If there's a consensus bug in that client, it affects every validator simultaneously. That's what happened in October 2023 — a bug in handling "durable nonces" triggered a cascading consensus failure across the network, taking it offline for around 19 hours.
This isn't a knock on Solana's engineering quality. It's a structural fragility that exists whenever a network depends on a single codebase — regardless of how good that codebase is.
QUIC replaced TCP as the transport layer for transaction submission. QUIC has built-in flow control — validators can throttle incoming transactions more effectively without being overwhelmed. It doesn't eliminate spam, but makes the network much harder to saturate by brute force.
Localized fee markets were introduced, allowing fees for transactions accessing specific "hot" state (like a popular DEX pool) to rise independently without affecting the whole network. A token launch attracting bot traffic doesn't necessarily drag down everything else anymore.
Most importantly: Firedancer. Jump Crypto built a second Solana validator client from scratch — different language (C++), different architecture, different team. When Firedancer validators are running in production at meaningful scale, a bug in the Solana Labs client will no longer automatically mean a network-wide failure.
Firedancer has been in testnet and is progressively rolling out to mainnet. The timeline for meaningful mainnet client diversity is the clearest forward signal to watch.
Now: Solana has been meaningfully more stable through 2024-2025 than it was in 2021-2022. The QUIC and fee market changes have helped. Whether Firedancer's production deployment becomes the stability floor it's designed to be is the live question.
Next: The pace of Firedancer's mainnet rollout and its performance under real stress are the near-term signal that will either validate or challenge the "Solana's outage problem is solved" thesis.
Later: Whether Solana's performance-first architecture can achieve Ethereum-level resilience at scale is a genuine open question. Not a settled one.
This covers the mechanism behind Solana's outages and the structural changes being made to address them. It doesn't assess whether Solana is a better or worse choice for any particular use case, and it isn't a prediction of future outages or their absence.
The outage history is documented. The architectural changes are real. Whether they're sufficient is a question the next high-load events will answer — not this post.




