How to Interpret On-Chain Data

The blockchain records addresses and transactions, not people and intentions. This post explains how raw chain data becomes metrics like active addresses and exchange flows — and the specific ways each one misleads when read literally.

Lewis Jackson

CEO and Founder

On-chain data has a seductive quality. Every transaction is public, every balance is visible, and nothing can be hidden or restated after the fact — so it feels like the one place in finance where you can just look at the truth directly. The access part is real. The interpretation part is where things go wrong, because the blockchain records addresses and value movements, while the questions people actually ask are about people and intentions. Those aren't the same thing, and most bad on-chain analysis comes from treating them as if they were.

This post covers the interpretation layer: how raw chain data gets turned into the metrics you see quoted — active addresses, exchange flows, holder cohorts — and the specific ways each one misleads. The practical skills of navigating a block explorer and decoding an individual transaction are covered separately in how to use a block explorer and how to read a blockchain transaction.

Two Layers: What the Chain Records vs What Gets Derived

It helps to keep two layers distinct. The raw layer is what the blockchain itself records: transactions, addresses, balances, timestamps, contract calls. This layer is factual in the strongest sense — if an address received 50 BTC at block 850,000, that happened, and no analyst can disagree.

Everything you'd actually call a metric lives in the derived layer: numbers produced by aggregating and labeling the raw data. Active addresses per day. Exchange inflows and outflows. Supply held by long-term holders. Coin dormancy. None of these are read off the chain directly — each one is built on top of assumptions about what addresses mean and which entities control them. The raw layer is fact. The derived layer is interpretation wearing fact's clothing, and the assumptions underneath are where the errors hide. (Indexers do the mechanical aggregation work — how blockchain indexers work covers that part.)

The Address Problem

The single most important thing to internalize: an address is not a person. One user can control thousands of addresses — many wallets generate a fresh address for every receive. And one address can represent millions of users: a single exchange omnibus wallet holds customer funds for an entire platform.

This breaks naive readings in both directions. "Active addresses doubled" might mean genuine adoption, or one entity splitting activity across fresh addresses, or an airdrop farmer running scripts. "Supply is concentrating into fewer wallets" might mean whale accumulation, or it might mean retail holders moving coins onto exchanges, collapsing thousands of small balances into one custodial address.

Analytics platforms attack this with entity clustering — heuristics that group addresses likely controlled by the same actor. On Bitcoin's UTXO model, the classic heuristic is co-spending: addresses used as inputs to the same transaction probably share an owner. On account-based chains like Ethereum, clustering leans on behavioral patterns and labeled addresses. These heuristics are genuinely useful, but they're probabilistic. When a dashboard says "entity," it means "our best statistical guess at an entity." Good analysts carry that uncertainty with them; bad ones quote cluster-derived numbers to four significant figures.

Exchange Flows, Dormancy, and the Intent Gap

Exchange flow metrics are the clearest example of the interpretation gap. The mechanics: analytics firms maintain lists of labeled exchange addresses, then measure value moving in and out. The conventional reading is that inflows suggest positioning to sell (coins moved where selling happens) and outflows suggest the opposite — withdrawal to self-custody.

The mechanics are sound. The intent inference is not guaranteed. Coins flow into exchanges to be lent, posted as collateral, or moved between custodians. Exchanges shuffle funds internally between hot and cold wallets, and a newly created, unlabeled cold wallet can register as a massive "outflow" that's actually nothing. Large players who genuinely intend to sell often avoid the visible route entirely and transact over-the-counter, off-chain. So a flow metric tells you where coins moved, with reasonable confidence, and why with much less.

Dormancy metrics — coin age, long-term holder supply — have the same shape. Coins untouched for five years suddenly moving is an observable fact and often worth noticing. But the move might be a sale, a custody migration, a wallet upgrade, or an estate being settled. The chain shows movement. It does not show motive. If you take one sentence from this post, that's the one.

This is also why single-metric analysis fails so reliably. Any one signal has multiple innocent explanations; the honest method is confluence — several independent metrics pointing the same direction — plus an explicit answer to "what else could produce this exact pattern?" The same logic applies to individual large wallets, covered in how to track whale wallets.

Where the Constraints Live

Three structural limits bound everything above. First, labeling is private infrastructure. The chain is public, but the address labels that make metrics meaningful are proprietary datasets maintained by analytics firms — different providers report different exchange flow numbers from identical raw data, which tells you how much interpretation is involved. Second, the chain only sees the chain. Exchange internal ledgers, OTC trades, and custodial reshuffles settle off-chain; a growing share of economic activity is invisible to mainnet analysis. Third, model differences matter: Bitcoin's UTXO structure supports coin-age analysis natively, while Ethereum's account model makes some lifecycle metrics harder and contract interactions richer. Metrics don't translate cleanly across chains, though dashboards often present them as if they do.

What's Changing

The measurement surface is fragmenting. Activity migrating to Layer 2s means mainnet metrics capture a shrinking slice of real usage — Ethereum active addresses can look flat while rollup activity grows multiples. Institutional custody and ETFs concentrate coins into omnibus wallets, distorting holder-cohort metrics in ways the heuristics weren't designed for. And account abstraction plus privacy tooling will erode some clustering heuristics that assume one-key-one-owner. None of this makes on-chain analysis useless. It does mean the derived layer needs constant recalibration, and metrics that were reliable in 2021 quietly mean something different now.

What Would Confirm This Direction

Analytics platforms shipping cross-layer aggregated metrics as defaults rather than mainnet-only views. Public divergence between providers' flow numbers narrowing as labeling improves. Explicit confidence intervals appearing on entity-level metrics — the honest direction for the industry.

What Would Break It

A demonstrated systematic failure in a major provider's labeling — exchange flows materially miscounted over a sustained period — would invalidate more than that one metric; it would impeach the derived layer's reliability generally. Privacy adoption at scale (widespread coinjoin-style mixing or shielded transfers on major chains) would break clustering assumptions outright rather than degrading them gradually.

Timing Perspective

Now: when you encounter any on-chain metric, ask two questions before accepting it — what raw data produced this, and what assumptions turned that data into this number? Next: expect L2 fragmentation to make single-chain dashboards progressively less representative; cross-layer views are worth adopting as they mature. Later: account abstraction and privacy tech will force a rebuild of core heuristics. The raw layer stays trustworthy throughout; it's the derived layer that ages.

Boundary Statement

This post explains how on-chain metrics are constructed and where their interpretive assumptions live. It is not a trading methodology, not an endorsement of any analytics platform, and not a claim that any metric predicts price — the relationship between on-chain activity and market outcomes is contested and unstable. The data is public. The meaning is built. Knowing the difference is the skill.

See All

Crypto Research

New XRP-Focused Research Defining the “Velocity Threshold” for Global Settlement and Liquidity

A lot of people looking at my recent research have asked the same question: “Surely Ripple already understands all of this. So what does that mean for XRP?” That question is completely valid — and it turns out it’s the right question to ask. This research breaks down why XRP is unlikely to be the internal settlement asset of CBDC shared ledgers or unified bank platforms, and why that doesn’t mean XRP is irrelevant. Instead, it explains where XRP realistically fits in the system banks are actually building: at the seams, where different rulebooks, platforms, and networks still need to connect. Using liquidity math, system design, and real-world settlement mechanics, this piece explains: why most value settles inside venues, not through bridges why XRP’s role is narrower but more precise than most narratives suggest how velocity (refresh interval) determines whether XRP creates scarcity or just throughput and why Ripple’s strategy makes more sense once you stop assuming XRP must be “the core of everything” This isn’t a bullish or bearish take — it’s a structural one. If you want to understand XRP beyond hype and price targets, this is the question you need to grapple with.

Read Now

Crypto Research

The Jackson Liquidity Framework - Announcement

Lewis Jackson Ventures announces the release of the Jackson Liquidity Framework — the first quantitative, regulator-aligned model for liquidity sizing in AMM-based settlement systems, CBDC corridors, and tokenised financial infrastructures. Developed using advanced stochastic simulations and grounded in Basel III and PFMI principles, the framework provides a missing methodology for determining how much liquidity prefunded AMM pools actually require under real-world flow conditions.

Read Now

Crypto Research

Banks, Stablecoins, and Tokenized Assets

In Episode 011 of The Macro, crypto analyst Lewis Jackson unpacks a pivotal week in global finance — one marked by record growth in tokenized assets, expanding stablecoin adoption across emerging markets, and major institutions deepening their blockchain commitments. This research brief summarises Jackson’s key findings, from tokenized deposits to institutional RWA chains and AI-driven compliance, and explains how these developments signal a maturing, multi-rail settlement architecture spanning Ethereum, XRPL, stablecoin networks, and new interoperability layers.Taken together, this episode marks a structural shift toward programmable finance, instant settlement, and tokenized real-world assets at global scale.

Read Now

How to Interpret On-Chain Data

Two Layers: What the Chain Records vs What Gets Derived

The Address Problem

Exchange Flows, Dormancy, and the Intent Gap

Where the Constraints Live

What's Changing

What Would Confirm This Direction

What Would Break It

Timing Perspective

Boundary Statement

Related Posts

Related Posts

Weekly notes on what I’m seeing

Join The LewsLetter