The Physics of the Order Book: L2, L3, and Sequence Gaps

Why 'Price' is an aggregation of intent. Understanding Level 2 vs Level 3 data, UDP Sequence Gaps, and the Crossed Book phenomenon.

Beginner 45 min read Expert Version →

🎯 What You'll Learn

  • Differentiate L1 (Top), L2 (Aggregated), and L3 (Individual) Data
  • Deconstruct a WebSocket Delta Update (`Add`, `Update`, `Delete`)
  • Analyze the 'Crossed Book' state (Bid >= Ask)
  • Implement a Sequence Gap Detector for UDP Feeds
  • Visualize Market Depth using Heatmaps

📚 Prerequisites

Before this lesson, you should understand:

🔬 Try It: Watch a Flash Crash

See an order book in action. Watch what happens when liquidity evaporates:

📊 Order Book Replay: Flash Crash

ETH/USD
T+0s
T+0s: Normal Market
Healthy spread: $0.50. Deep liquidity on both sides.
BIDS (Buy Orders)
$2000.0050
$1999.50120
$1999.00200
$1998.50350
$1998.00500
ASKS (Sell Orders)
$2000.5045
$2001.00100
$2001.50180
$2002.00300
$2002.50450
Spread: $0.50 | Mid: $2000.25

Introduction

Most traders see a chart. Engineers see a State Machine synchronized across thousands of miles.

The Order Book is not static. It is a living data structure that mutates millions of times per second. To build a trading bot, you don’t just “read” the book. You reconstruct it locally, packet by packet, verifying the integrity of the universe with every sequence number.


The Physics: L2 vs L3 Data

Data comes in resolutions.

Level 1 (Top of Book):

  • “Best Bid is 100.BestAskis100. Best Ask is 101.”
  • Physics: Low Bandwidth. Useful for retail UI. Useless for algo trading.

Level 2 (Aggregated Depth):

  • “There are 500 shares at 100.Thereare200sharesat100. There are 200 shares at 99.”
  • Physics: You know how much is there, but not who is there. Most HFT happens here.

Level 3 (Market by Order):

  • “Order ID 123 (Size 100) added at $100.”
  • “Order ID 456 (Size 50) added at $100.”
  • Physics: Full visibility. Highest bandwidth (Gbps). You can track individual queue positions.

Deep Dive: Delta Updates & Sequence Gaps

Downloading the full book (Snapshot) takes 100ms. That is an eternity. Instead, we download a Snapshot once, and then apply Deltas (Changes).

The Protocol:

  1. Snapshot: { "bids": [[100, 500]], "seq": 50 }
  2. Delta: { "action": "update", "price": 100, "size": 600, "seq": 51 }

The Physics of Gaps: If you receive seq: 50 and then seq: 52, you have lost reality. You cannot just “skip” packet 51. Packet 51 might have been “Sell 1 Million BTC”. If you miss a packet, your local book is corrupted. You must disconnect, flush, and restart.


The Anomaly: Crossed Markets

In a sane universe, Best Bid < Best Ask. If Bid >= Ask, a trade should have happened. Why do we sometimes see Bid: $100, Ask: $99?

  1. Latency: The trade report packet hasn’t arrived yet.
  2. Exchange Lag: The Matching Engine is overwhelmed and hasn’t processed the cross yet.
  3. Arbitrage: This is happening on two different exchanges. (Buy on A at 99,SellonBat99, Sell on B at 100).

Code: The Local Book Builder

How to maintain a local L2 book from a stream of updates.

class OrderBook:
    def __init__(self):
        self.bids = {} # Price -> Size
        self.asks = {} 
        self.last_seq = None

    def process_update(self, msg):
        # 1. Sequence Gap Detection
        if self.last_seq and msg['seq'] != self.last_seq + 1:
            raise Exception(f"GAP DETECTED! Expected {self.last_seq+1}, got {msg['seq']}")
        
        self.last_seq = msg['seq']

        # 2. Apply Delta
        side = self.bids if msg['side'] == 'buy' else self.asks
        price = msg['price']
        
        if msg['size'] == 0:
            if price in side: del side[price] # Delete level
        else:
            side[price] = msg['size'] # Upsert level

    def get_best_bid(self):
        return max(self.bids.keys()) if self.bids else 0

Practice Exercises

Exercise 1: Bandwidth Calculation (Beginner)

Scenario: An L3 feed sends 100 bytes per order. 50,000 orders/sec. Task: What is the bandwidth requirement? (5 MB/s). What happens if volatility spikes to 1,000,000 orders/sec? (100 MB/s - do you have a 1Gbps line?)

Exercise 2: The Ghost Order (Intermediate)

Scenario: You miss a “Delete” packet for Order A. Task: Your bot thinks Order A is still there. You try to trade against it. What happens? (Exchange rejects order: “Liquidity missing”).

Exercise 3: Crossed Book Arb (Advanced)

Task: Write a script that listens to 2 mock orderbooks. Print “ARBITRAGE” whenever BookA.Bid > BookB.Ask.


Knowledge Check

  1. Why is L3 data “heavier” than L2?
  2. What does a sequence gap imply about your network?
  3. Why can’t you trade against a “Crossed Market” on the same exchange?
  4. What is a “Snapshot” vs a “Delta”?
  5. Why do HFTs prefer UDP over TCP for market data?
Answers
  1. Granularity. L3 sends every single order add/cancel. L2 only sends price level summaries.
  2. Packet Loss. UDP packets were dropped, or the CPU was too slow to read the socket buffer.
  3. Matching Engine Logic. The engine would have matched them instantly. If you see it, it’s a display artifact or a timing race.
  4. State vs Change. Snapshot is the full state (slow). Delta is the change (fast).
  5. Speed. TCP requires ACKs (slow). UDP fires and forgets (fast).

Summary

  • L2 vs L3: Resolution vs Bandwidth trade-off.
  • Sequence Numbers: The heartbeat of data integrity.
  • Reconstruction: The art of keeping your local truth in sync with the exchange.


Pro Version: See the full research: Orderbook Reconstruction at Sub-Millisecond

Questions about this lesson? Working on related infrastructure?

Let's discuss