The Sub-50µs Cloud Lie

Why cloud vendors' latency claims don't match reality for trading. Real measurements and the hard limits of cloud infrastructure.

Intermediate 20 min read Expert Version →

🎯 What You'll Learn

  • Understand why vendor latency claims are misleading
  • Learn how to measure real trading latency
  • Identify cloud infrastructure limitations
  • Know when cloud works and when it doesn't

📚 Prerequisites

Before this lesson, you should understand:

The Marketing vs Reality Gap

Cloud vendors claim “sub-millisecond latency.” Your trading system measures 5-50ms. What’s going on?

AWS claims: "Single-digit millisecond latency"
Your measurement: 15ms to Binance
Reality: Both are "correct" - but measuring different things

This lesson exposes the gap between marketing claims and trading reality.


What You’ll Learn

By the end of this lesson, you’ll understand:

  1. What vendors actually measure - VM-to-VM, not your use case
  2. Real trading latency sources - Network, hypervisor, kernel
  3. How to measure properly - End-to-end, with percentiles
  4. When cloud makes sense - Not for HFT, but maybe for you

The Foundation: What “Latency” Actually Means

Vendors measure inter-VM latency within the same datacenter:

EC2 instance-A → EC2 instance-B (same AZ)
AWS claims: ~50-100µs

What you actually need:

Your EC2 → Internet → Exchange → Processing → Response
Reality: 5-50ms depending on exchange

Marketing latency ≠ application latency


The “Aha!” Moment

Here’s what cloud vendors won’t tell you:

The hypervisor adds 5-20µs of jitter to every network operation. You share physical hardware with other tenants. When they spike, you spike. This variability is invisible in averages but destroys your p99 latency.

Dedicated hardware doesn’t have this problem.


Let’s See It In Action: Measuring Real Latency

Measure VM-to-VM (What AWS Claims)

# Install sockperf on two EC2 instances
sudo apt install sockperf

# Server side
sockperf server -i 0.0.0.0 -p 12345

# Client side - measure latency
sockperf ping-pong -i <server-ip> -p 12345 --pps=max -t 60

# Typical AWS result: avg 60µs, p99 150µs

Measure to Exchange (What You Actually Get)

import time
import requests

def measure_exchange_latency(url, n=100):
    latencies = []
    for _ in range(n):
        start = time.perf_counter()
        requests.get(url)
        latencies.append((time.perf_counter() - start) * 1000)
    
    latencies.sort()
    print(f"Min: {latencies[0]:.1f}ms")
    print(f"Avg: {sum(latencies)/len(latencies):.1f}ms")
    print(f"P99: {latencies[int(n*0.99)]:.1f}ms")
    print(f"Max: {latencies[-1]:.1f}ms")

# Run from EC2
measure_exchange_latency("https://api.binance.com/api/v3/time")
# Typical: Min 15ms, Avg 25ms, P99 80ms

Where Cloud Latency Comes From

SourceContributionFixable?
Physical distance1-50msMove to colo
Internet routing1-20msPay for direct connect
Hypervisor overhead5-20µsBare metal instance
Kernel network stack10-50µsKernel tuning
Your applicationVariableCode optimization

90% of your latency is location + network path. Optimizing code won’t fix this.


The Noisy Neighbor Problem

Shared infrastructure means shared variability:

Normal operation:
  Your latency: 50µs

Neighbor running ML training:
  Your latency: 200µs (CPU steal)

Neighbor doing heavy I/O:
  Your latency: 500µs (network contention)

This variability is random and unpredictable. Your p99 suffers.

Measuring CPU Steal

# Check if you're losing CPU to other tenants
vmstat 1 | awk 'NR>2 {print "steal:", $18"%"}'

# >0% steal means others are taking your CPU time

AWS Instance Selection

Instance TypeLatency ProfileMonthly Cost
t3.mediumHigh variability, burst$30
c6i.2xlargeBetter, still shared$250
c6i.metalBare metal, no hypervisor$3,000
p4d.24xlargeDedicated network$30,000+

For trading: Minimum c5n/c6i.xlarge with Enhanced Networking.


Common Misconceptions

Myth: “Faster instance types = lower latency.”
Reality: Instance type affects CPU, not network latency. A t3.micro and p4d.24xlarge have similar network latency to external destinations.

Myth: “AWS Direct Connect solves all latency problems.”
Reality: Direct Connect reduces internet routing variability (~5-10ms savings) but doesn’t fix hypervisor jitter or distance.

Myth: “My cloud setup is fast enough because average latency is low.”
Reality: Averages hide tail latency. Your p99 or p99.9 is what matters for trading. One 500ms spike per minute is catastrophic.


When Cloud Makes Sense

Cloud is Fine For:

  • Swing trading (minutes to days)
  • Backtesting and research
  • Non-latency-sensitive strategies
  • Starting out / proving concepts

Cloud is Not Fine For:

  • Market making
  • HFT strategies
  • Arbitrage (especially cross-exchange)
  • Any strategy where you compete on speed

Honest Latency Budget

If you’re serious about cloud trading:

Fixed costs (can't optimize):
  Distance to exchange: 10-30ms
  Internet routing: 5-15ms
  TLS handshake: 5-10ms
  
Variable costs (can optimize):
  Application code: 0.1-10ms
  Network stack: 0.01-0.1ms
  
Realistic total: 25-70ms
  
Your competitor in colo: 0.1-1ms

You’re 25-700x slower. Accept it or move to colo.


Practice Exercises

Exercise 1: Measure Your Reality

# From your trading server, measure to your exchange
while true; do
  curl -w "%{time_total}\n" -o /dev/null -s https://api.exchange.com/time
  sleep 1
done | tee latency.log

Exercise 2: Check for Steal Time

# Monitor for 1 hour
vmstat 1 3600 | awk '{print $18}' > steal.log
# Any non-zero values?

Exercise 3: Compare Instance Types

If budget allows:
- Spin up c6i.xlarge and c6i.metal
- Run same latency test on both
- Compare p99 latency

Key Takeaways

  1. Vendor claims measure the wrong thing - VM-to-VM ≠ to-exchange
  2. Hypervisor adds jitter - Shared infrastructure = shared variability
  3. Distance dominates - No amount of tuning fixes 10ms of physics
  4. Know your use case - Cloud works for some strategies, not others

What’s Next?

🎯 Continue learning: Trading Infrastructure First Principles

🔬 Expert version: The Sub-50µs Cloud Lie

Now you know what cloud vendors aren’t telling you. ☁️

Questions about this lesson? Working on related infrastructure?

Let's discuss