The Sub-50µs Cloud Lie

Why cloud vendors' latency claims don't match reality for trading. Real measurements and the hard limits of cloud infrastructure.

Intermediate • 20 min read • Expert Version →

🎯 What You'll Learn

Understand why vendor latency claims are misleading
Learn how to measure real trading latency
Identify cloud infrastructure limitations
Know when cloud works and when it doesn't

📚 Prerequisites

Before this lesson, you should understand:

First Principles of Trading Infrastructure

The Marketing vs Reality Gap

Cloud vendors claim “sub-millisecond latency.” Your trading system measures 5-50ms. What’s going on?

AWS claims: "Single-digit millisecond latency"
Your measurement: 15ms to Binance
Reality: Both are "correct" - but measuring different things

This lesson exposes the gap between marketing claims and trading reality.

What You’ll Learn

By the end of this lesson, you’ll understand:

What vendors actually measure - VM-to-VM, not your use case
Real trading latency sources - Network, hypervisor, kernel
How to measure properly - End-to-end, with percentiles
When cloud makes sense - Not for HFT, but maybe for you

The Foundation: What “Latency” Actually Means

Vendors measure inter-VM latency within the same datacenter:

EC2 instance-A → EC2 instance-B (same AZ)
AWS claims: ~50-100µs

What you actually need:

Your EC2 → Internet → Exchange → Processing → Response
Reality: 5-50ms depending on exchange

Marketing latency ≠ application latency

The “Aha!” Moment

Here’s what cloud vendors won’t tell you:

The hypervisor adds 5-20µs of jitter to every network operation. You share physical hardware with other tenants. When they spike, you spike. This variability is invisible in averages but destroys your p99 latency.

Dedicated hardware doesn’t have this problem.

Let’s See It In Action: Measuring Real Latency

Measure VM-to-VM (What AWS Claims)

# Install sockperf on two EC2 instances
sudo apt install sockperf

# Server side
sockperf server -i 0.0.0.0 -p 12345

# Client side - measure latency
sockperf ping-pong -i <server-ip> -p 12345 --pps=max -t 60

# Typical AWS result: avg 60µs, p99 150µs

Measure to Exchange (What You Actually Get)

import time
import requests

def measure_exchange_latency(url, n=100):
    latencies = []
    for _ in range(n):
        start = time.perf_counter()
        requests.get(url)
        latencies.append((time.perf_counter() - start) * 1000)
    
    latencies.sort()
    print(f"Min: {latencies[0]:.1f}ms")
    print(f"Avg: {sum(latencies)/len(latencies):.1f}ms")
    print(f"P99: {latencies[int(n*0.99)]:.1f}ms")
    print(f"Max: {latencies[-1]:.1f}ms")

# Run from EC2
measure_exchange_latency("https://api.binance.com/api/v3/time")
# Typical: Min 15ms, Avg 25ms, P99 80ms

Where Cloud Latency Comes From

Source	Contribution	Fixable?
Physical distance	1-50ms	Move to colo
Internet routing	1-20ms	Pay for direct connect
Hypervisor overhead	5-20µs	Bare metal instance
Kernel network stack	10-50µs	Kernel tuning
Your application	Variable	Code optimization

90% of your latency is location + network path. Optimizing code won’t fix this.

The Noisy Neighbor Problem

Shared infrastructure means shared variability:

Normal operation:
  Your latency: 50µs

Neighbor running ML training:
  Your latency: 200µs (CPU steal)

Neighbor doing heavy I/O:
  Your latency: 500µs (network contention)

This variability is random and unpredictable. Your p99 suffers.

Measuring CPU Steal

# Check if you're losing CPU to other tenants
vmstat 1 | awk 'NR>2 {print "steal:", $18"%"}'

# >0% steal means others are taking your CPU time

AWS Instance Selection

Instance Type	Latency Profile	Monthly Cost
t3.medium	High variability, burst	$30
c6i.2xlarge	Better, still shared	$250
c6i.metal	Bare metal, no hypervisor	$3,000
p4d.24xlarge	Dedicated network	$30,000+

For trading: Minimum c5n/c6i.xlarge with Enhanced Networking.

Common Misconceptions

Myth: “Faster instance types = lower latency.”
Reality: Instance type affects CPU, not network latency. A t3.micro and p4d.24xlarge have similar network latency to external destinations.

Myth: “AWS Direct Connect solves all latency problems.”
Reality: Direct Connect reduces internet routing variability (~5-10ms savings) but doesn’t fix hypervisor jitter or distance.

Myth: “My cloud setup is fast enough because average latency is low.”
Reality: Averages hide tail latency. Your p99 or p99.9 is what matters for trading. One 500ms spike per minute is catastrophic.

When Cloud Makes Sense

Cloud is Fine For:

Swing trading (minutes to days)
Backtesting and research
Non-latency-sensitive strategies
Starting out / proving concepts

Cloud is Not Fine For:

Market making
HFT strategies
Arbitrage (especially cross-exchange)
Any strategy where you compete on speed

Honest Latency Budget

If you’re serious about cloud trading:

Fixed costs (can't optimize):
  Distance to exchange: 10-30ms
  Internet routing: 5-15ms
  TLS handshake: 5-10ms
  
Variable costs (can optimize):
  Application code: 0.1-10ms
  Network stack: 0.01-0.1ms
  
Realistic total: 25-70ms
  
Your competitor in colo: 0.1-1ms

You’re 25-700x slower. Accept it or move to colo.

Practice Exercises

Exercise 1: Measure Your Reality

# From your trading server, measure to your exchange
while true; do
  curl -w "%{time_total}\n" -o /dev/null -s https://api.exchange.com/time
  sleep 1
done | tee latency.log

Exercise 2: Check for Steal Time

# Monitor for 1 hour
vmstat 1 3600 | awk '{print $18}' > steal.log
# Any non-zero values?

Exercise 3: Compare Instance Types

If budget allows:
- Spin up c6i.xlarge and c6i.metal
- Run same latency test on both
- Compare p99 latency

Key Takeaways

Vendor claims measure the wrong thing - VM-to-VM ≠ to-exchange
Hypervisor adds jitter - Shared infrastructure = shared variability
Distance dominates - No amount of tuning fixes 10ms of physics
Know your use case - Cloud works for some strategies, not others

What’s Next?

🎯 Continue learning: Trading Infrastructure First Principles

🔬 Expert version: The Sub-50µs Cloud Lie

Now you know what cloud vendors aren’t telling you. ☁️

Questions about this lesson? Working on related infrastructure?

Let's discuss