Linux Memory: The Physics of RAM

Why access to RAM is slow (TLB Misses), how the Kernel cheats with Slab Allocators, and the math behind the OOM Killer.

Beginner • 45 min read • Expert Version →

🎯 What You'll Learn

Deconstruct the Virtual Memory Map (VMA)
Calculate the latency cost of a TLB Miss vs Page Fault
Analyze the SLAB/SLUB allocator for kernel objects
Predict OOM Killer behavior using `oom_score`
Trace a Major Page Fault from CPU to Disk IO

📚 Prerequisites

Before this lesson, you should understand:

🔬 Try It: Feel the Distance of Latency

If L1 cache is 1 foot away, how far is RAM? Feel the difference:

🔬 The Distance of Latency

If L1 cache is 1 foot away, how far is RAM? Feel the difference.

L1 CacheCross-Country

L1 Cache

CPU Cycles

Time

~1ns

If L1 = 1 foot

1.0 feet

On-die, per-core. Lightning fast.

Visual Scale (logarithmic)

L1 Cache

~1ns

L2 Cache

~3ns

L3 Cache

~10ns

Main Memory (RAM)

~50ns

NVMe SSD

~25μs

Network (Datacenter)

~500μs

Network (Cross-Country)

~50ms

💡 In HFT, accessing RAM instead of L1 can cost you the trade.

Introduction

RAM is a sparsely populated map, not a bucket of bytes. Your 16GB laptop can run processes asking for 100TB of memory. How? The Kernel lies.

Virtual Memory is the art of promising distinct memory addresses to every process while mapping them to the same physical chips-or often, to nothing at all.

The Physics: Translation Lookaside Buffer (TLB)

The CPU does not know “Physical RAM”. It only knows Virtual Addresses. Every memory access requires a translation: Virtual -> Physical. Reading the Page Table from RAM takes 100ns. This doubles the cost of every memory access.

The Fix: The TLB. A tiny hardware cache that remembers “Virtual Page 5 = Physical Frame 900”. TLB Hit: 0ns overhead. TLB Miss: 100ns (Page Walk).

Physics: If your working set exceeds the TLB size, your program slows down by 50%, even if you have infinite RAM. This is why HugePages (2MB instead of 4KB) exist-to increase TLB coverage.

Deep Dive: The SLAB Allocator

malloc() is slow. It searches for free blocks. The Kernel needs to allocate millions of identical objects (e.g., task_struct, inode).

The Solution: SLAB (and SLUB). Pre-allocated caches of specific object sizes.

Need a task_struct? Take one off the top of the task_struct_cache stack.
Freeing? Just push it back on the stack.
Zero Fragmentation. Zero Searching.

The Assassin: OOM Killer

When RAM + Swap is full, the Kernel must kill something. It doesn’t pick randomly. It calculates a score.

$Score = \text{RAM Usage (\%)} + \text{Root Bonus} + \text{OOM Adjustment}$

Badness: High memory usage (1000 points max).
Protection: Root processes get a -30 points discount.
Adjustment: You can manually set /proc/<pid>/oom_score_adj to -1000 to become invincible (like sshd).

Code: Inspecting the Maps

Where is your heap? Where is your stack? cat /proc/self/maps tells the truth.

# Output format:
# address           perms offset  dev   inode       pathname
00400000-00452000 r-xp 00000000 08:02 173521      /usr/bin/zsh  (Text)
00651000-00652000 r--p 00051000 08:02 173521      /usr/bin/zsh  (Data)
01cea000-01d0d000 rw-p 00000000 00:00 0           [heap]
...
7ffeeb9a0000-7ffeeb9c1000 rw-p 00000000 00:00 0   [stack]

Key Insight: Notice r-xp (Read-Execute-Private) for code, and rw-p for data. You cannot write to code. You cannot execute data (NX Bit). This prevents Buffer Overflows.

Practice Exercises

Exercise 1: The Page Fault (Beginner)

Scenario: malloc(1GB). Task: Does the OS allocate 1GB of RAM? No. It allocates “Virtual Promises”. The RAM is allocated only when you touch (write to) the pages (Minor Page Faults). Verify this with ps (VSZ vs RSS).

Exercise 2: OOM Survival (Intermediate)

Task:

Run a process.
echo -1000 > /proc/<pid>/oom_score_adj.
Trigger an OOM condition (safely!).
Confirm your process survives while others die.

Exercise 3: HugePages (Advanced)

Task: Check /proc/meminfo for HugePages_Total. Enable HugePages and run a memory-intensive database (like Postgres). Measure the performance difference (TLB Miss reduction).

Knowledge Check

Why do HugePages improve performance?
What is the difference between specific Major and Minor Page Faults?
Why does the Kernel generally refuse to kill sshd during OOM?
What is the “NX Bit”?
Does free() return memory to the OS immediately?

Answers

TLB Efficiency. One TLB entry covers 2MB instead of 4KB. Fewer misses.
Major: Disk I/O required (Swap/File). Minor: Memory mapping only (First access/COW).
Safety. sshd allows the admin to log in and fix the problem. Killing it locks you out.
No-Execute. Hardware flag that prevents code execution in Data/Stack segments.
No. Usually, glibc keeps the memory in the heap for reuse. It is only returned (via sbrk) if a large chunk at the end is freed.

Summary

Virtual Memory: A lie that enables isolation.
TLB: The hardware cache that makes the lie fast.
OOM: The grim reaper of memory leaks.

Questions about this lesson? Working on related infrastructure?

Let's discuss