OSTEP Memory Virtualization Notes
Published:
OSTEP Memory Virtualization Notes
This is my notes for the book Operating Systems: Three Easy Pieces — Memory Virtualization.
I spent a week reading through the Memory Virtualization section and gained a lot from it.
Ch12 A Dialogue on Memory Virtualization
It’s quite fun to see a teacher-student dialogue in a textbook. The chapter introduces memory virtualization through an exchange between professor and student, which kind of reminds me of the Analects of Confucius.
Ch13 The Abstraction: Address Spaces
Programmers want things to be as simple as possible. When we write C and call malloc(), we get a chunk of memory and start using it directly — no need to worry about how that memory came to be, how it’s stored, or how it’s reclaimed. The OS abstracts away the real physical memory and gives us a convenient, easy-to-use virtual address space.
Ch14 Interlude: Memory API
This chapter discusses the basic implementation of malloc() and free().
malloc() takes one argument — the amount of memory to allocate — and returns a pointer to the allocated memory. free() takes a pointer and releases the memory.
I used to wonder why free() only needs a pointer but not a size. The reason is that malloc() records the size of the allocation right next to the memory block it returns, so free() can figure out how much to release just from the pointer.
Ch15 Mechanism: Address Translation
To effectively virtualize memory, the OS must translate virtual addresses into physical addresses. This process is called address translation.
This chapter explains dynamic relocation, which uses two registers:
base register: stores the base address of the virtual address spacebounds register: stores the limit of the virtual address space
Under this scheme, the physical address equals virtual address + base register.
The hardware’s Memory Management Unit (MMU) handles the translation from virtual to physical addresses. The OS is responsible for allocating space to each process (setting base and bounds registers) and raising exceptions when a process uses an illegal address.
Dynamic relocation has a limitation: within a given memory space, it uses memory very sparsely, which leads to Internal Fragmentation.
Ch16 Segmentation
To mitigate Internal Fragmentation, the OS introduced Segmentation.
Segmentation divides a program into Code, Heap, and Stack segments. The high-order bits of each virtual address indicate the segment number, while the low-order bits indicate the offset within the segment.
There are still problems here:
- Because segments are variable in size, memory gets divided into uneven blocks, leading to
External Fragmentation. - Segmentation still isn’t flexible enough. If you have a large but sparse
Heap, it takes up a lot of memory space that is effectively wasted.
Ch17 Free-Space Management
This chapter discusses how to manage free memory space.
We use a linked list to track free memory blocks, where each node represents a free block and stores its size plus a pointer to the next free block.
When allocating memory, we find a “suitable” block from the list, split it, and update the list. When freeing memory, we add the block back to the list and consider coalescing it with adjacent free blocks.
There are several strategies for finding a “suitable” block: First-Fit, Best-Fit, Worst-Fit. In practice, a widely-used strategy is the Buddy Algorithm.
The Buddy Algorithm’s core idea is to view memory as blocks of power-of-two sizes. When a request comes in, find the smallest power-of-two block that’s large enough. If there’s no exact match, split a larger block into two equal-sized buddies and allocate one. When freeing, if two adjacent buddy blocks are both free, merge them back into a larger block.
(Honestly, this feels a lot like the segment tree data structure commonly used in competitive programming.)
Ch18 Paging: Introduction
To address most memory management problems, there are generally two approaches. One is to split things into variable-sized chunks, as with Segmentation. The other is to split things into fixed-size chunks — this is Paging.
A virtual address under Paging is divided into two parts:
Virtual Page Number (VPN): indicates the page numberOffset: the offset within the page; the number of offset bits equals the page size’s bit width (e.g., 12 bits for a 4KB page)
The VPN-to-PTE (Page Table Entry) translation process:
VPN = (VirtualAddress & VPN_MASK) >> PAGE_SHIFT
PTEAddr = PageTableBaseRegister + (VPN * sizeof(PTE))
PageTableBaseRegister (PTBR) is the page table base register.
The physical address is computed as:
offset = VirtualAddress & OFFSET_MASK
PhysAddr = (PFN << PAGE_SHIFT) | offset
But this design has many performance issues.
Ch19 Paging: Faster Translation (TLBs)
“When we want to make things fast, the OS usually needs some help. And help often comes from the OS’s old friend: the hardware.”
To accelerate virtual-to-physical address translation, hardware introduced the Translation Lookaside Buffer (TLB), which is part of the chip’s Memory Management Unit (MMU). You can think of the TLB as an address-translation cache.
TLB Control Flow Algorithm:
VPN = (VirtualAddress * VPN_MASK) >> PAGE_SHIFT
(Success, TlbEntry) = TLB_Lookup(VPN)
if (Success == True)
if (CanAccess(TlbEntry.ProtectBits) == True)
Offset = VirtualAddress & OFFSET_MASK
PhyAddr = (TlbEntry.PFN << PAGE_SHIFT) | Offset
Register = AccessMemory(PhyAddr)
else
RaiseException(PROTECTION_FAULT)
else
PTEAddr = PTBR + (VPN * sizeof(PTE))
PTE = AccessMemory(PTEAddr)
if (PTE.Valid == False)
RaiseException(SEGMENTATION_FAULT)
else if (CanAccess(PTE.ProtectBits) == False)
RaiseException(PROTECTION_FAULT)
else
TLB_Insert(VPN, PTE.PFN, PTE.ProtectBits)
RetryInstruction()
TLBs are great, but who handles a TLB miss?
There are two approaches:
- Hardware
- Software (the OS)
Historically, CISC architectures typically had hardware handle it. Modern architectures like RISC usually let software handle it. When a TLB miss occurs, the hardware raises an exception, switches to kernel mode, jumps to the trap handler, and the trap handler resolves the miss.
A TLB_Entry looks like:
VPN | PFN | other bits
The other bits typically include a valid bit (is this translation entry valid?) and protection bits (permissions — e.g., read and execute for a code page, read and write for a heap page).
When a context switch occurs, the TLB needs to be updated. There are several ways to handle this:
- Simplest: flush all TLB entries
- Add hardware support to identify TLB entries by process ID — this is called
Address Space Identifier (ASID).
Ch20 Paging: Smaller Tables
Consider a 32-bit address space with 4KB pages and 4-byte page table entries.
Since pages are 4KB, the offset is 12 bits. With a 32-bit address space, we can store 2^20 page table entries. Each entry is 4 bytes, so the page table size is 2^20 × 4B = 4MB. Since every process needs its own page table, a system with 1000 processes would need 1000 × 4MB = 4GB just for page tables. That’s enormous memory overhead.
Solution 1: Larger pages.
- Pros: fewer page table entries, smaller page tables.
- Cons: internal fragmentation gets worse.
Solution 2: Multi-level page tables. Organizing page tables in a tree structure reduces the table size.
For a two-level page table, we have a Page Directory that stores indices to page tables. A virtual address takes the form PDE | PTE | offset: PDE for Page Directory Entry, PTE for Page Table Entry, and offset for the in-page offset.
First, use the Page Directory to find the relevant page table; then use the Page Table to find the physical page. This is essentially a tree structure, and it’s easy to generalize to multi-level page tables. This tree structure significantly reduces the memory cost of storing page tables.
- Pros: dramatically reduces page table memory overhead.
- Cons: increases the number of memory accesses (e.g., a two-level table needs two memory accesses, though TLB can mitigate this).
Ch21 Beyond Physical Memory: Mechanisms
The problem this chapter addresses: how can the OS use a large but slow device to transparently provide the illusion of an even larger virtual address space? The large but slow device here is the disk (if disks were faster than memory, why wouldn’t we just use disks as memory?).
Because physical address space is limited, when the OS runs large programs, physical memory may run out. We need to move some pages to Swap Space. Swap space is a large disk region used to store pages that have been swapped out.
In each page table entry, we store a Present Bit, indicating whether the page is currently in physical memory. When a process accesses a page whose Present Bit is 0, a page fault is triggered. The OS’s page fault handler then loads the relevant data from swap space into physical memory.
Ch22 Beyond Physical Memory: Policies
This chapter focuses on cache management strategies.
The book presents the optimal policy, proposed by Belady: always evict the page that will be accessed furthest in the future. But this strategy is impractical in reality because we can’t actually know which page will be accessed furthest in the future.
A simple strategy is FIFO (First-In, First-Out). FIFO’s biggest advantage is its extremely simple implementation, but the downside is that such a naive policy can be tripped up by corner cases.
Random Replacement avoids corner-case issues but sometimes performs poorly.
A widely-used strategy is Least Recently Used (LRU): evict the page that hasn’t been accessed for the longest time. However, implementing this precisely is complex — traversing all pages to check their access times would be prohibitively expensive. We use an approximate strategy instead.
Here the OS needs help from its old friend the hardware again. We introduce the use bit (also called reference bit), which indicates whether a page has been accessed recently. If accessed, it’s set to 1; when to reset it to 0 depends on the OS policy. A simple and practical strategy using this is the Clock Algorithm.
Specifically, pages are organized in a circular linked list, each with a use bit. A pointer traverses the list. When we need to evict a page, we check the use bit of the page the pointer currently points to. If it’s 1, we set it to 0 and move the pointer to the next page. If it’s 0, we evict the current page. This gives us an approximate LRU.
Ch23 Complete Virtual Memory Systems
This chapter discusses how virtual memory is implemented in real operating systems. (Though it felt a bit too advanced for me — a lot of it went over my head.)
The book mainly covers two systems: VAX/VMS and Linux.
VAX/VMS
The VAX-11 uses a 32-bit virtual address space with 512-byte pages, combining segmentation and paging.
Memory layout: divided into P0 (user program and heap), P1 (stack), and S (system).
Page replacement policy:
- No hardware reference bit.
- Uses FIFO replacement.
- Uses clustering to improve disk I/O efficiency.
Loading optimizations:
- Demand Zeroing
- Copy-On-Write
Both are common lazy allocation strategies.
Linux
Memory layout: typical 32-bit Linux divides the address space into user space and kernel space (above 0xC0000000). Kernel addresses are directly mapped to physical addresses.
Page tables: four-level page tables. Linux defaults to 4KB pages, but supports huge pages for applications like databases — this requires user configuration.
Other advanced designs and security discussions were genuinely beyond me.
Ch24 Summary Dialogue on Memory Virtualization
Through another teacher-student dialogue, the chapter summarizes the entire Memory Virtualization section.
Here’s my own understanding of the whole process:
The OS wants to make memory easier for programmers to use, so it abstracts physical memory into a virtual address space.
Virtual addresses must be mapped to physical addresses — this mapping process is called address translation.
The two fundamental ways to manage virtual addresses are: segmentation and paging.
Segmentation: divide a program into multiple segments, each with different permissions and sizes.
Paging: divide memory into pages, all of equal size.
Since page tables incur significant memory overhead, multi-level page tables are used to reduce this cost.
Since accessing page tables requires multiple memory accesses, the TLB is used to accelerate address translation.
But caches always face misses, so replacement policies are needed. These include: FIFO, Random, LRU (practically implemented via the Clock Algorithm).

Leave a Comment