mateusz@systems ~/book/ch01/mmap $ cat section.md

Memory-Mapped Files

[DRAFT - Section in progress]

Memory-mapped files allow applications to access files as if they were regions of memory, eliminating explicit read/write system calls. The kernel handles paging transparently, bringing file content into memory on demand via page faults.

# Fundamentals

mmap() maps a file (or portion of a file) into the process's address space:

int fd = open("datafile", O_RDWR);
void *addr = mmap(NULL, length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, offset);
// Access file via memory operations
addr[0] = 42;  // Modifies file
munmap(addr, length);

Mapping modes:

  • MAP_SHARED: Changes visible to other processes mapping the same file. Writes eventually propagate to the underlying file.
  • MAP_PRIVATE: Copy-on-write. Changes are private to this process and don't affect the file or other mappings.

# Relationship to Page Cache

Memory-mapped files use the page cache as backing store. When you access a mapped page:

  1. Page fault occurs (page not in process page tables)
  2. Kernel checks if file page is in page cache
  3. If yes: Map it into process address space
  4. If no: Trigger IO to read from disk, add to page cache, then map

This means multiple processes mapping the same file share the same physical pages via the page cache.

# Synchronization Semantics (DRAFT)

TODO: Expand this section

  • Reads see writes from same/other processes (eventually via page cache)
  • msync() for explicit flush to disk
  • No read/write ordering guarantees without explicit sync
  • Overlapping read/write races possible

# Performance Characteristics (DRAFT)

TODO: Expand with benchmarks and use cases

  • Avoids system call overhead for repeated access
  • Kernel manages paging transparently
  • TLB pressure with large mappings (huge pages help)
  • When NOT to use: sequential streaming, small random access

# Use Cases and Pitfalls (DRAFT)

TODO: Add concrete examples