Memory-Mapped Files
[DRAFT - Section in progress]
Memory-mapped files allow applications to access files as if they were regions of memory, eliminating explicit read/write system calls. The kernel handles paging transparently, bringing file content into memory on demand via page faults.
# Fundamentals
mmap() maps a file (or portion of a file) into the process's address space:
int fd = open("datafile", O_RDWR);
void *addr = mmap(NULL, length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, offset);
// Access file via memory operations
addr[0] = 42; // Modifies file
munmap(addr, length);
Mapping modes:
MAP_SHARED: Changes visible to other processes mapping the same file. Writes eventually propagate to the underlying file.MAP_PRIVATE: Copy-on-write. Changes are private to this process and don't affect the file or other mappings.
# Relationship to Page Cache
Memory-mapped files use the page cache as backing store. When you access a mapped page:
- Page fault occurs (page not in process page tables)
- Kernel checks if file page is in page cache
- If yes: Map it into process address space
- If no: Trigger IO to read from disk, add to page cache, then map
This means multiple processes mapping the same file share the same physical pages via the page cache.
# Synchronization Semantics (DRAFT)
TODO: Expand this section
- Reads see writes from same/other processes (eventually via page cache)
msync()for explicit flush to disk- No read/write ordering guarantees without explicit sync
- Overlapping read/write races possible
# Performance Characteristics (DRAFT)
TODO: Expand with benchmarks and use cases
- Avoids system call overhead for repeated access
- Kernel manages paging transparently
- TLB pressure with large mappings (huge pages help)
- When NOT to use: sequential streaming, small random access
# Use Cases and Pitfalls (DRAFT)
TODO: Add concrete examples