mateusz@systems ~/book/ch01 $ cat chapter.md

Chapter 1

File Systems Deep Dive

Filesystems are the primary interface for storing and accessing data, both within systems and across networks. For systems engineers working in production environments, understanding filesystem internals isn't academic—it's essential for making informed decisions about performance, reliability, and operational characteristics.

This chapter explores filesystems across three critical dimensions:

  • Taxonomy: Understanding the landscape of filesystem types and their use cases
  • Guarantees: Synchronization semantics and resiliency properties
  • Performance: How design choices impact real-world workloads

We'll go deep into implementation details, examining kernel code, protocol specifications, and operational trade-offs. By the end, you'll understand not just what different filesystems do, but why they behave the way they do—and how to debug them when things go wrong.

# Chapter Sections

Distributed and Network Filesystems

Framework for evaluating network filesystems across semantics, architecture, and failure modes. Covers NFS, Lustre, GPFS, and WekaFS with architecture diagrams and comparison tables. Includes AWS S3 as a counterexample to POSIX filesystems.

ZFS: A Modern Local Filesystem

End-to-end checksumming, copy-on-write architecture, integrated RAID, and snapshots. Understand ZFS's unique features, RAM requirements, and when to choose it over traditional filesystems.

Filesystem Hacks

Battle-tested techniques: squashfs on NFS, tmpfs for performance testing, loop devices, bind mounts, sparse files, and FUSE. Unconventional approaches that solve real problems.

The Linux Page Cache

How the page cache works, read-ahead mechanisms, dirty page writeback, and memory reclaim. Essential for understanding Linux filesystem performance.

Essential Linux Filesystem Semantics

Atomic rename, unlink with open file descriptors, O_TMPFILE, file locking (flock vs fcntl), and sparse files. POSIX operations that reliable software depends on.

IO Modes: Buffered vs Direct IO

Understanding buffered IO, direct IO (O_DIRECT), synchronous IO flags (O_SYNC, fsync), and how different filesystems handle these modes.

Memory-Mapped Files

How mmap() works, relationship to page cache, synchronization semantics, and performance characteristics. (Draft section - content in progress)

Synchronization Guarantees

Coordinating concurrent access across processes and systems. Local POSIX semantics, NFS close-to-open, and distributed locking. (Draft section - content in progress)

Resiliency Guarantees

Surviving crashes, power loss, and corruption. Journaling, copy-on-write, fsync semantics, and checksumming. (Draft section - content in progress)

Performance Characteristics

Block allocation, metadata overhead, caching strategies, IO patterns, and distributed filesystem performance. (Draft section - content in progress)

Deep Dive: ZFS recordsize

What recordsize controls, compression interaction, workload matching, performance measurement, and code walkthrough. (Draft section - content in progress)

Deep Dive: NFS Close-to-Open Consistency

Protocol sequence, edge cases, performance implications, tuning parameters, and code walkthrough. (Draft section - content in progress)

Observability and Debugging

Tool landscape, filesystem-specific debugging tools, and practical scenarios with eBPF/bpftrace examples. (Draft section - content in progress)