Skip to content

Enhance page fault handler to support all region types and lazy allocation #413

@pbalduino

Description

@pbalduino

Goal

Expand the page fault handler to recognize all region types (text/data/heap/stack/mmap), distinguish between guard-page hits and legitimate faults, and populate pages appropriately for on-demand allocation. This is the foundation that enables lazy allocation and guard pages.

Current State

The page fault handler has limited capabilities:

  • Basic grow-down/grow-up only: vm_region_handle_page_fault in src/kernel/user/vm_region.c:168-195 only services stack (grow-down) and heap (grow-up) regions by allocating a single zeroed page when a user task touches an unmapped address.

  • Falls back to SIGSEGV: Any fault outside of explicit grow regions falls back to the generic SIGSEGV path in src/kernel/idt.c:176-205, killing the process instead of checking if the fault was in a valid lazy region.

  • No region type awareness: The handler doesn't recognize different region types (text, data, BSS, heap, mmap) or how to populate them differently.

  • No guard page support: Cannot distinguish between:

    • Guard page hits (should remain unmapped and kill process)
    • Protection violations (wrong permissions, should SIGSEGV)
    • Legitimate "needs a new page" events (should allocate)
  • No file-backed region support: Cannot populate pages from ELF segments or VFS-backed mappings, only knows how to zero-fill anonymous pages.

Why This Matters

This enhanced handler is the foundation that lets on-demand paging (#56) work:

  • It becomes the one place that allocates (or refuses) pages at first use
  • Enables mapping regions as reserved-but-uncommitted virtual space
  • Allows guard pages to protect against overflows
  • Supports file-backed lazy loading from ELF segments

Without this enhancement, all pages must be eagerly allocated at process creation.

Definition of Done

1. Recognize All Region Types

  • Extend vm_region_handle_page_fault in src/kernel/user/vm_region.c:168-195 to handle:
    • Text regions (read/execute, file-backed)
    • Data regions (read/write, file-backed with initial data)
    • BSS regions (read/write, zero-filled anonymous)
    • Heap regions (read/write, zero-filled anonymous, grow-up)
    • Stack regions (read/write, zero-filled anonymous, grow-down)
    • Anonymous mmap regions (read/write, zero-filled)
    • File-backed mmap regions (from VFS)

2. Implement Fault Classification

  • Distinguish between different fault types:
    • Guard page hit: Fault address is in a guard region → SIGSEGV (overflow detected)
    • Protection violation: Region exists but access violates permissions → SIGSEGV
    • Lazy allocation fault: Region exists, permissions OK, page not present → allocate
    • Invalid address: No region at fault address → SIGSEGV

3. Add Region Metadata for Lazy Loading

  • Extend vm_region_t structure to include:
    • Region type (text/data/BSS/heap/stack/mmap/guard)
    • File-backed information (VFS path, offset, length) for ELF segments
    • Flags indicating lazy allocation vs eager mapping
    • Pointer to backing store (ELF image, VFS file, or NULL for anonymous)

4. Implement Per-Type Page Population

  • Anonymous regions (BSS, heap, stack, anonymous mmap):

    • Allocate zeroed physical frame via pmm_alloc_page()
    • Map with appropriate permissions
    • Mark as present
  • File-backed regions (text, initialized data):

    • Allocate physical frame
    • Read data from ELF segment or VFS file at correct offset
    • Map with appropriate permissions (e.g., R-X for text, RW- for data)
    • Cache loaded pages (future enhancement)
  • Guard pages:

    • Do NOT allocate
    • Trigger SIGSEGV to kill process
    • Log the overflow for debugging

5. Update Generic Page Fault Path

  • Modify src/kernel/idt.c:176-205 to:
    • Extract fault address from CR2
    • Identify faulting process
    • Call enhanced vm_region_handle_page_fault
    • Only fall back to SIGSEGV if handler returns "unhandled"

6. Add Comprehensive Logging and Diagnostics

  • Log page fault events with:
    • Fault address and type (read/write/execute)
    • Matching region (if any)
    • Handler decision (allocated/rejected/guard hit)
    • Process PID and instruction pointer
  • Add counters for:
    • Lazy allocations per region type
    • Guard page hits
    • Protection violations
    • Total page faults handled

7. Error Handling and Edge Cases

  • Handle allocation failures gracefully (OOM)
  • Validate fault address alignment
  • Handle faults at region boundaries
  • Prevent infinite fault loops
  • Ensure thread safety for SMP (future)

Implementation Phases

Phase 1: Foundation (High Priority)

  • Extend vm_region_t with region type and lazy allocation metadata
  • Implement fault classification logic
  • Add anonymous region handling (BSS, heap, stack)

Phase 2: File-Backed Support (Medium Priority)

  • Add ELF segment metadata to regions
  • Implement file-backed page loading from ELF image
  • Support loading from VFS for file-backed mmap

Phase 3: Guard Pages (High Priority for Security)

  • Implement guard page detection
  • Add guard regions between canonical address space areas
  • Log overflow attempts

Phase 4: Optimization (Low Priority)

  • Add page cache for file-backed pages
  • Optimize fault path for common cases
  • Reduce logging overhead in production

Example Scenarios

Scenario 1: Stack Growth (Already Works)

User program: pushq %rax
Fault: Address 0x7fffffffe000 (just below stack)
Handler: Recognize stack region with grow-down flag
         Allocate zeroed page
         Map at 0x7fffffffe000
Result: Execution continues

Scenario 2: Lazy BSS Allocation (New)

User program: movl $0, global_bss_var
Fault: Address 0x600040 in BSS region
Handler: Recognize BSS region (anonymous, RW-)
         Allocate zeroed page
         Map at 0x600000 (page-aligned)
Result: Execution continues

Scenario 3: Guard Page Hit (New)

User program: Stack overflow, writing past guard
Fault: Address 0x7ffffff00000 (in guard region)
Handler: Recognize guard page
         Log overflow: "Stack overflow at 0x7ffffff00000"
         Return "unhandled"
Result: Process receives SIGSEGV and terminates

Scenario 4: File-Backed Text Page (New)

User program: Jump to unmapped function
Fault: Address 0x400500 in text region
Handler: Recognize text region (R-X, file-backed)
         Allocate physical frame
         Read from ELF at offset 0x500
         Map with RX permissions
Result: Execution continues

Files to Create/Modify

  • src/kernel/user/vm_region.c:168-195 - Enhance vm_region_handle_page_fault
  • src/kernel/user/vm_region.h - Extend vm_region_t structure
  • src/kernel/idt.c:176-205 - Update generic page fault handler
  • include/kernel/vm_region.h - Add region type enums
  • src/kernel/user/vm_fault.c (new) - Optional: separate fault handling logic
  • docs/architecture/page_fault_handler.md (new) - Document handler behavior

Success Criteria

  • Page faults in lazy regions allocate pages instead of crashing
  • Guard pages correctly trigger SIGSEGV
  • File-backed regions can be populated from ELF/VFS
  • All region types (text/data/BSS/heap/stack/mmap) are supported
  • Comprehensive logging shows fault handling decisions
  • No regression in existing stack/heap fault handling

Dependencies

Blocks

Related Issues

This is the core page fault handling infrastructure that enables on-demand paging and guard pages.

References

  • Linux kernel: handle_page_fault(), do_anonymous_page(), do_fault()
  • xv6: uvmalloc() and trap handling
  • Page fault error codes: Intel SDM Vol 3A, Section 4.7

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestsecuritySecurity-related issues and vulnerabilities

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions