[SCRUM-7]: Add bitmap page allocator with double-free detection.#13
[SCRUM-7]: Add bitmap page allocator with double-free detection.#13carlitodavis wants to merge 17 commits intomainfrom
Conversation
Remove placeholder for multiboot_info structure.
LordHerdier
left a comment
There was a problem hiding this comment.
PR Review — [SCRUM-7] Bitmap page allocator
Acceptance criteria are met, but there is one critical bug that would cause memory corruption in practice.
What's correct
- Bitmap set/clear/test logic is correct
align_up_uintptris correct- Iteration idiom
cur += entry->size + sizeof(entry->size)is right MULTIBOOT_MEMORY_AVAILABLE = 1constant is a good addition (addresses an issue from the PR #11 review)free_pagevalidates alignment and range before operating
Critical: bitmap covers already-in-use pages
page_alloc_init sets managed_base to the aligned start of the first usable region above 1 MiB, which is 0x100000. But the kernel image and the kmalloc heap both live in that same region: the serial output shows Allocator base: 00209000, meaning the heap extends to at least 0x209000. Every page from 0x100000 to 0x208FFF is occupied, yet the bitmap marks them all as free. The first alloc_page() call returns 0x100000, which is inside the kernel image. Writing to any of these pages corrupts live kernel/heap data.
The fix is to start managed_base at align_up(end_of_kmalloc_heap, PAGE_SIZE), or to pre-mark kernel-occupied pages as used in the bitmap after zeroing it.
Moderate: test/debug code left in kernel_main
The double-free smoke test (free_page(p1) called twice, alloc_page result printed, etc.) is hardcoded in kernel_main. Fine for verification, but shouldn't merge into main as-is. Should be removed or gated behind a #ifdef DEBUG.
Minor: bundles PR #11's mmap.c/mmap.h unchanged
This PR includes identical copies of the mmap.c/mmap.h files from the open, unreviewed PR #11. It carries the same issues flagged there (no stored parse output, region_type_name collapsing 5 types into 2, duplicate formatters). If PR #11 merges first with changes, this will need a rebase.
Minor: only the first usable region is managed
page_alloc_init returns immediately after initializing from the first eligible region. For QEMU -m 256M this is fine: the one large region covers ~254 MB. But the single-region limitation isn't documented and will silently ignore memory on fragmented maps (common on real hardware).
Minor: MULTIBOOT_MEMORY_AVAILABLE is the only type constant
Now that page_alloc.c filters on mmap type, the sibling constants (2=reserved, 3=ACPI reclaimable, 4=NVS, 5=bad) should live alongside it in multiboot.h. Type 3 (ACPI reclaimable) is particularly relevant: it's usable memory the allocator currently ignores.
fa9419d to
25b8808
Compare
|
Addressed the critical allocator bug by reserving the kernel/kmalloc heap range in the bitmap before handing out pages, so alloc_page() no longer returns memory overlapping live kernel or heap memory. I also cleaned the smoke test out of the normal boot flow by gating it behind #ifdef DEBUG. Since this branch was rebased onto the updated mmap work, it now also uses the cleaned-up mmap API / formatting utilities from SCRUM-6. Verified in docker-ci that boot still succeeds and the allocator now initializes with kernel/heap pages reserved. Current limitation: the allocator still manages the first large usable region only, which is sufficient for the QEMU -m 256M acceptance environment. |
LordHerdier
left a comment
There was a problem hiding this comment.
Good progress! The bitmap corruption fix looks solid and gating the smoke test behind #ifdef DEBUG is the right call. A few things to sort out before this merges though.
Also, since your previous memory PR went through, could you rebase this onto main before we merge?
Blocking: serial.log shouldn't be committed
This is a generated artifact, pull it out of the PR and add it to .gitignore. Also worth noting the committed file doesn't match the output in the PR description, so it's probably from a stale build anyway.
Should fix: page_alloc_init re-parses the mmap directly
page_alloc_init is walking mb->mmap_addr itself instead of going through mmap_get_regions(). That's what mmap.c is for, two separate parse sites is asking for drift. Worth wiring up to the existing API.
Should fix: reserve_region rounds start the wrong way
align_up_uintptr(start, PAGE_SIZE) on the start address means a non-page-aligned _load_start would leave the partial first page unreserved. Start should round down (or stay as-is), only end needs to round up.
Minor: document the single-region limitation in page_alloc.h
A short note on page_alloc_init saying it only manages the first eligible region would be helpful, makes it clear it's a known constraint rather than something that got missed.
|
Thanks for the feedback!!
Also rebased onto |
LordHerdier
left a comment
There was a problem hiding this comment.
Review: [SCRUM-7] Add bitmap page allocator with double-free detection
Overall: Solid foundation. Bit manipulation is correct, double-free detection and input validation in free_page are good, and the idempotency guard is a nice touch. Two real bugs to fix before merge.
Issues
1. Pages between 1 MB and _load_start (2 MB) are incorrectly left free
managed_base is 0x100000 but reserve_region starts at &_load_start (0x200000). That leaves 256 pages in [0x100000, 0x200000) marked free and allocatable, but the multiboot info struct, module list, and parts of the WAD likely live there. alloc_page() could silently return a page that aliases live data.
Fix: reserve from managed_base to memory_base_address() instead of from _load_start:
reserve_region((uintptr_t)managed_base, (uintptr_t)memory_base_address());2. Multiboot module pages are not reserved
The WAD is loaded somewhere in the usable region by GRUB, with its location in mb->mods_addr. Nothing reserves those pages, so alloc_page() could hand out a page overlapping the WAD. The mods_addr range should be walked and reserved before returning from page_alloc_init.
Minor
3. extern char _load_start: prefer array declaration
extern char _load_start; // current
extern char _load_start[]; // preferred for linker symbolsTaking &_load_start on a bare char is technically UB. Array form is the conventional style for linker symbols and compiles identically.
4. DEBUG smoke test leaks p2
Intentional since the kernel exits right after, but a short comment would make it clear to future readers it's not a bug.
5. Single-region limitation: noted in header, good
Worth a TODO(SCRUM-?) if there's a follow-up ticket planned.
Verdict: Items 1 and 2 could cause silent memory corruption once the allocator is actually used, so those should be fixed before merge. Everything else looks good.
e1521ff to
f43ebe1
Compare
|
Addressed ALL review items"
Verified in docker-ci:
Allocator no longer returns pages overlapping kernel, heap, or mod memory. |
|
Thanks for your work on this! I have a couple issues that'll need to be addressed before this can be merged: Bugspage_alloc.c:13 _load_start declared but never used Only the first usable region above 1 MB is managed. The loop over regions[] returns immediately on the first match. Any secondary usable regions (e.g. above 256 MB) are silently ignored. The header comment acknowledges this but it's a real gap. The system could have GBs of usable RAM that get no allocator coverage. page_alloc.c:107–118 WAD module reservation indentation is broken The body of the if ((mb->flags & MULTIBOOT_INFO_FLAG_MODS) && mb->mods_count > 0) {
struct multiboot_module* mods = // ← at column 4, should be further right
(struct multiboot_module*)mb->mods_addr;Variable shadowing: the outer loop in Style / Minor
Other than the issues mentioned, this is coming along really nicely! Thanks for all your work on this |
THINGS THAT WERE DONE
Checked to verify it was working:
1.) Verified with serial output in QEMU:
OUTPUT:
Kernel Booted Multiboot memory map: base=0x0000000000000000 len=0x000000000009FC00 type=1 usable base=0x000000000009FC00 len=0x0000000000000400 type=2 reserved base=0x00000000000F0000 len=0x0000000000010000 type=2 reserved base=0x0000000000100000 len=0x000000000FEE0000 type=1 usable base=0x000000000FFE0000 len=0x0000000000020000 type=2 reserved base=0x00000000FFFC0000 len=0x0000000000040000 type=2 reserved Memory subsystem initialized Allocator base: 00209000 page_alloc: initialized Allocated page 1: 00100000 Allocated page 2: 00101000 Freed page 1 free_page: double free detectedRun using:
docker make-buildthen use:
make docker-ci