Skip to content

add memtrace use Kconfig#986

Open
greenfool wants to merge 1 commit intoOpenXiangShan:masterfrom
greenfool:add-memtrace
Open

add memtrace use Kconfig#986
greenfool wants to merge 1 commit intoOpenXiangShan:masterfrom
greenfool:add-memtrace

Conversation

@greenfool
Copy link

@cebarobot
Copy link
Member

Hi, thanks for your PR. This is a nice idea to improve our checkpoint flow. But there is still some issues that should be fixed before merging this PR.

  1. Could you please add an PR description (and commit message) in English? Your PDF file is nice, but may be not suitable for a PR.
  2. Could you please add an description in OpenXiangShan/XiangShan-doc? The motivation and result part of your PDF may be moved to there.
  3. There is some problems in your codes. I'll point it out later.

For your contribution, I also have some questions:

  1. I'm not sure whether your test results is strong enough to persuade us to switch to your way. Could you try to run more test programs?
  2. Is CacheReplay a boot program like gcptRestorer? How does it works? Can it work on normal simulation environments, like XiangShan's verilator/pldm emulator? Why do you need to modify GEM5?

@greenfool
Copy link
Author

Hi ,thanks for your valuable feedback and recognition of the PR! I totally agree with all your suggestions and will address each point promptly:

1. PR Description & Commit Message

I’ll add a detailed English description to the PR (including key changes, design logic, and compatibility notes) and update all commit messages to follow the English convention. This will be completed within next week.

2. XiangShan-doc Update

I’ll migrate the "Motivation" and "Experimental Results" sections from the PDF to OpenXiangShan/XiangShan-doc (under the appropriate directory). I’ll ensure the content is concise, structured, and aligned with the doc’s existing style. Will share the doc PR link for your review once ready.

3. Code Issues

No problem! Please feel free to point out the specific code issues – I’ll fix them immediately and add corresponding test cases to avoid regressions.

4. Test Results Enhancement

I fully understand the need for stronger validation. I’ll try to expand the test suite.

5. Questions About CacheReplay

  • Is CacheReplay a boot program like gcptRestorer?

    No – CacheReplay is a standalone program, distinct from gcptRestorer. It first generates the microarchitectural state of the cache hierarchy for the corresponding checkpoints under different configurations by parsing and simulating the memtrace. Then, during checkpoint restoration in GEM5, it constructs the exact microarchitectural state of the cache hierarchy using this information, instead of relying on warm-up instructions.

  • How does it work?

    First, in NEMU, we capture all memory accesses during execution and generate a memtrace . CacheReplay then reads this memtrace and runs simulations in accordance with specified cache configurations; through this simulation, it derives the exact cache microstate (including cache data, tags, LRU timestamps, and more) that the cache hierarchy should be in when restoring the target checkpoint under the given configurations. Finally, it outputs all these microstate details to a dedicated file. During checkpoint restoration in GEM5, the system reads this pre-generated microstate file and reconstructs the cache hierarchy’s state exactly as required based on the file’s information.

  • Compatibility with Verilator/PLDM emulator?

    Currently, this implementation only runs on the XiangShan-specific GEM5 emulator and has not yet been tested on the Verilator or PLDM emulators.

  • Why modify GEM5?

    GEM5 does not natively support the restoration of cache microarchitectural states. To allow GEM5 to read the output file from CacheReplay and restore the corresponding cache microarchitectural state, we need to add a set of additional behaviors to GEM5’s checkpoint restoration process. Importantly, this modification is fully controlled by dedicated configuration options and will not impact any of GEM5’s original functionalities or existing checkpoint workflows.

Thanks again for your guidance!

@greenfool
Copy link
Author

feat: Add memory access trace (memtrace) feature to support cache microarchitecture state recovery for accelerated simulation


Overview

This PR introduces the memory access trace (memtrace) feature, which provides the core fundamental capability to skip the warm-up phase by directly constructing and restoring the microarchitecture state of the cache, ultimately achieving a remarkable speedup of the simulator's execution.

Core Changes

  1. Added Kconfig compilation configuration items

    • Added CONFIG_MEMTRACE and CONFIG_MEMTRACE_PATH configuration options to control the enable/disable of the memtrace feature at compile time, and to specify the temporary storage path for memory access traces
    • All newly added code is wrapped and isolated by the #ifdef CONFIG_MEMTRACE macro, ensuring zero intrusion to the original code logic when the feature is disabled
  2. Added core data structures and interfaces for memtrace

    • Added the memory access record structure pkt_data_used_small in include/util.h, along with interface declarations for trace file path acquisition, buffer writing, and file flushing
    • Implemented core interfaces including memtrace_dump, memtrace_flush, and memtrace_trapflush in src/utils/memtrace.c, which are responsible for buffer management and persistent output of memory access records
  3. Adaptation for the checkpoint generation process

    • Modified src/checkpoint/serializer.cpp to trigger memtrace_flush() during PMem serialization, which flushes the complete memory access trace to the corresponding checkpoint directory; meanwhile adjusted the uniform checkpoint generation logic to ensure the accuracy of checkpoint generation when the feature is enabled
    • Modified src/isa/riscv64/instr/special.h to call memtrace_trapflush() when good_trap is triggered, which outputs the remaining access records in the buffer to the temporary path to avoid data loss
  4. Memory access trace capture capability

    • Modified src/memory/host-tlb.c and src/memory/paddr.c to capture memory access information in the memory access execution path, and call the memtrace_dump interface to write records, realizing full tracking of all memory access behaviors of the program

Feature Description

When the CONFIG_MEMTRACE compilation option is enabled, NEMU records the memory access behaviors of the program in real time during simulation execution. When a checkpoint is generated, the memory access trace file of the corresponding phase is synchronously output to the directory where the checkpoint is located.

The generated memory access trace file can be used with the CacheReplay tool (https://github.com/greenfool/CacheReplay.git) to simulate and generate the cache microarchitecture state under any specified configuration. Furthermore, in full-system simulators such as GEM5, while restoring the program execution state based on the checkpoint generated by NEMU, the runtime state of the cache can be directly restored, completely skipping the traditional warm-up phase. This greatly reduces the total number of instructions that need to be simulated in detail, and achieves a significant improvement in simulation efficiency.

Compatibility Guarantee

  • This feature is disabled by default. When CONFIG_MEMTRACE is not enabled, all newly added code will not be compiled, and there is no impact on the original compilation process, running logic, and functional features of NEMU
  • All modifications strictly follow the coding specifications of the original code, with no breaking changes, and do not affect the normal use of existing functions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants