This repository contains a low overhead Linux kernel module and userspace tooling for identifying worst case scheduler latency, response time, and IRQ handling delays on PREEMPT / PREEMPT_RT systems.
The module was built to validate real time scheduling assumptions under interference. Specifically, to quickly determine whether observed workloads meet expected latency bounds, and to isolate sources of timing violation such as priority inversion, IRQ contention, or scheduler misconfiguration. For example, rather than using a full trace analysis (trace-cmd) or offline analysis, a developer may do a quick check using the module to see whether their workload is behaving correctly when a new task is added to the cumulative system.
Full writeup (design overview/tradeoffs, module evaluation of correctness, SMP concurrency): writeup.pdf
This is specifically designed for PREEMPT_RT, and evaluted on Raspberry Pi 4 (SMP) running Buildroot's PREEMPT Linux.
- Buildroot 2025.11-git
- Any Raspberry Pi 4 compatible Linux kernel with the PREEMPT or PREEMPT_RT model.
- My OS: Linux buildroot 6.12.41-v8 #1 SMP PREEMPT Thu Nov 20 17:18:47 UTC 2025 aarch64 GNU/Linux
- jerry_rt_module
- Motivation
- Features
- Design Highlights
- Installation
- Usage
- Example Output
- Control Plane Precision
- Conclusion and Further Reading
- Future Work
rt_tracer/irq_button_demois the kernel module made to configure an GPIO on the PI as an IRQ line, and its according handlers.rt_tracer/jerryis the kernel patch needed for the IRQ tracing. TODO: Without kernel patch, the module won't load. I need to make this an optional feature.- Other files are either with the main kernel module or workload folders used to analyze and validate, featured in
writeup.pdf. - Cross compiled binaries by
aarch64-linux-gnu-gccare included in workload folders.
Note: Despite the name, the current evaluation was performed on a PREEMPT (low latency) kernel rather than PREEMPT_RT. As a result, some wording in rt_tracer/writeup.pdf may refer to PREEMPT_RT semantics imprecisely.
The module is kernel agnostic with respect to scheduling model. It observes scheduler and interrupt tracepoints that are present across Linux configurations, and reports timing relationships between events (e.g. wakeup → run, run → sleep).
As a result, it can be used on PREEMPT, PREEMPT_RT, and non-RT kernels to diagnose scheduling behavior. Kernel configuration affects absolute latency values and tail behavior, but does not change the semantics of the metrics being collected. Full evaluation on PREEMPT_RT is left as future work.
Additionally, the IRQ handler response time metric is defined to align with PREEMPT_RT semantics, where IRQ handlers execute in threaded context.
Debugging real time scheduling issues on Linux often requires collecting large kernel traces and manually inspecting them with tools such as trace-cmd or kernelshark. This process is time consuming and makes it difficult to quickly identify which task or event is responsible in a scheduling induced case of delay.
This project aims to provide a lightweight alternative: track worst case latency metrics per PID directly in the kernel, along with a best effort scheduler trace for the worst observed case. This aims to have minimal interference to the running workload.
- Tracks per pid worst case metrics: Schedule in latency, Voluntary sleep response time (wake -> first voluntary sleep), control loop relief time (wake -> timer sleep), IRQ threaded handler execution time
- Records best effort scheduler traces for worst case metrics (need to add with trace and have SLO for the specific metric)
- Allows configuration of a SLO for each metric and records number of violations of SLO
- CAS loops used for max/min updates to avoid global locks
- Trace collection protected by per-metric spinlocks
- Best-effort trace semantics: correctness of numeric metrics prioritized over complete trace capture
This project is designed to run on a Raspberry Pi 4 with a PREEMPT_RT enabled Linux kernel. I used PREEMPT in my evaluations, however, installation and functionality remain the same.
- Install Buildroot and select an existing Raspberry Pi 4 configuration in menuconfig
- Enable a PREEMPT_RT kernel variant in the Buildroot kernel configuration (linux-menuconfig).
- Enable tracepoints inside of the kernel variant, meaning
TRACEPOINTS=yinside linux-menuconfig (make linux-menuconfig and type / in the menu and search to findTRACEPOINTS' dependencies) - Apply the kernel patch provided in this repository:
- Patch location:
rt_tracer/jerry/ - Add the patch according to Buildroot instructions (in 2025 Dec, I moved the
jerryfolder intobuildroot/board/and ranmake)
- Patch location:
- Build the full system image using Buildroot (
make).
Once completed, flash the generated image onto an SD card and boot it on the Raspberry Pi 4. The system should now be running Linux with PREEMPT_RT enabled.
Note: If you already have a PREEMPT_RT kernel running on a Pi 4, you may skip the Buildroot steps and proceed directly to module installation.
-
Build the kernel module (
.ko) as part of the Buildroot build process.Create a new package folder
jerry_rt_moduleinbuildroot/package/and populate with the necessary.mk,Makefile,Config.infiles, and reference thert_tracer/Config.in's name from thepackage/Config.into register it in menuconfig. (Reference Buildroot's process). My module files are all here in the repo (jerry_rt_module.mk, Makefile, rt_main.c, trace_ring_buffer.c/h, Config.in)Do
makeafter the package is setup in buildroot. The module is compiled against the same kernel source tree used to build the kernel, ensuring ABI compatibility.After the first
make, to recompile the module, just domake jerry_rt_module-rebuildatbuildroot/. Don't forget to enable the compilation of it inmenuconfig. -
Copy the resulting module .ko binary to the Pi 4 (e.g. via USB storage or SCP).
This is in the
/buildroot/output/build/[module]-ver/folder.For me, at
/buildroot/output/build/jerry_rt_module-1.0/jerry_rt_module.ko -
Load the module on the Pi:
insmod jerry_rt_module.ko
If you would like to test out tracking IRQ handlers and have a pushup button, you may install this module. This is installed in a similar fashion as above (rt_tracer/irq_button_demo is the module folder).
Loading the module connects GPIO 17 pin on the Pi 4 to hard, soft IRQ handler functions defined in the module. (i.e. Signalling GPIO 17 triggers those functions).
Note: While kernel modules can also be built against installed kernel headers on the target system, this project builds the module directly within Buildroot, which is the recommended approach for embedded systems.
Remark that the module provides per PID real time scheduling diagnostics, including:
- Max/min latency and response metrics
- SLO (upper bound) violation counting
- Worst case trace capture for debugging scheduling anomalies
All interaction is done via sysfs after loading the kernel module.
insmod jerry_rt_module.koReplace jerry_rt_module.ko with the full path to the module binary.
After this step, the module is loaded but no PIDs are tracked yet.
You can track a PID with or without trace capture.
Track without trace
echo "123" > /sys/kernel/jerry_rt_module/add_pid
Track with trace enabled
echo "123" > /sys/kernel/jerry_rt_module/add_pid
Tracking with trace enables worst-case scheduler trace collection when SLO violations occur.
After tracking a PID, you can configure upper bounds (SLOs) for specific metrics. Violations are counted whenever the metric exceeds the configured bound.
Example: Latency bound (nanoseconds)
echo "123 5000" > /sys/kernel/jerry_rt_module/set_slo_latency_boundFormat:
echo "[pid] [bound_in_ns]" > /sys/kernel/jerry_rt_module/set_slo_XXXXX_boundOther supported SLOs
set_slo_response_boundset_slo_response_relief_boundset_slo_irq_handling_bound
Note: If tracing was enabled when adding the PID, the trace desired metric's corresponding SLO must be set for a violation trace to be collected.
Remove a single PID
echo "123" > /sys/kernel/jerry_rt_module/del_pidThis:
- Stops tracking the PID
- Prints all accumulated statistics and traces to the kernel log (can view using dmesg)
Remove the module (stop all tracking)
rmmod jerry_rt_module.koThis:
- Stops tracking all PIDs
- Prints statistics for all tracked PIDs
- Unloads the module
Automatic cleanup on process exit If a tracked process exits normally, the module automatically emits its final statistics using the `sched_process_exit` tracepoint.
# Start tracking PID 22543 with tracing
echo "22543 trace" > /sys/kernel/jerry_rt_module/add_pid
# Set SLO bounds (ns)
echo "22543 2" > /sys/kernel/jerry_rt_module/set_slo_latency_bound
echo "22543 2" > /sys/kernel/jerry_rt_module/set_slo_response_bound
echo "22543 2" > /sys/kernel/jerry_rt_module/set_slo_response_relief_bound
# Let workload run...
# Stop tracking and emit statistics
echo "22543" > /sys/kernel/jerry_rt_module/del_pidjerry_rt_module: PID=22543, Name=background_bloc, minLat=3519, maxLat=25482,
minResponseTimeVoluntarySleepAllTypes=1017797, maxResponseTimeVoluntarySleepAllTypes=1059093,
minResponseTimeVoluntarySleepReliefBased=9223372036854775807, maxResponseTimeVoluntarySleepReliefBased=0
LatencyBound=2, Violations=19434
MAX TRACE LEN: 3
WORST LATENCY TRACE:
[ 0 us] Event: sched_wakeup, pid: 22543, wake_cpu: 1
[ 3 us] Event: sched_switch, preemption: 0, voluntary: 1, prev_pid: 22544 (priority: 49), next_pid: 0 (priority: 120), on_cpu: 2,
[233048.368766] [ 25 us] Event: sched_switch, preemption: 0, voluntary: 0, prev_pid: 0 (priority: 120), next_pid: 22543 (priority: 69), on_cpu: 1,
ResponseBound=2, Violations=18267
MAX TRACE LEN: 3
WORST RESPONSE TIME TRACE:
[ 0 us] Event: sched_wakeup, pid: 22543, wake_cpu: 1
[ = 7 us] Event: sched_switch, preemption: 0, voluntary: 0, prev_pid: 0 (priority: 120), next_pid: 22543 (priority: 69), on_cpu: 1,
[ 1059 us] Event: sched_switch, preemption: 0, voluntary: 1, prev_pid: 22543 (priority: 69), next_pid: 0 (priority: 120), on_cpu: 1,
ResponseReliefBound=2, Violations=0
MAX TRACE LEN: 0
WORST RESPONSE RELIEF TIME TRACE: Traces help identify why a worst-case occurred (CPU, preemption, competing tasks, etc.).
The module intentionally does not strictly synchronize userspace control commands with in flight tracepoint execution.
This means the following boundaries are not precisely defined:
- When tracking exactly begins after
add_pid - When SLO bounds exactly take effect
- When tracking exactly stops after
del_pid
Why this happens:
- Tracepoint handlers execute asynchronously on other CPUs
- Control plane updates (echo commands) are not synchronized against them
- Writes become visible eventually, but not at a deterministic instant
Practical impact:
- An initial latency cycle may be missed after
add_pid - A small number of events may still be counted after
del_pid - SLO checks may briefly use the old or new bound
This behavior is intentional.
The design prioritizes minimal runtime interference over precise control plane boundaries. A future version may tighten this window (e.g. using xchg on PID addition), at the cost of slightly higher overhead.
This module is intended as a practical, low overhead tool for quickly assessing real time scheduling behavior on PREEMPT_RT Linux systems. Rather than replacing full tracing frameworks, it complements them by providing immediate visibility into worst case behavior with minimal disruption to the running workload.
For readers interested in deeper details, including:
- design rationale and tradeoffs
- concurrency and synchronization guarantees
- trace correctness semantics
- overhead evaluation and RT impact analysis
- space/memory considerations
A full technical writeup is available: rt_tracer/writeup.pdf
That document expands on the internal mechanisms, explains known limitations, and motivates the design choices made to balance observability with real time safety.
case_studies and sanity_check_workloads are workloads used for the full writeup.
- Evaluations on a PREEMPT_RT kernel
- Potential races that I haven't caught probably exist
- To allow the use of tgids rather than pids, as there isn’t support for threads spawned under a pid yet.
- To allow for easier IRQ tracing setup without doing a kernel patch.
- Finding a way to use less memory when running the module for more strict embedded environments.
Contributions, feedback, and discussion are welcome (jerryq[zero one zero one with no spaces]@[jee mail like the popular one].com)