Upstream vm interactive bug fixes#1
Open
Nicyzk wants to merge 1200 commits intovm_interactive_bug_fixesfrom
Open
Upstream vm interactive bug fixes#1Nicyzk wants to merge 1200 commits intovm_interactive_bug_fixesfrom
Nicyzk wants to merge 1200 commits intovm_interactive_bug_fixesfrom
Conversation
…otifications Fix periodic job notifications
Don't try to perform preemption for greedy tasks. This improves the fairnes. Signed-off-by: Changwoo Min <changwoo@igalia.com>
scx_lavd: Don't try preemption for greedy tasks
Re-introduce a softer version of idle polling to keep re-scheduling the user-space scheduler from ops.update_idle() if it still has pending tasks waiting to be dispatched. This allows to achieve good core utilization both with v6.12 and v6.13 kernels. Signed-off-by: Andrea Righi <arighi@nvidia.com>
…dle-polling scx_rustland_core: re-introduce ops.update_idle()
Minor version bump to include the new backward-compatible changes. Signed-off-by: Andrea Righi <arighi@nvidia.com>
scxtop: fix short option conflict between tick* options.
A FIFO-only variation on scx_simple with CPU selection that prioritizes an idle previous CPU over a fully idle core (as is done in scx_simple and scx_rusty). scx_prev outperforms a few other schedulers on OLTP workloads run on systems with relatively flat topology (i.e. non-NUMA, single LLC) by changing CPU selection as above and by taking advantage of the more aggressive work conservation (i.e. idle balancing) that comes with sched_ext by default. It's far from being a full-fledged scheduler, but it demonstrates how a small change to an existing scheduler can improve performance in a real application. Notes: - AMD EPYC 7J13 (16-CPU VM) server running v6.12-based UEK-next kernel, scx (688bffc "Merge pull request sched-ext#1192 from devnexen/code_simpl3"), and MySQL Community Edition 8.4[0] - AMD EPYC 7551 (128-CPU BM) client running BMK[1] (a sysbench-based BenchMark Kit) - Each data point in the table below represents the average of ten, one-minute runs done after a three-minute warmup. The server is rebooted between each scheduler. - "cli" means the number of database clients. - Each %diff column is relative to eevdf. Representative BMK testcase: sb11-OLTP_RO_10M_8tab-uniform-ps-notrx.sh cli eevdf (std%) rusty (std%) %diff simple (std%) %diff prev (std%) %diff --- ------------ ------------ ----- ------------- ----- ----------- ----- throughput 16 4140 ( 1%) 4224 ( 1%) ( 2%) 4276 ( 2%) ( 3%) 4263 ( 1%) ( 3%) 32 7382 ( 1%) 7259 ( 1%) ( -2%) 7314 ( 1%) ( -1%) 7919 ( 1%) ( 7%) 48 9015 ( 0%) 9644 ( 0%) ( 7%) 10055 ( 0%) ( 12%) 10411 ( 1%) ( 15%) 64 9765 ( 1%) 9601 ( 0%) ( -2%) 10214 ( 0%) ( 5%) 10481 ( 0%) ( 7%) average latency 16 4 ( 1%) 4 ( 1%) ( -2%) 4 ( 2%) ( -3%) 4 ( 1%) ( -3%) 32 4 ( 1%) 4 ( 1%) ( 2%) 4 ( 1%) ( 1%) 4 ( 1%) ( -7%) 48 5 ( 0%) 5 ( 0%) ( -7%) 5 ( 0%) (-10%) 5 ( 1%) (-13%) 64 7 ( 1%) 7 ( 0%) ( 2%) 6 ( 0%) ( -4%) 6 ( 0%) ( -7%) 95p latency 16 4 ( 3%) 4 ( 2%) ( -4%) 4 ( 4%) ( -1%) 4 ( 4%) ( -7%) 32 5 ( 2%) 5 ( 1%) ( 1%) 5 ( 2%) ( 1%) 4 ( 2%) (-11%) 48 7 ( 1%) 6 ( 1%) (-16%) 5 ( 1%) (-24%) 5 ( 1%) (-26%) 64 9 ( 3%) 8 ( 0%) (-12%) 7 ( 0%) (-26%) 7 ( 1%) (-26%) In the read-only workload, prev consistently outperforms with equal or better throughput and latency across the board. [0] https://github.com/mysql/mysql-server/tree/8.4 [1] http://dimitrik.free.fr/blog/posts/mysql-perf-bmk-kit.html Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Begin migration of CPU heavy tasks from GitHub free runners to self hosted dedicated Linux runners. Start with the build-kernel job as it's pretty simple and doesn't run very often. Will come back for the other copies of `build-kernel` once this is proven. Changes to enable this: - Set `runs-on` correctly for this job. - Switch dependency management to a Nix develop shell for this job. This means we get the same dependencies whether we stay on a self-hosted runner or switch back to GitHub. This is much easier than chasing the ever moving target of software installed on the GitHub runners, and has the added benefit of pinning dependencies. - Use my branch of nixpkgs with `virtme-ng` packaged. Will upstream this once `virtme-ng` is confirmed working for all of our use cases. - Bump the cache version number. This isn't really necessary but will mean if this does cause any problems that a revert is cleaner. - Enables `lookup-only` for the cache kernel step. This means that the cache is never downloaded, which is a good idea given the dedicated server will likely take longer to download the cache. It has the added benefit of being much faster on a hit, and has the same behaviour on a miss. On request, landing this as an additional job and not using the cache artifacts in this initial merge. Will leave this running for a short time before switching. ||| |-|-| |Old cache miss| [13m2s](https://github.com/sched-ext/scx/actions/runs/13000153631/job/36256954083) | |New cache miss| [2m55s](https://github.com/sched-ext/scx/actions/runs/13021263130/job/36322236211) | |Old cache hit | [12s ](https://github.com/sched-ext/scx/actions/runs/13016947090/job/36308397960) | |New cache hit | [6s ](https://github.com/sched-ext/scx/actions/runs/13021532927/job/36323025625) | Test plan: - Performance looks good.
scx_rustland_core: bump up version to 2.2.6
ci: add build-kernel-nix job using dedicated server
scx_prev: a simple scheduler tested on OLTP workloads
As Andrea points out[0], select_cpu() is never called for such tasks, so this branch is dead code. Remove it. [0] sched-ext#1275 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
…s_fix scx_prev: delete unused logic for nr_cpus_allowed == 1 tasks
…runs Currently the Slack notification gets sent even if the failing branch is not `main`. This means that any testing done on these workflows (by temporarily enabling `push`) triggers the notification if they fail or are cancelled. Replace the `always()` condition with `failure()`. This omits the previous `cancelled()` case, but that should be fine. Will add it back if there are objections. Also add the same filter as the `pages` job to only run on `main`. Test plan: - __shrugs__
Release a new scx_utils to fix scx_rustland_core dependency. Signed-off-by: Andrea Righi <arighi@nvidia.com>
scx_utils: Bump up version to 1.0.10
Adds a reusable workflow `build-kernel.yml` to cover the build-kernel job using the new Nix based build process on the dedicated runner. All of the `build-kernel` jobs are identical except for their git repo and git branch name. Factor these out into a reusable workflow to reduce code duplication. This also removes any suffixes from the cache, which might (unlikely) increase hit rates. Each build was suffixing their kernels separately even though the build was identical, which was slightly wasteful. It is unlikely any of these repos/branches have identical states though. Test plan: - Ran the CI and waited for build-kernel-nix to succeed in each workflow (added a temporary `push:` condition to make sure they all build). Cancelled the rest as this built kernel isn't used yet so there's no point causing extra queuing.
ci: create reusable workflow for nix kernel builds
scx_utils: adding cpu affinity data to Gpu type.
Note that at ops.enqueue() path, setting a task's slice to zero is risky because we don't know the exact status of the task, so it could cause a zero time slice error as follows: [ 8271.001818] sched_ext: ksoftirqd/1[70] has zero slice in pick_task_scx() The zero slice warning is harmful because the sched_ext core ends up setting the time slice to SCX_SLICE_DFL (20 msec), increasing latency spikes. Thus, we do not set the time slice to 0 at the ops.enqueue() path and rely on scx_bpf_kick_cpu(SCX_KICK_PREEMPT) all the time. Also, use 1 (instead of 0) as a marker to perform scx_bpf_kick_cpu(). This should solve the following issue: sched-ext#1283 Signed-off-by: Changwoo Min <changwoo@igalia.com>
Separate setting the preemption information into the case of entering into an idle state and the case that the CPU is taken by a high scheduling class. Signed-off-by: Changwoo Min <changwoo@igalia.com>
scx_lavd: Do not set task's time slice to zero at the ops.enqueue() path
When (s64)(after - before) > 0, the code returns the result of (s64)(after - before) > 0 while the intended result should be (s64)(after - before). That happens because the middle operand of the ternary operator was omitted incorrectly, returning the result of (s64)(after - before) > 0. Thus, add the middle operand -- (s64)(after - before) -- to return the correct time calculation. Signed-off-by: Changwoo Min <changwoo@igalia.com>
scx_utils: sched-ext#1281 follow-up for 32 bits.
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
Signed-off-by: vjabrayilov <vjabrayilov@cs.columbia.edu>
vjabrayilov
pushed a commit
that referenced
this pull request
Mar 18, 2025
Into trait was calling the Into<&SupportedSched> which was calling
Into<SupportedSched> and so on.
```
#0 0x622450e96149 in scx_loader::_$LT$impl$u20$core..convert..From$LT$scx_loader..SupportedSched$GT$$u20$for$u20$$RF$str$GT$::from::h13ba9d4271e33441 /tmp/scx/rust/scx_loader/src/lib.rs:60:9
#1 0x622450e91af3 in _$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$::into::h9481856c4f80c765 /home/vl/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/convert/mod.rs:759:9
#2 0x622450e9614a in scx_loader::_$LT$impl$u20$core..convert..From$LT$scx_loader..SupportedSched$GT$$u20$for$u20$$RF$str$GT$::from::h13ba9d4271e33441 /tmp/scx/rust/scx_loader/src/lib.rs:60:9
#3 0x622450e91af3 in _$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$::into::h9481856c4f80c765 /home/vl/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/convert/mod.rs:759:9
sched-ext#4 0x622450e9614a in scx_loader::_$LT$impl$u20$core..convert..From$LT$scx_loader..SupportedSched$GT$$u20$for$u20$$RF$str$GT$::from::h13ba9d4271e33441 /tmp/scx/rust/scx_loader/src/lib.rs:60:9
sched-ext#5 0x622450e91af3 in _$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$::into::h9481856c4f80c765 /home/vl/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/convert/mod.rs:759:9
sched-ext#6 0x622450e9614a in scx_loader::_$LT$impl$u20$core..convert..From$LT$scx_loader..SupportedSched$GT$$u20$for$u20$$RF$str$GT$::from::h13ba9d4271e33441 /tmp/scx/rust/scx_loader/src/lib.rs:60:9
sched-ext#7 0x622450e91af3 in _$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$::into::h9481856c4f80c765 /home/vl/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/convert/mod.rs:759:9
sched-ext#8 0x622450e9614a in scx_loader::_$LT$impl$u20$core..convert..From$LT$scx_loader..SupportedSched$GT$$u20$for$u20$$RF$str$GT$::from::h13ba9d4271e33441 /tmp/scx/rust/scx_loader/src/lib.rs:60:9
sched-ext#9 0x622450e91af3 in _$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$::into::h9481856c4f80c765 /home/vl/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/convert/mod.rs:759:9
sched-ext#10 0x622450e9614a in scx_loader::_$LT$impl$u20$core..convert..From$LT$scx_loader..SupportedSched$GT$$u20$for$u20$$RF$str$GT$::from::h13ba9d4271e33441 /tmp/scx/rust/scx_loader/src/lib.rs:60:9
sched-ext#11 0x622450e91af3 in _$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$::into::h9481856c4f80c765 /home/vl/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/convert/mod.rs:759:9
sched-ext#12 0x622450e9614a in scx_loader::_$LT$impl$u20$core..convert..From$LT$scx_loader..SupportedSched$GT$$u20$for$u20$$RF$str$GT$::from::h13ba9d4271e33441 /tmp/scx/rust/scx_loader/src/lib.rs:60:9
sched-ext#13 0x622450e91af3 in _$LT$T$u20$as$u20$core..convert..Into$LT$U$GT$$GT$::into::h9481856c4f80c765 /home/vl/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/convert/mod.rs:759:9
sched-ext#14 0x622450e9614a in scx_loader::_$LT$impl$u20$core..convert..From$LT$scx_loader..SupportedSched$GT$$u20$for$u20$$RF$str$GT$::from::h13ba9d4271e33441 /tmp/scx/rust/scx_loader/src/lib.rs:60:9
```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.