Sort_Suite is a command-line benchmark tool for comparing and learning about sorting algorithms across multiple dataset shapes and input sizes.
It supports:
- human-readable benchmark tables
- CSV output for graphs and reports
- optional operation counting
- optional Linux PMU statistics
- dataset dump mode for inspecting generated inputs
By default, Sort_Suite is deterministic. It uses a fixed seed unless you explicitly request entropy from the operating system.
./build/sort_suite [options]If you run the program with no options, it performs a human-readable benchmark run using the default configuration.
- algorithms:
bubble,merge,select - datasets:
random,sorted,nearly_sorted - sizes:
128,256,512,1024,2048,4096,8192,16384 - trials:
100 - warmup trials:
10 - verification: enabled
- output format:
human - seed: fixed (
0xC0FFEE) - quadratic safety cap: enabled
Run the default benchmark:
./build/sort_suite Run a CSV benchmark and redirect the results to an output file:
./build/sort_suite --format csv > out.csvSelect specific algorithms and datasets:
./build/sort_suite --algo merge,quick,quick_insert --dataset random,sortedChoose exact sizes:
./build/sort_suite --sizes 128,256,512,1024Generate sizes from a range:
./build/sort_suite --range 128:65536:x2Use OS entropy instead of a fixed default seed:
./build/sort_suite --seed entropyEnable operation counting and PMU statistics:
./build/sort_suite --count-ops --pmuDump a generated dataset instead of benchmarking:
./build/sort_suite --dump-dataset random --n 32Human output is intended for reading in a terminal. It uses aligned tables and may use ANSI color when writing to an interactive terminal.
Example:
./build/sort_suite --format humanCSV (comma-separated values, aka spreadsheet format) is intended for spreadsheets, graphs, and other machine-readable analysis workflows.
CSV output is designed to remain stable and machine-readable so that results can be imported into spreadsheets, plotting tools, or analysis scripts without manual cleanup.
Example:
./build/sort_suite --format csv > results.csvColor is only relevant for human-readable output.
Supported modes:
auto: use color only when writing to a terminalalways: Always attempt color outputnever: Disable color output
Example:
./build/sort_suite --color neverNote: ANSI escape sequences are suppressed for CSV output, as well as for human output that is redirected, even if manually requested.
Algorithms are selected with the --algo option, using a comma-separated list.
Example:
./build/sort_suite --algo bubble,merge,selectTo list all available algorithms:
./build/sort_suite --list-algosTo show more detailed information about an algorithm:
./build/sort_suite --algo-info mergeAlgorithm names are implementation-defined by the current build. Typical keys include:
bubblemergeselectinsertquickquick_insertlibc_qsort
Not every preset uses every available algorithm.
Datasets are selected with the --dataset option, using a comma-separated list.
Example:
./build/sort_suite --dataset random,sorted,nearly_sortedTo list every available dataset:
./build/sort_suite --list-datasetsSupported dataset keys:
randomrandom_uniquesortedsorted_descendingrandom_descendingnearly_sortedfew_unique
Accepted aliases:
reversed->sorted_descendingrandom_reversed->random_descending
Example:
./build/sort_suite --dataset reversed,random_reversedYou can specify sizes in two ways:
Use --sizes with a comma-separated list:
./build/sort_suite --sizes 128,256,512,1024Use --range with this format:
start:end:xK
start:end:+K
Examples:
./build/sort_suite --range 128:65536:x2
./build/sort_suite --range 1000:10000:+1000Rules:
xK,XK, or*Kmultiply by K each step+KaddsKeach step- start and end must be positive
- the end value is included if the stepping lands on it exactly
--trials N sets how many timed trials are recorded per benchmark case.
./build/sort_suite --trials 200--warmup N sets how many warmup trials are run before timing begins.
./build/sort_suite --warmup 20Warmup trials are used to reduce the noise from one-time or early-run effects that can distort timing results. Examples include:
- Cold instruction and data caches
- Branch predictor warmup
- Page faults and memory mapping overhead
- Allocator or runtime startup effects
- Early CPU frequency or scheduling behavior
These warmup runs are executed before recorded timing begins. They are not included in the reported best, mean, or worst times.
Warmup does not make benchmarking perfect, but it usually makes repeated runs a little more stable and more representative of steady-state behavior.
By default, Sort_Suite verifies that sorted output is correct.
To explicitly enable verification:
./build/sort_suite --verifyTo disable verification:
./build/sort_suite --no-verifyVerification checks that the algorithm output is correctly sorted after each benchmark case.
This is useful while developing or comparing implementations, especially when testing new algorithms or tuning hybrid thresholds.
By default, Sort_Suite uses a fixed, arbitrary, deterministic seed:
0xC0FFEE
You can override this in two ways:
./build/sort_suite --seed 12345
./build/sort_suite --seed 0xDEADBEEF./build/sort_suite --seed entropyEntropy seeding reads from the operating system, so repeated runs will practically always generate different datasets.
Use --count-ops to run an additional untimed pass that collects operation
counts.
./build/sort_suite --count-opsDepending on the algorithm implementation, operation counts may include fields such as:
- comparisons
- swaps
- moves
- passes
- partitions
This pass is not timed. It does not affect the benchmark timing measurements.
Operation counting is an algorithm-level diagnostic pass. It counts logical operations performed by the sorting implementation, such as comparisons, swaps, moves, passes, and partitions.
These counts are useful for understanding algorithm behavior and for comparing theoretical expectations against observed results. For example:
- Bubble sort on sorted input should show very few passes and no swaps
- Selection sort typically performs a fixed number of comparisons regardless of input order
- Merge sort shows structured movement of data even when the input is already sorted
Important: these counts are NOT hardware measurements. They do not represent machine instructions, CPU micro-ops, or exact memory traffic. They are counts defined by the benchmarked algorithm implementation itself.
Not every algorithm implementation necessarily supports operation counting. If counting is unavailable for a selected algorithm, the output will indicate that counting support is missing.
Use --pmu to run an additional untimed pass that collects Linux PMU
statistics.
./build/sort_suite --pmuTypical PMU-derived values may include:
- cycles
- instructions
- branches
- branch misses
- cache misses
- IPC
- branch miss rate
- scaling factor
This pass is also not timed.
PMU support currently requires Linux and a working perf_event environment.
You can disable PMU explicitly with:
./build/sort_suite --no-pmuPMU stands for Performance Monitoring Unit. Modern CPUs contain hardware counters that can track very low-level events such as cycles, branches, retired instructions, branch misses, and cache misses.
When --pmu is enabled, Sort_Suite performs an additional, untimed measurement
pass using Linux perf_event support. This allows the program to report
hardware-oriented statistics alongside timing results.
These counters are useful for studying questions like:
- How much work an algorithm causes the CPU to perform
- Whether branch-heavy algorithms suffer from misprediction
- How efficiently instructions are being retired
- Whether different input shapes affect CPU behavior
Derived values may include:
- IPC: Instructions Per Cycle
- Branch miss rate: the fraction of measured branches that were mispredicted
- Scale: a correction factor used when counters were multiplexed by the kernel
Important notes:
- PMU measurements are collected in a separate pass and do not affect timing results
- PMU support requires Linux and a working
perf_eventenvironment - PMU counters descrube hardware events during one measurement pass; they are useful, but they are still subject to normal benchmarking noise from the OS and runtime environment.
Quadratic algorithms can become painfully slow at large input sizes. Sort_Suite, therefore, applies a default size cap to algorithms flagged as quadratic.
Use:
./build/sort_suite --quadratic-max NExample:
./build/sort_suite --quadratic-max 4096Use 0 to disable the cap entirely:
./build/sort_suite --quadratic-max 0--bubble-max is accepted as a compatibility alias for --quadratic-max.
When a benchmark case is skipped due to this cap, human output and CSV output
mark the row as skipped with reason quadratic_cap.
Hybrid quick/insertion sort can be tuned with:
./build/sort_suite --quick-threshold NExample:
./build/sort_suite --algo quick_insert --quick-threshold 16A threshold of 0 disables the insertion-sort fallback.
Sort_Suite provides a few convenience presets.
Fast sanity-check benchmark for interactive use.
./build/sort_suite --preset quickCurrent behavior:
- algorithms: bubble, merge, select
- datasets: random, sorted, nearly_sorted
- sizes: 128,256,512,1024,2048,4096
- trials: 50
- warmup: 5
- format: human
Broader CSV-oriented benchmark intended for report graphs.
./build/sort_suite --preset paperCurrent behavior:
- algorithms:
bubble,merge,select - datasets:
random,random_unique,sorted,sorted_descending,random_descending,nearly_sorted,few_unique - sizes:
128:65536:x2 - trials:
100 - warmup:
10 - format:
csv
Heavier benchmark configuration.
./build/sort_suite --preset fullCurrent behavior:
- algorithms: bubble, merge, select
- datasets:
random,random_unique,sorted,sorted_descending,random_descending,nearly_sorted,few_unique - sizes:
128:131072:x2 - trials:
300 - warmup:
20 - format:
csv
Note: presets overwrite relevant fields such as selected algorithms, datasets, sizes, trials, warmup, and default output format.
Instead of benchmarking, you can generate and print a dataset directly.
Use:
./build/sort_suite --dump-dataset DATASET --n NExample:
./build/sort_suite --dump-dataset few_unique --n 64Dump mode prints:
- dataset name
n- seed
- sample values
- summary statistics such as minimum, maximum, sortedness, adjacent inversions, and unique count
This is useful for validating dataset generators and visually inspecting input shape.
Sort_Suite may exit without running benchmarks in the following cases:
-
--help -
--list-algos -
--list-datasets -
--algo-info KEY
Invalid options or invalid combinations cause an error exit.
Examples:
- unknown algorithm name
- unknown dataset name
- malformed size range
- missing required
--nin dump mode --pmuon a non-Linux build
./build/sort_suite./build/sort_suite --preset paper > results.csv./build/sort_suite --algo quick,quick_insert,libc_qsort --dataset random --range 256:65536:x2./build/sort_suite --dataset nearly_sorted --sizes 128,256,512,1024,2048,4096./build/sort_suite --algo bubble,select --dataset random --sizes 8192,16384 --quadratic-max 0./build/sort_suite --format csv --count-ops --pmu > diag.csv./build/sort_suite --dump-dataset random_unique --n 32- Human output is intended for reading.
- CSV output is intended for analysis tools and graphs.
- Operation counting and PMU collection use separate untimed passes.
- PMU support is Linux-only.
- Default behavior is deterministic unless
--seed entropyis used. - Quadratic algorithms may be skipped at large n unless the cap is raised or disabled.
Some benchmark cases may be skipped intentionally. The most common reason is
quadratic_cap, which prevents very slow quadratic algorithms from running at
large input sizes unless the cap is raised or disabled.
In human-readable output, skipped cases are shown as SKIPPED.
In CSV output, skipped cases are still emitted as rows so that downstream tools
see a complete result grid.
Built-in command summary:
./build/sort_suite --helpList algorithms:
./build/sort_suite --list-algosList datasets:
./build/sort_suite --list-datasetsShow algorithm details:
./build/sort_suite --algo-info merge