diff --git a/scripts/README.md b/scripts/README.md index 4805f9d..34f716e 100644 --- a/scripts/README.md +++ b/scripts/README.md @@ -1,35 +1,123 @@ -## Creating LaTeX tables +## Scripts for Benchmark Analysis -Prerequisite: You should be able to build and run the C++ benchmark. You need Python 3 on your system. +This directory contains scripts for processing and visualizing benchmark results. -Run your benchmark: +### Prerequisites +- Python 3.6+ +- Required Python packages: `pandas`, `numpy`, `matplotlib`, `seaborn` + +You can install the required packages with: + +```bash +pip install pandas numpy matplotlib seaborn ``` -cmake -B build + +### Creating LaTeX Tables + +#### Basic Table Generation + +Run your benchmark and convert the output to a LaTeX table: + +```bash +# Run benchmark +cmake -B build . +cmake --build build ./build/benchmarks/benchmark -f data/canada.txt > myresults.txt + +# Convert to LaTeX table +./scripts/latex_table.py myresults.txt ``` -Process the raw output: +This will print a LaTeX table to stdout with numbers rounded to two significant digits. + +#### Automated Multiple Table Generation +Instead of manually running benchmarks and generating tables, you can use the `generate_multiple_tables.py` script to automate the entire process: + +```bash +# Basic usage with g++ compiler +./scripts/generate_multiple_tables.py g++ ``` -./scripts/latex_table.py myresults.txt + +This script: + +- Automatically compiles the benchmark code with the specified compiler +- Runs multiple benchmarks with different configurations +- Generates LaTeX tables for each benchmark result +- Saves all tables to the output directory + +Options: + +- First argument: Compiler to use (g++, clang++) +- `--build-dir`: Build directory (default: build) +- `--output-dir`: Output directory for tables (default: ./outputs) +- `--clean`: Clean build directory before compilation +- `--march`: Architecture target for -march flag (default: native) + +The script also has several configurable variables at the top of the file: + +- Benchmark datasets (canada, mesh, uniform_01) +- Algorithm filters +- Number of runs +- Volume size + +This is the recommended approach for generating comprehensive benchmark results. + +### Combining Tables + +The `concat_tables.py` script combines separate benchmark tables (mesh, canada, uniform_01) into comprehensive tables: + +```bash +# Basic usage, using tables in ./outputs +./scripts/concat_tables.py ``` -This will print to stdout the table. -The numbers are already rounded to two significant digits, ready to be included in a scientific manuscript. +Options: + +- `--input-dir`, `-i`: Directory containing benchmark .tex files (default: ./outputs) +- `--output-dir`, `-o`: Output directory for combined tables (default: same as input) +- `--exclude`, `-e`: Algorithms to exclude from the output tables + +### Generating Visualization Figures -It is also possible to create multiple LaTeX tables at once with: +The `generate_figures.py` script creates heatmaps and relative performance plots: +```bash +# Generate figures for nanoseconds per float metric +./scripts/generate_figures.py nsf ./outputs ``` -./scripts/generate_multiple_tables.py ` + +Options: + +- First argument: Metric to visualize (`nsf`, `insf`, or `insc`) +- Second argument: Directory containing benchmark result .tex files +- `--output-dir`, `-o`: Directory to save generated figures (default: same as input directory) +- `--exclude`, `-e`: Algorithms to exclude from visualization +- `--cpus`, `-c`: CPUs to include in relative performance plots + +### Extracting Summary Metrics + +The `get_summary_metrics.py` script analyzes raw benchmark files to extract performance metrics: + +```bash +# Analyze all CPUs +./scripts/get_summary_metrics.py ``` -## Running tests on Amazon AWS +Options: + +- `--cpu`: CPU folder name to restrict analysis +- `--input-dir`, `-i`: Directory containing benchmark .raw files (default: ./outputs) +- `--outlier-threshold`, `-t`: Threshold for reporting outliers (default: 5.0%) +- `--dedicated-cpus`, `-d`: CPU folder names considered dedicated (non-cloud) + +### Running Tests on Amazon AWS It is possible to generate tests on Amazon AWS: -``` -./scripts/aws_tests.py +```bash +./scripts/aws_tests.bash ``` This script will create new EC2 instances, run @@ -37,3 +125,30 @@ This script will create new EC2 instances, run save each output to a separate folder, and then terminate the instance. Prerequisites and some user configurable variables are in the script itself. + +### Workflow Example + +A typical complete workflow might look like: + +1. **Generate benchmark results and tables automatically**: + ```bash + # For g++ compiler (compiles and runs benchmarks) + ./scripts/generate_multiple_tables.py g++ --clean + + # For clang++ compiler (compiles and runs benchmarks) + ./scripts/generate_multiple_tables.py clang++ --clean + ``` +2. **Combine tables for better comparison**: + ```bash + ./scripts/concat_tables.py + ``` +3. **Generate visualization figures**: + ```bash + ./scripts/generate_figures.py nsf ./outputs + ``` +4. **Extract summary metrics**: + ```bash + ./scripts/get_summary_metrics.py + ``` + +This automated workflow handles the entire process from compilation to visualization with minimal manual intervention. diff --git a/scripts/concat_tables.py b/scripts/concat_tables.py new file mode 100755 index 0000000..75ba817 --- /dev/null +++ b/scripts/concat_tables.py @@ -0,0 +1,177 @@ +#!/usr/bin/env python3 +""" +Concatenate multiple benchmark result tables into a single comprehensive table. + +This script finds and combines related benchmark results from different datasets +(mesh, canada, uniform_01) into a single LaTeX table for easier comparison. +""" +import os +import re +import argparse +import pandas as pd + + +def parse_tex_table(filepath): + """Parse a LaTeX table file into a pandas DataFrame.""" + with open(filepath, 'r') as file: + lines = file.readlines() + data_start = False + parsed = [] + for line in lines: + if "\\midrule" in line: + data_start = True + continue + if "\\bottomrule" in line: + break + if data_start and '&' in line: + row = [x.strip().strip('\\') for x in line.split('&')] + if len(row) == 4: + parsed.append({ + 'algorithm': row[0], + 'ns/f': row[1], + 'ins/f': row[2], + 'ins/c': row[3] + }) + return pd.DataFrame(parsed) + + +def clean_cpu_name(cpu_name): + """Clean CPU name for better display in tables.""" + cpu_cleaned = cpu_name.replace("Ryzen9900x", "Ryzen 9900X") + cpu_cleaned = cpu_cleaned.replace("_Platinum", "") + cpu_cleaned = re.sub(r"_\d+-Core_Processor", "", cpu_cleaned) + cpu_cleaned = re.sub(r"_CPU__\d+\.\d+GHz", "", cpu_cleaned) + cpu_cleaned = re.sub(r"\(R\)", "", cpu_cleaned) + return cpu_cleaned.replace("_", " ").replace(" ", " ").strip() + + +def format_latex_table(df, cpu_name, compiler, float_bits, microarch=None, + exclude_algos=None): + """Format the combined data as a LaTeX table.""" + if exclude_algos is None: + exclude_algos = set() + + cpu_cleaned = clean_cpu_name(cpu_name) + caption = f"{cpu_cleaned} results ({compiler}, {float_bits}-bit floats" + if microarch: + caption += f", {microarch}" + caption += ")" + label = f"tab:{re.sub(r'[^a-zA-Z0-9]+', '', cpu_name.lower())}results" + header = ( + "\\begin{table}\n" + " \\centering\n" + f" \\caption{{{caption}}}%\n" + f" \\label{{{label}}}\n" + " \\begin{tabular}{lccccccccc}\n" + " \\toprule\n" + " \\multirow{1}{*}{Name} & \\multicolumn{3}{c|}{mesh} & " + "\\multicolumn{3}{c|}{canada} & \\multicolumn{3}{c}{unit} \\\\\n" + " & {ns/f} & {ins/f} & {ins/c} & " + "{ns/f} & {ins/f} & {ins/c} & {ns/f} & {ins/f} & {ins/c} \\\\ " + "\\midrule\n" + ) + body = "" + for _, row in df.iterrows(): + if row['algorithm'] in exclude_algos: + continue + line = ( + f" {row['algorithm']} & {row['ns/f_mesh']} & " + f"{row['ins/f_mesh']} & {row['ins/c_mesh']} & " + f"{row['ns/f_canada']} & {row['ins/f_canada']} & " + f"{row['ins/c_canada']} & " + f"{row['ns/f_unit']} & {row['ins/f_unit']} & " + f"{row['ins/c_unit']} \\\\\n" + ) + body += line + footer = ( + " \\bottomrule\n" + " \\end{tabular}\\restartrowcolors\n" + "\\end{table}\n" + ) + return header + body + footer + + +def find_combinations(root, pattern=None): + """Find all combinations of benchmark result files that can be combined.""" + if pattern is None: + pattern = re.compile( + r"(.*?)_(g\+\+|clang\+\+)_(mesh|canada|uniform_01)_(none|s)" + r"(?:_(x86-64|x86-64-v2|x86-64-v3|x86-64-v4|native))?\.tex" + ) + # group(1)=cpu, 2=compiler, 3=dataset, 4=variant, 5=microarch (optional) + + combos = [] + for dirpath, _, filenames in os.walk(root): + tex_files = [f for f in filenames if f.endswith('.tex')] + table = {} + for f in tex_files: + m = pattern.match(f) + if m: + cpu, compiler, dataset, variant, microarch = m.groups() + key = (dirpath, cpu, compiler, variant, microarch) + if key not in table: + table[key] = {} + table[key][dataset] = os.path.join(dirpath, f) + for (dirpath, cpu, compiler, variant, microarch), files in table.items(): + if {"mesh", "canada", "uniform_01"}.issubset(files.keys()): + combos.append((dirpath, cpu, compiler, variant, microarch, files)) + return combos + + +def main(): + parser = argparse.ArgumentParser( + description="Concatenate benchmark tables into comprehensive tables") + parser.add_argument( + "--input-dir", "-i", default="./outputs", + help="Directory containing benchmark .tex files") + parser.add_argument( + "--output-dir", "-o", + help="Output directory for combined tables (defaults to input directory)") + parser.add_argument( + "--exclude", "-e", nargs="+", + default=["netlib", "teju\\_jagua", "yy\\_double", "snprintf", "abseil"], + help="Algorithms to exclude from the output tables") + args = parser.parse_args() + + input_dir = args.input_dir + output_dir = args.output_dir if args.output_dir else input_dir + exclude_algos = set(args.exclude) + + # Create output directory if it doesn't exist + if not os.path.exists(output_dir): + os.makedirs(output_dir) + + combos = find_combinations(input_dir) + if not combos: + print(f"No matching benchmark files found in {input_dir}") + return + + print(f"Found {len(combos)} combinations to process") + + for dirpath, cpu, compiler, variant, microarch, paths in combos: + df_mesh = parse_tex_table(paths['mesh']) + df_canada = parse_tex_table(paths['canada']) + df_unit = parse_tex_table(paths['uniform_01']) + df_merged = df_mesh.merge( + df_canada, on='algorithm', suffixes=('_mesh', '_canada')) + df_merged = df_merged.merge(df_unit, on='algorithm') + df_merged.rename(columns={ + 'ns/f': 'ns/f_unit', + 'ins/f': 'ins/f_unit', + 'ins/c': 'ins/c_unit' + }, inplace=True) + + float_bits = "32" if variant == "s" else "64" + tex_code = format_latex_table( + df_merged, cpu, compiler, float_bits, microarch, exclude_algos) + + suffix = f"_{microarch}" if microarch else "" + out_path = os.path.join( + output_dir, f"{cpu}_{compiler}_all_{variant}{suffix}.tex") + with open(out_path, "w") as f: + f.write(tex_code) + print(f"[OK] {out_path}") + + +if __name__ == "__main__": + main() diff --git a/scripts/generate_figures.py b/scripts/generate_figures.py new file mode 100755 index 0000000..ddccb8a --- /dev/null +++ b/scripts/generate_figures.py @@ -0,0 +1,238 @@ +#!/usr/bin/env python3 +""" +Generate visualization figures from benchmark results. + +This script creates heatmaps and relative performance plots from benchmark +results stored in LaTeX tables. It helps visualize performance differences +across algorithms, CPUs, and compilers. +""" +import os +import re +import sys +import argparse +import pandas as pd +import numpy as np +import matplotlib.pyplot as plt +import seaborn as sns +from collections import defaultdict + + +def parse_table(filepath, metric_name): + """ + Parse a LaTeX table file to extract benchmark metrics. + + Args: + filepath: Path to the LaTeX table file + metric_name: Metric to extract (nsf, insf, or insc) + + Returns: + Tuple of (cpu_name, compiler, width, results_dict) + """ + metrics = {"nsf": [1, 4, 7], "insf": [2, 5, 8], "insc": [3, 6, 9]} + + with open(filepath, encoding="utf-8") as f: + lines = f.readlines() + + caption_line = next(line for line in lines if '\\caption' in line) + cpu_caption = re.search(r'\\caption\{(.+?) results', caption_line) + cpu_name = cpu_caption.group(1).strip() if cpu_caption else "UnknownCPU" + + compiler = "clang++" if "clang++" in os.path.basename(filepath) else "g++" + width = "64" if filepath.endswith("_all_none.tex") else "32" + + testcases = ["mesh", "canada", "unit"] + + results = defaultdict(dict) + in_data = False + for line in lines: + if '\\midrule' in line: + in_data = True + continue + if '\\bottomrule' in line: + break + if in_data: + row = line.strip() + if not row or row.startswith('%') or row.startswith('\\'): + continue + parts = [x.strip() for x in row.split('&')] + if len(parts) < 10: + continue + algo = parts[0] + try: + for i, testcase in zip(metrics[metric_name], testcases): + metric = float(parts[i]) + results[algo][testcase] = metric + except Exception: + continue + return cpu_name, compiler, width, results + + +def plot_relative_performance(df, dataset, cpus_to_plot=None, + outfile="relative_performance.pdf", + considered_suffix="-C-64"): + """ + Create a relative performance plot comparing algorithms. + + Args: + df: DataFrame with benchmark results + dataset: Dataset name (mesh, canada, unit) + cpus_to_plot: List of CPU names to include + outfile: Output file path + considered_suffix: Suffix to filter columns + """ + df = df.copy() + cols_to_keep = [col for col in df.columns + if col.endswith(considered_suffix) and + any(cpu in col for cpu in cpus_to_plot)] + df = df[cols_to_keep] + + # Use a predefined order for algorithms if available + if hasattr(df, 'reindex') and 'algorithm_order' in globals(): + df = df.reindex(algorithm_order) + + if "dragon4" not in df.index: + print("Dragon4 not found in DataFrame; can't normalize!") + return + + df_rel = df.loc["dragon4"] / df + df_rel = df_rel.drop("dragon4", axis=0) + + # Use display name mapping if available + if 'algo_display_map' in globals(): + df_rel.index = [algo_display_map.get(algo, algo) + for algo in df_rel.index] + + plt.figure(figsize=(10, 4)) + for col in df_rel.columns: + plt.plot(df_rel.index, df_rel[col], marker='o', label=col) + plt.ylabel("Rel. speedup (vs. Dragon4)") + plt.xlabel("Algorithm") + plt.xticks(rotation=15, ha='right') + plt.legend(loc='upper left', fontsize=10) + plt.tight_layout() + plt.savefig(outfile) + plt.close() + print(f"Generated: {outfile}") + + +def main(): + parser = argparse.ArgumentParser( + description="Generate visualization figures from benchmark results") + parser.add_argument( + "metric_name", choices=["nsf", "insf", "insc"], + help="Metric to visualize (nsf=nanoseconds/float, insf=instructions/float, " + "insc=instructions/cycle)") + parser.add_argument( + "input_dir", "-i", default="./outputs", + help="Directory containing benchmark result .tex files") + parser.add_argument( + "--output-dir", "-o", default=None, + help="Directory to save generated figures (default: same as input directory)") + parser.add_argument( + "--exclude", "-e", nargs="+", default=[], + help="Algorithms to exclude from visualization") + parser.add_argument( + "--cpus", "-c", nargs="+", + default=["Ryzen 9900X", "AMD EPYC 7R13", "Intel Xeon 8488C", + "Apple M4 Max", "Neoverse-V2"], + help="CPUs to include in relative performance plots") + args = parser.parse_args() + + # Set output directory to input directory if not specified + if args.output_dir is None: + args.output_dir = args.input_dir + + # Create output directory if it doesn't exist + if not os.path.exists(args.output_dir): + os.makedirs(args.output_dir) + + # Configuration + sns.set_context("paper", font_scale=1.5) + algorithms_to_exclude = args.exclude + cpus_to_plot = args.cpus + + # Algorithm display name mapping + global algo_display_map + algo_display_map = { + "ryu": "ryƫ", + "double_conversion": "double_conv.", + } + + # Compiler mapping + compiler_map = {"clang++": "C", "g++": "G"} + + # Algorithm order for consistent display + global algorithm_order + algorithm_order = [ + "dragon4", "netlib", "double_conversion", "fmt_format", "grisu3", + "swiftDtoa", "grisu_exact", "schubfach", "ryu", "dragonbox" + ] + + # Find relevant files + relevant_files = [] + for root, _, files in os.walk(args.input_dir): + for file in files: + if file.endswith("_all_none.tex") or file.endswith("_all_s.tex"): + relevant_files.append(os.path.join(root, file)) + + if not relevant_files: + print(f"No relevant .tex files found in {args.input_dir}!") + sys.exit(1) + + # Process files and collect results + all_results = { + "mesh": defaultdict(dict), + "canada": defaultdict(dict), + "unit": defaultdict(dict) + } + + for filepath in relevant_files: + cpu_name, compiler, width, table = parse_table(filepath, args.metric_name) + shortened_compiler = compiler_map[compiler] + colname = f"{cpu_name}-{shortened_compiler}-{width}" + for algo, tc_dict in table.items(): + for testcase, metric in tc_dict.items(): + if testcase in all_results: + all_results[testcase][algo][colname] = metric + + # Create DataFrames from collected results + dfs = {} + for testcase, d in all_results.items(): + df = pd.DataFrame.from_dict(d, orient="index").sort_index(axis=1) + dfs[testcase] = df + + # Generate plots for each testcase + for testcase in ["mesh", "canada", "unit"]: + df = dfs[testcase].copy() + + # Filter algorithms with enough data points + most_common_algos = df.dropna(thresh=10).index + df = df.loc[most_common_algos] + df.index = df.index.str.replace("\\", "", regex=False) + df = df[~df.index.isin(algorithms_to_exclude)] + + # Generate relative performance plot + plot_relative_performance( + df, testcase, cpus_to_plot=cpus_to_plot, + outfile=os.path.join(args.output_dir, + f"relative_performance_{testcase}.pdf") + ) + + # Generate heatmap + df.index = [algo_display_map.get(algo, algo) for algo in df.index] + df_log = df.map(lambda x: None if pd.isna(x) else + (float("nan") if x <= 0 else np.log10(x)) + ).apply(pd.to_numeric, errors="coerce") + plt.figure(figsize=(18, 7), constrained_layout=True) + sns.heatmap(df_log, annot=False, cmap="coolwarm", linewidths=0.1, + cbar_kws={'label': f'log$_{{10}}$({args.metric_name})'}) + plt.ylabel("Algorithm") + plt.savefig(os.path.join(args.output_dir, f"heatmap_{testcase}.pdf")) + plt.close() + print(f"Generated: heatmap_{testcase}.pdf") + + print("All heatmaps and relative performance plots generated.") + + +if __name__ == '__main__': + main() diff --git a/scripts/generate_multiple_tables.py b/scripts/generate_multiple_tables.py index 3250bea..4c8b4b6 100755 --- a/scripts/generate_multiple_tables.py +++ b/scripts/generate_multiple_tables.py @@ -1,13 +1,18 @@ #!/usr/bin/env python3 +""" +Generate multiple benchmark tables with automatic compilation. + +This script automates the process of compiling the benchmark code, +running benchmarks with various configurations, and generating LaTeX tables. +""" import subprocess import os import platform -import sys +import argparse +import shutil from latex_table import generate_latex_table # Configuration -benchmark_executable = './build/benchmarks/benchmark' -output_dir = './outputs' input_files = [ 'data/canada.txt', 'data/mesh.txt', @@ -28,14 +33,30 @@ # ['-F6', '-s'], ] -# Get compiler label from command line -if len(sys.argv) < 2: - print("Usage: ./scripts/generate_multiple_tables.py ") - sys.exit(1) -CompilerLabel = sys.argv[1] + +def parse_args(): + """Parse command line arguments.""" + parser = argparse.ArgumentParser( + description="Compile, run benchmarks, and generate LaTeX tables") + parser.add_argument( + "compiler", help="Compiler to use (g++, clang++)") + parser.add_argument( + "--build-dir", default="build", + help="Build directory (default: build)") + parser.add_argument( + "--output-dir", default="./outputs", + help="Output directory for tables (default: ./outputs)") + parser.add_argument( + "--clean", action="store_true", + help="Clean build directory before compilation") + parser.add_argument( + "--march", default="native", + help="Architecture target for -march flag (default: native)") + return parser.parse_args() def get_cpu_model(): + """Get the CPU model name for the current system.""" env = os.environ.copy() env["LANG"] = "C" @@ -60,19 +81,52 @@ def get_cpu_model(): return "unknown_cpu" -CPUModel = get_cpu_model().replace(' ', '_').replace('/', '-').replace('@', '') -os.makedirs(output_dir, exist_ok=True) +def compile_benchmarks(compiler, build_dir, clean=False, march="native"): + """Compile the benchmark code with the specified compiler.""" + print(f"Compiling benchmarks with {compiler}...") + + # Clean build directory if requested + if clean and os.path.exists(build_dir): + print(f"Cleaning build directory: {build_dir}") + shutil.rmtree(build_dir) + + # Set environment variables for compiler + env = os.environ.copy() + if compiler == "g++": + env["CC"] = "gcc" + env["CXX"] = "g++" + elif compiler == "clang++": + env["CC"] = "clang" + env["CXX"] = "clang++" + + # Configure with CMake + cmake_cmd = [ + "cmake", "-B", build_dir, ".", + f"-DSIMPLE_FAST_FLOAT_BENCHMARK_MARCH={march}" + ] + print(f"Running: {' '.join(cmake_cmd)}") + subprocess.run(cmake_cmd, env=env, check=True) + + # Build with CMake + build_cmd = ["cmake", "--build", build_dir] + print(f"Running: {' '.join(build_cmd)}") + subprocess.run(build_cmd, env=env, check=True) + + print("Compilation successful!") # Helper to run a command and return its stdout def run_cmd(cmd): + """Run a command and return its stdout.""" result = subprocess.run(cmd, capture_output=True, text=True) result.check_returncode() return result.stdout # Process a single benchmark invocation and generate .tex -def process_job(label, cmd_args, flags): +def process_job(benchmark_executable, output_dir, cpu_model, compiler_label, + label, cmd_args, flags): + """Run a benchmark and generate LaTeX table.""" # Run the benchmark cmd = [benchmark_executable] + cmd_args + flags print(f"Running: {' '.join(cmd)}", flush=True) @@ -81,7 +135,7 @@ def process_job(label, cmd_args, flags): # Build output file name flag_label = ''.join([f.strip('-') for f in flags]) or 'none' safe_label = label.replace('.', '_') - filename_tex = f"{CPUModel}_{CompilerLabel}_{safe_label}_{flag_label}.tex" + filename_tex = f"{cpu_model}_{compiler_label}_{safe_label}_{flag_label}.tex" filename_raw = filename_tex[:-4] + '.raw' # replace .tex with .raw out_path_tex = os.path.join(output_dir, filename_tex) out_path_raw = os.path.join(output_dir, filename_raw) @@ -98,12 +152,46 @@ def process_job(label, cmd_args, flags): print(f"Written: {out_path_raw}\n", flush=True) -if __name__ == '__main__': +def main(): + """Main function.""" + args = parse_args() + + # Compile the benchmarks + compile_benchmarks( + compiler=args.compiler, + build_dir=args.build_dir, + clean=args.clean, + march=args.march + ) + + # Set up paths and directories + benchmark_executable = f'./{args.build_dir}/benchmarks/benchmark' + output_dir = args.output_dir + os.makedirs(output_dir, exist_ok=True) + + # Get CPU model and clean it for filenames + cpu_model = get_cpu_model().replace(' ', '_').replace('/', '-').replace('@', '') + + # Save compiler information + compiler_info_path = os.path.join(output_dir, f"{args.compiler}.txt") + try: + compiler_version = subprocess.check_output( + [args.compiler, "--version"], text=True) + with open(compiler_info_path, 'w') as f: + f.write(compiler_version) + print(f"Saved compiler info to: {compiler_info_path}") + except Exception as e: + print(f"Warning: Could not get compiler version: {e}") + # File-based benchmarks for filepath in input_files: file_label = os.path.splitext(os.path.basename(filepath))[0] for flags in flag_combinations: process_job( + benchmark_executable=benchmark_executable, + output_dir=output_dir, + cpu_model=cpu_model, + compiler_label=args.compiler, label=file_label, cmd_args=['-f', filepath, '-r', str(runs_r)], flags=flags @@ -113,7 +201,15 @@ def process_job(label, cmd_args, flags): for model in models: for flags in flag_combinations: process_job( + benchmark_executable=benchmark_executable, + output_dir=output_dir, + cpu_model=cpu_model, + compiler_label=args.compiler, label=model, cmd_args=['-m', model, '-v', str(volume_v), '-r', str(runs_r)], flags=flags ) + + +if __name__ == '__main__': + main() diff --git a/scripts/get_summary_metrics.py b/scripts/get_summary_metrics.py new file mode 100755 index 0000000..ebe3778 --- /dev/null +++ b/scripts/get_summary_metrics.py @@ -0,0 +1,261 @@ +#!/usr/bin/env python3 +""" +Extract and summarize benchmark metrics from raw result files. + +This script analyzes benchmark raw output files to extract performance metrics +and provide statistical summaries. It can analyze metrics across different +CPU types (dedicated vs. cloud) and identify outliers. +""" +import os +import re +import statistics +import argparse +from collections import defaultdict + + +def get_cpu_type(filename, dedicated_cpus=None): + """ + Determine if a benchmark was run on a dedicated or cloud CPU. + + Args: + filename: Path to the benchmark result file + dedicated_cpus: Set of CPU folder names considered dedicated + + Returns: + String: "dedicated" or "cloud" + """ + if dedicated_cpus is None: + dedicated_cpus = {"apple_m4", "AMD_Ryzen9_9900X"} + + parts = os.path.normpath(filename).split(os.sep) + if len(parts) < 2: + return "unknown" + cpu_folder = parts[-2] + if cpu_folder in dedicated_cpus: + return "dedicated" + return "cloud" + + +def extract_metrics_from_file(filename, percent_metrics=None, + raw_metrics=None): + """ + Extract performance metrics from a benchmark result file. + + Args: + filename: Path to the benchmark result file + percent_metrics: List of metrics reported with percent variation + raw_metrics: List of raw metrics to extract + + Returns: + Tuple of (percent_values, percent_sources, raw_values) + """ + if percent_metrics is None: + percent_metrics = ["MB/s", "c/f", "i/f"] + if raw_metrics is None: + raw_metrics = ["i/c"] + + percent_values = defaultdict(list) + percent_sources = defaultdict(list) + raw_values = defaultdict(list) + algo = None + + with open(filename, "r") as f: + for line in f: + m = re.match(r"([a-zA-Z0-9_]+)\s*:", line) + if m: + algo = m.group(1) + if algo is None: + continue + + for metric in percent_metrics: + regex = rf"{re.escape(metric)}\s*\(\+/-\s*([-+]?\d+\.\d+)\s*%\)" + pmatch = re.search(regex, line) + if pmatch: + val = float(pmatch.group(1)) + percent_values[(algo, metric)].append(val) + percent_sources[(algo, metric)].append((filename, val)) + + for metric in raw_metrics + percent_metrics: + regex = rf"([-\d\.eE]+)\s+{re.escape(metric)}\b" + match = re.search(regex, line) + if match: + value = float(match.group(1)) + raw_values[(algo, metric)].append(value) + return percent_values, percent_sources, raw_values + + +def collect_all_stats(root=".", cpu_filter=None, dedicated_cpus=None): + """ + Collect statistics from all benchmark result files. + + Args: + root: Root directory to search for benchmark files + cpu_filter: Optional CPU folder name to filter results + dedicated_cpus: Set of CPU folder names considered dedicated + + Returns: + Dictionary of collected statistics + """ + all_data = { + "dedicated": { + "percent": defaultdict(list), + "percent_src": defaultdict(list), + "raw": defaultdict(list) + }, + "cloud": { + "percent": defaultdict(list), + "percent_src": defaultdict(list), + "raw": defaultdict(list) + }, + "global": { + "percent": defaultdict(list), + "percent_src": defaultdict(list), + "raw": defaultdict(list) + }, + } + + for dirpath, _, filenames in os.walk(root): + if cpu_filter is not None: + if os.path.basename(dirpath) != cpu_filter: + continue + for fname in filenames: + if fname.endswith(".raw"): + fullpath = os.path.join(dirpath, fname) + cpu_type = get_cpu_type(fullpath, dedicated_cpus) + percent_vals, percent_sources, raw_vals = extract_metrics_from_file(fullpath) + for key, vals in percent_vals.items(): + all_data[cpu_type]["percent"][key].extend(vals) + all_data["global"]["percent"][key].extend(vals) + for key, vals in percent_sources.items(): + all_data[cpu_type]["percent_src"][key].extend(vals) + all_data["global"]["percent_src"][key].extend(vals) + for key, vals in raw_vals.items(): + all_data[cpu_type]["raw"][key].extend(vals) + all_data["global"]["raw"][key].extend(vals) + return all_data + + +def print_stats_block(label, stats, outlier_threshold=5.0, + percent_metrics=None, raw_metrics=None): + """ + Print a formatted block of statistics. + + Args: + label: Block label + stats: Statistics dictionary + outlier_threshold: Threshold for reporting outliers + percent_metrics: List of metrics reported with percent variation + raw_metrics: List of raw metrics to report + """ + if percent_metrics is None: + percent_metrics = ["MB/s", "c/f", "i/f"] + if raw_metrics is None: + raw_metrics = ["i/c"] + + print(f"\n=== {label.upper()} ===") + percent_stats = stats["percent"] + percent_sources = stats["percent_src"] + raw_stats = stats["raw"] + + for metric in percent_metrics: + print(f"\nMetric: {metric}") + algos = sorted(set(a for (a, m) in percent_stats if m == metric)) + for algo in algos: + vals = percent_stats[(algo, metric)] + if vals: + mean = statistics.mean(vals) + median = statistics.median(vals) + print( + f" Algorithm: {algo:15s} [%] " + f"min={min(vals):.2f}%, max={max(vals):.2f}%, " + f"mean={mean:.2f}%, median={median:.2f}% (n={len(vals)})" + ) + all_vals = [v for ((a, m), vs) in percent_stats.items() + if m == metric for v in vs] + if all_vals: + mean = statistics.mean(all_vals) + median = statistics.median(all_vals) + print( + f" [Global][%] min={min(all_vals):.2f}%, " + f"max={max(all_vals):.2f}%, mean={mean:.2f}%, " + f"median={median:.2f}% (n={len(all_vals)})" + ) + outlier_vals = [] + for (algo, m), vs in percent_sources.items(): + if m == metric: + for fname, v in vs: + if v > outlier_threshold: + outlier_vals.append((v, algo, fname)) + if outlier_vals: + print(f" Outliers above {outlier_threshold:.1f}%:") + for v, algo, fname in sorted(outlier_vals, reverse=True): + print(f" {v:.2f}% : {algo} [{fname}]") + + for metric in raw_metrics: + print(f"\nMetric: {metric}") + algos = sorted(set(a for (a, m) in raw_stats if m == metric)) + for algo in algos: + vals = raw_stats[(algo, metric)] + if vals: + mean = statistics.mean(vals) + median = statistics.median(vals) + print( + f" Algorithm: {algo:15s} [raw] " + f"min={min(vals):.4g}, max={max(vals):.4g}, " + f"mean={mean:.4g}, median={median:.4g} (n={len(vals)})" + ) + all_vals = [v for ((a, m), vs) in raw_stats.items() + if m == metric for v in vs] + if all_vals: + mean = statistics.mean(all_vals) + median = statistics.median(all_vals) + print( + f" [Global][raw] min={min(all_vals):.4g}, " + f"max={max(all_vals):.4g}, mean={mean:.4g}, " + f"median={median:.4g} (n={len(all_vals)})" + ) + + +def main(): + parser = argparse.ArgumentParser( + description="Summarize metrics from benchmark .raw files") + parser.add_argument( + "--cpu", type=str, + help="CPU folder name to restrict analysis (e.g. 'apple_m4')") + parser.add_argument( + "--input-dir", "-i", default="./outputs", + help="Directory containing benchmark .raw files") + parser.add_argument( + "--outlier-threshold", "-t", type=float, default=5.0, + help="Threshold for reporting outliers (default: 5.0%%)") + parser.add_argument( + "--dedicated-cpus", "-d", nargs="+", + default=["apple_m4", "AMD_Ryzen9_9900X"], + help="CPU folder names considered dedicated (non-cloud)") + args = parser.parse_args() + + dedicated_cpus = set(args.dedicated_cpus) + + if args.cpu: + print(f"\nFiltering for CPU: {args.cpu}\n") + all_data = collect_all_stats( + args.input_dir, cpu_filter=args.cpu, dedicated_cpus=dedicated_cpus) + # Only print "global" block in this mode for clarity + print_stats_block( + args.cpu, all_data["global"], outlier_threshold=args.outlier_threshold) + else: + all_data = collect_all_stats( + args.input_dir, dedicated_cpus=dedicated_cpus) + print_stats_block( + "dedicated", all_data["dedicated"], + outlier_threshold=args.outlier_threshold) + print_stats_block( + "cloud", all_data["cloud"], + outlier_threshold=args.outlier_threshold) + print_stats_block( + "global", all_data["global"], + outlier_threshold=args.outlier_threshold) + + +if __name__ == "__main__": + main() diff --git a/scripts/latex_table.py b/scripts/latex_table.py index 116120b..618589f 100755 --- a/scripts/latex_table.py +++ b/scripts/latex_table.py @@ -1,11 +1,26 @@ #!/usr/bin/env python3 +""" +Convert benchmark output to LaTeX tables. + +This script parses benchmark output data and generates a formatted LaTeX table +with performance metrics (ns/float, instructions/float, instructions/cycle). +It formats numbers to two significant digits for readability. +""" import sys import re import argparse -# Function to format a number to two significant digits def format_to_two_sig_digits(value): + """ + Format a number to two significant digits with appropriate notation. + + Args: + value: The number to format + + Returns: + A string representation of the number with two significant digits + """ if not isinstance(value, (int, float)) or value == 0: return "N/A" @@ -36,8 +51,16 @@ def format_to_two_sig_digits(value): return f"{'-' if is_negative else ''}{abs_value:.1f}e{exponent}" -# Function to parse the raw input data def parse_input(data): + """ + Parse benchmark output data to extract performance metrics. + + Args: + data: Raw benchmark output as a string + + Returns: + List of dictionaries containing parsed metrics for each algorithm + """ lines = data.splitlines() parsed_data = [] current_entry = None @@ -48,7 +71,8 @@ def parse_input(data): if not line or line.startswith("#"): continue - # Match lines that start a new entry (e.g., "just_string : 1365.92 MB/s ...") + # Match lines that start a new entry + # e.g., "just_string : 1365.92 MB/s ..." match_entry = re.match(r"(\S+)\s*:\s*[\d.]+\s*MB/s", line) if match_entry: current_entry = {"name": match_entry.group(1)} @@ -75,8 +99,16 @@ def parse_input(data): return parsed_data -# Function to generate LaTeX table def generate_latex_table(raw_input): + """ + Generate a LaTeX table from benchmark output data. + + Args: + raw_input: Raw benchmark output as a string + + Returns: + Formatted LaTeX table as a string + """ data = parse_input(raw_input) latex_table = r""" @@ -86,7 +118,7 @@ def generate_latex_table(raw_input): \midrule """ for entry in data: - name = entry["name"].replace("_", "\\_") # Escape underscores for LaTeX + name = entry["name"].replace("_", "\\_") # Escape underscore for LaTeX ns_per_float = format_to_two_sig_digits(entry['ns_per_float']) if 'ns_per_float' in entry else 'N/A' inst_per_float = format_to_two_sig_digits(entry['inst_per_float']) if 'inst_per_float' in entry else 'N/A' inst_per_cycle = format_to_two_sig_digits(entry['inst_per_cycle']) if 'inst_per_cycle' in entry else 'N/A' @@ -97,9 +129,13 @@ def generate_latex_table(raw_input): return latex_table -if __name__ == "__main__": - parser = argparse.ArgumentParser(description="Generate LaTeX table from performance data") - parser.add_argument("file", nargs="?", help="Optional input file name (if not provided, reads from stdin)") +def main(): + """Parse command line arguments and generate LaTeX table.""" + parser = argparse.ArgumentParser( + description="Generate LaTeX table from performance data") + parser.add_argument( + "file", nargs="?", + help="Optional input file name (if not provided, reads from stdin)") args = parser.parse_args() # Read input data @@ -118,3 +154,7 @@ def generate_latex_table(raw_input): latex_output = generate_latex_table(raw_input) print(latex_output) + + +if __name__ == "__main__": + main() diff --git a/scripts/test_x86_levels.bash b/scripts/test_x86_levels.bash index 734618c..3c954fa 100755 --- a/scripts/test_x86_levels.bash +++ b/scripts/test_x86_levels.bash @@ -1,25 +1,50 @@ #!/bin/bash +# +# This script benchmarks floating-point serialization performance across different +# x86 microarchitecture levels (x86-64, x86-64-v2, x86-64-v3, x86-64-v4, native). +# It compiles and runs benchmarks for each architecture level, then generates +# LaTeX tables from the results. +# +# Usage: ./test_x86_levels.bash +# +# The script will: +# 1. Create an output directory for the specified CPU +# 2. For each architecture level: +# - Compile the benchmarks with the appropriate -march flag +# - Run benchmarks on three datasets (canada, mesh, uniform_01) +# - Save raw results to output files +# 3. Convert all raw results to LaTeX tables +# +# Results are saved in the outputs/ directory. CPU=$1 OutputDir="outputs/${CPU}" -Algorithms="schubfach,dragonbox" +Algorithms="schubfach,dragonbox" # comma-separated list +# Check if CPU name was provided if [ -z "$1" ]; then echo "Usage: $0 " exit 1 fi +# Create output directory mkdir -p ${OutputDir} + +# Test each x86 architecture level for v in x86-64 x86-64-v2 x86-64-v3 x86-64-v4 native; do + # Compile with specific architecture target cmake -B build-${v} -DSIMPLE_FAST_FLOAT_BENCHMARK_MARCH=${v} cmake --build build-${v} echo "Running benchmarks for ${v} on ${CPU}..." + + # Run benchmarks on different datasets ./build-${v}/benchmarks/benchmark -f data/canada.txt -a ${Algorithms} > ${OutputDir}/${CPU}_g++_canada_none_${v}.raw ./build-${v}/benchmarks/benchmark -f data/mesh.txt -a ${Algorithms} > ${OutputDir}/${CPU}_g++_mesh_none_${v}.raw ./build-${v}/benchmarks/benchmark -a ${Algorithms} > ${OutputDir}/${CPU}_g++_uniform_01_none_${v}.raw done +# Convert all raw results to LaTeX tables for f in ${OutputDir}/*.raw; do python3 scripts/latex_table.py "$f" > "${f%.raw}.tex" echo "Converted $f to ${f%.raw}.tex"