Document decision-tree baseline models and default sample sizes#3

Open

hongqin wants to merge 1 commit intomainfrom

codex/describe-rf/xgb/catboost-baselines-and-samples

Contributor

hongqin commented Feb 15, 2026

Motivation

Clarify which tree-based baseline models are used for decision-tree experiments and make the default dataset sizing explicit so users can reproduce training and tuning runs.

Description

Add a "Baseline models and sample sizes" subsection to decision_tree/README.md that lists the baselines (Random Forest rf, XGBoost xgb, CatBoost cat) and documents the defaults for -num 2000 across five VOCs (10,000 total, 8,000 train / 2,000 test) and the default fine-tuning subset (500 per VOC, 2,500 total).

Testing

Ran nl -ba decision_tree/README.md | sed -n '1,90p' to verify the README content and it succeeded.
Ran git status --short && git rev-parse --short HEAD to confirm the working tree state and it succeeded.


          Document decision-tree baselines and default sample sizes

f7a9259

hongqin added the codex label

— with

ChatGPT Codex Connector

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels