Skip to content

Document decision-tree baseline models and default sample sizes#3

Open
hongqin wants to merge 1 commit intomainfrom
codex/describe-rf/xgb/catboost-baselines-and-samples
Open

Document decision-tree baseline models and default sample sizes#3
hongqin wants to merge 1 commit intomainfrom
codex/describe-rf/xgb/catboost-baselines-and-samples

Conversation

@hongqin
Copy link
Contributor

@hongqin hongqin commented Feb 15, 2026

Motivation

  • Clarify which tree-based baseline models are used for decision-tree experiments and make the default dataset sizing explicit so users can reproduce training and tuning runs.

Description

  • Add a "Baseline models and sample sizes" subsection to decision_tree/README.md that lists the baselines (Random Forest rf, XGBoost xgb, CatBoost cat) and documents the defaults for -num 2000 across five VOCs (10,000 total, 8,000 train / 2,000 test) and the default fine-tuning subset (500 per VOC, 2,500 total).

Testing

  • Ran nl -ba decision_tree/README.md | sed -n '1,90p' to verify the README content and it succeeded.
  • Ran git status --short && git rev-parse --short HEAD to confirm the working tree state and it succeeded.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments