Skip to content

Commit b22dc4d

Browse files
authored
Fix config file loading: require --config flag instead of positional argument (#223)
This PR fixes how configuration files are passed to training scripts. Some scripts are fine, while others crash. Previously, the training yaml file (e.g. config_qlora.yaml) was provided as a positional argument (first noticed in the zephyr beta qlora DPO example): ```accelerate launch ... scripts/dpo.py recipes/zephyr-7b-beta/dpo/config_qlora.yaml``` In this case, HfArgumentParser only receives command-line flags and does not parse the config_qlora.yaml, leaving dataset_mixture=None and triggering the error: ```ValueError: Either `dataset_name` or `dataset_mixture` must be provided``` The correct usage is to pass the config file with the --config flag: ```accelerate launch ... scripts/dpo.py --config recipes/zephyr-7b-beta/dpo/config_qlora.yaml```
1 parent 925acdf commit b22dc4d

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

recipes/zephyr-7b-beta/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,16 +29,16 @@ Train faster with flash-attention 2 (GPU supporting FA2: A100, H100, etc)
2929
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/sft.py --config recipes/zephyr-7b-beta/sft/config_qlora.yaml --load_in_4bit=true
3030

3131
# Step 2 - DPO
32-
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/dpo.py recipes/zephyr-7b-beta/dpo/config_qlora.yaml
32+
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/dpo.py --config recipes/zephyr-7b-beta/dpo/config_qlora.yaml
3333
```````
3434

3535
P.S. Using Flash Attention also allows you to drastically increase the batch size (x2 in my case)
3636

3737
Train without flash-attention (i.e. via PyTorch's scaled dot product attention):
3838
```````shell
3939
# Step 1 - SFT
40-
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/sft.py recipes/zephyr-7b-beta/sft/config_qlora.yaml --load_in_4bit=true --attn_implementation=sdpa
40+
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/sft.py --config recipes/zephyr-7b-beta/sft/config_qlora.yaml --load_in_4bit=true --attn_implementation=sdpa
4141

4242
# Step 2 - DPO
43-
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/dpo.py recipes/zephyr-7b-beta/dpo/config_qlora.yaml --attn_implementation=sdpa
44-
```````
43+
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/ddp.yaml --num_processes=1 scripts/dpo.py --config recipes/zephyr-7b-beta/dpo/config_qlora.yaml --attn_implementation=sdpa
44+
```````

0 commit comments

Comments
 (0)