Skip to content

Conversation

leisuzz
Copy link
Contributor

@leisuzz leisuzz commented Aug 26, 2025

What does this PR do?

  1. When I use the accelerate with deepspeed zero2, I found the error:
    "ValueError: At least one of the dataloaders passed to accelerate.prepare() has None as batch size. Please set an integer value in train_micro_batch_size_per_gpu in the deepspeed config file or assign integer value to AcceleratorState().deepspeed_plugin.deepspeed_config['train_micro_batch_size_per_gpu']."

I think the current batch_sampler does not support deepspeed. So I added this function to support it.

  1. I've added the function to support saving checkpoints when using deepspeed

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@leisuzz
Copy link
Contributor Author

leisuzz commented Aug 26, 2025

Cc: @sayakpaul @a-r-r-o-w Please take a look at this PR. Thanks!

@leisuzz leisuzz force-pushed the kontext branch 3 times, most recently from 3950d2e to 3094c2b Compare August 26, 2025 09:01
@leisuzz
Copy link
Contributor Author

leisuzz commented Aug 26, 2025

@sayakpaul I've also added ds support for load/save ckpt function. Thanks!

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@leisuzz
Copy link
Contributor Author

leisuzz commented Aug 28, 2025

@sayakpaul Please approve the workflow, thx :)

@leisuzz
Copy link
Contributor Author

leisuzz commented Aug 28, 2025

Hi @sayakpaul, It looks weird but I think the original code does not need the empty line. I added the empty line to it. Thanks :)

@leisuzz
Copy link
Contributor Author

leisuzz commented Aug 29, 2025

Cc: @sayakpaul, I checked with ruff and make quality, they all passed. Sorry for the trouble and thanks for your help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants