-
Notifications
You must be signed in to change notification settings - Fork 175
Description
Prerequisites
- I have searched the existing issues and confirmed this is not a duplicate.
- I am using the latest version of the MLLM framework.
Bug Description
a series of ImportError occurs due to a breaking change in the datasets library (version 3.0 and above). The modelscope library (v1.34.0) attempts to import internal variables that have been removed or relocated in newer versions of datasets.
Run the training command:
python train.py --model_path ~/path/to/Qwen1.5-0.5B --max_length 1024 --num_samples 128 --output_dir ~/output/
Error Message is like:
ImportError: cannot import name 'HubDatasetModuleFactoryWithoutScript' from 'datasets.load'(xxx/python3.10/site-packages/datasets/load.py)
- If datasets Version:4.6.1 Missing ALL_ALLOWED_EXTENSIONS:
ImportError: cannot import name 'ALL_ALLOWED_EXTENSIONS' from 'datasets.load'
- If datasets Version: 2.18.0, Missing LargeList:
ImportError: cannot import name 'LargeList' from 'datasets'
- Missing _FEATURE_TYPES:
ImportError: cannot import name '_FEATURE_TYPES' from 'datasets.features.features'
Steps to Reproduce
Run the training command:
python train.py --model_path ~/path/to/Qwen1.5-0.5B --max_length 1024 --num_samples 128 --output_dir ~/output/
Expected Behavior
export my module as a .safetensors file successfully.
Operating System
Linux
Device
Computer
MLLM Framework Version
V2.0.0
Model Information
No response
Additional Context
No response