diff --git a/ROADMAP.md b/ROADMAP.md
index f94a932..598c66a 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -38,7 +38,7 @@ source venv/bin/activate
 
 # 3. 下载实验数据（可选，部分实验不需要）
 cd modules/common
-python datasets.py --download-all
+python data_sources.py --download-all
 cd ../..
 ```
 
diff --git a/docs/guide/quick-start.md b/docs/guide/quick-start.md
index 9125db9..c2ac435 100644
--- a/docs/guide/quick-start.md
+++ b/docs/guide/quick-start.md
@@ -27,7 +27,7 @@ source venv/bin/activate
 
 # 3. 下载实验数据（可选，部分实验不需要）
 cd modules/common
-python datasets.py --download-all
+python data_sources.py --download-all
 cd ../..
 ```
 
diff --git a/en/ROADMAP.md b/en/ROADMAP.md
index ce0190a..14e783d 100644
--- a/en/ROADMAP.md
+++ b/en/ROADMAP.md
@@ -39,7 +39,7 @@ source venv/bin/activate
 
 # 3. Download experiment data (optional, some experiments do not need it)
 cd modules/common
-python datasets.py --download-all
+python data_sources.py --download-all
 cd ../..
 ```
 
diff --git a/en/docs/guide/quick-start.md b/en/docs/guide/quick-start.md
index 1938547..3a1875e 100644
--- a/en/docs/guide/quick-start.md
+++ b/en/docs/guide/quick-start.md
@@ -20,7 +20,7 @@ source venv/bin/activate
 
 # 3. Download experiment datasets (optional)
 cd modules/common
-python datasets.py --download-all
+python data_sources.py --download-all
 cd ../..
 ```
 
diff --git a/en/modules/01-foundation/index.md b/en/modules/01-foundation/index.md
index c2654e7..9d24d2f 100644
--- a/en/modules/01-foundation/index.md
+++ b/en/modules/01-foundation/index.md
@@ -125,7 +125,7 @@ For each module:
 
 A: Check the following:
 1. Did you activate the virtual environment? `source venv/bin/activate`
-2. Did you download data? `cd modules/common && python datasets.py --download-all`
+2. Did you download data? `cd modules/common && python data_sources.py --download-all`
 3. Are you in the correct folder? Experiments must run inside `experiments/`
 
 **Q: Experiments are too slow?**
diff --git a/en/modules/index.md b/en/modules/index.md
index 2004c1a..9a2d64f 100644
--- a/en/modules/index.md
+++ b/en/modules/index.md
@@ -58,6 +58,23 @@ _(Planned)_
 
 ---
 
+## 📋 System Requirements
+
+### Python Version
+- **Recommended**: Python 3.10+
+- **Minimum**: Python 3.10
+
+Some utility code uses Python 3.10+ union type syntax (e.g., `str | list`). Earlier versions will not work.
+
+### Dependencies
+```bash
+pip install torch requests datasets matplotlib numpy
+```
+
+See: [Environment Setup Guide](../docs/guide/environment-setup.md)
+
+---
+
 ## ⚡ Quick Start
 
 ### Environment setup
@@ -68,7 +85,7 @@ source venv/bin/activate
 
 # 2. Download experiment data (~60 MB)
 cd modules/common
-python datasets.py --download-all
+python data_sources.py --download-all
 ```
 
 ### 30-minute quick experience
@@ -170,10 +187,10 @@ Each design choice answers:
 
 Shared tools live in `modules/common/`:
 
-### datasets.py - Dataset manager
+### data_sources.py - Dataset manager
 
 ```python
-from modules.common.datasets import get_experiment_data
+from modules.common.data_sources import get_experiment_data
 
 # TinyShakespeare
 text = get_experiment_data('shakespeare')
@@ -204,7 +221,13 @@ from modules.common.visualization import (
 )
 ```
 
-See docstrings in each file for details.
+See docstrings in each file or [`modules/common/README.md`](../modules/common/README.md) for details.
+
+#### ⚠️ Migration Notice
+
+**2026-02**: `datasets.py` has been renamed to `data_sources.py` to avoid naming conflict with HuggingFace datasets library.
+
+For detailed migration guide, see [modules/common/README.md](../modules/common/README.md) or [PR #20](https://github.com/joyehuang/minimind-notes/pull/20).
 
 ---
 
diff --git a/modules/01-foundation/README.md b/modules/01-foundation/README.md
index 4e1967a..3e60591 100644
--- a/modules/01-foundation/README.md
+++ b/modules/01-foundation/README.md
@@ -125,7 +125,7 @@ keywords: Transformer基础组件, 归一化, 位置编码, 注意力机制, 前
 
 A: 检查以下几点：
 1. 是否激活了虚拟环境？ `source venv/bin/activate`
-2. 是否下载了数据？ `cd modules/common && python datasets.py --download-all`
+2. 是否下载了数据？ `cd modules/common && python data_sources.py --download-all`
 3. 是否在正确的目录？实验需要在 `experiments/` 目录下运行
 
 **Q: 实验太慢怎么办？**
diff --git a/modules/README.md b/modules/README.md
index 3b85d16..4271b89 100644
--- a/modules/README.md
+++ b/modules/README.md
@@ -58,6 +58,23 @@ _（后续扩展）_
 
 ---
 
+## 📋 系统要求
+
+### Python 版本
+- **推荐**: Python 3.10+
+- **最低**: Python 3.10
+
+部分工具代码使用了 Python 3.10+ 的类型注解语法（如 `str | list`），低于此版本将无法运行。
+
+### 依赖安装
+```bash
+pip install torch requests datasets matplotlib numpy
+```
+
+详见：[环境配置指南](../docs/guide/environment-setup.md)
+
+---
+
 ## ⚡ 快速开始
 
 ### 准备环境
@@ -68,7 +85,7 @@ source venv/bin/activate
 
 # 2. 下载实验数据（约 60 MB）
 cd modules/common
-python datasets.py --download-all
+python data_sources.py --download-all
 ```
 
 ### 30 分钟快速体验
@@ -170,10 +187,10 @@ python exp_xxx.py --help
 
 模块提供了以下通用工具（位于 `modules/common/`）：
 
-### datasets.py - 数据集管理
+### data_sources.py - 数据集管理
 
 ```python
-from modules.common.datasets import get_experiment_data
+from modules.common.data_sources import get_experiment_data
 
 # 获取 TinyShakespeare
 text = get_experiment_data('shakespeare')
@@ -204,7 +221,34 @@ from modules.common.visualization import (
 )
 ```
 
-详细文档见各文件的 docstring。
+详细文档见各文件的 docstring 或 [`modules/common/README.md`](./common/README.md)。
+
+#### ⚠️ 迁移说明
+
+**2026-02**: `datasets.py` 已重命名为 `data_sources.py`
+
+如果你的代码使用了旧的导入方式：
+```python
+# 旧代码（会报错）
+from modules.common.datasets import get_experiment_data
+```
+
+请更新为：
+```python
+# 新代码
+from modules.common.data_sources import get_experiment_data
+```
+
+命令行使用也需要更新：
+```bash
+# 旧命令
+python datasets.py --download-all
+
+# 新命令
+python data_sources.py --download-all
+```
+
+**变更原因**: 避免与 HuggingFace `datasets` 库命名冲突，详见 [通用工具文档](./common/README.md#重要变更说明)
 
 ---
 
diff --git a/modules/common/README.md b/modules/common/README.md
new file mode 100644
index 0000000..ae3ce70
--- /dev/null
+++ b/modules/common/README.md
@@ -0,0 +1,122 @@
+# 通用工具 (Common Utilities)
+
+本目录包含所有实验模块共享的工具代码。
+
+## 📋 系统要求
+
+### Python 版本
+- **要求**: Python 3.10+
+
+**注意**: 代码使用了 Python 3.10+ 的类型联合语法（`str | list`），低于此版本将无法运行。
+
+### 依赖库
+- `torch` - PyTorch 深度学习框架
+- `requests` - HTTP 请求（用于数据下载）
+- `datasets` - HuggingFace datasets 库（用于 TinyStories 下载）
+
+安装方法：
+```bash
+pip install torch requests datasets
+```
+
+## 📦 可用工具
+
+### data_sources.py - 数据集管理
+
+提供统一的实验数据接口，支持：
+- TinyShakespeare（经典字符级数据，1MB）
+- TinyStories（现代英文，支持取子集）
+- 合成数据（用于可视化实验）
+
+**使用示例**:
+```python
+from modules.common.data_sources import get_experiment_data
+
+# 获取 TinyShakespeare
+text = get_experiment_data('shakespeare')
+
+# 获取 TinyStories 子集（10MB）
+texts = get_experiment_data('tinystories', size_mb=10)
+
+# 生成合成数据
+text = get_experiment_data('synthetic', size_mb=1)
+```
+
+**命令行使用**:
+```bash
+cd modules/common
+
+# 下载所有数据集
+python data_sources.py --download-all
+
+# 测试单个数据集
+python data_sources.py --dataset shakespeare
+```
+
+### experiment_base.py - 实验基类
+
+提供统一的实验框架，包括：
+- 自动设备检测（CPU/MPS/CUDA）
+- 结果保存（图表 + 指标）
+- 进度显示
+- 可复现性（固定随机种子）
+
+**使用示例**:
+```python
+from modules.common.experiment_base import Experiment
+
+class MyExperiment(Experiment):
+    def __init__(self):
+        super().__init__(
+            name="my_experiment",
+            output_dir="experiments/results"
+        )
+
+    def run(self):
+        # 你的实验代码
+        metrics = {'accuracy': 0.95}
+        self.print_metrics(metrics)
+        self.save_metrics(metrics)
+
+exp = MyExperiment()
+exp.run()
+```
+
+### visualization.py - 可视化工具
+
+提供常用的可视化函数。
+
+**注意**: 此文件目前尚未创建，计划在后续模块中添加。
+
+## ⚠️ 重要变更说明
+
+### datasets.py 已重命名为 data_sources.py (2026-02)
+
+**原因**: 避免与 HuggingFace `datasets` 库的命名冲突，该冲突会导致 TinyStories 数据集下载失败。
+
+**背景**: Python 模块搜索时优先查找当前目录，如果存在本地 `datasets.py`，会导致 `from datasets import load_dataset` 错误导入本地文件而非 HuggingFace 库。
+
+**迁移方法**:
+
+| 旧代码 | 新代码 |
+|--------|--------|
+| `from modules.common.datasets import ...` | `from modules.common.data_sources import ...` |
+| `python datasets.py --download-all` | `python data_sources.py --download-all` |
+
+**注意**:
+- `datasets.py` 文件已完全删除（不再存在于仓库中）
+- 使用旧导入方式会收到标准的 `ModuleNotFoundError`
+- 所有官方文档和实验代码已更新为新文件名
+- Git 历史中仍可通过 `git log -- modules/common/datasets.py` 追溯旧文件
+
+**相关信息**:
+- 问题追踪: GitHub Issue #19
+- 详细讨论: GitHub Pull Request #20
+
+## 📝 贡献指南
+
+在添加新工具时，请：
+1. 在本 README 中添加工具说明
+2. 在文件头部添加清晰的文档字符串
+3. 提供使用示例
+4. 确保工具是通用的，可以被多个模块复用
diff --git a/modules/common/datasets.py b/modules/common/data_sources.py
similarity index 82%
rename from modules/common/datasets.py
rename to modules/common/data_sources.py
index 3eeae1a..7d41840 100644
--- a/modules/common/datasets.py
+++ b/modules/common/data_sources.py
@@ -6,8 +6,22 @@
 - TinyStories（现代英文，支持取子集）
 - 合成数据（用于可视化实验）
 
+⚠️ 文件命名说明：
+    此文件名为 data_sources.py 而非 datasets.py，原因是：
+
+    Python 模块搜索时，当前目录优先级高于 site-packages。
+    如果命名为 datasets.py，则 `from datasets import load_dataset`
+    会优先导入本地文件而非 HuggingFace datasets 库，导致 ImportError。
+
+    重命名为 data_sources.py 避免了此冲突。
+    详见：https://github.com/joyehuang/minimind-notes/pull/20
+
+系统要求：
+    - Python 3.10+（使用了类型联合语法 str | list）
+    - 依赖：requests（TinyShakespeare）、datasets（TinyStories）
+
 使用示例：
-    from modules.common.datasets import get_experiment_data
+    from modules.common.data_sources import get_experiment_data
 
     # 获取 TinyShakespeare
     text = get_experiment_data('shakespeare')
@@ -22,14 +36,13 @@
 from typing import List, Optional
 import json
 
-# 数据缓存目录
+# 数据缓存目录（按需创建，不在模块导入时创建）
 DATA_DIR = Path(__file__).parent / 'data'
-DATA_DIR.mkdir(exist_ok=True)
 
 
 def get_experiment_data(
     dataset: str = 'shakespeare',
-    size_mb: Optional[float] = None,
+    size_mb: Optional[int] = None,
     cache: bool = True
 ) -> str | List[str]:
     """
@@ -38,9 +51,9 @@ def get_experiment_data(
     Args:
         dataset: 数据集名称
             - 'shakespeare': TinyShakespeare (1MB)
-            - 'tinystories': TinyStories (可指定大小)
+            - 'tinystories': TinyStories (可指定大小，单位: MB，必须为整数)
             - 'synthetic': 合成随机数据
-        size_mb: 数据大小限制（仅对 tinystories 有效）
+        size_mb: 数据大小限制（仅对 tinystories 和 synthetic 有效，单位: MB）
         cache: 是否使用缓存
 
     Returns:
@@ -79,6 +92,7 @@ def _get_shakespeare(cache: bool = True) -> str:
 
         # 保存缓存
         if cache:
+            DATA_DIR.mkdir(exist_ok=True)  # 确保目录存在
             cache_file.write_text(text, encoding='utf-8')
             print(f"✅ 已缓存到: {cache_file}")
 
@@ -91,10 +105,14 @@ def _get_shakespeare(cache: bool = True) -> str:
         raise
 
 
-def _get_tinystories(size_mb: float, cache: bool = True) -> List[str]:
+def _get_tinystories(size_mb: int, cache: bool = True) -> List[str]:
     """
     获取 TinyStories 子集
 
+    Args:
+        size_mb: 数据大小（MB），必须为整数
+        cache: 是否使用缓存
+
     注意：需要安装 datasets 库
         pip install datasets
     """
@@ -131,6 +149,7 @@ def _get_tinystories(size_mb: float, cache: bool = True) -> List[str]:
 
         # 保存缓存
         if cache:
+            DATA_DIR.mkdir(exist_ok=True)  # 确保目录存在
             with open(cache_file, 'w', encoding='utf-8') as f:
                 json.dump(texts, f, ensure_ascii=False)
             print(f"✅ 已缓存到: {cache_file}")
@@ -146,10 +165,13 @@ def _get_tinystories(size_mb: float, cache: bool = True) -> List[str]:
         raise
 
 
-def _generate_synthetic(size_mb: float) -> str:
+def _generate_synthetic(size_mb: int) -> str:
     """
     生成合成数据（用于快速测试）
 
+    Args:
+        size_mb: 数据大小（MB），必须为整数
+
     生成简单的重复模式，用于验证模型是否能学习
     """
 
diff --git a/modules/index.md b/modules/index.md
index 11a7a42..99c5b7a 100644
--- a/modules/index.md
+++ b/modules/index.md
@@ -58,6 +58,23 @@ _（后续扩展）_
 
 ---
 
+## 📋 系统要求
+
+### Python 版本
+- **推荐**: Python 3.10+
+- **最低**: Python 3.10
+
+部分工具代码使用了 Python 3.10+ 的类型注解语法（如 `str | list`），低于此版本将无法运行。
+
+### 依赖安装
+```bash
+pip install torch requests datasets matplotlib numpy
+```
+
+详见：[环境配置指南](../docs/guide/environment-setup.md)
+
+---
+
 ## ⚡ 快速开始
 
 ### 准备环境
@@ -68,7 +85,7 @@ source venv/bin/activate
 
 # 2. 下载实验数据（约 60 MB）
 cd modules/common
-python datasets.py --download-all
+python data_sources.py --download-all
 ```
 
 ### 30 分钟快速体验
@@ -170,10 +187,10 @@ python exp_xxx.py --help
 
 模块提供了以下通用工具（位于 `modules/common/`）：
 
-### datasets.py - 数据集管理
+### data_sources.py - 数据集管理
 
 ```python
-from modules.common.datasets import get_experiment_data
+from modules.common.data_sources import get_experiment_data
 
 # 获取 TinyShakespeare
 text = get_experiment_data('shakespeare')
@@ -204,7 +221,13 @@ from modules.common.visualization import (
 )
 ```
 
-详细文档见各文件的 docstring。
+详细文档见各文件的 docstring 或 [`modules/common/README.md`](./common/README.md)。
+
+#### ⚠️ 迁移说明
+
+**2026-02**: `datasets.py` 已重命名为 `data_sources.py`，避免与 HuggingFace datasets 库命名冲突。
+
+详细的迁移指南请参考 [modules/README.md](./README.md) 或 [modules/common/README.md](./common/README.md)。
 
 ---