Skip to content

环境参考 #11

@soberm258

Description

@soberm258

data:2026-01-13
非常好的初学者rag项目,本人小白也完全看的懂,这里分享下环境配置(参考gpt5.2),供快速实现,另项目中config还得修改一下模型配置:

# requirements.txt
# Python 版本:推荐 3.10 / 3.11(3.12 需要确保 torch / torchvision / modelscope / faiss 等依赖有对应平台的可用轮子)
#
# 说明:
# - Windows 下 `faiss` 建议使用 conda 安装(pip 往往没有可用轮子),例如:conda install -c conda-forge faiss-cpu


# 基础工具
numpy>=1.23,<2
tqdm>=4.66
loguru>=0.7

# 检索 / 分词 / 传统特征
jieba>=0.42.1
nltk>=3.8
scikit-learn>=1.3,<2
scipy<2

# 向量检索(faiss)
#faiss-cpu>=1.7.4; platform_system != "Windows" # Windows环境下需考虑兼容性,这里推荐conda install

# Embedding / Rerank / LLM(本地模型)
# torch>=2.0 # torch 官网选定下载
torchvision>=0.15
transformers==4.48.3
huggingface-hub>=0.25,<0.26
accelerate>=0.26
sentence-transformers>=2.6

# 句子切分(ModelScope)
#modelscope>=1.15 #注:项目中默认不使用ModelScope的,这里可以不安装,我尝试安装,总缺依赖,我干脆给它修改了在./tinyrag/sentence_splitter.py
# def __init__(self, 
#                  use_model: bool = False, 
#                  sentence_size = 256,
#                  model_path: str = "damo/nlp_bert_document-segmentation_chinese-base", 
#                  device="cpu"
#         ):
#         self.sentence_size = sentence_size
#         self.use_model = use_model
#         if self.use_model:
#             try:
#                 from modelscope.pipelines import pipeline
#             except ModuleNotFoundError as e:
#                 raise ModuleNotFoundError(
#                     "已启用句子切分模型(use_model=True),但当前环境缺少 modelscope 或其依赖。"
#                     "请安装 modelscope 及其依赖。"
#                 ) from e
#             # assert model_path == "" "模型路径为空"
#             self.sent_split_pp = pipeline(
#                 task="document-segmentation",
#                 model=model_path,
#                 device=device
#             )

# 文档解析
PyMuPDF>=1.23
python-docx>=1.1
python-pptx>=0.6.23
markdown>=3.6
beautifulsoup4>=4.12
Pillow>=10.0

# 在线 API(可选)
openai>=1.0
zhipuai>=2.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions