Description
ChatPDF功能存在的问题使用chatglm4运行 chatpdf.py直接报错。如下所示。
(mindspore) root@autodl-container-bff2469f3e-a4796232:~/autodl-tmp/ChatPDF# python chatpdf.py
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.749 seconds.
Prefix dict has been built successfully.
Namespace(sim_model_name='shibing624/text2vec-base-multilingual', gen_model_type='chatglm', gen_model_name='/root/autodl-tmp/Qwen_Fine_tuning/ZhipuAI/glm-4-9b-chat', lora_model=None, rerank_model_name='', corpus_files='sample.pdf', chunk_size=220, chunk_overlap=0, num_expand_context_chunk=1)
The following parameters in checkpoint files are not loaded:
['embeddings.position_ids']
Traceback (most recent call last):
File "/root/autodl-tmp/ChatPDF/chatpdf.py", line 518, in
m = ChatPDF(
File "/root/autodl-tmp/ChatPDF/chatpdf.py", line 169, in init
self.gen_model, self.tokenizer = self._init_gen_model(
File "/root/autodl-tmp/ChatPDF/chatpdf.py", line 208, in _init_gen_model
tokenizer = tokenizer_class.from_pretrained(gen_model_name_or_path, mirror='modelscope')
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/models/auto/tokenization_auto.py", line 772, in from_pretrained
raise ValueError(
ValueError: Tokenizer class ChatGLM4Tokenizer does not exist or is not currently imported.