This study proposes an innovative framework aimed at breaking through the cognitive bottlenecks of traditional RAG systems, offering significant reference value for researchers and engineers committed to enhancing the depth and breadth of information processing in LLMs. Specifically, professionals in the following fields will benefit from our work:
Researchers in NLP and Information Retrieval: The active memory extraction paradigm proposed in this paper challenges the traditional text processing workflow of "chunk first, then understand", providing a novel research perspective for fields such as document understanding, semantic segmentation, and knowledge representation.
Developers of LLM Applications: Our work directly addresses the core challenges faced by RAG systems in constructing knowledge-intensive applications, such as semantic incompleteness and logical fragmentation of text chunks. It offers a systematic approach to generating high-quality, structured document memories.
Researchers in SLMs: Facing the limitations of SLMs in complex cognitive tasks, we demonstrate, through the reverse construction strategy of the Chain reasoning of Memory extraction (CoM), how to efficiently transfer the deep reasoning capabilities of LLMs to SLMs, opening up new pathways for building lightweight, high-performance intelligent systems.
Scholars in the Interdisciplinary Field of Cognitive Science and AI: The core of this study lies in simulating the cognitive processes of human experts by transforming unstructured text into hierarchical memories. This provides robust support for exploring human-like cognition, knowledge construction, and reasoning mechanisms in machines.
Proposing Active Memory Extraction: We advocate transforming text processing in RAG from passive text chunking to active memory extraction. By simulating domain experts, we first achieve a holistic and macroscopic understanding of documents and then construct structured document memories.
Defining Structured Document Memories: We formally define document memories as a triplet composed of a macroscopic logical outline, highly condensed core content, and semantically coherent atomic chunks.
Constructing the MoM Framework and CoM: We design the MoM framework, which generates high-quality memories through a multi-path sampling and multi-dimensional evaluation mechanism. Furthermore, we employ a reverse reasoning strategy to construct the CoM, thereby endowing SLMs with complex cognitive capabilities.
Designing a Three-Layer Retrieval Mechanism and Providing Theoretical Proof: We develop a three-layer document memory retrieval mechanism encompassing logical outlines, core content, and original text. From a probabilistic modeling perspective, we theoretically demonstrate that this strategy can more effectively reduce information loss and achieve more precise knowledge localization compared to fusing information before retrieval.
- Install dependency packages
pip install -r requirements.txt
- Start the milvus-lite service (vector database)
milvus-server --data /Storage/path/of/the/database
- Download models to corresponding directories.
- Modify various configurations according to your need.
- Run
chunk_*.py
andmom_*.py
to accomplish the text chunking task for domain documents.
CUDA_VISIBLE_DEVICES=0 nohup python chunk_gpt.py >> multifiled/qwen3_14B_set.log 2>&1 &
- Subsequently, execute
quick_start.py
andretrieval.py
to carry out the retrieval and question-answering processes.
CUDA_VISIBLE_DEVICES=1 nohup python quick_start.py
--docs_path 'crud_qwen3_14B_set.json'
--collection_name 'crud_qwen3_14B_set'
--retrieve_top_k 8
--task 'quest_answer'
--construct_index
>> log/mom_crud_qwen3_14B_set.log 2>&1 &
CUDA_VISIBLE_DEVICES=2 nohup python retrieval.py
--data_path 'evaldata/multifieldqa_zh.json'
--save_file 'eval/mom_multifieldqa_zh_qwen3_14B_set.json'
--docs_path 'multifieldqa_zh_qwen3_14B_set.json'
--collection_name 'multifieldqa_zh_qwen3_14B_set'
--retrieve_top_k 8
--construct_index
>> log/mom_multifieldqa_zh_huagong_qwen3_14B_set.log 2>&1 &
- Open and run
chunk.ipynb
, which will conduct a comprehensive quality assessment of the results generated by different chunking strategies.
We conduct extensive experiments on three QA datasets across different domains, including news, finance and so on.
Performance Across Domains: Our MemReader demonstrates outstanding performance in handling pure text QA tasks.
Effectiveness of Evaluation Metrics: The memory evaluation metrics we proposed are proven to effectively assess the quality of memory chunks, providing a reliable basis for the automatic screening of high-quality document memories.
Information Supportiveness of Retrieved Content: The results indicate that the memories extracted and organized by MoM can provide more comprehensive information for downstream tasks.