In the paper, Figure 3 describes the RL pipeline and rollout for MEM1:
“Figure 3: (Top): the RL pipeline used to train MEM1. (Bottom left): The evolution of context in MEM1–old <IS>, <query>, <info> are cleared as new states enter the context. The mechanism is used in the rollout.”
However, in the current implementation:
In the paper, Figure 3 describes the RL pipeline and rollout for MEM1:
However, in the current implementation:
internal_stateonly occurs inMem1/inference/data_pipelines.py, withinMem1Pipeline.run_llm_loopfor inference only.https://github.com/MIT-MI/MEM1/blob/main/Mem1/inference/data_pipelines.py#L109-L143
internal_stateis not referenced in the rollout (Mem1/train/rollout/llm_agent/generation_think.py).https://github.com/MIT-MI/MEM1/blob/main/Mem1/train/rollout/llm_agent/generation_think.py#L264-L290