Skip to content

mlbio-epfl/LaMer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Meta-RL Induces Exploration in Language Agents

Yulun Jiang*, Liangze Jiang*, Damien Teney, Michael Moor**, Maria Brbić**

Project page | Paper | BibTeX


This repo contains the source code of 🌊LaMer, a Meta-RL framework of training LLM agents to actively explore and adapt to the environment at test time (ICLR '26).



Training

To train the LLM Agent with LaMer:

bash examples/minesweeper/lamer_minesweeper_qwen3_4b.sh

To train the LLM Agent with RL baselines:

bash examples/minesweeper/gigpo_minesweeper_qwen3_4b.sh

See the examples folder for more examples.



Environment

Please follow this note to install and test the agent environments.

Acknowledgements

This work is built upon verl, verl-agent, reflexion, RAGEN. We thank the authors and contributors of these projects for sharing their valuable work.

Citing

If you find our code useful, please consider citing:

@inproceedings{jiang2026metarl,
    title={Meta-RL Induces Exploration in Language Agents},
    author={Yulun Jiang and Liangze Jiang and Damien Teney and Michael Moor and Maria Brbic},
    booktitle={International Conference on Learning Representations}
    year={2026}
}