GitHub - mlbio-epfl/LaMer: [ICLR 2026] Meta-RL Induces Exploration in Language Agents

Meta-RL Induces Exploration in Language Agents

Yulun Jiang*, Liangze Jiang*, Damien Teney, Michael Moor**, Maria Brbić**

Project page | Paper | BibTeX

This repo contains the source code of 🌊LaMer, a Meta-RL framework of training LLM agents to actively explore and adapt to the environment at test time (ICLR '26).

Training

To train the LLM Agent with LaMer:

bash examples/minesweeper/lamer_minesweeper_qwen3_4b.sh

To train the LLM Agent with RL baselines:

bash examples/minesweeper/gigpo_minesweeper_qwen3_4b.sh

See the examples folder for more examples.

Environment

Please follow this note to install and test the agent environments.

Acknowledgements

This work is built upon verl, verl-agent, reflexion, RAGEN. We thank the authors and contributors of these projects for sharing their valuable work.

Citing

If you find our code useful, please consider citing:

@inproceedings{jiang2026metarl,
    title={Meta-RL Induces Exploration in Language Agents},
    author={Yulun Jiang and Liangze Jiang and Damien Teney and Michael Moor and Maria Brbic},
    booktitle={International Conference on Learning Representations}
    year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
agent_system		agent_system
assets		assets
examples		examples
scripts		scripts
verl		verl
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Meta-RL Induces Exploration in Language Agents

Training

Environment

Acknowledgements

Citing

About

Uh oh!

Packages

Languages

mlbio-epfl/LaMer

Folders and files

Latest commit

History

Repository files navigation

Meta-RL Induces Exploration in Language Agents

Training

Environment

Acknowledgements

Citing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Languages

Packages