save and load from factorio by hrshtt · Pull Request #282 · JackHopkins/factorio-learning-environment

hrshtt · 2025-07-27T07:57:54Z

Changes to make cluster/local work with saves

1:1 saves directories created for n instance in range(1,n) at repo root .fle/saves/{n-1}/

fle/cluster/local/run-envs.sh

hrshtt · 2025-07-27T08:05:21Z

The only issue right now is that the server start command for loading save and loading scenario:

START_COMMAND="--start-server-load-latest"
START_COMMAND="--start-server-load-scenario ${SCENARIO}"

Can’t be used together for the headless server. When we load a scenario first and create a save using rcon send_command('/save {name}'), the scenario gets embedded into the save automatically.

So we need to do a dry run with the scenario first, generate a save, and then switch to -l for subsequent restarts. This flow makes sense to me, though I’d prefer something cleaner, as I’m unsure how much implicit behavior we should assume without risking weird assumptions.

hrshtt · 2025-07-29T09:56:36Z

@JackHopkins

ive refactored FactorioInstance.

my reasoning for the changes:

Factorio Instance clearly owns namespace and the game state, that to me is the clear definition of its purpose.
Moved out the implementation details of rcon via FactorioServer and the script loading via LuaScriptManager.
Created FactorioServer to own server runtime and its restarts
a clear distinction of responsibility lets me make a strong implementation for server restart.

All the changes other than the following are syntactical sugar, moving code around to places that seem to own the functionality better. Changes that might affect functionality:

removing screenshots logic (imo its depreciated i think it expects the existence of client, i can be wrong)
creating a transaction context manager (unlikely to be problematic, context manager is safer than starting and ending a transaction)
creating a single class implementation for pre and post hooks via ToolHookRegistry

hrshtt · 2025-08-08T10:18:07Z

right now, FLE couples agent sessions (factorio instance) to servers by loosely getting ips and tcp ports and mapping them 1:1 on the sessions. this is an ordered mapping, so its only loose in concept (while running experiments/episodes the agent always receives the same server before/after resets).

my factorio native save & load implementation needs a save directory to bind 1:1 to a server. the save directory binds the servers and couples them with the current running session of an agent in a more strict manner. if we want to use factorio native saves and loads we need to:

1.) spawn servers from saves.
in terms of server spawning we can convert scenarios to save as a preliminary setup step for fresh runs. this lets us make save based (rather than the scenario based spawn at the moment) server spawning the default way rather than having two ways of spawning servers (save and scenario).

2.) attach save volumes to access different saves
this is a strict coupling between state/native-save & a server, ie .fle/saves/0 <-> factorio_0 <-> FLE:factorio instance, this means we would have to incorporate instance.py reset logic to also own the lifecycle of a saves directory as it owns the rest of the reset logic already.*

3.) couple server lifecycle with factorio instance
*letting instance reset saves creates a weird dependancy between servers (factorio_0) and agent sessions (factorio instance):

agent session -(expects)-> alive servers
alive servers -(expects)-> saves to exist
agent session -(mutates/resets)-> saves for the server
saves -(dictates)-> server's restart entrypoint

we can mitigate it by letting instance own the server spawning logic and strictly coupling it with the server:

agent session -(spawns)-> server
agent session -(mutates/resets)-> saves
agent session -(restarts)-> server using new save entrypoint

hrshtt · 2025-08-14T19:41:47Z

change-logs:

1. decoupled FactorioInstance into:

FactorioInstance: keeps instance level logic
AgentInstance: keeps the bits that the agent recieves

This lets me decouple agent level resets and instance level resets, and expose downstream classes like GameSession and AgentSession to only the bits they would touch.

Instances own complete & pure FLE coupled logic, its the core bits of the FLE designed API and its interfaces with Factorio

2. Added Sessions: `GameSession` & `AgentSession`

Motivation:

The factorio interface of FLE does not care about the infra & services, this session layer lets us couple db & docker sensibly with the game lifecycle.
The RL loop is polling the gamestate, score & production flows from the namespace directly using its own implementation, this doesnt make sense as in fast mode there will be considerable drift in the expected values if this is not done correctly. A better approach is to take snapshots of the game state, production flows, score etc around an eval call and only supply the RL loop with the snapshotted values.

Sessions couple infra to FLE and provide a clean contract for the RL loop to use expected values from FLE

3. Python based `aiodocker` for cluster management

Needed for programmatic lifecycles of cluster in python with the save files.

Minor changes:

FactorioClient wrapper for rcon connection & transactions so that we can manage connection lifecycle outside instance.
- Transactions using context manager, low hanging fruit.
updated pytest.fixtures to provide test specific instances (regular instance, unresearched instance, we can add other test specific fixtures here as needed) rather than creating instances inside tests.
moved db, rcon and docker to fle/services rather than fle/commons, there is a chance we can remove commons entirely or keep it only for shared models
moved all FLE game related environment modules into fle/env/game
- eg. it didnt make sense for GameState to be part of commons because its completely interdependant with instance.py
removed cluster directory completely, the only it.
for the most part all sub-modules now only import from scripts in the same level or sub-modules at a lower level:
- eg. all fle/env/game scripts only import from fle/env/game or fle/env/services/rcon

hrshtt · 2025-08-15T09:00:23Z

Would need to move script loading done over rcon to modular scripts embedded into the scenario's control.lua because whenever the server restarts we would need to load all the scripts again over rcon which is slow (5-10 seconds). This would add up with every restart for game reset.

kantneel · 2025-08-15T18:54:25Z

I like the increased modularity and separation of concerns in the instance and session levels. I also like the additional structure with new classes like AbstractTrajectoryRunner, GameConfig, FactorioClient, ToolHookRegistry etc. These changes definitely improve readability and extensibility.

You mentioned that "there is a minor connection error when running evals, so that is still unstable." Can you clarify what the issue is and do you think you know the fix for it? The unit tests should definitely pass before we merge this but also I think we should have functional tests in the form of running eval trajectories to completion and getting expected performance levels from agents

save and load from factorio

e27d913

hrshtt commented Jul 27, 2025

View reviewed changes

fle/cluster/local/run-envs.sh Outdated Show resolved Hide resolved

hrshtt added 21 commits July 27, 2025 18:49

python port

e831c7f

added environment

9fc3bbb

consolidated image creation and scenario2map approaches

b951905

linting fixes

6f1adec

defaulting to saves

dcf30c0

cleaner

4258838

minor changes

52754fb

moved to aiodocker

3e3ffdd

more async

92b955d

even more async

11dd335

clean up

778b899

fixed rcon issue

44d9529

new location for factorio-server structure

05506fd

added hot reloading control.lua

285c626

FactorioInstance refactor

c8ce309

moved script loading

46d3902

moved script loading even more

46c2de1

tests passing

a9b6586

minor fixes all actions tests passed

7921aea

formatted, no change

124bbe2

api changes + verbose set_speed method

d088d11

hrshtt added 5 commits July 29, 2025 19:19

minor changes

99e5a21

dir/module name change

ed24b00

api change

f48bc22

minor changes

b284424

moved factorio_server out

743921a

hrshtt added 8 commits July 30, 2025 11:31

moved mods and tools to fle/env/factorio

fcfef79

minor update + remove script for now

79ae1cd

restructure

c938a7c

remove cluster directory

46a0c9e

client goes to game

8bc383e

import path changes

05a012e

added config + path changes

437116d

wow

f0799f4

hrshtt added 16 commits August 12, 2025 19:43

a lotta things

d9f9f7b

generic-ify

a9788d6

even more changes

4404fea

renaming factorio_server & namespace

8796a70

snapshot driven eval

35aa145

minor changes

88768bc

minor changes

706e843

shifting code around

fb050de

minor changes

929fb63

explicit semantics

226e6b7

make it run

a102069

rm

7607d7c

fixes to be able to run

c5e3eb5

redo fixtures

2d5000f

update tests

bd58368

almost there

0060f6f

hrshtt requested a review from kantneel August 15, 2025 08:49

kiankyars force-pushed the main branch from 5c0ce82 to dcce4e6 Compare November 26, 2025 18:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

save and load from factorio#282

save and load from factorio#282
hrshtt wants to merge 54 commits intomainfrom
factorio_native_save

hrshtt commented Jul 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

hrshtt commented Jul 27, 2025

Uh oh!

hrshtt commented Jul 29, 2025 •

edited

Loading

Uh oh!

hrshtt commented Aug 8, 2025 •

edited

Loading

Uh oh!

hrshtt commented Aug 14, 2025 •

edited

Loading

Uh oh!

hrshtt commented Aug 15, 2025

Uh oh!

kantneel commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hrshtt commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

hrshtt commented Jul 27, 2025

Uh oh!

hrshtt commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hrshtt commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hrshtt commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. decoupled FactorioInstance into:

2. Added Sessions: GameSession & AgentSession

3. Python based aiodocker for cluster management

Minor changes:

Uh oh!

hrshtt commented Aug 15, 2025

Uh oh!

kantneel commented Aug 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hrshtt commented Jul 27, 2025 •

edited

Loading

hrshtt commented Jul 29, 2025 •

edited

Loading

hrshtt commented Aug 8, 2025 •

edited

Loading

hrshtt commented Aug 14, 2025 •

edited

Loading

2. Added Sessions: `GameSession` & `AgentSession`

3. Python based `aiodocker` for cluster management