Add RL/Gymnasium Integration

Python bindings + Gymnasium wrapper for training RL agents in Scrimmage.

## Why

The old OpenAI Gym integration was removed in Nov 2025 (commits 87ffbf190, e140f2a). Looking at what got deleted, I think the issue was embedding Python inside the C++ sim loop — Global Interpreter Lock acquisition every entity every timestep, messy control flow, etc.

## Proposed approach

Flip the control: Python drives, C++ executes.

- Use `run_single_step()` which already exists
- Simple RLAutonomy plugin that's basically just a mailbox for actions
- pybind11 bindings for SimControl/State/Contact
- Gymnasium wrapper on top

The bindings would be opt-in (`-DENABLE_RL_BINDINGS=ON`), keeping the default build clean.

## Scope

Starting small:
- Single agent waypoint navigation
- Maybe 1v1 pursuit evasion
- SAC/TD3 (not PPO, need sample efficiency)

Multi-agent is a stretch goal.

## What it'd look like

```python
import gymnasium as gym
import scrimmage_gym
from stable_baselines3 import SAC

env = gym.make("Scrimmage-Waypoint-v0", mission="missions/waypoint.xml")
model = SAC("MlpPolicy", env)
model.learn(100_000)
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RL/Gymnasium Integration #636

Why

Proposed approach

Scope

What it'd look like

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add RL/Gymnasium Integration #636

Description

Why

Proposed approach

Scope

What it'd look like

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions