google · yuktakul04 · Jun 11, 2025 · Jun 12, 2025 · Jun 12, 2025 · Jun 23, 2025
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
@@ -1,6 +1,28 @@
-Fixes #\<issue_number_goes_here>
+## Description
 
-> It's a good idea to open an issue first for discussion.
+[Provide a one sentence summary of the changes implemented.]
 
-- [ ] Tests pass
-- [ ] Appropriate changes to documentation are included in the PR
+[Link to related issues (e.g. "Closes #123", "Resolves #456").]
+
+## Details
+
+Details:
+
+- [Provide additional details, as applicable.]
+
+- [Provide additional details, as applicable.]
+
+- [Provide additional details, as applicable.]
+
+## Checklist
+
+- [ ] I have read the [Contributor's Guide](https://google.github.io/sbsim/contributing/).
+- [ ] I have signed the [Contributor License Agreement](https://cla.developers.google.com/) (first time contributors only).
+- [ ] I have set up [pre-commit hooks](https://google.github.io/sbsim/contributing/#pre-commit-hooks) by running `pre-commit install` (one time only), and the pre-commit hooks pass.
+- [ ] I have added appropriate [unit tests](https://google.github.io/sbsim/contributing/#testing), and the tests pass.
+- [ ] I have added [docstrings](https://google.github.io/sbsim/contributing/#documentation) and updated the documentation as necessary, and I have previewed the [documentation site](https://google.github.io/sbsim/docs-site/) locally to make sure things look good.
+- [ ] I have self-reviewed my code (especially important if using AI agents).
+
+---
+
+**Thank you for your contribution!**
diff --git a/.gitignore b/.gitignore
@@ -25,15 +25,28 @@ data/sb1.zip
 data/sb1/
 
 # results files:
-*/**/output_data/
-*/**/metrics/
-**/videos/
-**/train/
-**/eval/
-smart_control/learning/
+#*/**/output_data/
+#*/**/metrics/
+#**/videos/
+#**/train/
+#**/eval/
+
+smart_control/configs/resources/sb1/train_sim_configs/generated/
+# todo: use temp dir instead:
+smart_control/configs/resources/sb1/train_sim_configs/generation_test/
+
 smart_control/simulator/videos
-smart_control/refactor/data/
-smart_control/refactor/experiment_results/
+
+smart_control/reinforcement_learning/data/starter_buffers/*
+!smart_control/reinforcement_learning/data/starter_buffers/.gitkeep
+!smart_control/reinforcement_learning/data/starter_buffers/default
+!smart_control/reinforcement_learning/data/starter_buffers/test
+
+smart_control/reinforcement_learning/data/experiment_results/*
+!smart_control/reinforcement_learning/data/experiment_results/.gitkeep
+
+smart_control/reinforcement_learning/data/experiment_eval/*
+!smart_control/reinforcement_learning/data/experiment_eval/.gitkeep
 
 # jupyter notebook checkpoints:
 smart_control/notebooks/.ipynb_checkpoints/

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -37,4 +37,4 @@ repos:
   rev: 0.7.22
   hooks:
   - id: mdformat
-    exclude: ^docs/api/
+    exclude: ^docs/api/|^\.github/
diff --git a/docs/api/reinforcement_learning/scripts.md b/docs/api/reinforcement_learning/scripts.md
@@ -1,5 +1,9 @@
 # Scripts
 
+::: smart_control.reinforcement_learning.scripts.generate_gin_configs
+
 ::: smart_control.reinforcement_learning.scripts.populate_starter_buffer
 
 ::: smart_control.reinforcement_learning.scripts.train
+
+::: smart_control.reinforcement_learning.scripts.eval
diff --git a/docs/contributing.md b/docs/contributing.md
@@ -97,6 +97,10 @@ pytest --disable-pytest-warnings -k your_test_name_here
 # ignore specific test files and directories:
 pytest --ignore=path/to/your/test.py --ignore=path/to/other/
 
+# display more logs:
+pytest --disable-pytest-warnings -s --log-cli-level=INFO path/to/your/test.py
+# display all logs:
+pytest --disable-pytest-warnings -s --log-cli-level=DEBUG path/to/your/test.py
 ```
 
 ## Linting

diff --git a/docs/guides/reinforcement_learning/scripts.md b/docs/guides/reinforcement_learning/scripts.md
@@ -0,0 +1,125 @@
+# Reinforcement Learning Scripts
+
+## Configuration Generation
+
+By default, when training an RL agent, it will use configuration options defined
+in the base gin config file (see
+"smart_control/configs/resources/\<dataset_id>/sim_config.gin").
+
+However if you would like to use different configuration options, you can use
+the configuration generation script to flexibly create alternative config files
+with slight modifications to the base config file.
+
+Generate different configuration files to use during training:
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.generate_gin_configs
+```
+
+By default, the script will use the following parameter grid:
+
+- `time_steps`: `['300']`
+- `num_days`: `['1', '7', '14', '30']`
+- `start_timestamps`: ['2023-07-06']
+
+Optionally pass any of these command line flags to customize the parameter grid:
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.generate_gin_configs \
+  --time_steps 300,600,900 \
+  --num_days 1,7,14 \
+  --start_timestamps 2023-07-06,2023-08-06,2023-10-06
+```
+
+This script will generate a different file for each combination of custom
+parameter values you specify. The files will be written to the
+"smart_control/configs/resources/\<dataset_id>/train_sim_configs/generated"
+directory. Each file name will contain the parameter values you choose (e.g.
+"step_300_days_1_start_20230706.gin").
+
+## Starter Buffer Population
+
+Populate an initial replay buffer with initial exploration data, to provide a
+starting point when training RL agents:
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.populate_starter_buffer
+```
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.populate_starter_buffer \
+    --buffer_name buffer_xyz
+    --config_path smart_control/configs/resources/sb1/sim_config.gin
+```
+
+This creates a directory corresponding with the buffer name in
+"smart_control/reinforcement_learning/data/starter_buffers".
+
+A "default" starter buffer has been created for example purposes:
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.populate_starter_buffer \
+    --buffer_name default \
+    --num_runs 5 \
+    --capacity 50000 \
+    --steps_per_run 100 \
+    --sequence_length 2
+```
+
+A "test" starter buffer has been created for testing purposes:
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.populate_starter_buffer \
+    --buffer_name test \
+    --num_runs 1 \
+    --steps_per_run 1 \
+    --capacity 100 \
+    --sequence_length 2
+```
+
+## RL Agent Training
+
+Train a reinforcement learning agent, choosing a unique name for the experiment:
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.train \
+    --experiment_name="my-experiment-1"
+```
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.train \
+    --experiment_name="my-experiment-2" \
+    --starter_buffer_name="default" \
+    --agent_type="sac" \
+    --learner_iterations=3 \
+    --train_iterations=10 \
+    --collect_steps_per_training_iteration=5
+```
+
+This will generate a new experiment results directory under
+"smart_control/reinforcement_learning/data/experiment_results/`experiment_name`".
+In the experiment results directory will be the following files and directories:
+
+- "collect" directory
+- "eval" directory
+- "metrics" directory
+- "replay_buffer" directory
+- "experiment_parameters.json" file
+- "experiment_parameters.txt" file
+
+## Evaluation
+
+Evaluate a previously trained agent, specifying an experiment name that
+references an existing experiment results directory:
+
+```sh
+python -m smart_control.reinforcement_learning.scripts.eval \
+    --eval_experiment_name my-experiment-1
+```
+
+```sh
+python scripts/eval.py
+  --policy-dir experiment_results/ddpg_train_run-july-6th_2025_04_07-12:50:40/policies/
+  --gin-config /home/gabriel-user/projects/sbsim/smart_control/configs/resources/sb1/generated_configs/config_timestepsec-900_numdaysinepisode-14_starttimestamp-2023-11-06.gin
+  --experiment-name ddpg_train-summer_eval-winter
+```
diff --git a/docs/setup/linux.md b/docs/setup/linux.md
@@ -120,7 +120,7 @@ cd ../..
 
 By default, simulation videos are stored in the "simulator/videos" directory
 (which is ignored from version control). If you would like to customize this
-location, use the `SIM_VIDEOS_DIRPATH` environment variable.
+location, use the `SIM_VIDEOS_DIR` environment variable.
 
 You can pass environment variable(s) at runtime, or create a local ".env" file
 and set your desired value(s) there:
@@ -129,7 +129,7 @@ and set your desired value(s) there:
 # this is the ".env" file...
 
 # customizing the directory where simulation videos are stored:
-SIM_VIDEOS_DIRPATH="/cns/oz-d/home/smart-buildings-control-team/smart-buildings/geometric_sim_videos/"
+SIM_VIDEOS_DIR="/cns/oz-d/home/smart-buildings-control-team/smart-buildings/geometric_sim_videos/"
 ```
 
 ## Notebook Setup

diff --git a/docs/setup/mac.md b/docs/setup/mac.md
@@ -121,7 +121,7 @@ cd ../..
 
 By default, simulation videos are stored in the "simulator/videos" directory
 (which is ignored from version control). If you would like to customize this
-location, use the `SIM_VIDEOS_DIRPATH` environment variable.
+location, use the `SIM_VIDEOS_DIR` environment variable.
 
 You can pass environment variable(s) at runtime, or create a local ".env" file
 and set your desired value(s) there:
@@ -130,7 +130,7 @@ and set your desired value(s) there:
 # this is the ".env" file...
 
 # customizing the directory where simulation videos are stored:
-SIM_VIDEOS_DIRPATH="/cns/oz-d/home/smart-buildings-control-team/smart-buildings/geometric_sim_videos/"
+SIM_VIDEOS_DIR="/cns/oz-d/home/smart-buildings-control-team/smart-buildings/geometric_sim_videos/"
 ```
 
 ## Notebook Setup