| title | Data Clean Env | |
|---|---|---|
| emoji | 🧹 | |
| colorFrom | blue | |
| colorTo | green | |
| sdk | docker | |
| pinned | false | |
| app_port | 8000 | |
| base_path | /web | |
| tags |
|
Data Clean Env is a high-fidelity, production-grade OpenEnv implementation designed to evaluate and train Reinforcement Learning (RL) agents on the messy, complex reality of Data Cleaning.
Unlike "toy" environments, this project simulates the exact workflow of a data engineer: identifying schema inconsistencies, handling missing values, casting types, and pruning noise from real-world datasets using the power of pandas.
The agent interacts with the environment through atomic, high-level data operations defined in models.py:
| Action | Parameters | Description |
|---|---|---|
fill_na |
column_name, value |
Replaces missing values with a specific constant. |
drop_na |
column_name |
Removes rows containing missing data in the target column. |
drop_column |
column_name |
Deletes irrelevant or noisy features from the dataset. |
rename_column |
column_name, value |
Fixes naming inconsistencies to match target schemas. |
change_type |
column_name, value |
Casts columns to int, float, or str for downstream compatibility. |
submit |
- | Finalizes the cleaning process and triggers the programmatic grader. |
The agent perceives the state of the data through a detailed schema:
df_schema: Real-time dictionary of column data types.missing_values: Current counts ofNaNvalues per column.head: A preview of the first 5 rows to identify formatting patterns.feedback: Semantic descriptions of the impact of the last action.
Each task is evaluated by a deterministic programmatic grader that compares the agent's output against a "Gold Standard" target, producing a score strictly between (0.0, 1.0).
- 🟢 Easy (
easy_clean):- Goal: Basic imputation.
- Challenge: Fill missing 'age' values.
- 🟡 Medium (
medium_clean):- Goal: Noise reduction.
- Challenge: Handle missing values across multiple columns and remove "junk" features.
- 🔴 Hard (
hard_clean):- Goal: Full schema alignment.
- Challenge: Rename columns, perform safe type casting on dirty strings, and handle complex missing value fallbacks.
# Build the production image
docker build -t openenv_data_clean:latest -f server/Dockerfile .
# Start the environment server
docker run -p 8000:8000 openenv_data_clean:latestWe provide a deterministic, zero-temperature baseline script using the OpenAI client:
export HF_TOKEN="your_huggingface_token"
export IMAGE_NAME="openenv_data_clean:latest"
python inference.pyOur reward function is designed for efficient RL convergence:
- Incremental Progress:
+0.1for every valid schema improvement. - Penalization:
-0.05for invalid operations (e.g., targetting non-existent columns). - Completion Bonus: A final reward scaling with the total grader score
[0.01 - 0.99].
- ✅ Typed Models: Fully Pydantic-powered
ObservationandAction. - ✅ API Standard: Implements
step(),reset(), andstate(). - ✅ Strict Logs: Emits
[START],[STEP], and[END]traces exactly as required. - ✅ Robustness: Handles network timeouts and invalid JSON carefully.
Built with ❤️ for the Meta & Hugging Face OpenEnv Hackathon.