Open-Deep-ML
diff --git a/‎.github/workflows/format_questions.yml‎
Lines changed: 31 additions & 0 deletions b/‎.github/workflows/format_questions.yml‎
Lines changed: 31 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 77 additions & 101 deletions b/‎README.md‎
Lines changed: 77 additions & 101 deletions
@@ -0,0 +1,31 @@
+name: Build & Validate Question Bundles
+
+on:
+  push:
+    branches: [main]
+    paths: ['questions/**', 'utils/**', 'schemas/**']
+  pull_request:
+    paths: ['questions/**', 'utils/**', 'schemas/**']
+
+jobs:
+  format-validate:
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v4
+
+    - name: Set up Python
+      uses: actions/setup-python@v5
+      with:
+        python-version: '3.12'
+
+    - name: Install deps
+      run: pip install jsonschema
+
+    # ---------- build ----------
+    - name: Build bundles into /build
+      run: python utils/build_bundle.py
+
+    # ---------- validate ----------
+    - name: Validate bundles
+      run: python utils/validate_questions.py build/*.json
@@ -1,133 +1,109 @@
-# DML-OpenProblem
+# Deep-ML Open Problem Bank
 
-DML-OpenProblem is an open-source repository of problems focused on linear algebra, machine learning, and deep learning. The problems are designed to be solved from scratch, providing a robust learning experience. This project powers the website [Deep-ML](https://www.deep-ml.com/).
+A community-maintained collection of machine learning coding challenges.  
+Each problem lives in its own folder (`questions/<id>_<slug>/`) so contributors can edit Markdown, Python, and JSON files naturally.  
+A build script assembles everything into a single JSON file used by [deep-ml.com](https://deep-ml.com).
 
-## Table of Contents
+---
 
-- [Installation](#installation)
-- [Usage](#usage)
-- [Project Structure](#project-structure)
-- [Contributing](#contributing)
-- [How to Add an Interactive Learn for a Problem](#how-to-add-an-interactive-learn-for-a-problem)
-- [How to Add C++ Questions](#how-to-add-c-questions)
-- [License](#license)
+## 📁 Repository Layout
 
-## Installation
-
-To get started with DML-OpenProblem, clone the repository and install the necessary dependencies.
-
-```sh
-git clone https://github.com/yourusername/DML-OpenProblem.git
-cd DML-OpenProblem
-pip install -r requirements.txt
 ```
-## Usage
-You can use the repository to create, edit, and solve problems related to linear algebra, machine learning, and deep learning. The problems are structured in directories, each containing relevant files such as learn.html for the learning section and solution.py for the solution code.
-
-### Running the Streamlit App
+.
+├─ questions/
+│  ├─ _template/                ← Copy this to start a new problem
+│  ├─ 101_grpo_objective/
+│  │   ├─ meta.json
+│  │   ├─ description.md
+│  │   ├─ learn.md
+│  │   ├─ starter_code.py
+│  │   ├─ solution.py
+│  │   ├─ example.json
+│  │   ├─ tests.json
+│  │   ├─ tinygrad/
+│  │   │   ├─ starter_code.py
+│  │   │   ├─ solution.py
+│  │   │   └─ tests.json
+│  │   └─ pytorch/
+│  │       ├─ starter_code.py
+│  │       ├─ solution.py
+│  │       └─ tests.json
+│  └─ ...
+│
+├─ schemas/
+│  └─ question.schema.json     ← JSON-Schema used for validation
+│
+├─ utils/
+│  ├─ build_bundle.py          ← folder → build/*.json bundler
+│  ├─ validate_questions.py    ← schema validator
+│  └─ make_question_template.py← template folder generator
+│
+└─ .github/workflows/
+   └─ format_questions.yml     ← GitHub Action: validate on PR/push
+```
 
-To launch the Streamlit application for editing and viewing problems, use the following command:
+---
 
-```sh
-streamlit run app.py
-```
+## 🛠️ Adding a New Question
 
-#### Features
-- Problem Editor: Edit the learn.html and solution.py files for each problem using a web-based code editor.
-- Preview Section: Preview the learning section with LaTeX rendering for mathematical expressions.
-- Save Changes: Save your edits to the corresponding files in the repository.
+1. **Copy the template**
 
-## Project Structure
-```sh
-DML-OpenProblem/
-│
-├── Problems/
-│   ├── 1_matrix_times_vector/
-│   │   ├── learn.md
-│   │   └── solution.py
-│   ├── 2_transpose_matrix/
-│   │   ├── learn.md
-│   │   └── solution.py
-│   └── ... (additional problem directories)
-│
-├── app.py
-├── requirements.txt
-└── README.md
+```bash
+cp -r questions/_template questions/123_my_problem
 ```
-- **Problems/**: Contains directories for each problem. Each directory includes:
-  - `learn.md`: markdown file containing the learning section with explanations and examples.
-  - `solution.py`: Python file containing the solution to the problem along with tests.
-- **requirements.txt**: Lists the dependencies required for the project.
-- **README.md**: This file.
-
-## Contributing
 
-We welcome contributions to improve DML-OpenProblem and [deep-ml.com](https://www.deep-ml.com). If you have a new problem to add or improvements to existing problems, please fork the repository and submit a pull request. All contributions will be displayed on [deep-ml.com](https://www.deep-ml.com). For example, check out this problem: [Divide Dataset Based on Feature Threshold](https://www.deep-ml.com/problem/Divide%20Dataset%20Based%20on%20Feature%20Threshold). A helpfull tool to work on the learn section and know what it would look like on the front end is [https://openproblem-r4vsjwuthdl9a3qzrd4p3m.streamlit.app/](https://dml-openproblem-a5bwuwjh2xeyt5ta5wdiw9.streamlit.app/). Also here is an example of a learn section writing in markdown [Example Problem](https://github.com/Open-Deep-ML/DML-OpenProblem/tree/main/example_problem)
+2. **Fill in the fields**
 
+- `meta.json`: question ID, title, category, difficulty, etc.
+- `description.md`: problem statement
+- `learn.md`: explanation and background
+- `starter_code.py`, `solution.py`: reference implementation
+- `example.json`: input/output + reasoning
+- `tests.json`: list of `{ "test": "...", "expected_output": "..." }`
+- Optional language support under `tinygrad/` and `pytorch/`
 
-### Steps to Contribute
+3. **Run local validation**
 
-1. Fork the repository.
-2. Create a new branch for your feature or bugfix.
-3. Make your changes and commit them with clear and concise messages.
-4. Push your changes to your fork.
-5. Submit a pull request with a detailed description of your changes.
+```bash
+python utils/build_bundle.py && python utils/validate_questions.py
+```
 
-### Steps to add a Problem
-1. create an issue, the issue should describe the problem you would like to create and use the label "New Problem"
-2. comment below the issue you would like to work on
-3. We will assign the issue and let you know what number problem to make it
+4. **Open a Pull Request**
 
-### How to Add a Video Solution (Optional)
+CI will build and validate your changes automatically.
 
-1. **Create a Comprehensive Video Solution**:  
-   Your video should clearly explain the concept and provide a step-by-step solution to the problem. Feel free to include additional elements that enhance understanding, such as animations, hand-written examples, or any other visual aids that will help clarify the topic.
+---
 
-2. **Upload the Video to YouTube**:  
-   Once your video is ready, upload it to YouTube. Make sure the video is accessible and properly titled.
+## 🧪 Schema Validation
 
-3. **Include a Link to the Problem**:  
-   In the video description, add a link to the corresponding problem on Deep-ML so that viewers can easily access and try solving the problem themselves.
-   
-5. **Submit the Video Link**:  
-In the corresponding problem folder, create a `.txt` file containing the link to your YouTube video. This will help us easily reference your solution.
+The schema ensures:
 
-## How to Add an Interactive Learn for a Problem (Optional)
+- Required fields are present
+- Optional `tinygrad_*`, `pytorch_*` are allowed
+- No invalid or extra fields
 
-1. **Create a Problem Folder**: Navigate to the `Problems/interactive_learn/` directory and create a folder named `problem-N`, where `N` is the problem number assigned to you (e.g., `problem-17`).
+Each question must pass validation before it can be merged.
 
-2. **Add Learning Materials**: Inside the folder, create a `notebook.py` file. This file should include the learning content with explanations, examples, and any required resources for the problem. You could use https://marimo.app/?slug=aojjhb to ensure the file is compatible with `marimo` for HTML-WASM conversion. For example, you can check [problem-4's notebook.py](Problems/interactive_learn/problem-4/notebook.py)
+---
 
-3. **Submit Changes**: Commit the new folder with its contents to your branch and submit a pull request. Ensure your commit messages clearly indicate the addition of the interactive learn for the problem.
+## 🤖 GitHub Actions
 
-4. **Collaborate for Review**: Engage with reviewers for feedback on your pull request. Make any necessary adjustments as suggested.
-## How to Add C++ Questions
+Located in `.github/workflows/format_questions.yml`, this runs:
 
-We are adding C++ support to the problem set, and you can contribute C++ solutions following these guidelines.
+1. `build_bundle.py` – compiles all question folders
+2. `validate_questions.py` – checks for schema and structure errors
 
-### Steps to Add a C++ Solution
+CI fails if anything is invalid.
 
-1. **Select a problem** from the existing Python-based problems.
-2. **Create a C++ solution file** inside the corresponding problem folder, naming it `solution.cpp`.
-3. **Follow the C++ coding guidelines**:
-   - Use **C++17 or later**.
-   - Prefer **Eigen** for matrix operations (or xtensor-blas if necessary).
-   - Ensure **well-structured, readable, and modular code**.
-   - Keep solutions **self-contained** and avoid unnecessary external dependencies.
-   - Format numerical outputs to **4 decimal places** for consistency.
-4. **Test your solution** to ensure correctness.
-5. **Submit a pull request** with a detailed explanation of your solution.
+---
 
-### C++ Coding Rules
+## 📜 License
 
-- Use **Eigen** for matrix computations where applicable.
-- Avoid **excessive STL usage** unless necessary for clarity.
-- Prefer **pass-by-reference** over pass-by-value to improve performance.
-- Ensure **error handling** without crashing the program.
-- Keep solutions **deterministic** and **efficient**.
+All problems are for **educational use only**.  
+See `LICENSE` file for full terms. 
 
-If you have any questions about library choices or implementation details, feel free to start a discussion in the GitHub issues section.
+---
 
-## License
+## 🙋 Need Help?
 
-This project is for educational reasons only. See the [LICENSE](LICENSE) file for details.
+Open an issue or visit our Discord: https://discord.gg/JwMePfMZAV