Skip to content

Conversation

@SiddharthaChakrabarty
Copy link

@SiddharthaChakrabarty SiddharthaChakrabarty commented Oct 30, 2025

Summary
Added a new set of MariaDB notebook magics ("MariaDB Magics+") that provide a complete ML pipeline in the kernel (data cleaning, preprocessing, feature engineering, training/predict), RAG-based document ingestion & search over embeddings, DB preview/apply/rollback flows, and command-level logging.

Our Solution

  1. New kernel magics implementing the ML pipeline:

    • Data cleaning: %missing, %dropmissing, %fillmissing, %outliers, %dropoutliers, %clipoutliers.
    • Preprocessing: %stats, %standardize, %encode, %normalize.
    • Feature engineering & split: %select_features, %splitdata.
    • Model training: %train_model, %select_model, %evaluate_model, %savemodel, %loadmodel, %predict.
    • Automated pipeline wrapper: %ml_pipeline (auto-select features, split, select best model based on metrics).
  2. RAG/document support:

    • %maria_ingest — ingest docs and store sentence-transformer embeddings (default k=8 for search).
    • %maria_search, %maria_rag_query — retrieve similar chunks and call the LLM
  3. DB preview / apply / rollback flow:

    • CLI-like magics to Preview changes, Apply them to the DB, and Rollback using rollback_token.
  4. In-DB logging:

    • Writes metadata/logs to a magic_metadata table. Users can inspect via SELECT * FROM magic_metadata.

For more details, please go through this link: MariaDB Magics+ PPT

1. Download Miniconda

        https://docs.conda.io/en/latest/miniconda.html

2. Install Miniconda

         sh ./Miniconda3-latest-Linuxscript-x86_64.sh

3. Create a new conda environment

         conda create -n maria_env python=3.7

4. Activate the new environment

          conda activate maria_env

5. Install Jupyterlab

          conda install -c conda-forge jupyterlab

6. Clone the mariadb_kernel repository

          git clone https://github.com/MariaDB/mariadb_kernel.git

7. Run requirements

          python3 -m pip install -r requirements.txt

8. Install the kernel

          python3 -m pip install mariadb_kernel

9. Install the kernelspec so that the kernel becomes visible to JupyterLab

          python3 -m mariadb_kernel.install

10. Open Jupyter Notebook / Jupyter Lab

          jupyter notebook / jupyter lab

11. Open DemoNotebooks

         cd DemoNotebooks

12. To know about all the machine learning commands try RawMLPipeline,ipynb

        Run the cells of the notebook

13. To know about Automated ML Pipeline commands try AutomatedMLPipeline.ipynb

        Run the cells of the notebook

14. To know about command about RAG commands try RAG.ipynb

        Run the cells of the notebook

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants