🦆 MALLARD

Local Data Warehouse · Data Refiner Edition

Drop your CSV, Excel, Parquet, or JSON. Let the duck do the rest.

MALLARD is a 100% local, zero-server data tool built on DuckDB + Streamlit. Upload messy datasets, auto-clean them, explore with interactive charts, run custom SQL, and export the refined result — all without touching the cloud.

🎬 Demo

✨ Features

Multi-format Ingestion — CSV, Excel (.xlsx/.xls), Parquet, JSON.
Zero-Pandas Memory Footprint — Uses DuckDB native SQL pushdowns to process larger-than-RAM datasets without Out-Of-Memory (OOM) crashes.
Auto Deep Clean — Deduplication, empty column removal, and statistical type healing (no hardcoded logic).
Smart Summary — Automated insights and missing-value detection processed instantly inside the DB engine.
Analytics Explorer — Uses smart SQL sampling and native aggregation before rendering charts to keep the browser lightning fast.
Custom SQL — Run any DuckDB query directly against your tables.
Direct-to-Disk Export — Ultra-fast downloads (CSV, Parquet, JSON) bypassing Python RAM using DuckDB's native COPY command.
100% Local — Data never leaves your machine, no cloud upload required.
Standalone .exe — Distributable to Windows users with no Python setup needed.

🚀 Quick Start

Option 1 — Run from source

# 1. Clone repo
git clone https://github.com/KMoex-HZ/mallard.git
cd mallard

# 2. Install dependencies
pip install -r requirements.txt

# 3. Run
streamlit run mallard.py

Option 2 — Windows .exe (no Python needed)

Download MALLARD.zip from Releases
Extract the zip
Double-click MALLARD.exe
Browser opens automatically at http://localhost:8501

📦 Requirements

streamlit==1.45.1
duckdb==1.2.2
pandas==2.2.3
plotly==6.0.1
openpyxl==3.1.5
pyarrow==19.0.1
xlrd==2.0.1
psutil==6.1.1

Python 3.10+ recommended.

🛠️ Build .exe from source

pip install pyinstaller pyinstaller-hooks-contrib
python setup_exe.py

Output: dist/MALLARD/MALLARD.exe
Distribute the entire dist/MALLARD/ folder as a zip.

🧹 How Deep Clean works

Step	What it does
Deduplication	Removes exact duplicate rows via native SQL `DISTINCT`
Empty column removal	Drops columns with zero non-null values instantly via metadata count
Smart Numeric Healing	Uses statistical execution (`TRY_CAST`) to detect and convert dirty VARCHAR numbers into DOUBLE, bypassing hardcoded column names
Wide-format detection	Auto-melts date-header pivot tables into long format

📁 Project Structure

mallard/
├── mallard.py          # Main Streamlit app
├── launcher.py         # Entry point for .exe build
├── setup_exe.py        # PyInstaller build script
├── requirements.txt
└── data/               # Auto-created, drop files here for auto-ingestion

🦆 Why MALLARD?

Most data tools are either too heavy (full BI platforms) or too barebones (just a CSV viewer). MALLARD lives in the middle — a local warehouse that's fast, opinionated, and actually fixes your data before you analyze it.

By leveraging DuckDB's in-process OLAP engine, it allows you to process datasets that would normally crash a standard pandas/Streamlit app.

No Tableau license. No cloud upload. Just a duck and DuckDB.

Author

Khairunnisa Maharani

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
launcher.py		launcher.py
mallard.py		mallard.py
mallard_icon.ico		mallard_icon.ico
requirements.txt		requirements.txt
setup_exe.py		setup_exe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦆 MALLARD

Local Data Warehouse · Data Refiner Edition

🎬 Demo

✨ Features

🚀 Quick Start

Option 1 — Run from source

Option 2 — Windows .exe (no Python needed)

📦 Requirements

🛠️ Build .exe from source

🧹 How Deep Clean works

📁 Project Structure

🦆 Why MALLARD?

Author

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🦆 MALLARD

Local Data Warehouse · Data Refiner Edition

🎬 Demo

✨ Features

🚀 Quick Start

Option 1 — Run from source

Option 2 — Windows .exe (no Python needed)

📦 Requirements

🛠️ Build .exe from source

🧹 How Deep Clean works

📁 Project Structure

🦆 Why MALLARD?

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages