Skip to content
@willowvibe

WILLOWVIBE

We provide simpler solutions

   


🌿 WillowVibe

Open-source data infrastructure tools. Built in India. Used everywhere.
We build lightweight, self-hosted tooling that gives small data teams enterprise-grade pipeline observability, auditing, and automation — without vendor lock-in or SaaS bills.

         


🔍 What We Build

WillowVibe is a data engineering & AI tooling studio — solo-founded, contributor-driven, OSS-first.

  • Pipeline Auditing — point-in-time health checks on Airflow + dbt + warehouses; one command, one report
  • Data Observability — continuous monitoring for pipeline health, data freshness, volume anomalies, and schema drift
  • FinOps for Data — tracking Snowflake credits and BigQuery bytes billed, turning cloud cost chaos into actionable visibility
  • AI-Augmented Pipelines — embedding AI at the right layer of the data stack without replacing what already works
  • Open-Source First — every internal tool we build, we ship as OSS so the community benefits

We operate a solo + contributor model — lean by design, moving fast, building things that solve real problems for data teams.


🚀 Projects

🔬 PipelineProbeNew

Instant Data Pipeline Audit Report for Airflow + dbt + modern warehouses

Run a single command, get a full HTML audit report. PipelineProbe is a read-only CLI audit tool for data engineers who want a fast, objective health check of their pipeline stack — before a migration, after an incident, or as a recurring CI gate.

pip install pipelineprobe
pipelineprobe init      # generates pipelineprobe.yml
pipelineprobe audit     # produces pipelineprobe-report.html
  • Airflow checks — high failure-rate DAGs, missing retries, missing SLAs, stale pipelines
  • dbt checks — models with zero tests, failing test runs, orphaned models
  • Warehouse checks — oversized tables, missing audit timestamps (Postgres, BigQuery, Snowflake)
  • HTML + JSON report — traffic-light severity, health score 0–100, per-issue recommendations
  • CI-readyfail_on_critical exit code gates for GitHub Actions / GitLab CI
  • Zero mutations — 100% read-only; safe to run against production

Stack: Python · Typer · Pydantic · httpx · Jinja2 · psycopg2 · dbt artifacts

PipelineProbe Repo MIT License


Self-hosted Data Observability & FinOps Starter Kit for small data teams

ObservaKit gives 1–5 person data teams the 5 core observability pillars — Freshness, Volume, Quality, Schema Drift, and Pipeline Health — in a single docker-compose up. No Monte Carlo. No Metaplane. No SaaS bill.

  • ✅ Freshness Monitor — detects stale tables by tracking max(updated_at)
  • ✅ Volume Anomaly — Z-score detection against 7-day rolling averages
  • ✅ Quality Checks — Soda Core & Great Expectations templates, ready to use
  • ✅ Schema Drift Detector — snapshots information_schema, diffs on every run
  • ✅ Pipeline Health — Airflow/Prefect REST API + OpenTelemetry + Grafana
  • ✅ FinOps Tracker — Snowflake credits & BigQuery bytes billed, natively
  • ✅ Native dbt Integration — parses run_results.json directly, no extra packages

Stack: Python · FastAPI · SQLAlchemy · Alembic · Prometheus · Grafana · Docker Compose · dbt · Airflow / Prefect

ObservaKit Repo MIT License


🗂️ All Repositories

Repo Description Language Status
🔬 pipelineprobe Instant pipeline audit CLI — Airflow + dbt + warehouse Python active
🔭 ObservaKit Self-hosted data observability & FinOps starter kit Python active
🧰 toolscontainer Multi-purpose Python utility scripts & automations Python maintained
🕷️ scrapy-bot Scrapy + Flask web scraping bot experiment Python archived
💻 online-ide Lightweight online Python execution environment Python experimental

🛠️ Tech We Work With

Layer Tools
Data Engineering Python · dbt · Apache Airflow · Prefect · Apache Spark
Warehouses PostgreSQL · Snowflake · BigQuery · DuckDB
Observability Prometheus · Grafana · OpenTelemetry · Soda Core
Backend FastAPI · SQLAlchemy · Alembic · Pydantic
Infra & DevOps Docker · Docker Compose · Terraform · GitHub Actions
AI / ML LangChain · OpenAI APIs · Vector DBs (Qdrant / ChromaDB)

🌱 Our Open-Source Philosophy

"Build what the ecosystem needs. Share what you build. Let the community make it better."

Every project we open-source follows three rules:

  1. Zero vendor lock-in — runs on infra you own and control
  2. Quickstart in under 10 minutes — if onboarding is painful, it won't get adopted
  3. Progressive complexity — adopt one layer at a time; no all-or-nothing commitment

We actively maintain what we ship. Issues get responses. PRs get reviewed. Roadmaps get published.


🤝 Contributing

All public repos welcome contributions. Best places to start:

  • 🔬 PipelineProbegood first issues

    • Add a new warehouse connector (Redshift, DuckDB)
    • Add a new rule (task duration outliers, dbt source freshness)
    • Improve the HTML report template
  • 🔭 ObservaKitgood first issues

    • Add a new warehouse connector (Redshift, Delta Lake)
    • Write a Grafana dashboard for a new observability use case
    • Improve documentation or add a real-world example

Read CONTRIBUTING.md before opening a PR.


📬 Get In Touch

We are open to:

  • Collaborations on data tooling, AI pipelines, or observability infra
  • Consulting engagements — data platform audits, pipeline migrations, cost optimization
  • Freelance / contract data engineering for startups and scaleups
Channel Link
🐙 GitHub @willowvibe
🔬 PipelineProbe Issues Open an issue
🔭 ObservaKit Issues Open an issue
🔐 Security Reports See SECURITY.md

🌿 WillowVibe — Bengaluru, India  ·  Building in the open since 2024  ·  Try PipelineProbe 🔬  ·  Star ObservaKit ⭐

Pinned Loading

  1. ObservaKit ObservaKit Public

    Self-hosted Data Observability & FinOps starter kit. Automate pipeline audits, track data freshness/quality, and monitor cloud costs with zero vendor lock-in. Powered by Docker Compose, dbt, Airflo…

    Python 1

Repositories

Showing 6 of 6 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…