Skip to content
View ayushkr03's full-sized avatar

Block or report ayushkr03

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ayushkr03/readme.md

Hi, I'm Ayush 👋

Data Analyst · Financial Firm, 📍Bengaluru
Building intelligent pipelines with Snowflake, dbt, and LLMs · Former Intern at Earnst & Young · IEEE Published · B.Tech CSE, VIT Vellore


About Me

  • I build end-to-end data pipelines, AI-powered document processing tools, and analytics dashboards that turn messy data into decisions
  • My work spans the full data stack — from ingestion and transformation to LLM-powered extraction, API backends, and business-facing visualizations
  • Contributing to an end-to-end donor management solution for Make-A-Wish, India
  • Deeply focused on agentic workflows, intelligent document processing, and text-to-SQL interfaces
  • Based in Bengaluru, India

🚀 Featured Projects

🔄 Invoice Ingestion & Reconciliation Pipeline

Snowflake · dbt · Azure Automation · Document AI · Fivetran · Streamlit

End-to-end pipeline processing multi-vendor invoices from email & SFTP sources. Uses LLM-powered field extraction (Snowflake AI Extract), dbt transformations for lineage and testability, and a Streamlit app for business-user validation and exception handling. Meaningfully reduced end-to-end processing time across daily invoice flows.


🧠 QueryCraft — Text-to-SQL with Dynamic Schema Retrieval

Node.js · Google Gemini · Pinecone · Vector Search

Converts natural language queries into SQL using Google Gemini, with Pinecone vector search for dynamic, metadata-driven schema retrieval. Supports any database structure via admin-uploaded metadata files, eliminating hardcoded table dependencies. Includes dynamic chart visualizations for non-technical stakeholders.


🛍️ Product Recommendation Engine (Ernst & Young)

FastAPI · Streamlit · NLP · GitHub Actions · Azure

Built during my internship at EY — end-to-end analysis of online retail data to uncover demand and purchase-behavior trends. Evaluated NLP embedding approaches (CountVectorizer, Word2Vec, Sentence-Transformers) to design a scoring and ranking framework for product recommendations, deployed on Azure with CI/CD via GitHub Actions.


📄 Hybrid Apparel Recommendation System Using Cascaded LLMs (IEEE Access, 2024)

NLP · LLMs · Metadata Filtering · Hybrid Retrieval

Published research proposing a hybrid recommendation system that combines metadata filtering with review-based LLM reasoning to improve apparel discovery accuracy — going beyond traditional collaborative filtering by incorporating cascaded language model inference.

🔗 Read on IEEE Xplore


🤖 Lead Qualification Agent

Copilot Studio · Azure Functions · Power Apps

Agentic workflow that maps inbound job descriptions to firm service lines with explainable match scores. Includes weighted scoring, gap analysis for near-matches, and automated first-touch outreach with human-in-the-loop validation and audit logs.


📈 Portfolio Monitoring Backend

FastAPI · PostgreSQL · SQLAlchemy · OAuth

FastAPI backend for managing financial data across General Partners, Funds, and Portfolio Companies. Supports multi-tenant access via a multi-schema PostgreSQL architecture with OAuth-based authentication and custom financial statement templates.


🛠️ Tech Stack

Languages

Python SQL JavaScript Java HTML/CSS

Data Engineering

Snowflake dbt Fivetran Alteryx

Cloud & DevOps

Azure Azure Automation GitHub Actions

Backends & Databases

FastAPI Node.js Express.js PostgreSQL MongoDB

AI & ML

Snowflake Cortex Pinecone Copilot Studio Google AI Studio

Analytics & Visualization

Power BI Streamlit Looker Studio GA4


📈 GitHub Stats


🤝 Let's Connect

I'm always open to discussing data engineering, LLM applications, and the startup ecosystem.

Open for technical bounties: Looking forward on actively taking on database systems, Azure, and DevOps challenges.

Reach out on LinkedIn or at pin.ayush@gmail.com

Profile views

Popular repositories Loading

  1. Lie-Detection-Using-Audio-Classification Lie-Detection-Using-Audio-Classification Public

    This repository deals with Deception Detection using Audio Classification

    Jupyter Notebook

  2. Orket-An-Ecommerce-Website Orket-An-Ecommerce-Website Public

    JavaScript

  3. Reco Reco Public

  4. Data-Ingestion Data-Ingestion Public

    Python

  5. ayushkr03 ayushkr03 Public