Data Analyst · Financial Firm, 📍Bengaluru
Building intelligent pipelines with Snowflake, dbt, and LLMs · Former Intern at Earnst & Young · IEEE Published · B.Tech CSE, VIT Vellore
- I build end-to-end data pipelines, AI-powered document processing tools, and analytics dashboards that turn messy data into decisions
- My work spans the full data stack — from ingestion and transformation to LLM-powered extraction, API backends, and business-facing visualizations
- Contributing to an end-to-end donor management solution for Make-A-Wish, India
- Deeply focused on agentic workflows, intelligent document processing, and text-to-SQL interfaces
- Based in Bengaluru, India
Snowflake · dbt · Azure Automation · Document AI · Fivetran · Streamlit
End-to-end pipeline processing multi-vendor invoices from email & SFTP sources. Uses LLM-powered field extraction (Snowflake AI Extract), dbt transformations for lineage and testability, and a Streamlit app for business-user validation and exception handling. Meaningfully reduced end-to-end processing time across daily invoice flows.
Node.js · Google Gemini · Pinecone · Vector Search
Converts natural language queries into SQL using Google Gemini, with Pinecone vector search for dynamic, metadata-driven schema retrieval. Supports any database structure via admin-uploaded metadata files, eliminating hardcoded table dependencies. Includes dynamic chart visualizations for non-technical stakeholders.
FastAPI · Streamlit · NLP · GitHub Actions · Azure
Built during my internship at EY — end-to-end analysis of online retail data to uncover demand and purchase-behavior trends. Evaluated NLP embedding approaches (CountVectorizer, Word2Vec, Sentence-Transformers) to design a scoring and ranking framework for product recommendations, deployed on Azure with CI/CD via GitHub Actions.
NLP · LLMs · Metadata Filtering · Hybrid Retrieval
Published research proposing a hybrid recommendation system that combines metadata filtering with review-based LLM reasoning to improve apparel discovery accuracy — going beyond traditional collaborative filtering by incorporating cascaded language model inference.
Copilot Studio · Azure Functions · Power Apps
Agentic workflow that maps inbound job descriptions to firm service lines with explainable match scores. Includes weighted scoring, gap analysis for near-matches, and automated first-touch outreach with human-in-the-loop validation and audit logs.
FastAPI · PostgreSQL · SQLAlchemy · OAuth
FastAPI backend for managing financial data across General Partners, Funds, and Portfolio Companies. Supports multi-tenant access via a multi-schema PostgreSQL architecture with OAuth-based authentication and custom financial statement templates.
Languages
Data Engineering
Cloud & DevOps
Backends & Databases
AI & ML
Analytics & Visualization
I'm always open to discussing data engineering, LLM applications, and the startup ecosystem.
Open for technical bounties: Looking forward on actively taking on database systems, Azure, and DevOps challenges.
Reach out on LinkedIn or at pin.ayush@gmail.com