Skip to content
View LimPark996's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report LimPark996

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
LimPark996/README.md

Typing SVG


About Me

name: Yumi Park
location: Seoul, Korea
education: M.S. Statistics, Dongguk University
focus: RAG Systems · Video Retrieval · Data Pipelines

background:
  - Designed and built RAG systems for enterprise document search
  - Built Video RAG prototype for gov't R&D project (1st round pass)
  - ML modeling at EdTech company (Woongjin ThinkBig AI Labs)
  - Taught generative AI to non-developers at Samsung C&T AI Academy
  - Wrote technical deep-dives on VQGAN, Transformer, BERT source code (byumm315.tistory.com)

Featured Projects

VideoRAG — AI-Powered Video Retrieval & Synthesis

Hybrid retrieval pipeline combining BM25 + InternVideo2 + ColBERT. Selected for government R&D funding (1st round pass).

InternVideo2 FAISS ColBERT BM25 Runway DINOv2 C2PA Gradio

  • Indexed 7,010 MSR-VTT videos with hybrid search: BM25 · Dense · WRRF fusion · ColBERT · ITM reranking
  • MSR-VTT 1k-A benchmark: ITC dense alone R@1 3.5% → R@1 44.4% after full ITM (paper: 51.9%; −10.6%p gap attributed to unresolved ITC collapse)
  • Scene Graph → 2-path routing (USE_AS_IS / TRANSFORM) PD workstation
  • DINOv2 transition scoring + DreamColour 3D LUT color grading + C2PA ES256 provenance signing

Repo demo


Construction Law RAG Chatbot

AI chatbot over 9 construction regulation PDFs · Samsung C&T AI Academy project · Team lead

Python FAISS BM25 bge-reranker GPT-4o-mini Streamlit

  • FAISS + BM25 hybrid retrieval with bge-reranker for precision improvement
  • 7-type query classification via GPT-4o-mini with per-type response strategy branching
  • Designed step-by-step implementation notebooks for non-developer teammates

Repo


Metal Defect Synthesis

Image synthesis PoC to address manufacturing defect data scarcity

VQGAN MaskGIT PyTorch HuggingFace Gradio

  • v1 (taming VQGAN + custom MaskGIT) → v2 (LlamaGen VQGAN + Halton-MaskGIT): architecture migration driven by codebook incompatibility — Halton-MaskGIT (ICLR 2025) is built for LlamaGen VQGAN and cannot be combined with taming VQGAN; pretrained weights available only for LlamaGen made the switch the practical choice
  • Merged 3 datasets including NEU-DET; 2,659 images → 8× augmentation
  • VQGAN fine-tuning: Edge IoU +10.6%, PSNR +3.1%, SSIM +0.73%
  • MaskGIT training loss 6.77 (target ~4.0) → convergence failure — structural limitation of 69M-param model on 21K data
  • Deployed Gradio inpainting demo on HuggingFace Spaces

Repo Demo


Term Search System

Hybrid search engine for construction standard terminology

Python FAISS ColBERT OpenAI API

  • Weighted fusion: OpenAI Embedding semantic similarity (60%) + ColBERT token-level MaxSim (40%)
  • Achieved accurate standard term retrieval from colloquial queries
  • Circuit Breaker + Rate Limiting for API failure resilience

Repo


Ticketmon — Concert Ticketing Platform

High-concurrency ticketing system · Bootcamp final project · Team lead

Spring Boot Redis MySQL React Docker AWS

  • Designed and implemented AI review summarization feature (Together AI → OpenAI migration)
  • Refactored seat management from section-based to grade-based architecture
  • Implemented dynamic seat layout rendering based on venue scale (small / medium / large)

Backend


Work Experience

Period Role Key Work
2025.08–11 AI Instructor, Samsung C&T AI Academy (Elice) Generative AI curriculum & RAG chatbot training for construction industry professionals
2023.09–2024.12 Research Team, Woongjin ThinkBig AI Labs CatBoost difficulty prediction model for exam items (R²=0.57), ALP system reverse analysis

Skills

AI / RAG / Search

Python OpenAI FAISS HuggingFace Claude ChatGPT

ML / Data

NumPy Pandas scikit-learn PostgreSQL

Backend / Infra

Spring Boot Redis Docker AWS Google Colab Jupyter Notebook


Connect

LinkedIn HuggingFace Blog


Pinned Loading

  1. term-search-system-ver1 term-search-system-ver1 Public

    Python

  2. cntchatbot_pjt1 cntchatbot_pjt1 Public

    (엘리스) 삼성물산 AI 아카데미 교안 1 - 부동산 시장 동향 리포트 요약 Q&A AI Agent

    Python 5

  3. cntworkbot_pjt1 cntworkbot_pjt1 Public

    (엘리스) 삼성물산 AI 아카데미 교안 2 - 건축 관련 규제 문서 요약 및 준수계획 초안 작성 AI Agent

    Jupyter Notebook 4

  4. Yum-CS-Study-Memo/Effective-Java-Study-Memo Yum-CS-Study-Memo/Effective-Java-Study-Memo Public

    Effective-Java-Study-Memo