AI-powered content curation + voice briefings for engineers who don't have time to read everything
Live Demo: devcaddie.com
Real-time voice briefings powered by Gemini Live 2.5 Flash with barge-in Q&A capability.
- Script Generation - Airflow generates daily briefing scripts from top-scored articles
- Voice Synthesis - Gemini Live converts scripts to natural audio via Pipecat + Daily WebRTC
- Barge-in Q&A - Users can interrupt mid-briefing to ask questions about articles
- State Sync - UI shows "Listening" vs "Speaking" states for seamless interaction
| Feature | Implementation |
|---|---|
| Deterministic Narration | Pre-generated scripts for consistent delivery |
| Live Q&A | allow_interruptions toggle for barge-in |
| Zero Latency | WebRTC via Daily.co for real-time audio |
| Cost: ~$0 | Vertex AI free tier (15 hrs audio/month) |
AI Agent-powered content curation with autonomous scoring and community validation
As a software engineer, I was drowning in 100+ tech articles daily from 50+ sources. 99% were noise. I needed AI to surface the signal.
Built an Agentic Workflow powered by Airflow:
- Scoring Agent - LLM-powered agent (Gemini) that perceives article content, reasons about relevance using user interests, and outputs scores (0-100), topics, and actionability
- Community Enrichment Pipeline - Fetches HackerNews + Lobsters engagement data, calculates community scores, detects viral thresholds
- Orchestration Layer - Airflow DAGs coordinate the workflow on decay schedules (16h β 24h β 7d)
Result: 500 articles β 5-10 must-reads with 95%+ precision
This is a Single-Agent Pipeline orchestrated by Airflow:
| Component | Type | Role |
|---|---|---|
| Scoring Agent | AI Agent (Gemini) | The cognitive core - perceives article content, reasons about relevance, outputs scores/topics/actionability |
| Community Enrichment | Data Pipeline | Fetches HN + Lobsters engagement data, calculates community scores (deterministic, not AI) |
| Rescore Pipeline | Data Pipeline | Re-checks community scores on decay schedule (16h β 24h β 7d) |
| Airflow DAGs | Orchestrator | Coordinates the agentic workflow - scheduling, retries, dependencies |
Viral Override: When community score exceeds threshold (70+), the system flips weighting from 70% AI / 30% Community to 30% AI / 70% Community - letting the crowd override the algorithm.
RSS Feeds (106 feeds via OPML)
β
Airflow DAG (daily @ 13:00 UTC)
β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SCORING PIPELINE β
β ββ Fetch & Dedupe (SHA-256 URL hashes) β
β ββ AI Scoring (Gemini 2.5 Flash β structured JSON) β
β ββ Community Enrichment (HN Algolia + Lobste.rs APIs) β
β ββ Final Score = weighted(AI, Community) + viral override β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
BigQuery (articles_scored, daily_briefings, lecture_notes)
β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DELIVERY LAYER β
β ββ Cloud Run API (FastAPI + static UI) β
β ββ Voice Briefing (Sidecar VM: Pipecat + Daily + Gemini) β
β ββ Feed Assistant (NL β StruQ β BigQuery) β
β ββ Video Lecture Notes (Gemini vision + GCS snapshots) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Component | Technology | Cost |
|---|---|---|
| Orchestration | Apache Airflow 2.8 on GCE | ~$20/mo |
| Backend | FastAPI on Cloud Run | ~$0 (free tier) |
| AI Scoring | Gemini 2.5 Flash (Vertex AI) | ~$1/mo |
| Voice Briefing | Gemini Live + Pipecat + Daily WebRTC | ~$0 (free tier) |
| Storage | BigQuery + Firestore + GCS | ~$0 (free tier) |
Detailed Architecture (click to expand)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONTENT INTELLIGENCE PLATFORM - AGENTIC WORKFLOW β
β Airflow-Orchestrated AI Pipeline β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β EXTERNAL DATA SOURCES β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β β
β β β RSS β β Hashnode RSS β β Medium RSS β β HN RSS β β β
β β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β β
β β β β β β β β
β β βββββββββββββββββββ΄ββββββββββββββββββ΄ββββββββββββββββββ β β
β β β β β
β β β Fetch every 6 hours β β
β ββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β COMPUTE ENGINE VM (e2-medium) β β
β β $24/month β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β APACHE AIRFLOW (Docker) β β β
β β β β β β
β β β DAG 1: Content Intelligence Pipeline (every 6h) β β β
β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β [Fetch RSS] ββ> [Parse] ββ> [Scoring Agent] ββ> [Community] ββββββββ β β β
β β β β β β ββ β β β
β β β β βββββββΌββββββ βββββββΌββββββ ββ β β β
β β β β β Gemini AI β β HN + Lob β ββ β β β
β β β β β (0-100) β β APIs β ββ β β β
β β β β βββββββ¬ββββββ βββββββ¬ββββββ ββ β β β
β β β β βββββββββ¬ββββββββ ββ β β β
β β β β βΌ ββ β β β
β β β β ββββββββββββββββββββββββ ββ β β β
β β β β β Final Score Blend: β ββ β β β
β β β β β 70% AI + 30% Comm β ββ β β β
β β β β β (flips if viral!) β ββ β β β
β β β β ββββββββββββ¬ββββββββββββ ββ β β β
β β β β βΌ βΌβ β β β
β β β β [Store BigQuery] <βββββββββββββββββββββββ β β β
β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β β β
β β β DAG 2: Community Rescore Pipeline (every 8h) β β β
β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β [Query BQ: articles needing rescore] ββ> [Batch HN/Lob APIs] β β β β
β β β β β β β β β
β β β β Decay Schedule: βΌ β β β β
β β β β β’ 0-48h: check every 16h [Match URLs locally] ββ> [Update BQ] β β β β
β β β β β’ 2-7d: check daily (viral override if score >= 70) β β β β
β β β β β’ 7-30d: check weekly β β β β
β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β β β
β β β DAG 3: LLM Review Observability Pipeline β β β
β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β β β β β
β β β β [Ingest Docs] ββ> [Chunk & Embed] ββ> [Vector Store] βββββββ β β β β
β β β β β β β β β
β β β β βΌ β β β β
β β β β [User Query] ββ> [RAG Retrieval] ββ> [Gemini Review] ββ> [Store] β β β β
β β β β β β β β β β β
β β β β βββββββββββββββββββββββ΄βββββββββββββββ β β β β
β β β β β β β β β
β β β β βΌ β β β β
β β β β [Track with MLflow] β β β β
β β β β β β β β
β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β β β
β β β Scheduler: 6-hour intervals β β β
β β β Resources: 2 vCPU, 4 GB RAM β β β
β β β Stack: Python, Docker Compose, PostgreSQL (metadata DB) β β β
β β β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β Monitoring: Google Cloud Ops agent installed β β
β β Logs: /var/log/airflow/* β Cloud Logging β β
β β Metrics: CPU, Memory, Disk β Cloud Monitoring β β
β ββββββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β β Writes processed data β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β BIGQUERY (Storage Layer) β β
β β FREE TIER β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Dataset: content_intelligence β β β
β β β β β β
β β β Table: articles_scored β β β
β β β ββ article_id, title, url, source (STRING) β β β
β β β ββ ai_relevance_score (INTEGER 0-100) β β β
β β β ββ community_score (INTEGER 0-100) β β β
β β β ββ final_score (INTEGER - blended) β β β
β β β ββ hn_points, hn_comments, lobsters_points (INTEGER) β β β
β β β ββ is_trending, is_viral (BOOL) β β β
β β β ββ last_community_check (TIMESTAMP) β β β
β β β ββ community_check_count (INTEGER) β β β
β β β ββ key_topics (ARRAY<STRING>) β β β
β β β ββ ai_reasoning, actionability, content_type (STRING) β β β
β β β β β β
β β β Table: code_review_sessions β β β
β β β ββ session_id (STRING) β β β
β β β ββ code_snippet (STRING) β β β
β β β ββ review_result (STRING) β β β
β β β ββ retrieval_context (ARRAY<STRING>) β β β
β β β ββ response_time_ms (INTEGER) β β β
β β β ββ created_at (TIMESTAMP) β β β
β β β ββ metadata (JSON) β β β
β β β β β β
β β β Storage: ~1 GB/month (well within free tier) β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β β Queries data β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CLOUD RUN (User-Facing API) β β
β β $5/month β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β FastAPI Application β β β
β β β β β β
β β β Frontend: Multi-Tab Portfolio Interface β β β
β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β β β β β
β β β β [Tab 1: Content Intelligence Hub] β β β β
β β β β ββ AI Reading Assistant (Chat Interface) β β β β
β β β β β β’ Time-based recommendations β β β β
β β β β β β’ Trending topics β β β β
β β β β β β’ Topic curation β β β β
β β β β β β’ URL analyzer β β β β
β β β β β β’ What's new detection β β β β
β β β β ββ Architecture diagram β β β β
β β β β ββ Dual-scoring explanation β β β β
β β β β ββ Live demo β β β β
β β β β ββ Links: GitHub, Dev.to post, Medium articles β β β β
β β β β β β β β
β β β β [Tab 2: LLM Review Observability] β β β β
β β β β ββ Code review demo (submit code β get AI review) β β β β
β β β β ββ RAG pipeline visualization β β β β
β β β β ββ Multi-stage observability β β β β
β β β β ββ Datadog screenshots β β β β
β β β β ββ Architecture diagram β β β β
β β β β ββ Links: GitHub, blog post β β β β
β β β β β β β β
β β β β [Tab 3: About Me] β β β β
β β β β ββ 15 years backend + platform engineering β β β β
β β β β ββ Data Platform Engineer positioning β β β β
β β β β ββ Synopsys experience (30 services, 150 containers) β β β β
β β β β ββ Certifications (Databricks, CKAD) β β β β
β β β β ββ Tech stack showcase β β β β
β β β β ββ Contact info β β β β
β β β β β β β β
β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β β β
β β β Backend: API Endpoints β β β
β β β ββ POST /api/assistant (Content Intelligence chat) β β β
β β β ββ GET /api/articles (Query processed articles) β β β
β β β ββ GET /api/articles/trending (Trending topics) β β β
β β β ββ POST /api/analyze-url (Analyze specific URL) β β β
β β β ββ POST /api/review (Code review with RAG) β β β
β β β ββ GET /metrics (Prometheus metrics) β β β
β β β ββ GET /health (Health check) β β β
β β β β β β
β β β Rate Limiting: β β β
β β β ββ 10 requests/minute per IP β β β
β β β ββ 50 requests/hour per IP β β β
β β β ββ 200 requests/day per IP β β β
β β β β β β
β β β Auto-scaling: 0-10 instances β β β
β β β Concurrency: 80 requests per instance β β β
β β β Min instances: 0 (scales to zero when idle) β β β
β β β β β β
β β β Instrumentation: β β β
β β β ββ OpenTelemetry (traces) β β β
β β β ββ Custom metrics (response time, token usage) β β β
β β β ββ MLflow (experiment tracking) β β β
β β β ββ Error tracking β β β
β β β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β URL: https://content-intelligence-hub.run.app β β
β β Stack: FastAPI, Python, Docker, HTML/CSS/JavaScript β β
β ββββββββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β β Sends requests β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β GEMINI API (AI Processing) β β
β β $6/month β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Model: Gemini 1.5 Flash β β β
β β β β β β
β β β Use Cases: β β β
β β β 1. Article Analysis (Content Intelligence) β β β
β β β ββ Extract key topics β β β
β β β ββ Score personal relevance (0-100) β β β
β β β ββ Assess quality signal β β β
β β β ββ Generate summary β β β
β β β β β β
β β β 2. Conversational Assistant β β β
β β β ββ Answer user queries β β β
β β β ββ Recommend articles β β β
β β β ββ Explain trends β β β
β β β β β β
β β β 3. Code Review Generation (LLM Observability) β β β
β β β ββ Analyze code quality β β β
β β β ββ Suggest improvements β β β
β β β ββ RAG-enhanced context β β β
β β β β β β
β β β Rate Limits: β β β
β β β ββ Daily budget cap: $5/day β β β
β β β ββ ~1000 requests/month β β β
β β β ββ $0.0002 per article β β β
β β β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β GOOGLE CLOUD OPERATIONS (Observability) β β
β β FREE β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β β
β β β Cloud Trace (Distributed Tracing) β β β
β β β ββ API request traces β β β
β β β ββ Airflow task traces β β β
β β β ββ BigQuery query traces β β β
β β β ββ Gemini API call traces β β β
β β β ββ End-to-end latency visualization β β β
β β β β β β
β β β Cloud Monitoring (Metrics & Dashboards) β β β
β β β ββ VM metrics (CPU, memory, disk) β β β
β β β ββ Cloud Run metrics (requests, latency, errors) β β β
β β β ββ Custom metrics (article processing rate, relevance scores) β β β
β β β ββ BigQuery metrics (query performance) β β β
β β β ββ Custom dashboards β β β
β β β β β β
β β β Cloud Logging (Log Aggregation) β β β
β β β ββ Airflow logs (scheduler, webserver, tasks) β β β
β β β ββ Cloud Run logs (requests, errors, traces) β β β
β β β ββ Application logs (debug, info, error) β β β
β β β ββ Log-based metrics β β β
β β β β β β
β β β Error Reporting β β β
β β β ββ Automatic error detection β β β
β β β ββ Error grouping β β β
β β β ββ Stack trace analysis β β β
β β β ββ Alert integration β β β
β β β β β β
β β β Alerting β β β
β β β ββ High error rate (>5%) β β β
β β β ββ API latency (>2s p95) β β β
β β β ββ VM CPU/Memory (>90%) β β β
β β β ββ Daily budget exceeded β β β
β β β ββ Airflow DAG failures β β β
β β β β β β
β β β Free Tier Limits: β β β
β β β ββ 50 GB logs/month β
β β β
β β β ββ All traces (unlimited) β
β β β
β β β ββ 150 GB metrics ingestion/month β
β β β
β β β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β OPTIONAL: DATADOG (Demo Period Only) β β
β β 14-Day Trial β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Used ONLY for Week 3-4 (Hackathon Submission) β β β
β β β β β β
β β β Agent installed on: β β β
β β β ββ Compute Engine VM β β β
β β β ββ Cloud Run (via OpenTelemetry) β β β
β β β β β β
β β β Purpose: β β β
β β β ββ Create beautiful dashboards (screenshots for blog) β β β
β β β ββ Record demo video β β β
β β β ββ Show multi-stage observability β β β
β β β ββ Submit to GCP/Datadog hackathon β β β
β β β β β β
β β β After trial expires: β β β
β β β ββ Remove Datadog agent β β β
β β β ββ Keep Google Cloud Ops (free forever) β β β
β β β ββ Portfolio stays live at $25/month β β β
β β β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- AI Relevance Score (0-100): Scoring Agent analyzes content against your subscribed feed interests
- Community Score (0-100): Aggregated from HackerNews points + Lobsters engagement
- Final Score: Blended 70% AI + 30% Community (flips to 30/70 on viral content)
- Viral Override: When community score >= 70, let the crowd override the algorithm
- Conversational interface
- Time-based recommendations
- Real-time article analysis
- Trend detection
- Agentic Workflow: Scoring Agent + Community Pipelines orchestrated by Airflow
- Decay-based Rescoring: Articles re-checked on schedule (16h β 24h β 7d β stop)
- Batch API Calls: HN Algolia + Lobsters fetched in bulk, matched locally
- Rate-limited API: Prevents abuse
- $0.0002 per article: Batch LLM processing for 10x cost reduction
- Apache Airflow 2.8
- Google Cloud Run
- Google BigQuery
- Gemini 1.5 Flash
- FastAPI
- Docker
- Reading time: 2 hours β 15 minutes daily
- Relevance accuracy: 94%+
- Processing cost: $0.0002/article
- Total cost: $25/month
What I Learned:
- Agentic workflow design (Scoring Agent + orchestration)
- Apache Airflow for autonomous pipeline coordination
- Cost-effective LLM API usage (batching strategies)
- Community signal aggregation (HN/Lobsters APIs)
- Decay scheduling for rolling score updates
MIT

