-
Notifications
You must be signed in to change notification settings - Fork 0
docs: Milestone 1 report, project plan, and charts (HW9) #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,190 @@ | ||
| # AI Campaign Studio (Flair2) — Project Plan | ||
|
|
||
| > Multi-stage AI pipeline that generates social media marketing campaigns. | ||
| > Two-person team: Sam (pipeline + frontend) and Jess (infrastructure + distributed systems). | ||
|
|
||
| --- | ||
|
|
||
| ## Timeline | ||
|
|
||
| ``` | ||
| Mar 25 Mar 28 Apr 4 Apr 8 Apr 11 Apr 15 | ||
| | | | | | | | ||
| |--- M1 -----| | | | | | ||
| | MVP Pipeline (Sam) | | | | | ||
| | |--- M2 -----| | | | | ||
| | | AWS Infra (Jess) | | | | ||
| | | |--- M3 -----| | | | ||
| | | | Distributed (Both) | | | ||
| | | | |--- M4 -----| | | ||
| | | | | Frontend (Sam) | | ||
| | | | | |--- M5 -----| | ||
| | | | | | Experiments (Both) | ||
| v v v v v v | ||
| Start MVP done AWS deployed Pipeline Frontend + Experiments | ||
| posting locally + reachable on AWS feedback + write-up | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Task Breakdown by Milestone | ||
|
|
||
| ### M1: MVP Pipeline — Sam (Mar 25 - Mar 28) | ||
|
|
||
| Get a working 6-stage pipeline running locally. Analyze trending content, generate scripts, simulate audience voting, personalize output. | ||
|
|
||
| | # | Title | Size | Status | Completed | | ||
| |---|-------|------|--------|-----------| | ||
| | #16 | Project scaffolding + Pydantic models | S | Done | Mar 27 | | ||
| | #17 | Provider interface + Gemini implementation | S | Done | Mar 27 | | ||
| | #18 | S1 analyze + S2 aggregate (MapReduce Cycle 1) | M | Done | Mar 27 | | ||
| | #19 | S3 generate + S4 vote + S5 rank (MapReduce Cycle 2) | M | Done | Mar 27 | | ||
| | #20 | S6 personalize + local runner + CLI | M | Done | Mar 27 | | ||
| | #21 | Download dataset + first real pipeline run | M | Done | Mar 27 | | ||
| | #22 | Generate + post first video | M | Suspended | -- | | ||
| | #60 | Add Kimi (Moonshot) as default reasoning provider | M | Done | Mar 28 | | ||
|
|
||
| > M1 status: 7/8 issues closed. #22 suspended (video generation depends on external API availability). | ||
|
|
||
| ### M2: AWS Infrastructure — Jess (Mar 28 - Apr 4) | ||
|
|
||
| Deploy all AWS services via Terraform. Goal: ALB URL returns 200 from /api/health. | ||
|
|
||
| | # | Title | Size | Status | Completed | | ||
| |---|-------|------|--------|-----------| | ||
| | #23 | Terraform project + VPC + IAM roles | M | Done | Mar 28 | | ||
| | #24 | S3 bucket + DynamoDB tables | S | Done | Mar 28 | | ||
| | #25 | ElastiCache Redis + ECR repository | S | Done | Mar 28 | | ||
| | #26 | ECS Fargate + ALB (API service) | L | Done | Mar 28 | | ||
| | #27 | ECS Fargate (Celery worker service) | M | Done | Mar 28 | | ||
| | #28 | Lambda function for S7 video generation | M | Done | Mar 28 | | ||
|
|
||
| > M2 status: 6/6 issues closed. All infrastructure deployed. | ||
|
|
||
| ### M3: Distributed Pipeline — Both (Apr 4 - Apr 8) | ||
|
|
||
| Wire up API, Celery workers, orchestrator, and distributed systems features. | ||
|
|
||
| | # | Title | Owner | Size | Status | | ||
| |---|-------|-------|------|--------| | ||
| | #29 | FastAPI routes + infra clients (Redis, S3, DynamoDB) | Both | L | Planned | | ||
| | #30 | Celery tasks + orchestrator state machine | Jess | L | Planned | | ||
| | #31 | Rate limiter + SETNX cache | Jess | M | Planned | | ||
| | #32 | SSE streaming + checkpoint recovery | Both | M | Planned | | ||
| | #33 | Multi-user validation (3 concurrent runs) | Both | M | Planned | | ||
|
|
||
| ### M4: Frontend + Feedback Loop — Sam (Apr 8 - Apr 11) | ||
|
|
||
| Astro frontend with React islands for real-time pipeline visualization. | ||
|
|
||
| | # | Title | Size | Status | | ||
| |---|-------|------|--------| | ||
| | #34 | Astro project scaffold + Cloudflare Pages deploy | S | Planned | | ||
| | #35 | Create page (input form + model selection) | M | Planned | | ||
| | #36 | Pipeline visualizer (SSE-connected React island) | L | Planned | | ||
| | #37 | Voting animation (100-avatar React island) | L | Planned | | ||
| | #38 | Results page + video player | M | Planned | | ||
| | #39 | Performance tracking page + feedback API | M | Planned | | ||
| | #40 | Insights dashboard + runs page | M | Planned | | ||
|
|
||
| ### M5: Experiments + Write-up — Both (Apr 11 - Apr 15) | ||
|
|
||
| Run three distributed systems experiments and collect data for the course write-up. | ||
|
|
||
| | # | Title | Owner | Size | Status | | ||
| |---|-------|-------|------|--------| | ||
| | #41 | Experiment 1: Multi-tenant backpressure | Both | L | Planned | | ||
| | #42 | Experiment 2: Failure recovery + run isolation | Both | L | Planned | | ||
| | #43 | Experiment 3: Cross-user cache concurrency | Both | L | Planned | | ||
|
|
||
| ### Cross-cutting | ||
|
|
||
| | # | Title | Owner | Size | Status | Completed | | ||
| |---|-------|-------|------|--------|-----------| | ||
| | #44 | CI/CD pipeline (GitHub Actions) | Jess | S | Done | Mar 28 | | ||
| | #50-55 | Pipeline quality polish (prompts, models, personas) | Sam | S-M | Open | -- | | ||
|
|
||
| --- | ||
|
|
||
| ## Who Is Doing What | ||
|
|
||
| | Team Member | Issues | Scope | | ||
| |-------------|--------|-------| | ||
| | **Sam** | 14 | M1 pipeline stages (#16-#22, #60), M4 frontend (#34-#40) | | ||
| | **Jess** | 9 | M2 AWS infrastructure (#23-#28), M3 orchestrator (#30, #31), CI/CD (#44) | | ||
| | **Both** | 6 | M3 integration (#29, #32, #33), M5 experiments (#41, #42, #43) | | ||
|
|
||
| **Total tracked issues:** 29 (+ 6 polish issues) | ||
|
|
||
| **Size breakdown:** 6 Small (half day), 15 Medium (1 day), 8 Large (2-3 days) | ||
|
|
||
| --- | ||
|
|
||
| ## Critical Path | ||
|
|
||
| The longest dependency chain determines the project deadline: | ||
|
|
||
| ``` | ||
| #23 --> #25 --> #26 --> #29 --> #30 --> #31 --> #33 --> #41/#42/#43 | ||
| VPC Redis ECS API Celery Rate Multi Experiments | ||
| limit user | ||
| ``` | ||
|
|
||
| If any issue on this chain slips, experiments get compressed. Everything else has float. | ||
|
|
||
| Key dependency relationships: | ||
| - M3 cannot start until M2 (AWS) is deployed and M1 (pipeline stages) is complete | ||
| - M4 frontend scaffold (#34) has no backend dependency — can start early | ||
| - M5 experiments require the full distributed system (M3) to be operational | ||
|
|
||
| --- | ||
|
|
||
| ## AI Usage in Development | ||
|
|
||
| ### How AI tools were used | ||
|
|
||
| | Area | Tool | What it did | | ||
| |------|------|-------------| | ||
| | **Architecture design** | Claude Code | Designed 7-stage MapReduce pipeline, wrote architecture spec, created data models | | ||
| | **Spec writing** | Claude Code | Drafted all 29 GitHub issues with acceptance criteria, dependency graphs, size estimates | | ||
| | **Code generation** | Claude Code (Shannon) | Implemented pipeline stages, provider interfaces, Pydantic models, CLI runner | | ||
| | **Code review** | Claude Code | Automated review on every PR; catches type errors, missing edge cases, style issues | | ||
| | **CI/CD** | GitHub Actions | Automated lint (ruff) + test (pytest) on every push | | ||
| | **Research** | Claude Code | Analyzed viral content psychology for persona design, evaluated dataset options (TikTok-10M vs Gopher-Lab transcripts) | | ||
| | **Debugging** | Claude Code | Diagnosed Gemini API intermittent 500s, implemented retry logic with exponential backoff | | ||
|
|
||
| ### Cost-benefit assessment | ||
|
|
||
| **Time saved:** | ||
| - Architecture + spec phase completed in ~2 days instead of estimated 5 | ||
| - Boilerplate code generation (models, interfaces, tests) saved ~1 day per milestone | ||
| - Automated PR review catches issues before human review, reducing review cycles | ||
|
|
||
| **Review overhead:** | ||
| - Every AI-generated PR requires human review — adds ~30 min per PR | ||
| - AI occasionally over-engineers solutions — need to simplify before merging | ||
| - Prompt iteration for pipeline stages required 3-4 rounds to get output quality right | ||
|
|
||
| **Net assessment:** AI accelerated the project by roughly 40%, primarily in the spec/scaffolding/review phases. Implementation still requires significant human judgment for prompt engineering and integration decisions. | ||
|
|
||
| --- | ||
|
|
||
| ## Current Status Summary (as of March 28, 2026) | ||
|
|
||
| | Milestone | Progress | Key Metrics | | ||
| |-----------|----------|-------------| | ||
| | **M1: MVP Pipeline** | 88% (7/8 closed, 1 suspended) | All 6 stages implemented + tested, Kimi provider integrated, first real pipeline run complete | | ||
| | **M2: AWS Infra** | 100% (6/6 closed) | Full Terraform stack: VPC, ECS, ALB, Redis, S3, DynamoDB, Lambda | | ||
| | **M3: Distributed** | 0% (0/5) | Not started — on schedule per timeline (starts Apr 4) | | ||
| | **M4: Frontend** | 0% (0/7) | Not started — on schedule per timeline (starts Apr 8) | | ||
| | **M5: Experiments** | 0% (0/3) | Not started — on schedule per timeline (starts Apr 11) | | ||
| | **CI/CD** | Done | GitHub Actions pipeline active | | ||
|
|
||
| **Overall numbers:** | ||
| - **Issues closed:** 14 of 29 (48%) | ||
| - **Test suite:** 45 tests passing | ||
| - **CI/CD:** Active on all PRs (ruff lint + pytest) | ||
| - **AI providers:** 2 integrated (Kimi for reasoning, Gemini for video generation) | ||
| - **Pipeline stages:** 6 of 7 implemented (S1 analyze, S2 aggregate, S3 generate, S4 vote, S5 rank, S6 personalize) | ||
|
|
||
| **Risk assessment:** M1 and M2 completed ahead of schedule. The critical path through M3 has adequate buffer. Primary risk is M3-to-M5 integration complexity — the distributed systems experiments depend on a fully operational multi-user pipeline. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,156 @@ | ||
| """Generate charts from pipeline test run data for course report.""" | ||
|
|
||
| import matplotlib.pyplot as plt | ||
| import matplotlib.ticker as ticker | ||
| import numpy as np | ||
| from pathlib import Path | ||
|
Comment on lines
+1
to
+6
|
||
|
|
||
| # Output directory | ||
| OUT_DIR = Path(__file__).resolve().parent.parent / "docs" / "charts" | ||
| OUT_DIR.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| # Consistent style | ||
| COLORS = { | ||
| "input": "#4A90D9", | ||
| "output": "#E8734A", | ||
| "completed": "#4CAF50", | ||
| "remaining": "#D0D0D0", | ||
| "timing": "#5B7FA5", | ||
| } | ||
| plt.rcParams.update({ | ||
| "font.family": "sans-serif", | ||
| "font.size": 11, | ||
| "axes.spines.top": False, | ||
| "axes.spines.right": False, | ||
| "figure.facecolor": "white", | ||
| "axes.facecolor": "white", | ||
| "savefig.facecolor": "white", | ||
| }) | ||
|
|
||
|
|
||
| def chart_token_usage() -> None: | ||
| """Chart 1: Token usage by pipeline stage (grouped bar).""" | ||
| stages = ["S1\nDiscover", "S3\nStudio Brief", "S4\nVideo Brief", "S6\nEvaluate"] | ||
| input_tokens = [1193, 259, 1731, 673] | ||
| output_tokens = [3945, 1131, 4691, 5347] | ||
| requests = [2, 1, 3, 2] | ||
|
|
||
| x = np.arange(len(stages)) | ||
| width = 0.32 | ||
|
|
||
| fig, ax = plt.subplots(figsize=(9, 5)) | ||
| bars_in = ax.bar(x - width / 2, input_tokens, width, label="Input tokens", | ||
| color=COLORS["input"], edgecolor="white", linewidth=0.5) | ||
| bars_out = ax.bar(x + width / 2, output_tokens, width, label="Output tokens", | ||
| color=COLORS["output"], edgecolor="white", linewidth=0.5) | ||
|
|
||
| # Annotate with request counts | ||
| for i, (bi, bo) in enumerate(zip(bars_in, bars_out)): | ||
| ax.text(bi.get_x() + bi.get_width() / 2, bi.get_height() + 80, | ||
| f"{input_tokens[i]:,}", ha="center", va="bottom", fontsize=8, color="#555") | ||
| ax.text(bo.get_x() + bo.get_width() / 2, bo.get_height() + 80, | ||
| f"{output_tokens[i]:,}", ha="center", va="bottom", fontsize=8, color="#555") | ||
| # Request count label below stage name | ||
| ax.text(x[i], -900, f"({requests[i]} req)", ha="center", fontsize=8, color="#888") | ||
|
|
||
| ax.set_ylabel("Tokens") | ||
| ax.set_title("LLM Token Usage by Pipeline Stage", fontsize=14, fontweight="bold", pad=15) | ||
| ax.set_xticks(x) | ||
| ax.set_xticklabels(stages, fontsize=10) | ||
| ax.yaxis.set_major_formatter(ticker.FuncFormatter(lambda v, _: f"{int(v):,}")) | ||
| ax.legend(frameon=False, loc="upper left") | ||
| ax.set_ylim(-1000, 6500) | ||
|
|
||
| fig.tight_layout() | ||
| fig.savefig(OUT_DIR / "token_usage_by_stage.png", dpi=300, bbox_inches="tight") | ||
| plt.close(fig) | ||
| print(f"Saved: {OUT_DIR / 'token_usage_by_stage.png'}") | ||
|
|
||
|
|
||
| def chart_pipeline_timing() -> None: | ||
| """Chart 2: Pipeline stage duration (horizontal bar).""" | ||
| stages = ["S1 — Discover", "S2 — Curate", "S3 — Studio Brief", | ||
| "S4 — Video Brief", "S5 — Assemble", "S6 — Evaluate"] | ||
| durations = [78, 0.5, 75, 120, 0.5, 146] | ||
|
|
||
| fig, ax = plt.subplots(figsize=(9, 4.5)) | ||
| y = np.arange(len(stages)) | ||
| bars = ax.barh(y, durations, height=0.55, color=COLORS["timing"], | ||
| edgecolor="white", linewidth=0.5) | ||
|
|
||
| # Annotate each bar | ||
| for i, (bar, d) in enumerate(zip(bars, durations)): | ||
| label = f"{d:.0f}s" if d >= 1 else "<1s" | ||
| x_pos = bar.get_width() + 3 | ||
| ax.text(x_pos, bar.get_y() + bar.get_height() / 2, label, | ||
| va="center", fontsize=10, fontweight="bold", color="#333") | ||
|
|
||
| ax.set_yticks(y) | ||
| ax.set_yticklabels(stages, fontsize=10) | ||
| ax.invert_yaxis() | ||
| ax.set_xlabel("Duration (seconds)") | ||
| ax.set_title("Pipeline Stage Duration (2-video test run)", fontsize=14, | ||
| fontweight="bold", pad=15) | ||
| ax.set_xlim(0, 175) | ||
|
|
||
| # Total annotation | ||
| total = sum(durations) | ||
| ax.text(0.98, 0.02, f"Total: {total:.0f}s (~{total/60:.1f} min)", | ||
| transform=ax.transAxes, ha="right", va="bottom", | ||
| fontsize=10, color="#666", fontstyle="italic") | ||
|
|
||
| fig.tight_layout() | ||
| fig.savefig(OUT_DIR / "pipeline_timing.png", dpi=300, bbox_inches="tight") | ||
| plt.close(fig) | ||
| print(f"Saved: {OUT_DIR / 'pipeline_timing.png'}") | ||
|
|
||
|
|
||
| def chart_milestone_progress() -> None: | ||
| """Chart 3: Project milestone progress (horizontal stacked bar).""" | ||
| milestones = [ | ||
| "M1 — MVP Pipeline", | ||
| "M2 — AWS Infra", | ||
| "M3 — Distributed", | ||
| "M4 — Frontend", | ||
| "M5 — Experiments", | ||
| ] | ||
| closed = [7, 6, 0, 0, 1] | ||
| total = [8, 6, 5, 7, 4] | ||
| remaining = [t - c for t, c in zip(total, closed)] | ||
| pcts = [c / t * 100 for c, t in zip(closed, total)] | ||
|
|
||
| fig, ax = plt.subplots(figsize=(9, 4)) | ||
| y = np.arange(len(milestones)) | ||
| height = 0.5 | ||
|
|
||
| ax.barh(y, closed, height=height, color=COLORS["completed"], | ||
| edgecolor="white", linewidth=0.5, label="Closed") | ||
| ax.barh(y, remaining, height=height, left=closed, color=COLORS["remaining"], | ||
| edgecolor="white", linewidth=0.5, label="Remaining") | ||
|
|
||
| # Annotate with fraction and percentage | ||
| for i in range(len(milestones)): | ||
| label = f"{closed[i]}/{total[i]} ({pcts[i]:.0f}%)" | ||
| ax.text(total[i] + 0.2, y[i], label, va="center", fontsize=10, color="#333") | ||
|
|
||
| ax.set_yticks(y) | ||
| ax.set_yticklabels(milestones, fontsize=10) | ||
| ax.invert_yaxis() | ||
| ax.set_xlabel("Issues") | ||
| ax.set_title("Milestone Progress \u2014 March 28, 2026", fontsize=14, | ||
| fontweight="bold", pad=15) | ||
| ax.set_xlim(0, 10) | ||
| ax.xaxis.set_major_locator(ticker.MaxNLocator(integer=True)) | ||
| ax.legend(frameon=False, loc="lower right") | ||
|
|
||
| fig.tight_layout() | ||
| fig.savefig(OUT_DIR / "milestone_progress.png", dpi=300, bbox_inches="tight") | ||
| plt.close(fig) | ||
| print(f"Saved: {OUT_DIR / 'milestone_progress.png'}") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| chart_token_usage() | ||
| chart_pipeline_timing() | ||
| chart_milestone_progress() | ||
| print("\nAll charts generated.") | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This summary says "M1 status: 7/8 issues closed", but later in the same document the status table lists M1 as "86% (6/7 closed, 1 suspended)" and the M1 task table above includes 8 items with 7 marked Done and 1 Suspended. Please reconcile these counts (and ensure they align with the milestone progress chart data) so the plan is internally consistent.