Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/charts/milestone_progress.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/charts/pipeline_timing.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/charts/token_usage_by_stage.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
288 changes: 288 additions & 0 deletions docs/homework/milestone-1-report.md

Large diffs are not rendered by default.

190 changes: 190 additions & 0 deletions docs/homework/project-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
# AI Campaign Studio (Flair2) — Project Plan

> Multi-stage AI pipeline that generates social media marketing campaigns.
> Two-person team: Sam (pipeline + frontend) and Jess (infrastructure + distributed systems).

---

## Timeline

```
Mar 25 Mar 28 Apr 4 Apr 8 Apr 11 Apr 15
| | | | | |
|--- M1 -----| | | | |
| MVP Pipeline (Sam) | | | |
| |--- M2 -----| | | |
| | AWS Infra (Jess) | | |
| | |--- M3 -----| | |
| | | Distributed (Both) | |
| | | |--- M4 -----| |
| | | | Frontend (Sam) |
| | | | |--- M5 -----|
| | | | | Experiments (Both)
v v v v v v
Start MVP done AWS deployed Pipeline Frontend + Experiments
posting locally + reachable on AWS feedback + write-up
```

---

## Task Breakdown by Milestone

### M1: MVP Pipeline — Sam (Mar 25 - Mar 28)

Get a working 6-stage pipeline running locally. Analyze trending content, generate scripts, simulate audience voting, personalize output.

| # | Title | Size | Status | Completed |
|---|-------|------|--------|-----------|
| #16 | Project scaffolding + Pydantic models | S | Done | Mar 27 |
| #17 | Provider interface + Gemini implementation | S | Done | Mar 27 |
| #18 | S1 analyze + S2 aggregate (MapReduce Cycle 1) | M | Done | Mar 27 |
| #19 | S3 generate + S4 vote + S5 rank (MapReduce Cycle 2) | M | Done | Mar 27 |
| #20 | S6 personalize + local runner + CLI | M | Done | Mar 27 |
| #21 | Download dataset + first real pipeline run | M | Done | Mar 27 |
| #22 | Generate + post first video | M | Suspended | -- |
| #60 | Add Kimi (Moonshot) as default reasoning provider | M | Done | Mar 28 |

> M1 status: 7/8 issues closed. #22 suspended (video generation depends on external API availability).
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This summary says "M1 status: 7/8 issues closed", but later in the same document the status table lists M1 as "86% (6/7 closed, 1 suspended)" and the M1 task table above includes 8 items with 7 marked Done and 1 Suspended. Please reconcile these counts (and ensure they align with the milestone progress chart data) so the plan is internally consistent.

Suggested change
> M1 status: 7/8 issues closed. #22 suspended (video generation depends on external API availability).
> M1 status: All issues except #22 are closed. #22 suspended (video generation depends on external API availability).

Copilot uses AI. Check for mistakes.

### M2: AWS Infrastructure — Jess (Mar 28 - Apr 4)

Deploy all AWS services via Terraform. Goal: ALB URL returns 200 from /api/health.

| # | Title | Size | Status | Completed |
|---|-------|------|--------|-----------|
| #23 | Terraform project + VPC + IAM roles | M | Done | Mar 28 |
| #24 | S3 bucket + DynamoDB tables | S | Done | Mar 28 |
| #25 | ElastiCache Redis + ECR repository | S | Done | Mar 28 |
| #26 | ECS Fargate + ALB (API service) | L | Done | Mar 28 |
| #27 | ECS Fargate (Celery worker service) | M | Done | Mar 28 |
| #28 | Lambda function for S7 video generation | M | Done | Mar 28 |

> M2 status: 6/6 issues closed. All infrastructure deployed.

### M3: Distributed Pipeline — Both (Apr 4 - Apr 8)

Wire up API, Celery workers, orchestrator, and distributed systems features.

| # | Title | Owner | Size | Status |
|---|-------|-------|------|--------|
| #29 | FastAPI routes + infra clients (Redis, S3, DynamoDB) | Both | L | Planned |
| #30 | Celery tasks + orchestrator state machine | Jess | L | Planned |
| #31 | Rate limiter + SETNX cache | Jess | M | Planned |
| #32 | SSE streaming + checkpoint recovery | Both | M | Planned |
| #33 | Multi-user validation (3 concurrent runs) | Both | M | Planned |

### M4: Frontend + Feedback Loop — Sam (Apr 8 - Apr 11)

Astro frontend with React islands for real-time pipeline visualization.

| # | Title | Size | Status |
|---|-------|------|--------|
| #34 | Astro project scaffold + Cloudflare Pages deploy | S | Planned |
| #35 | Create page (input form + model selection) | M | Planned |
| #36 | Pipeline visualizer (SSE-connected React island) | L | Planned |
| #37 | Voting animation (100-avatar React island) | L | Planned |
| #38 | Results page + video player | M | Planned |
| #39 | Performance tracking page + feedback API | M | Planned |
| #40 | Insights dashboard + runs page | M | Planned |

### M5: Experiments + Write-up — Both (Apr 11 - Apr 15)

Run three distributed systems experiments and collect data for the course write-up.

| # | Title | Owner | Size | Status |
|---|-------|-------|------|--------|
| #41 | Experiment 1: Multi-tenant backpressure | Both | L | Planned |
| #42 | Experiment 2: Failure recovery + run isolation | Both | L | Planned |
| #43 | Experiment 3: Cross-user cache concurrency | Both | L | Planned |

### Cross-cutting

| # | Title | Owner | Size | Status | Completed |
|---|-------|-------|------|--------|-----------|
| #44 | CI/CD pipeline (GitHub Actions) | Jess | S | Done | Mar 28 |
| #50-55 | Pipeline quality polish (prompts, models, personas) | Sam | S-M | Open | -- |

---

## Who Is Doing What

| Team Member | Issues | Scope |
|-------------|--------|-------|
| **Sam** | 14 | M1 pipeline stages (#16-#22, #60), M4 frontend (#34-#40) |
| **Jess** | 9 | M2 AWS infrastructure (#23-#28), M3 orchestrator (#30, #31), CI/CD (#44) |
| **Both** | 6 | M3 integration (#29, #32, #33), M5 experiments (#41, #42, #43) |

**Total tracked issues:** 29 (+ 6 polish issues)

**Size breakdown:** 6 Small (half day), 15 Medium (1 day), 8 Large (2-3 days)

---

## Critical Path

The longest dependency chain determines the project deadline:

```
#23 --> #25 --> #26 --> #29 --> #30 --> #31 --> #33 --> #41/#42/#43
VPC Redis ECS API Celery Rate Multi Experiments
limit user
```

If any issue on this chain slips, experiments get compressed. Everything else has float.

Key dependency relationships:
- M3 cannot start until M2 (AWS) is deployed and M1 (pipeline stages) is complete
- M4 frontend scaffold (#34) has no backend dependency — can start early
- M5 experiments require the full distributed system (M3) to be operational

---

## AI Usage in Development

### How AI tools were used

| Area | Tool | What it did |
|------|------|-------------|
| **Architecture design** | Claude Code | Designed 7-stage MapReduce pipeline, wrote architecture spec, created data models |
| **Spec writing** | Claude Code | Drafted all 29 GitHub issues with acceptance criteria, dependency graphs, size estimates |
| **Code generation** | Claude Code (Shannon) | Implemented pipeline stages, provider interfaces, Pydantic models, CLI runner |
| **Code review** | Claude Code | Automated review on every PR; catches type errors, missing edge cases, style issues |
| **CI/CD** | GitHub Actions | Automated lint (ruff) + test (pytest) on every push |
| **Research** | Claude Code | Analyzed viral content psychology for persona design, evaluated dataset options (TikTok-10M vs Gopher-Lab transcripts) |
| **Debugging** | Claude Code | Diagnosed Gemini API intermittent 500s, implemented retry logic with exponential backoff |

### Cost-benefit assessment

**Time saved:**
- Architecture + spec phase completed in ~2 days instead of estimated 5
- Boilerplate code generation (models, interfaces, tests) saved ~1 day per milestone
- Automated PR review catches issues before human review, reducing review cycles

**Review overhead:**
- Every AI-generated PR requires human review — adds ~30 min per PR
- AI occasionally over-engineers solutions — need to simplify before merging
- Prompt iteration for pipeline stages required 3-4 rounds to get output quality right

**Net assessment:** AI accelerated the project by roughly 40%, primarily in the spec/scaffolding/review phases. Implementation still requires significant human judgment for prompt engineering and integration decisions.

---

## Current Status Summary (as of March 28, 2026)

| Milestone | Progress | Key Metrics |
|-----------|----------|-------------|
| **M1: MVP Pipeline** | 88% (7/8 closed, 1 suspended) | All 6 stages implemented + tested, Kimi provider integrated, first real pipeline run complete |
| **M2: AWS Infra** | 100% (6/6 closed) | Full Terraform stack: VPC, ECS, ALB, Redis, S3, DynamoDB, Lambda |
| **M3: Distributed** | 0% (0/5) | Not started — on schedule per timeline (starts Apr 4) |
| **M4: Frontend** | 0% (0/7) | Not started — on schedule per timeline (starts Apr 8) |
| **M5: Experiments** | 0% (0/3) | Not started — on schedule per timeline (starts Apr 11) |
| **CI/CD** | Done | GitHub Actions pipeline active |

**Overall numbers:**
- **Issues closed:** 14 of 29 (48%)
- **Test suite:** 45 tests passing
- **CI/CD:** Active on all PRs (ruff lint + pytest)
- **AI providers:** 2 integrated (Kimi for reasoning, Gemini for video generation)
- **Pipeline stages:** 6 of 7 implemented (S1 analyze, S2 aggregate, S3 generate, S4 vote, S5 rank, S6 personalize)

**Risk assessment:** M1 and M2 completed ahead of schedule. The critical path through M3 has adequate buffer. Primary risk is M3-to-M5 integration complexity — the distributed systems experiments depend on a fully operational multi-user pipeline.
156 changes: 156 additions & 0 deletions scripts/generate_charts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
"""Generate charts from pipeline test run data for course report."""

import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
from pathlib import Path
Comment on lines +1 to +6
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script depends on matplotlib and numpy, but those packages don't appear to be declared in the repository's Python dependency manifest (only backend/pyproject.toml exists and doesn't include them). In a fresh environment created from the repo's declared dependencies, running this script will likely fail with ImportError. Consider adding these to an optional dependency group (e.g. docs/reporting) or documenting the required pip installs in the docstring/README for the script.

Copilot uses AI. Check for mistakes.

# Output directory
OUT_DIR = Path(__file__).resolve().parent.parent / "docs" / "charts"
OUT_DIR.mkdir(parents=True, exist_ok=True)

# Consistent style
COLORS = {
"input": "#4A90D9",
"output": "#E8734A",
"completed": "#4CAF50",
"remaining": "#D0D0D0",
"timing": "#5B7FA5",
}
plt.rcParams.update({
"font.family": "sans-serif",
"font.size": 11,
"axes.spines.top": False,
"axes.spines.right": False,
"figure.facecolor": "white",
"axes.facecolor": "white",
"savefig.facecolor": "white",
})


def chart_token_usage() -> None:
"""Chart 1: Token usage by pipeline stage (grouped bar)."""
stages = ["S1\nDiscover", "S3\nStudio Brief", "S4\nVideo Brief", "S6\nEvaluate"]
input_tokens = [1193, 259, 1731, 673]
output_tokens = [3945, 1131, 4691, 5347]
requests = [2, 1, 3, 2]

x = np.arange(len(stages))
width = 0.32

fig, ax = plt.subplots(figsize=(9, 5))
bars_in = ax.bar(x - width / 2, input_tokens, width, label="Input tokens",
color=COLORS["input"], edgecolor="white", linewidth=0.5)
bars_out = ax.bar(x + width / 2, output_tokens, width, label="Output tokens",
color=COLORS["output"], edgecolor="white", linewidth=0.5)

# Annotate with request counts
for i, (bi, bo) in enumerate(zip(bars_in, bars_out)):
ax.text(bi.get_x() + bi.get_width() / 2, bi.get_height() + 80,
f"{input_tokens[i]:,}", ha="center", va="bottom", fontsize=8, color="#555")
ax.text(bo.get_x() + bo.get_width() / 2, bo.get_height() + 80,
f"{output_tokens[i]:,}", ha="center", va="bottom", fontsize=8, color="#555")
# Request count label below stage name
ax.text(x[i], -900, f"({requests[i]} req)", ha="center", fontsize=8, color="#888")

ax.set_ylabel("Tokens")
ax.set_title("LLM Token Usage by Pipeline Stage", fontsize=14, fontweight="bold", pad=15)
ax.set_xticks(x)
ax.set_xticklabels(stages, fontsize=10)
ax.yaxis.set_major_formatter(ticker.FuncFormatter(lambda v, _: f"{int(v):,}"))
ax.legend(frameon=False, loc="upper left")
ax.set_ylim(-1000, 6500)

fig.tight_layout()
fig.savefig(OUT_DIR / "token_usage_by_stage.png", dpi=300, bbox_inches="tight")
plt.close(fig)
print(f"Saved: {OUT_DIR / 'token_usage_by_stage.png'}")


def chart_pipeline_timing() -> None:
"""Chart 2: Pipeline stage duration (horizontal bar)."""
stages = ["S1 — Discover", "S2 — Curate", "S3 — Studio Brief",
"S4 — Video Brief", "S5 — Assemble", "S6 — Evaluate"]
durations = [78, 0.5, 75, 120, 0.5, 146]

fig, ax = plt.subplots(figsize=(9, 4.5))
y = np.arange(len(stages))
bars = ax.barh(y, durations, height=0.55, color=COLORS["timing"],
edgecolor="white", linewidth=0.5)

# Annotate each bar
for i, (bar, d) in enumerate(zip(bars, durations)):
label = f"{d:.0f}s" if d >= 1 else "<1s"
x_pos = bar.get_width() + 3
ax.text(x_pos, bar.get_y() + bar.get_height() / 2, label,
va="center", fontsize=10, fontweight="bold", color="#333")

ax.set_yticks(y)
ax.set_yticklabels(stages, fontsize=10)
ax.invert_yaxis()
ax.set_xlabel("Duration (seconds)")
ax.set_title("Pipeline Stage Duration (2-video test run)", fontsize=14,
fontweight="bold", pad=15)
ax.set_xlim(0, 175)

# Total annotation
total = sum(durations)
ax.text(0.98, 0.02, f"Total: {total:.0f}s (~{total/60:.1f} min)",
transform=ax.transAxes, ha="right", va="bottom",
fontsize=10, color="#666", fontstyle="italic")

fig.tight_layout()
fig.savefig(OUT_DIR / "pipeline_timing.png", dpi=300, bbox_inches="tight")
plt.close(fig)
print(f"Saved: {OUT_DIR / 'pipeline_timing.png'}")


def chart_milestone_progress() -> None:
"""Chart 3: Project milestone progress (horizontal stacked bar)."""
milestones = [
"M1 — MVP Pipeline",
"M2 — AWS Infra",
"M3 — Distributed",
"M4 — Frontend",
"M5 — Experiments",
]
closed = [7, 6, 0, 0, 1]
total = [8, 6, 5, 7, 4]
remaining = [t - c for t, c in zip(total, closed)]
pcts = [c / t * 100 for c, t in zip(closed, total)]

fig, ax = plt.subplots(figsize=(9, 4))
y = np.arange(len(milestones))
height = 0.5

ax.barh(y, closed, height=height, color=COLORS["completed"],
edgecolor="white", linewidth=0.5, label="Closed")
ax.barh(y, remaining, height=height, left=closed, color=COLORS["remaining"],
edgecolor="white", linewidth=0.5, label="Remaining")

# Annotate with fraction and percentage
for i in range(len(milestones)):
label = f"{closed[i]}/{total[i]} ({pcts[i]:.0f}%)"
ax.text(total[i] + 0.2, y[i], label, va="center", fontsize=10, color="#333")

ax.set_yticks(y)
ax.set_yticklabels(milestones, fontsize=10)
ax.invert_yaxis()
ax.set_xlabel("Issues")
ax.set_title("Milestone Progress \u2014 March 28, 2026", fontsize=14,
fontweight="bold", pad=15)
ax.set_xlim(0, 10)
ax.xaxis.set_major_locator(ticker.MaxNLocator(integer=True))
ax.legend(frameon=False, loc="lower right")

fig.tight_layout()
fig.savefig(OUT_DIR / "milestone_progress.png", dpi=300, bbox_inches="tight")
plt.close(fig)
print(f"Saved: {OUT_DIR / 'milestone_progress.png'}")


if __name__ == "__main__":
chart_token_usage()
chart_pipeline_timing()
chart_milestone_progress()
print("\nAll charts generated.")
Loading