Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 0 additions & 115 deletions e2e/README.md

This file was deleted.

35 changes: 35 additions & 0 deletions public/educational/extracted/comments_denormalization_tooltip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Denormalization: Same Data, Different Partition Key

Comments are stored in **two tables** — same data, organized for different queries. This is the core Cassandra data modeling pattern.

## Two Tables, Two Query Patterns

```cql
-- "Show comments on this video" (partition by video)
CREATE TABLE comments (
videoid uuid,
commentid timeuuid,
comment text, userid uuid,
PRIMARY KEY (videoid, commentid)
) WITH CLUSTERING ORDER BY (commentid DESC);

-- "Show all comments by this user" (partition by user)
CREATE TABLE comments_by_user (
userid uuid,
commentid timeuuid,
comment text, videoid uuid,
PRIMARY KEY (userid, commentid)
) WITH CLUSTERING ORDER BY (commentid DESC);
```

**Why duplicate the data?**
- Cassandra has **no JOINs** — you can't query `comments` by `userid` efficiently
- Each table is optimized for exactly one access pattern
- Both use `commentid DESC` for newest-first ordering

## The Trade-off

- **Write cost**: Every comment is written twice (one to each table)
- **Read benefit**: Both queries are single-partition reads — the fastest operation Cassandra offers

**In KillrVideo**: Posting a comment writes to both tables. The video page reads from `comments`, while a user's profile page reads from `comments_by_user`.
28 changes: 28 additions & 0 deletions public/educational/extracted/latest_videos_timeseries_tooltip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Time-Series Bucketing for Latest Videos

Cassandra models "latest videos" as a **time-series table** using date-based partitioning. Each day gets its own partition, and videos within that day are sorted by timestamp.

## The Data Model

```cql
CREATE TABLE latest_videos (
day date, -- Partition key: one partition per day
added_date timestamp, -- Clustering column: sorts within the day
videoid uuid,
name text,
preview_image_location text,
userid uuid,
PRIMARY KEY (day, added_date, videoid)
) WITH CLUSTERING ORDER BY (added_date DESC);
```

**Why bucket by day?**
- **Bounded partitions**: Each partition holds one day's videos, preventing unbounded growth
- **Sorted within partition**: `added_date DESC` gives newest-first ordering for free
- **Efficient pagination**: Query today's bucket, then yesterday's, etc.

## Why Not Just Use the `videos` Table?

The `videos` table uses `videoid` as partition key — great for single-video lookups but terrible for "give me the latest N videos." A full table scan would be required. The `latest_videos` table is a **denormalized copy** optimized for this specific query pattern.

**In KillrVideo**: When a video is submitted, it's written to both `videos` (for lookups) and `latest_videos` (for the feed). This dual-write is a classic Cassandra data modeling pattern.
30 changes: 30 additions & 0 deletions public/educational/extracted/login_credential_lookup_tooltip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Credential Lookup & Login Counters

Login is a **partition key lookup** on the `user_credentials` table — Cassandra's fastest operation.

## How Login Works in the DB

```cql
-- Step 1: Find credentials by email (O(1) partition key lookup)
SELECT * FROM user_credentials WHERE email = 'user@example.com';

-- Step 2: On failure, increment the counter
UPDATE login_attempts
SET failed_attempts = failed_attempts + 1
WHERE email = 'user@example.com';
```

## Why a Separate Counter Table?

Cassandra requires **counter columns in their own dedicated table** — you cannot mix counters with regular columns. That's why `login_attempts` exists separately from `user_credentials`:

```cql
CREATE TABLE login_attempts (
email text PRIMARY KEY,
failed_attempts counter -- Must be in a counter-only table
);
```

**Key constraint**: Counter tables can only contain the primary key columns and counter columns. No regular text, boolean, or timestamp columns allowed.

**In KillrVideo**: Login performs an O(1) lookup by email, verifies the bcrypt hash in the application layer, and tracks failed attempts in a dedicated counter table for account lockout.
32 changes: 32 additions & 0 deletions public/educational/extracted/playback_stats_counters_tooltip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Video Playback Stats: Multi-Counter Table

Video statistics use a **dedicated counter table** with four independent counters — each atomically incremented across distributed nodes.

## The Data Model

```cql
CREATE TABLE video_playback_stats (
videoid uuid PRIMARY KEY,
views counter, -- Total view count
total_play_time counter, -- Total seconds watched
complete_views counter, -- Full watch-throughs
unique_viewers counter -- Approximate unique viewers
);

-- Atomic increment (no read-before-write):
UPDATE video_playback_stats
SET views = views + 1,
total_play_time = total_play_time + 245
WHERE videoid = 550e8400-...;
```

## Why a Separate Table?

Cassandra enforces a strict rule: **counter columns cannot coexist with regular columns**. That's why playback stats live in `video_playback_stats`, not in the `videos` table.

**Key properties**:
- **Atomic**: Increments never conflict, even across nodes
- **No read-before-write**: Unlike `UPDATE SET views = views + 1` in SQL, Cassandra counters don't require reading the current value first
- **Eventually consistent**: Counter values converge across replicas

**In KillrVideo**: Each video view atomically increments the `views` counter. The stats cards on the Creator Dashboard aggregate these counters across all of a creator's videos.
37 changes: 37 additions & 0 deletions public/educational/extracted/ratings_data_model_tooltip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Ratings Data Model: Counters + Individual Records

Video ratings use **two tables** — one for fast aggregate counts, another for individual user ratings with upsert semantics.

## The Two Tables

```cql
-- Aggregate counters (fast average calculation)
CREATE TABLE video_ratings (
videoid uuid PRIMARY KEY,
rating_counter counter, -- Number of ratings
rating_total counter -- Sum of all star values
);

-- Individual ratings (one per user per video)
CREATE TABLE video_ratings_by_user (
videoid uuid,
userid uuid,
rating int,
rating_date timestamp,
PRIMARY KEY (videoid, userid)
);
```

## Composite Primary Key & Upsert

`video_ratings_by_user` uses a **composite primary key** `(videoid, userid)` — this means each user can only have one rating per video. Re-rating the same video **upserts** (overwrites) automatically:

```cql
-- First rating or update — same CQL either way:
INSERT INTO video_ratings_by_user (videoid, userid, rating, rating_date)
VALUES (?, ?, 4, '2025-10-31T10:00:00Z');
```

Cassandra's INSERT is always an upsert when the full primary key matches — no need for `INSERT ... ON CONFLICT` like SQL.

**In KillrVideo**: Submitting a rating writes to both tables. The counter table gives instant averages; the per-user table prevents duplicate votes and lets users see their own rating.
35 changes: 35 additions & 0 deletions public/educational/extracted/registration_dual_table_tooltip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Dual-Table Writes for User Registration

When you register, Cassandra writes to **two separate tables** — because Cassandra has no JOINs, each query pattern needs its own table.

## The Two Tables

```cql
-- Table 1: Profile lookups by user ID
CREATE TABLE users (
userid uuid PRIMARY KEY,
email text,
firstname text,
lastname text,
account_status text
);

-- Table 2: Credential lookups by email
CREATE TABLE user_credentials (
email text PRIMARY KEY, -- Partition key is email for login lookups
password text,
userid uuid,
account_locked boolean
);
```

**Why separate?**
- **Different partition keys**: `users` is keyed by `userid`, `user_credentials` by `email`
- **Security**: Credentials are isolated from frequently-accessed profile data
- **Performance**: Login checks only read the small credentials row

## No Multi-Table Transactions

Cassandra doesn't support cross-table transactions. Both INSERTs happen independently — if one fails, you get an orphaned record. Production systems handle this with idempotent retries or background reconciliation.

**In KillrVideo**: Registration writes to `users` + `user_credentials` in sequence, and SAI indexes on `users.email` enable flexible lookups without yet another table.
27 changes: 27 additions & 0 deletions public/educational/extracted/trending_video_activity_tooltip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Time-Series Data for Trending

Trending videos are computed from the `video_activity` table — a time-series that records every view, partitioned by day.

## The Data Model

```cql
CREATE TABLE video_activity (
videoid uuid,
day date, -- Partition key: one partition per day
watch_time timeuuid, -- Clustering column: time-ordered
PRIMARY KEY (day, watch_time)
) WITH CLUSTERING ORDER BY (watch_time DESC);
```

**How trending works**:
1. Every video view writes a row to `video_activity` for that day
2. The trending query reads a day's partition and aggregates view counts per video
3. Time period selection (24h, 7 days, 30 days) determines how many day-partitions to scan

## Why This Design?

- **Bounded partitions**: Each day's activity is in its own partition, preventing unbounded growth
- **Natural time windowing**: "Past 7 days" = read 7 partitions
- **TimeUUID clustering**: Preserves exact ordering within each day

**In KillrVideo**: The trending page queries `video_activity` for the selected time window, counts views per video, and ranks them — all from time-bucketed partitions.
Loading