Skip to content

feat: add optional songs filter to content creation pipeline#112

Merged
sweetmantech merged 8 commits intomainfrom
feature/song-filtering
Mar 30, 2026
Merged

feat: add optional songs filter to content creation pipeline#112
sweetmantech merged 8 commits intomainfrom
feature/song-filtering

Conversation

@sidneyswift
Copy link
Copy Markdown
Contributor

@sidneyswift sidneyswift commented Mar 26, 2026

Summary

  • Add optional songs field to createContentPayloadSchema — an array of song slugs
  • Filter song list in selectAudioClip before random pick when songs is provided
  • Pass payload.songs through from createContentTask to selectAudioClip
  • When omitted, all songs remain eligible (backward compatible)

Test plan

  • All 183 existing tests pass
  • Verify songs: ["adhd"] only picks adhd.mp3
  • Verify songs: ["nonexistent"] throws clear error
  • Verify omitting songs preserves current behavior (all songs eligible)

Made with Cursor

Summary by CodeRabbit

  • New Features

    • Users can now optionally specify which songs to include when creating content, filtering available options.
  • Tests

    • Added comprehensive test coverage for song filtering functionality, including edge cases and validation scenarios.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

Warning

Rate limit exceeded

@sweetmantech has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 43 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 16 minutes and 43 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d93dab7f-304d-4124-9819-b7584adafc8f

📥 Commits

Reviewing files that changed from the base of the PR and between 31dd7ad and c3f2fc4.

📒 Files selected for processing (2)
  • src/content/__tests__/filterSongPaths.test.ts
  • src/content/filterSongPaths.ts
📝 Walkthrough

Walkthrough

This PR refactors the audio clip selection pipeline by consolidating selectAudioClip parameters into a single payload object, centralizing clip duration configuration, and introducing new song filtering capability via the filterSongPaths function to selectively match audio files against requested song slugs.

Changes

Cohort / File(s) Summary
Schema Updates
src/schemas/contentCreationSchema.ts
Added optional songs?: string[] field to createContentPayloadSchema and corresponding CreateContentPayload type.
Audio Selection Refactoring
src/content/selectAudioClip.ts, src/tasks/createContentTask.ts, src/content/testPipeline.ts
Consolidated selectAudioClip signature to accept payload object instead of separate parameters; removed explicit clipDuration argument (now uses DEFAULT_PIPELINE_CONFIG.clipDuration); updated callers accordingly.
Song Filtering Feature
src/content/filterSongPaths.ts, src/content/__tests__/filterSongPaths.test.ts
Added new filterSongPaths function that filters song paths by matching against requested song slugs with case-insensitive, exact matching and comprehensive test coverage validating filtering behavior, edge cases, and error handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 A payload unified, so clean and bright,
Songs filtered now with logic tight,
Clip durations centralized with care,
The pipeline dances through the air! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding an optional songs filter to the content creation pipeline, which is the core feature across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/song-filtering

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
src/content/templates/album-record-store/caption-guide.json (1)

7-7: Avoid dual length authorities in prompt constraints.

Line 7 duplicates length policy that is already enforced by CAPTION_LENGTH_INSTRUCTIONS in src/content/generateCaption.ts (Lines 48-56). If those drift, the model gets conflicting instructions.

💡 Suggested change
-    "keep it under 80 characters for short, can go longer for medium/long",
+    "follow the requested caption length exactly (short / medium / long)",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/content/templates/album-record-store/caption-guide.json` at line 7,
Remove the duplicate length constraint from the caption prompt in
caption-guide.json to avoid conflicting instructions with the canonical
CAPTION_LENGTH_INSTRUCTIONS constant in generateCaption.ts; either delete the
"keep it under 80 characters..." entry from the JSON or replace it with a single
reference comment that indicates CAPTION_LENGTH_INSTRUCTIONS is authoritative
(so maintainers update only src/content/generateCaption.ts), and ensure the JSON
no longer defines its own length policy that could drift from
CAPTION_LENGTH_INSTRUCTIONS.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/content/selectAudioClip.ts`:
- Around line 66-69: The filter currently uses path.includes(`/songs/${slug}`)
which does substring matching and can produce false positives; update the filter
in selectAudioClip.ts that processes songPaths (the filter callback using
songs.some(...)) to extract the slug segment after "/songs/" for each path
(e.g., via splitting or a regex) and compare that segment for exact equality
against each slug in the songs array so only exact slug matches are kept.

In `@src/content/templates/album-record-store/caption-guide.json`:
- Around line 8-11: Fix the typos in the caption constraints strings: change "no
punctuation at the end unless its a question mark" to "no punctuation at the end
unless it's a question mark", change "dont describe whats in the image" to
"don't describe what's in the image", and change "dont" occurrences to "don't"
where appropriate in the same JSON template so the prompt text is grammatically
correct and consistent.

In `@src/schemas/contentCreationSchema.ts`:
- Around line 17-18: The songs array currently allows empty or whitespace-only
strings; update the contentCreationSchema's songs definition so each element is
validated as a non-empty, trimmed slug and invalid values are rejected at the
schema boundary. Replace the element schema used in songs with a zod string that
trims input, enforces non-empty (e.g., .nonempty() or .refine(s => s.length >
0)), and validates against a slug pattern (e.g., alphanumerics,
dashes/underscores) using a regex; keep the outer songs as optionally present
(songs: z.array(...).optional()). Ensure the symbol names to change are songs in
contentCreationSchema and use Zod methods (transform/trim, nonempty/refine,
regex) to implement this tightening.

---

Nitpick comments:
In `@src/content/templates/album-record-store/caption-guide.json`:
- Line 7: Remove the duplicate length constraint from the caption prompt in
caption-guide.json to avoid conflicting instructions with the canonical
CAPTION_LENGTH_INSTRUCTIONS constant in generateCaption.ts; either delete the
"keep it under 80 characters..." entry from the JSON or replace it with a single
reference comment that indicates CAPTION_LENGTH_INSTRUCTIONS is authoritative
(so maintainers update only src/content/generateCaption.ts), and ensure the JSON
no longer defines its own length policy that could drift from
CAPTION_LENGTH_INSTRUCTIONS.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 120d4d29-4f5f-4f64-b27f-50bcf8cc68b2

📥 Commits

Reviewing files that changed from the base of the PR and between bcbda15 and 9f8a11b.

📒 Files selected for processing (9)
  • src/content/loadTemplate.ts
  • src/content/selectAudioClip.ts
  • src/content/templates/album-record-store/caption-guide.json
  • src/content/templates/album-record-store/references/captions/examples.json
  • src/content/templates/album-record-store/style-guide.json
  • src/content/templates/album-record-store/video-moods.json
  • src/content/templates/album-record-store/video-movements.json
  • src/schemas/contentCreationSchema.ts
  • src/tasks/createContentTask.ts

Comment on lines +8 to +11
"no punctuation at the end unless its a question mark",
"never sound like a press release or marketing copy",
"never say 'out now' or 'stream now' or 'link in bio'",
"dont describe whats in the image",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Tighten wording in constraint rules for clearer model behavior.

Line 8 (its) and Line 11 (dont, whats) have typos. Since this text is prompt input, cleaner wording improves instruction reliability.

✏️ Suggested change
-    "no punctuation at the end unless its a question mark",
+    "no punctuation at the end unless it's a question mark",
...
-    "dont describe whats in the image",
+    "don't describe what's in the image",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"no punctuation at the end unless its a question mark",
"never sound like a press release or marketing copy",
"never say 'out now' or 'stream now' or 'link in bio'",
"dont describe whats in the image",
"no punctuation at the end unless it's a question mark",
"never sound like a press release or marketing copy",
"never say 'out now' or 'stream now' or 'link in bio'",
"don't describe what's in the image",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/content/templates/album-record-store/caption-guide.json` around lines 8 -
11, Fix the typos in the caption constraints strings: change "no punctuation at
the end unless its a question mark" to "no punctuation at the end unless it's a
question mark", change "dont describe whats in the image" to "don't describe
what's in the image", and change "dont" occurrences to "don't" where appropriate
in the same JSON template so the prompt text is grammatically correct and
consistent.

Comment on lines +17 to +18
/** Optional list of song slugs to restrict which songs the pipeline picks from. */
songs: z.array(z.string()).optional(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden songs validation to reject empty/invalid slugs.

Right now songs accepts [""] (or whitespace), which can lead to unintended filtering behavior downstream. Validate each slug and reject empty entries at the schema boundary.

🔧 Proposed schema tightening
+const songSlugSchema = z
+  .string()
+  .trim()
+  .min(1, "song slug cannot be empty")
+  .regex(/^[a-z0-9-]+$/i, "song slug must contain only letters, numbers, and hyphens");
+
 export const createContentPayloadSchema = z.object({
   accountId: z.string().min(1, "accountId is required"),
   artistSlug: z.string().min(1, "artistSlug is required"),
   template: z.string().min(1, "template is required"),
   lipsync: z.boolean().default(false),
@@
   /** GitHub repo URL so the task can fetch artist files (face-guide, songs). */
   githubRepo: z.string().url("githubRepo must be a valid URL"),
   /** Optional list of song slugs to restrict which songs the pipeline picks from. */
-  songs: z.array(z.string()).optional(),
+  songs: z.array(songSlugSchema).optional(),
 });
As per coding guidelines "src/**/*.{ts,tsx}: Use Zod for schema validation".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/schemas/contentCreationSchema.ts` around lines 17 - 18, The songs array
currently allows empty or whitespace-only strings; update the
contentCreationSchema's songs definition so each element is validated as a
non-empty, trimmed slug and invalid values are rejected at the schema boundary.
Replace the element schema used in songs with a zod string that trims input,
enforces non-empty (e.g., .nonempty() or .refine(s => s.length > 0)), and
validates against a slug pattern (e.g., alphanumerics, dashes/underscores) using
a regex; keep the outer songs as optionally present (songs:
z.array(...).optional()). Ensure the symbol names to change are songs in
contentCreationSchema and use Zod methods (transform/trim, nonempty/refine,
regex) to implement this tightening.

name: templateName,
imagePrompt,
usesFaceGuide,
customInstruction,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is customInstruction used?

Comment on lines +65 to +76
// Step 1b: Filter to allowed songs if specified
if (songs && songs.length > 0) {
songPaths = songPaths.filter(path =>
songs.some(slug => path.includes(`/songs/${slug}`)),
);
if (songPaths.length === 0) {
throw new Error(
`None of the specified songs [${songs.join(", ")}] were found for artist ${artistSlug}`,
);
}
logger.log("Filtered to specified songs", { songs, matchCount: songPaths.length });
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open Closed Principle

  • actual: song filtering happens in existing selectAudioClip function.
  • required: new function file for filterAudioClips.

Comment on lines +113 to +114
const instruction = template.customInstruction
?? (template.usesFaceGuide ? FACE_SWAP_INSTRUCTION : NO_FACE_INSTRUCTION);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KISS principle

  • actual: customInstruction adds extra complexity, nesting a conditional.
  • required: do not nest conditionals.

Copy link
Copy Markdown
Contributor

@sweetmantech sweetmantech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this PR feels like it's doing 2 different tasks.

Why is songs[] Tasks PR also adding record label store templates?

SRP - Is this a PR for songs[] array or for a new template ?
I suggest: remove record label template from here. Open a separate branch + PR for the task changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kiss - what is the purpose of this file change in the context of this pr?

artistSlug: payload.artistSlug,
clipDuration: DEFAULT_PIPELINE_CONFIG.clipDuration,
lipsync: payload.lipsync,
songs: payload.songs,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of destructuring the payload object?

  • why not update selectAudioClip to directly accept payload so this file doesn't need updated for schema changes?

sidneyswift and others added 6 commits March 30, 2026 17:08
Allow callers to restrict which songs the pipeline picks from by
passing an array of song slugs. When omitted, all songs remain
eligible (preserving existing behavior).

Made-with: Cursor
Remove customInstruction, customMotionPrompt, and album-record-store
template — those belong in separate PRs per SRP feedback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses Open/Closed Principle feedback — song filtering is now a
standalone function with exact slug matching (fixes substring false
positives like "ad" matching "adhd").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers exact slug matching, case insensitivity, whitespace trimming,
org repo encoded paths, and error on no matches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Avoids destructuring payload fields at the call site — the task file
no longer needs updating when audio-related schema fields are added.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sweetmantech sweetmantech force-pushed the feature/song-filtering branch from e843dd3 to 5ffc9eb Compare March 30, 2026 22:10
}

// Step 1b: Filter to allowed songs if specified
if (songs && songs.length > 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the if statement required if we throw the previous error?

}): Promise<SelectedAudioClip> {
export async function selectAudioClip(
payload: Pick<CreateContentPayload, "githubRepo" | "artistSlug" | "lipsync" | "songs">,
clipDuration: number,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kiss - why not remove the prop and create the variable here? only one value is used

- clipDuration now sourced from DEFAULT_PIPELINE_CONFIG internally
- filterSongPaths handles undefined/empty songs as no-op, removing
  the if guard at the call site
- Added tests for undefined and empty songs cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
);
}

logger.log("Filtered to specified songs", { songs, matchCount: filtered.length });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use shared logstep function

Replace logger.log with logStep for consistent observability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/content/filterSongPaths.ts (1)

21-22: Handle normalized-empty song filters consistently.

If songs is provided as whitespace-only values (e.g. [" "]), requested becomes empty and the function throws a “not found” error. Consider short-circuiting on requested.size === 0 to align with the existing empty-array pass-through behavior.

Proposed tweak
   const requested = new Set(songs.map(s => s.trim().toLowerCase()).filter(Boolean));
+  if (requested.size === 0) return songPaths;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/content/filterSongPaths.ts` around lines 21 - 22, In filterSongPaths, the
normalized Set requested can be empty when songs contains only whitespace (e.g.
["   "]) causing a spurious "not found" error; after creating requested (const
requested = new Set(...)), add a short-circuit that if requested.size === 0 then
treat it as the empty-array case and return the original paths (or the same
pass-through behaviour the function uses for an empty songs array) so
whitespace-only filters don't trigger a not-found error.
src/content/__tests__/filterSongPaths.test.ts (1)

73-92: Add a whitespace-only songs case to lock behavior.

Given empty-array behavior is pass-through, it would help to add a test for inputs like [" "] so this edge case stays intentional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/content/__tests__/filterSongPaths.test.ts` around lines 73 - 92, Add a
test that ensures a songs array containing only whitespace elements is treated
like empty and returns all paths: create a new it block (e.g., "returns all
paths when songs contains only whitespace") that defines songPaths, calls
filterSongPaths(songPaths, ["   "], artistSlug) and expects the result toEqual
songPaths; this uses the existing filterSongPaths function and artistSlug test
variable to lock the whitespace-only behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/content/__tests__/filterSongPaths.test.ts`:
- Around line 73-92: Add a test that ensures a songs array containing only
whitespace elements is treated like empty and returns all paths: create a new it
block (e.g., "returns all paths when songs contains only whitespace") that
defines songPaths, calls filterSongPaths(songPaths, ["   "], artistSlug) and
expects the result toEqual songPaths; this uses the existing filterSongPaths
function and artistSlug test variable to lock the whitespace-only behavior.

In `@src/content/filterSongPaths.ts`:
- Around line 21-22: In filterSongPaths, the normalized Set requested can be
empty when songs contains only whitespace (e.g. ["   "]) causing a spurious "not
found" error; after creating requested (const requested = new Set(...)), add a
short-circuit that if requested.size === 0 then treat it as the empty-array case
and return the original paths (or the same pass-through behaviour the function
uses for an empty songs array) so whitespace-only filters don't trigger a
not-found error.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8b0f009d-6655-4147-b92b-e550e2c4494e

📥 Commits

Reviewing files that changed from the base of the PR and between f86cc7f and 31dd7ad.

📒 Files selected for processing (6)
  • src/content/__tests__/filterSongPaths.test.ts
  • src/content/filterSongPaths.ts
  • src/content/selectAudioClip.ts
  • src/content/testPipeline.ts
  • src/schemas/contentCreationSchema.ts
  • src/tasks/createContentTask.ts
💤 Files with no reviewable changes (1)
  • src/content/testPipeline.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/schemas/contentCreationSchema.ts
  • src/tasks/createContentTask.ts
  • src/content/selectAudioClip.ts

@sweetmantech sweetmantech merged commit ce0efce into main Mar 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants