Skip to content

fix: production audit remediation — correctness, reliability, perf, dedup#21

Merged
nathanialhenniges merged 5 commits intomainfrom
claude/nervous-pasteur-a50d2b
Apr 17, 2026
Merged

fix: production audit remediation — correctness, reliability, perf, dedup#21
nathanialhenniges merged 5 commits intomainfrom
claude/nervous-pasteur-a50d2b

Conversation

@nathanialhenniges
Copy link
Copy Markdown
Member

@nathanialhenniges nathanialhenniges commented Apr 17, 2026

Summary

End-to-end production-readiness pass on FluffBoost. Addresses 8 HIGH correctness/reliability/perf issues and 8 duplication findings surfaced by the full audit, plus Docker and observability hardening. Ships behind one bundled PR with tests for the HIGH bugs.

HIGH production bugs fixed

  • Atomic motivation delivery gatesrc/worker/jobs/sendMotivation.ts claims each guild via UPDATE … WHERE lastMotivationSentAt IS NULL OR < periodStart before sending. Overlapping worker ticks can no longer cause double-sends.
  • Single-query random quote fetchORDER BY random() LIMIT 1 via new src/utils/quoteHelpers.ts; empty-table guard.
  • Fresh embed per guildbuildMotivationEmbed returns a new EmbedBuilder each call.
  • BullMQ repeatable dedup on startupsrc/worker/index.ts removes stale repeatables before re-adding. removeOnFail: { count: 100 }, removeOnComplete: { count: 50 }, explicit WORKER_CONCURRENCY.
  • Graceful shutdownsrc/app.ts SIGTERM/SIGINT handler closes HTTP → shards → postgres → redis. Explicit respawn: true on the sharding manager. Shard disconnect no longer kills the process; Redis transient errors are downgraded to warnings.
  • Deep health endpointsrc/api/routes/health.ts probes DB + Redis with a 1.5s timeout and returns 503 when degraded.
  • Dockerfile HEALTHCHECK — added with curl probe. docker-entrypoint.sh wraps the migration in a postgres advisory lock so concurrent replica boots are safe.
  • DB pool + schema — explicit pool max in src/database/index.ts. Schema adds guild_motivation_channel_idx, suggestion_status_idx, and a SuggestionStatus pgEnum. One new migration generated.
  • Env hardening — snowflake regex on OWNER_ID, MAIN_GUILD_ID, MAIN_CHANNEL_ID, DISCORD_APPLICATION_ID, DISCORD_PREMIUM_SKU_ID, and ALLOWED_USERS. New DATABASE_POOL_MAX, WORKER_CONCURRENCY.
  • Observability — error logs include interactionId and guildId.

Dedup refactor

New shared helpers reduce copy-paste across commands:

  • src/utils/quoteHelpers.tsgetRandomMotivationQuote, resolveQuoteAuthor, buildMotivationEmbed
  • src/utils/replyHelpers.tsreplyWithTextFile
  • src/utils/suggestionHelpers.tsfetchPendingSuggestion
  • src/utils/premium.tsbuildPremiumUpsell
  • src/utils/ownerGuard.tsrequireApplication
  • src/utils/commandErrors.tslogCommandError, withCommandLogging wrapper

Deleted src/utils/trimArray.ts. Applied across admin, owner, premium, setup, and quote command handlers.

Tests

  • 243 passing, bun run typecheck clean, bun run lint:check clean.
  • New coverage: quoteHelpers, replyHelpers, suggestionHelpers, commandErrors wrapper, parseHourMinute bounds.
  • Extended: sendMotivation (atomic-gate race behavior), worker/index (repeatable dedup), health (503 probes), shardDisconnect (no-exit), envSchema (snowflake + DATABASE_POOL_MAX).

New env vars

Update .env.example documents:

  • DATABASE_POOL_MAX (default 10)
  • WORKER_CONCURRENCY (default 4)

Reviewer notes

  • The migration file drizzle/0000_loud_mother_askani.sql is the first migration committed to the repo since Drizzle adoption. It captures the full current schema including the new indexes and pgEnum.
  • src/utils/ownerGuard.ts#requireApplication now returns the ClientApplication (or null) instead of a boolean, so callers no longer need non-null assertions.
  • withCommandLogging covers the standard executing/success/error/safeErrorReply lifecycle. Handlers that need custom error replies can pass errorMessage as the 4th arg.
  • Weekly dedup in scheduleEvaluator still uses dayjs's locale week start (Sunday). Documented in code; not changed behaviorally.

Test plan

  • bun install
  • bun run typecheck
  • bun run lint:check
  • bun test --coverage
  • docker compose up -d then bun run db:migrate to apply the new migration locally
  • bun dev and verify /quote, /setup schedule (premium gate), admin list commands all work
  • curl -fsS http://localhost:3000/api/health returns {status:"ok", db:"ok", redis:"ok"}
  • Stop postgres → health endpoint returns 503 with db:"error"
  • kill -TERM <pid> → observe ordered graceful shutdown in logs

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added container health checks monitoring database and Redis connectivity status
    • Enhanced suggestion management system with status tracking and approval workflows
  • Configuration

    • Added DATABASE_POOL_MAX and WORKER_CONCURRENCY environment variables for system tuning
  • Improvements

    • Improved graceful shutdown handling for better service reliability
    • Enhanced activity scheduling and motivation quote management systems

…edup

HIGH production bugs
- Atomic motivation delivery gate prevents double-sends across overlapping
  worker ticks (claim via UPDATE ... WHERE lastMotivationSentAt IS NULL
  OR < periodStart before send).
- Single-query random quote fetch (ORDER BY random() LIMIT 1) with
  empty-table guard and fresh embed per guild.
- BullMQ repeatables deduplicated on startup; removeOnFail capped;
  explicit WORKER_CONCURRENCY.
- SIGTERM/SIGINT graceful shutdown: HTTP -> shards -> postgres -> redis;
  explicit respawn:true; shard disconnect no longer exits the process;
  redis transient errors downgraded to warnings.
- Deep health endpoint probes DB + Redis with 1.5s timeout; returns 503
  when degraded.
- Dockerfile HEALTHCHECK added; migration wrapped in postgres advisory
  lock so concurrent replica boots are safe.
- DB pool max configurable; schema adds indexes on Guild.motivationChannelId
  and SuggestionQuote.status; SuggestionStatus is now a pgEnum.
- Env schema hardened with snowflake regex on all Discord ID vars.
- Error logs carry interactionId + guildId.

Dedup refactor
- New helpers: quoteHelpers, replyHelpers, suggestionHelpers;
  premium.buildPremiumUpsell, ownerGuard.requireApplication,
  commandErrors.{logCommandError, withCommandLogging}.
- Removed trimArray.ts and applied the new helpers across admin,
  owner, premium, setup, and quote command handlers.

Tests
- 243 passing. New coverage for quoteHelpers, replyHelpers,
  suggestionHelpers, commandErrors wrapper, parseHourMinute.
- Extended sendMotivation tests for atomic-gate race behavior,
  worker/index for repeatable dedup, health endpoint for 503 probes,
  shardDisconnect for no-exit behavior, envSchema for snowflake checks.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

Warning

Rate limit exceeded

@nathanialhenniges has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 54 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 54 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5f3a7a01-a29a-4de5-a137-1234d002bcfb

📥 Commits

Reviewing files that changed from the base of the PR and between 7b4ebc4 and 180624a.

📒 Files selected for processing (14)
  • src/api/routes/health.ts
  • src/bot.ts
  • src/commands/admin/suggestion/list.ts
  • src/commands/admin/suggestion/reject.ts
  • src/commands/owner/premium/testList.ts
  • src/events/commandRegistry.ts
  • src/events/interactionCreate.ts
  • src/events/ready.ts
  • src/events/readyDeps.ts
  • src/utils/quoteHelpers.ts
  • src/utils/replyHelpers.ts
  • tests/commands/admin/suggestion/reject.test.ts
  • tests/events/interactionCreate.test.ts
  • tests/events/ready.test.ts

Walkthrough

This PR implements comprehensive infrastructure improvements including database pool configuration, health check monitoring for database and Redis dependencies, database schema migration with enums and tables, centralized command error logging, new helper utilities for quotes and suggestions, stricter environment validation with snowflake format checks, graceful shutdown logic, and worker job scheduling refactoring with concurrent claim semantics to prevent duplicate motivation sends.

Changes

Cohort / File(s) Summary
Environment Configuration
.env.example, src/utils/envSchema.ts
Added DATABASE_POOL_MAX (default 10) and WORKER_CONCURRENCY (default 4) env vars with validation; applied strict snowflake regex validation to Discord IDs; updated ALLOWED_USERS to require comma-separated snowflakes.
Database Layer
drizzle/0000_loud_mother_askani.sql, src/database/schema.ts, src/database/index.ts, src/database/migrate.ts
New migration with enums (DiscordActivityType, MotivationFrequency, SuggestionStatus), four tables with constraints/indexes; added SuggestionStatus type and indexes to schema; configured connection pool max and timeouts; added advisory locking for migration safety.
Dockerfile & Health Monitoring
Dockerfile, src/api/routes/health.ts
Added container HEALTHCHECK probe; upgraded health endpoint to concurrently probe database (SELECT 1) and Redis with 1.5s timeout, returning 200 with "ok" or 503 with "degraded" status and per-dependency error details.
Command Logging & Error Handling
src/utils/commandErrors.ts, src/commands/admin/activity/list.ts, src/commands/admin/quote/list.ts, src/commands/admin/suggestion/approve.ts, src/commands/admin/suggestion/list.ts, src/commands/admin/suggestion/reject.ts, src/commands/owner/premium/testCreate.ts, src/commands/owner/premium/testDelete.ts, src/commands/owner/premium/testList.ts, src/commands/premium.ts, src/commands/quote.ts, src/commands/setup/schedule.ts
Introduced withCommandLogging and logCommandError utilities replacing scattered manual logging; refactored 12 command handlers to use centralized wrapper; simplified reply patterns via new replyWithTextFile, buildPremiumUpsell helpers.
Helper Utilities
src/utils/quoteHelpers.ts, src/utils/replyHelpers.ts, src/utils/suggestionHelpers.ts, src/utils/premium.ts, src/utils/ownerGuard.ts, src/utils/permissions.ts, src/utils/scheduleEvaluator.ts
Added getRandomMotivationQuote, resolveQuoteAuthor, buildMotivationEmbed; added generic replyWithTextFile<T> for text file responses; added fetchPendingSuggestion for suggestion validation; added buildPremiumUpsell for premium UI; added requireApplication and parseHourMinute validators; updated isUserPermitted to async with getAllowedUsers helper.
Application & Process Handling
src/app.ts, src/bot.ts, src/events/shardDisconnect.ts, src/events/interactionCreate.ts
Added graceful shutdown with SIGTERM/SIGINT handlers closing HTTP/Discord/DB/Redis; changed Redis error handling from fatal to transient; added respawn: true to ShardingManager; updated shard disconnect to warn instead of exit; refined interaction error logging with contextual fields (cmd, interactionId, guildId).
Worker System
src/worker/index.ts, src/worker/jobs/setActivity.ts, src/worker/jobs/setActivityCore.ts, src/worker/jobs/sendMotivation.ts
Refactored worker initialization to async startWorker(queue) returning Worker; added ensureRepeatable helper for idempotent job scheduling; extracted setActivityCore to dedicated module with activity-type mapping; rewrote sendMotivation with atomic guild-claim logic via db.update to prevent concurrent duplicate sends; added WORKER_CONCURRENCY config to worker options.
Tests: API & Commands
tests/api/health.test.ts, tests/commands/*/...test.ts (20 files)
Updated health test to validate multi-dependency probing (db, redis, status); extended command test mocks to include queryClient export; refactored test helpers with stubBuildPremiumUpsell; updated permissions/schema env tests to validate snowflake format and new env vars.
Tests: Utilities & Workers
tests/utils/commandErrors.test.ts, tests/utils/quoteHelpers.test.ts, tests/utils/replyHelpers.test.ts, tests/utils/scheduleEvaluator.parseHourMinute.test.ts, tests/utils/suggestionHelpers.test.ts, tests/worker/index.test.ts, tests/worker/sendMotivation.test.ts
Added new test suites for command logging wrapper, quote/suggestion helpers, text-file reply handler, schedule parser, time-range validation; refactored worker tests to assert repeatable-job cleanup, concurrency config, and claim-based send semantics.
Removed
src/utils/trimArray.ts, tests/utils/trimArray.test.ts
Removed unused trimArray utility and tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

enhancement, refactor, infrastructure

Poem

🐰 A rabbit's tale of better streams,
Where health checks validate all dreams,
With pools and workers, claims so tight,
And graceful exits done just right!
Logs now dance in harmony true,

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.96% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly summarizes the main changes: production audit remediation addressing correctness, reliability, performance, and deduplication issues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/nervous-pasteur-a50d2b

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

bun:test mock.module mutates the global ES registry and persists across
files. Partial mocks that omitted exports used by other tests caused
TypeErrors like "execute is not a function" and "guildExists is not a
function" when test files ran together in CI (serialized differently
than local).

- Extract setActivityCore into its own module so worker/index.test.ts
  can mock the setActivity.js wrapper without clobbering setActivityCore
  imports used by setActivity.test.ts.
- Fill out partial mocks to include every export the real module
  provides (guildDatabase: guildExists + pruneGuilds + ensureGuildExists;
  command modules: default + named execute + slashCommand).
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/commands/admin/suggestion/approve.ts (1)

25-42: ⚠️ Potential issue | 🔴 Critical

Critical: TOCTOU allows duplicate motivationQuotes inserts on concurrent approvals.

fetchPendingSuggestion only reads the status; the UPDATE inside the transaction no longer gates on status = "Pending". Two concurrent approvals (e.g., double-click, two admins) will both see "Pending", both INSERT into motivationQuotes, and both UPDATE the suggestion. Since motivationQuotes has no uniqueness on (quote, author, addedBy), the duplicate row stays in the rotation pool — users see the same motivation twice as often, and every future approval of this specific race case compounds the problem.

Make the update the atomic claim and only insert when the claim succeeds:

🛡️ Suggested fix
-    await db.transaction(async (tx) => {
-      await tx.insert(motivationQuotes).values({
-        quote: suggestion.quote,
-        author: suggestion.author,
-        addedBy: suggestion.addedBy,
-      });
-      await tx
-        .update(suggestionQuotes)
-        .set({
-          status: "Approved",
-          reviewedBy: interaction.user.id,
-          reviewedAt: new Date(),
-        })
-        .where(eq(suggestionQuotes.id, suggestionId));
-    });
+    const claimed = await db.transaction(async (tx) => {
+      const rows = await tx
+        .update(suggestionQuotes)
+        .set({
+          status: "Approved",
+          reviewedBy: interaction.user.id,
+          reviewedAt: new Date(),
+        })
+        .where(and(eq(suggestionQuotes.id, suggestionId), eq(suggestionQuotes.status, "Pending")))
+        .returning({ id: suggestionQuotes.id });
+
+      if (rows.length === 0) {return false;}
+
+      await tx.insert(motivationQuotes).values({
+        quote: suggestion.quote,
+        author: suggestion.author,
+        addedBy: suggestion.addedBy,
+      });
+      return true;
+    });
+
+    if (!claimed) {
+      await interaction.reply({
+        content: "This suggestion was just reviewed by someone else.",
+        flags: MessageFlags.Ephemeral,
+      });
+      return;
+    }

(Add and to the drizzle-orm import.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/commands/admin/suggestion/approve.ts` around lines 25 - 42, The current
flow reads via fetchPendingSuggestion then blindly inserts into motivationQuotes
inside db.transaction, allowing TOCTOU duplicates; change the transaction to
perform an atomic conditional UPDATE on suggestionQuotes with
.where(eq(suggestionQuotes.id, suggestionId), and(eq(suggestionQuotes.status,
"Pending"))), check the number of rows affected by that update (only proceed
when it reports 1) and only then insert into motivationQuotes; keep the
reviewedBy/reviewedAt set in that atomic update so the claim and state change
are single-step. Also ensure you import the and helper from drizzle-orm (used in
the .where clause) so the combined predicate compiles.
src/worker/jobs/sendMotivation.ts (1)

83-102: ⚠️ Potential issue | 🟠 Major

Claiming before channel validation/send can silently drop a period's delivery.

claimGuild bumps lastMotivationSentAt to "now" before the channel is fetched or the embed is sent. If any of the following occur after a successful claim, the guild will not receive a motivation this period and nothing will retry until the next window:

  • client.channels.fetch rejects (Discord API error, permissions change)
  • The channel is no longer text‑based (return "skipped" at Line 96)
  • channel.send rejects at Line 100 (caught as "rejected" by allSettled)

This is an intentional atomicity trade‑off vs. double‑sends, but it's asymmetric: a transient Discord error effectively becomes a missed daily/weekly/monthly quote. Consider one of:

  1. Validate the channel (cheap) before claiming, then claim+send.
  2. On send failure, reset lastMotivationSentAt back to its previous value (store it before the claim) so the next tick can retry.
  3. Track raced vs claim-but-failed distinctly so operators can see delivery loss in logs/metrics.

The current logging lumps these into failed/skipped without surfacing that the row was already advanced, which will make this hard to diagnose in production.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/worker/jobs/sendMotivation.ts` around lines 83 - 102, Claiming
(claimGuild) occurs before validating/fetching the channel and sending, which
can advance lastMotivationSentAt and silently drop delivery; to fix, read and
cache the guild's current lastMotivationSentAt before calling claimGuild, then
perform client.channels.fetch and channel validation, and only after a
successful channel send (channel.send with buildMotivationEmbed) leave the
claim; if fetch/validation/send fails, call a rollback API (e.g.,
resetLastMotivation or updateGuildLastMotivation) to restore the cached
timestamp so the guild can be retried in the next tick—add the rollback helper
if missing and ensure claimGuild, client.channels.fetch,
channel.isTextBased/isDMBased, and channel.send paths invoke the rollback on any
failure.
🧹 Nitpick comments (24)
tests/commands/admin/quote/create.test.ts (1)

16-16: queryClient mock is unused and misrepresents the real export's type.

src/commands/admin/quote/create.ts only uses db; it never touches queryClient, so this added entry isn't exercised by any test in this file. It also models queryClient as () => Promise<[]>, whereas the real export in src/database/index.ts is a postgres() client instance (ReturnType<typeof postgres>) — a callable tagged-template/SQL function, not a zero-arg function returning an array. If a future test in this file starts depending on queryClient, this shape will silently diverge from production behavior.

Consider either dropping the extra key here, or — if you're standardizing the database mock across the suite to match new consumers (e.g., health checks) — shaping it closer to the real client (e.g., a sinon stub callable as queryClient\SELECT 1`resolving to[]`) so tests fail loudly on contract drift.

♻️ Minimal cleanup (drop unused key)
-    mock.module("../../../../src/database/index.js", () => ({ db, queryClient: () => Promise.resolve([]) }));
+    mock.module("../../../../src/database/index.js", () => ({ db }));

(apply to both loadModule and loadModuleNotPermitted)

Also applies to: 31-31

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/commands/admin/quote/create.test.ts` at line 16, The test mock adds a
`queryClient` key that is unused and misrepresents the real export; update the
mock in both `mock.module(...)` calls (used by `loadModule` and
`loadModuleNotPermitted`) to either remove the `queryClient` entry entirely and
only export `db`, or replace it with a stub that mimics the real tagged-template
SQL client (e.g., a callable/stub named `queryClient` that can be invoked as a
tagged-template and resolves to an array) so the test surface matches
`src/database/index.ts`; make the change where
`mock.module("../../../../src/database/index.js", () => ({ db, queryClient: ()
=> Promise.resolve([]) }))` is defined.
src/utils/ownerGuard.ts (1)

35-48: Optional: log the not-ready path for observability.

When client.application is missing, the user gets an ephemeral reply but nothing is recorded. Given this PR’s observability focus (adding interactionId/guildId to error logs), consider emitting a warn-level log here so operators can detect if owner commands are being invoked before the client is fully ready.

Suggested addition
 export async function requireApplication(
   client: Client,
   interaction: CommandInteraction
 ): Promise<ClientApplication | null> {
   if (client.application) {
     return client.application;
   }
 
+  logger.commands.error?.(
+    "requireApplication",
+    "client.application not ready",
+    { userId: interaction.user.id, guildId: interaction.guildId }
+  );
   await interaction.reply({
     content: "Bot application is not ready. Please try again in a moment.",
     flags: MessageFlags.Ephemeral,
   });
   return null;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/ownerGuard.ts` around lines 35 - 48, When client.application is
missing in requireApplication, add a warn-level log to record the not-ready path
for observability: log a message from within requireApplication that includes
context fields such as interaction.id and interaction.guildId (and mention the
function name requireApplication in the message), using the project’s existing
logger (e.g., processLogger or the module’s logger) so operators can detect
owner commands invoked before client readiness; then keep the existing ephemeral
reply and return null.
tests/events/guildDelete.test.ts (1)

14-14: Mock shape doesn't match real queryClient export.

The real queryClient (per src/database/index.ts) is a postgres tagged-template client, not a zero-arg function returning Promise.resolve([]). It happens to be harmless here because guildDeleteEvent only touches db, but if a future change in src/events/guildDelete.ts probes the DB directly (e.g., queryClient\SELECT 1`), this mock will throw queryClient is not a function-style errors or silently diverge from production behavior. Consider stubbing it as a tag function, e.g. queryClient: Object.assign(() => Promise.resolve([]), { end: sinon.stub().resolves() })`, or omit it entirely since it isn't used by this handler.

Same applies to line 31.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/events/guildDelete.test.ts` at line 14, The test mock for queryClient
in mock.module does not match the real export shape (it's a postgres
tagged-template function), so update the mock to either omit queryClient
entirely or stub it as a tag function to mirror production (e.g., replace the
zero-arg Promise-returning function with a tag-function shaped stub assigned to
queryClient and provide an end method), ensuring the mock lives alongside the
existing db mock used by guildDeleteEvent; apply the same change to the second
occurrence referenced in the file so future uses of queryClient (e.g.,
queryClient`SELECT 1`) won't throw or diverge from runtime behavior.
tests/helpers.ts (1)

239-239: Move the discord.js import to the top and wrap the long signature.

Imports hoist in ESM so this works, but placing it mid-file is unusual and will surprise readers/linters. Line 242 also exceeds the 120-char line-length guideline.

As per coding guidelines: "Enforce maximum line length of 120 characters".

🧹 Proposed refactor
 import sinon from "sinon";
 import type { SinonStub } from "sinon";
+import { ActionRowBuilder, ButtonBuilder, ButtonStyle, EmbedBuilder } from "discord.js";
@@
-// ── Premium upsell stub (matches real premium.buildPremiumUpsell shape) ────
-import { ActionRowBuilder, ButtonBuilder, ButtonStyle, EmbedBuilder } from "discord.js";
-
-export function stubBuildPremiumUpsell(skuId?: string) {
-  return (opts: { title?: string; description?: string; fields?: { name: string; value: string; inline?: boolean }[] } = {}) => {
+// ── Premium upsell stub (matches real premium.buildPremiumUpsell shape) ────
+
+interface UpsellOpts {
+  title?: string;
+  description?: string;
+  fields?: { name: string; value: string; inline?: boolean }[];
+}
+
+export function stubBuildPremiumUpsell(skuId?: string) {
+  return (opts: UpsellOpts = {}) => {

Also applies to: 242-242

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/helpers.ts` at line 239, Move the import of ActionRowBuilder,
ButtonBuilder, ButtonStyle, and EmbedBuilder to the top of the module (before
any other code) and reformat the import statement so the specifiers are wrapped
across multiple lines to keep the line length under 120 characters (e.g. import
{ ActionRowBuilder, ButtonBuilder, ButtonStyle, EmbedBuilder } from
"discord.js"; -> import { ActionRowBuilder, ButtonBuilder, ButtonStyle,
EmbedBuilder } from "discord.js"; but with each specifier on its own line inside
the braces or split into two shorter imports) so linters and readers won’t be
surprised by a mid-file import and the line-length rule is respected.
src/utils/scheduleEvaluator.ts (1)

23-29: LGTM — strict parsing is the right call.

Regex covers the valid HH:mm space exactly, and returning false from isGuildDueForMotivation on parse failure is safer than silently coercing to defaults (which could cause unintended sends). Consider emitting a warn log once per malformed guild so operators can spot bad rows — optional.

Also applies to: 58-62

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/scheduleEvaluator.ts` around lines 23 - 29, The parse function
parseHourMinute is fine but when it returns null callers like
isGuildDueForMotivation should emit a single warn-level log per malformed guild
to help operators find bad rows; update isGuildDueForMotivation to detect a null
result from parseHourMinute and call the logger (e.g., processLogger.warn or
your app logger) including the guild identifier and the offending time string,
and ensure you only warn once per guild (track warned guild IDs in a Set scoped
to the module or service) to avoid log spam.
Dockerfile (1)

43-45: HEALTHCHECK looks good.

curl is installed in the runtime stage (line 27), ${PORT:-3000} correctly mirrors the app's bind port default, and the 20s start-period gives migrations/startup room. One small consideration: since the deep health endpoint returns 503 when DB/Redis are degraded, transient dependency blips could flap the container to unhealthy and trigger orchestrator restarts. If that becomes noisy, consider probing a lightweight liveness endpoint here and reserving /api/health for readiness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Dockerfile` around lines 43 - 45, HEALTHCHECK currently probes the deep
readiness endpoint (/api/health) which can return 503 for degraded dependencies
and cause flapping; change the Docker HEALTHCHECK (symbol: HEALTHCHECK) to call
a lightweight liveness endpoint (e.g. /api/health/liveness or /api/health/live)
that only verifies the app process is responsive, keep ${PORT:-3000} as-is, and
leave the deep /api/health as the readiness probe used by orchestrator so
transient DB/Redis blips don't mark the container unhealthy.
tests/api/health.test.ts (1)

15-19: Mock shape diverges from production queryClient invocation.

src/api/routes/health.ts calls queryClient\SELECT 1`as a **tagged template** (postgres.js's query builder), whereas the stub here is a plainsinon.stub()invoked asdbStub(strings, ...values). The test still passes because sinon ignores args and resolves, but it doesn't actually verify the production call shape — if someone refactors the probe to e.g. queryClient.unsafe("SELECT 1")or to go throughdb`, these tests will continue to pass while the endpoint breaks.

Optional hardening: assert dbStub.calledOnce and/or that the first arg is an array with a raw property (tagged-template signature), to pin the contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api/health.test.ts` around lines 15 - 19, The test's dbStub must mimic
the tagged-template shape used by queryClient in src/api/routes/health.ts:
replace the plain sinon.stub() with a stub that accepts (strings, ...values) as
a tagged-template handler (or use callsFake to return the resolved row when
invoked with that signature) and add assertions that dbStub was called once and
that the first argument is an array with a raw property (e.g.,
Array.isArray(args[0]) && 'raw' in args[0]) to pin the contract; keep
redisPingStub behavior but ensure you reference dbStub and the queryClient
tagged-template invocation when updating the test.
src/database/migrate.ts (1)

28-38: Consider a bounded wait on the advisory lock.

pg_advisory_lock blocks indefinitely. If a previous migration container is wedged (crash-looping, stuck on a long DDL, or a leaked connection holding the lock), every subsequent docker-entrypoint.sh startup will hang here with no diagnostic. For a rolling-deploy/replica scenario this can turn a single stuck pod into a cluster-wide startup stall.

Two common mitigations:

  • Use pg_try_advisory_lock in a retry loop with a max-attempts/total-timeout and clear log output when waiting.
  • Or set a session lock_timeout / statement_timeout so the blocked acquire fails fast and the entrypoint exits non-zero (letting the orchestrator surface it).
♻️ Sketch — try-lock with backoff
-try {
-  await db.execute(sql`SELECT pg_advisory_lock(${LOCK_KEY})`);
-  await migrate(db, { migrationsFolder });
-} finally {
+const LOCK_WAIT_MS = 60_000;
+const start = Date.now();
+let acquired = false;
+try {
+  while (Date.now() - start < LOCK_WAIT_MS) {
+    const [row] = await db.execute<{ locked: boolean }>(
+      sql`SELECT pg_try_advisory_lock(${LOCK_KEY}) AS locked`
+    );
+    if (row?.locked) { acquired = true; break; }
+    console.log("Waiting for migration advisory lock...");
+    await new Promise((r) => setTimeout(r, 2000));
+  }
+  if (!acquired) {
+    throw new Error(`Timed out waiting for migration advisory lock after ${LOCK_WAIT_MS}ms`);
+  }
+  await migrate(db, { migrationsFolder });
+} finally {
+  if (acquired) {
     try {
       await db.execute(sql`SELECT pg_advisory_unlock(${LOCK_KEY})`);
     } catch {
       // unlock failure is non-fatal — connection close releases it
     }
+  }
   await sqlClient.end();
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/database/migrate.ts` around lines 28 - 38, The advisory lock call using
pg_advisory_lock(${LOCK_KEY}) can block forever; change the acquire logic in
migrate.ts to use a bounded wait: replace the direct db.execute(sql`SELECT
pg_advisory_lock(${LOCK_KEY})`) with a retry loop that uses
db.execute(sql`SELECT pg_try_advisory_lock(${LOCK_KEY})`) and retries with
backoff up to a max attempts/total timeout (logging each wait), or alternatively
set a session timeout (e.g., execute SET lock_timeout/statement_timeout) before
attempting the lock so a blocked acquire fails fast; ensure the unlock/cleanup
in the finally block still runs and that sqlClient.end() is awaited as before.
src/utils/suggestionHelpers.ts (1)

13-40: Helper assumes the interaction has not been replied/deferred.

fetchPendingSuggestion unconditionally calls interaction.reply(...) on the not-found / already-reviewed paths. If a future caller ever defers the interaction first (e.g., for a slow approval flow), this will throw InteractionAlreadyReplied. Low risk given current callers (approve.ts, reject.ts) reply immediately, but worth either a JSDoc note on the precondition or a interaction.replied || interaction.deferred branch using followUp/editReply.

status.toLowerCase() is safe — status is notNull enum in the schema.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/suggestionHelpers.ts` around lines 13 - 40, fetchPendingSuggestion
currently always calls interaction.reply on the not-found and already-reviewed
branches which will throw if the interaction was previously deferred/replied;
update fetchPendingSuggestion to check interaction.replied ||
interaction.deferred and, when true, use interaction.followUp for
not-found/already-reviewed messages (or interaction.editReply if you prefer to
replace a deferred placeholder), otherwise call interaction.reply as now;
reference the fetchPendingSuggestion function and the two reply sites handling
"Suggestion with ID ..." and "This suggestion has already been ..." to apply the
conditional branching.
tests/utils/commandErrors.test.ts (1)

17-50: LGTM — covers the main happy/sad paths.

One optional gap: there's no case exercising safeErrorReply's no-op branch (handler throws after the interaction has already been replied/deferred). That path is the most likely way the wrapper can hide user-facing errors, so a small test asserting interaction.reply is not called when interaction.replied = true would round out coverage.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/utils/commandErrors.test.ts` around lines 17 - 50, Add a test
exercising safeErrorReply's no-op branch by using the existing load() and
mockInteraction(), set interaction.replied = true (or mark it as
deferred/replied) before invoking mod.withCommandLogging("cmd", interaction as
never, ...) where the handler throws; then assert logger.commands.error was
called (e.g., calledOnce) and that interaction.reply was NOT called (called ===
false), so you cover the scenario where the wrapper should not attempt a reply
when interaction.replied is true.
tests/utils/env.test.ts (1)

86-116: LGTM — good coverage for the new snowflake/pool validation.

Nice to see the stricter snowflake regex covered for both single IDs and ALLOWED_USERS list entries, plus the pool-max default. One small optional addition: the PR also introduces WORKER_CONCURRENCY in the env schema, which doesn't have an analogous default-value assertion here — might be worth a one-liner for symmetry with the DATABASE_POOL_MAX test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/utils/env.test.ts` around lines 86 - 116, Add a test mirroring the
DATABASE_POOL_MAX default check: call envSchema.safeParse(validEnv()) and assert
result.success is true, then assert result.data.WORKER_CONCURRENCY equals the
default value defined in the env schema; locate the test near the existing
"should default DATABASE_POOL_MAX to 10 when unset" case and use the same
pattern (envSchema, validEnv) so the new test verifies the WORKER_CONCURRENCY
default.
src/utils/commandErrors.ts (2)

46-71: Handler success/failure is observable only via logs — no return value.

withCommandLogging returns Promise<void> even when the wrapped handler throws (errors are swallowed after logging + reply). Callers downstream can't distinguish "handled failure" from "success," which is fine today since nobody chains off the return, but it also means a handler that returns a value can't bubble it up. Since this is a new centralized wrapper that's applied broadly, consider returning a discriminated result (or at least propagating an "ok" boolean) now to keep future callers flexible.

Non-blocking; flag and move on if you'd rather keep the surface minimal.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/commandErrors.ts` around lines 46 - 71, withCommandLogging
currently swallows handler results and always returns Promise<void>, so callers
can't tell success vs handled failure or receive a returned value; change
withCommandLogging to be generic (e.g., withCommandLogging<T>) and return a
discriminated result (e.g., { ok: true, value: T } | { ok: false, error: unknown
}) instead of void, invoke the provided handler() and on success return { ok:
true, value } (capturing any returned value), and on catch keep calling
logCommandError(commandName, interaction, err) and safeErrorReply(interaction,
errorMessage) but return { ok: false, error: err } so downstream callers can
observe outcome; update callers accordingly to handle the new return shape.

67-70: Secondary failure in safeErrorReply will surface as an unhandled rejection.

If interaction.reply(...) inside safeErrorReply throws (network hiccup, unknown interaction, etc.), the rejection propagates out of withCommandLogging — which is now the top-level boundary for most command handlers. Since the error-logging already happened, it's usually better to catch that reply failure and log it rather than re-raise.

♻️ Suggested fix
   } catch (err) {
     logCommandError(commandName, interaction, err);
-    await safeErrorReply(interaction, errorMessage);
+    try {
+      await safeErrorReply(interaction, errorMessage);
+    } catch (replyErr) {
+      logger.commands.error(
+        commandName,
+        interaction.user.username,
+        interaction.user.id,
+        replyErr,
+        interaction.guildId ?? undefined
+      );
+    }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/commandErrors.ts` around lines 67 - 70, The catch block in
withCommandLogging currently calls logCommandError(...) then awaits
safeErrorReply(...), but if safeErrorReply(...) rejects that secondary failure
becomes an unhandled rejection; wrap the safeErrorReply(interaction,
errorMessage) call in its own try/catch and, on error, log that secondary
failure (use logCommandError with the same commandName and interaction or a
dedicated logger) instead of rethrowing so reply failures are recorded but do
not propagate past withCommandLogging.
src/events/interactionCreate.ts (1)

34-35: Redundant commandName guard.

On CommandInteraction, commandName is a required non-empty string. The if (!commandName) {return;} guard is dead code; consider removing to keep the flow clear.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/events/interactionCreate.ts` around lines 34 - 35, Remove the redundant
guard that checks commandName in the interactionCreate handler: because
interaction is a CommandInteraction and commandName is required, delete the
lines "const { commandName } = interaction;" and "if (!commandName) {return;}"
or at minimum remove the if-check and continue using interaction.commandName
directly; update any downstream references to use interaction.commandName (or
keep destructuring without the dead check) inside the interactionCreate function
to simplify the control flow.
src/utils/quoteHelpers.ts (1)

13-20: ORDER BY random() scales poorly.

ORDER BY random() LIMIT 1 sorts the entire MotivationQuote table on each call. Fine today, but if the table grows this becomes an O(N log N) hot-path invocation per /quote and per worker send. Consider an offset-by-count approach (one count query + one OFFSET n LIMIT 1) or PostgreSQL TABLESAMPLE once the table grows. Flagging as optional given current scale.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/quoteHelpers.ts` around lines 13 - 20, getRandomMotivationQuote
currently uses ORDER BY random() which sorts the whole table; replace it with a
two-step approach inside getRandomMotivationQuote: first run a COUNT query
against motivationQuotes to get totalRows, then pick a random index n =
floor(Math.random() * totalRows) and fetch the single row with
.limit(1).offset(n) (or use PostgreSQL TABLESAMPLE when enabled) to avoid
full-table sorting; ensure you handle totalRows === 0 and return null, and
reference the function name getRandomMotivationQuote and the motivationQuotes
table when making the change.
src/worker/jobs/setActivityCore.ts (1)

22-77: Optional: tighten return shape and avoid the synthetic DB row.

Two minor cleanups:

  1. setActivityCore has no declared return type and returns true / undefined / the result of _logger.warn(...) depending on the branch. Declaring Promise<void> and dropping the return true / return _logger.warn(...) values makes the contract clearer for the caller.
  2. Pushing a synthetic { id: "default", ..., createdAt: new Date() } into the DB-result array mixes real DiscordActivity rows with a fake one whose id isn’t a UUID. It works today but will bite later (e.g., if anything downstream keys off id). A small union or a separate "effective list" avoids that:
♻️ Sketch
-export async function setActivityCore(client: Client, { db: _db, env: _env, logger: _logger }: SetActivityDeps) {
+export async function setActivityCore(
+  client: Client,
+  { db: _db, env: _env, logger: _logger }: SetActivityDeps,
+): Promise<void> {
   try {
     ...
     if (!client.user) {
-      return _logger.warn("Worker", "Client user is not defined, cannot set activity");
+      _logger.warn("Worker", "Client user is not defined, cannot set activity");
+      return;
     }
     ...
-    activities.push({
-      id: "default",
-      activity: defaultActivity,
-      type: defaultActivityType,
-      url: defaultActivityUrl ? defaultActivityUrl : null,
-      createdAt: new Date(),
-    });
-
-    const randomIndex = Math.floor(Math.random() * activities.length);
-    const activity = activities[randomIndex];
+    const pool: Array<{ activity: string; type: string; url: string | null }> = [
+      ...activities,
+      { activity: defaultActivity, type: defaultActivityType, url: defaultActivityUrl ?? null },
+    ];
+    const activity = pool[Math.floor(Math.random() * pool.length)];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/worker/jobs/setActivityCore.ts` around lines 22 - 77, setActivityCore
currently returns inconsistent shapes and mutates the DB result by pushing a
synthetic row; change the function signature to return Promise<void>, stop
returning values from branches (call _logger.warn/_logger.success and then
return), and avoid inserting a fake DB row into activities — instead build a
separate effectiveActivities array (e.g., const effective = [...activities,
defaultActivityObj]) or pick the default as a fallback variable, then choose a
random element from that effective list; keep usage of getActivityType and
client.user.setActivity as-is and ensure the catch still logs via _logger.error.
src/database/schema.ts (1)

6-52: Enum + index additions LGTM.

suggestionStatusEnum properly types the status column, and the two btree indexes target known hot paths (motivationChannelId for the guild lookup in the send-motivation worker, status for the Pending-suggestion scans used by fetchPendingSuggestion and the stats queries).

One tiny optional refinement: you can derive SuggestionStatus from the enum to keep the values in a single place:

♻️ Optional
-export type SuggestionStatus = "Pending" | "Approved" | "Rejected";
+export type SuggestionStatus = (typeof suggestionStatusEnum.enumValues)[number];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/database/schema.ts` around lines 6 - 52, Add a derived TypeScript type
for the enum so the allowed status values are sourced from suggestionStatusEnum
rather than duplicated strings: export a type alias named SuggestionStatus that
is inferred from suggestionStatusEnum (so any references to SuggestionStatus in
code use this alias), leaving the existing suggestionStatusEnum,
suggestionQuotes.status column, and index unchanged.
src/worker/index.ts (1)

14-34: Consider migrating to upsertJobScheduler for idempotent scheduler management.

BullMQ v5.16.0+ recommends Queue.upsertJobScheduler() over the remove-then-re-add pattern. It is idempotent by scheduler ID, so you don't enumerate and delete prior repeatables, and it avoids resetting the next-run anchor on every startup (your current code defers the next run by one full interval each deploy — a 10-minute activity interval means the first rotation after a deploy is ~10 minutes out).

♻️ Suggested migration
await queue.upsertJobScheduler(
  name,
  { every: intervalMs },
  { name, data, opts: { removeOnComplete: { count: 50 }, removeOnFail: { count: 100 } } }
);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/worker/index.ts` around lines 14 - 34, The ensureRepeatable function
currently removes all repeatables then re-adds a job which is non-idempotent and
resets the next-run anchor; replace that logic by calling
Queue.upsertJobScheduler instead of enumerating/removing and queue.add. Use the
job scheduler id (reuse the existing name) and pass the schedule ({ every:
intervalMs }) and the job payload/opts (data and removeOnComplete/removeOnFail
counts) so the scheduler is updated idempotently; update the ensureRepeatable
implementation to call queue.upsertJobScheduler(name, { every: intervalMs }, {
name, data, opts: { removeOnComplete: { count: 50 }, removeOnFail: { count: 100
} } }).
src/app.ts (1)

94-99: Prefer redis.quit() for graceful shutdown.

disconnect() tears the socket down immediately, potentially dropping in-flight commands (including BullMQ writes that may be mid-flush). quit() sends a QUIT to the server, waits for the acknowledgement, and then closes — which is what a graceful-shutdown path typically wants. This pattern is already used for PostgreSQL in the preceding lines. You already tolerate failures with a try/catch, so swapping in quit() is low-risk.

♻️ Suggested change
   try {
-    redis.disconnect();
+    await redis.quit();
     logger.info("App", "Redis disconnected");
   } catch (err) {
     logger.warn("App", "Error disconnecting Redis", { error: err });
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app.ts` around lines 94 - 99, Replace the immediate socket teardown with
a graceful Redis shutdown by calling redis.quit() instead of redis.disconnect()
in the teardown block (the try/catch that currently calls redis.disconnect());
keep the existing logger messages and error handling but update the success log
text if desired (e.g., "Redis quit" or keep "Redis disconnected") and ensure the
catch continues to log the caught error via logger.warn with the error object.
tests/worker/sendMotivation.test.ts (2)

77-94: Optional: add an ordering assertion for claim-before-send.

The headline contract of this PR is that db.update (the atomic claim) must run before channel.send for each guild. A call-ordering assertion would guard against a future refactor accidentally reintroducing send-before-claim:

Suggested addition
     expect(db.update.calledOnce).toBe(true);
     expect(sendStub.calledOnce).toBe(true);
+    expect(sendStub.calledAfter(db.update)).toBe(true);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/worker/sendMotivation.test.ts` around lines 77 - 94, Add an ordering
assertion to ensure the atomic claim runs before the send: after invoking
sendMotivation and before final expectations, assert that the db.update call
(the claim) occurred before the channel.send call (sendStub) — e.g., use
sinon.assert.callOrder(db.update, sendStub) or
expect(db.update.calledBefore(sendStub)). This ties the test to the unique
symbols sendMotivation, db.update and sendStub (channel.send) so future
refactors can't reorder claim and send.

130-149: Consider tightening the per-guild isolation assertion.

This test stubs db.update to always return a winning claim, and stubs send to fail then succeed. It only asserts logger.error.called, which would also pass if both calls failed. A stronger assertion would verify exactly one failure and one success were recorded (e.g. logger.error.calledOnce and some indication of the sent=1 count), which is the actual behavior under test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/worker/sendMotivation.test.ts` around lines 130 - 149, The test should
assert per-guild isolation more strictly: after calling sendMotivation, replace
the loose expect(logger.error.called).toBe(true) with assertions that exactly
one send failure was logged (e.g. logger.error.calledOnce) and that the success
path recorded a single sent count (e.g. logger.info or whichever logging call
records "sent=1" was called with the expected sent=1 indicator); also ensure
db.update/send stubs remain as-is to exercise the first-fail/second-succeed
scenario. Target symbols: sendMotivation, logger, db.update, and the
channel.send stub to locate the assertions to change.
drizzle/0000_loud_mother_askani.sql (1)

1-48: Migration looks consistent with src/database/schema.ts.

Enums, tables, and the two btree indexes (guild_motivation_channel_idx, suggestion_status_idx) match the schema. The claim‑update path in sendMotivation.ts filters by guilds.id (primary key), so no additional index is needed to support it.

Note: updatedAt is only defaulted on insert here; row updates rely on Drizzle's $onUpdate at the application layer rather than a DB trigger. That's consistent with the rest of the project but something to keep in mind if raw SQL writes are ever introduced.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drizzle/0000_loud_mother_askani.sql` around lines 1 - 48, Migration SQL
matches src/database/schema.ts: enums
DiscordActivityType/MotivationFrequency/SuggestionStatus, tables
DiscordActivity/Guild/MotivationQuote/SuggestionQuote, and indexes
guild_motivation_channel_idx/suggestion_status_idx are correct; no functional
changes required—approve as-is and keep in mind updatedAt is application-managed
via Drizzle $onUpdate (no DB trigger) if raw SQL writes are added later.
src/utils/envSchema.ts (1)

58-64: ALLOWED_USERS validates a trimmed form but doesn't normalize the parsed value.

The refine validates by trimming entries (e.g., "123…, 456…" with space passes), but returns the raw string as-is — consumers must manually .split(",").map(s => s.trim()) or risk ID mismatches.

The sole consumer, src/utils/permissions.ts, already does this (line 10: env.ALLOWED_USERS?.trim()), and tests explicitly verify whitespace handling works. However, requiring every consumer to repeat this logic is fragile.

Consider using .transform() to normalize to string[] so validation and parsing align:

Proposed refactor
   ALLOWED_USERS: z
     .string()
     .optional()
-    .refine(
-      (v) => !v || SNOWFLAKE_LIST.test(v.split(",").map((s) => s.trim()).join(",")),
-      "ALLOWED_USERS must be a comma-separated list of Discord snowflakes"
-    ),
+    .transform((v) => (v ? v.split(",").map((s) => s.trim()).filter(Boolean) : []))
+    .refine(
+      (arr) => arr.every((id) => SNOWFLAKE.test(id)),
+      "ALLOWED_USERS must be a comma-separated list of Discord snowflakes"
+    ),

This is a breaking type change (string → string[]), so src/utils/permissions.ts and call sites need updating.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/envSchema.ts` around lines 58 - 64, ALLOWED_USERS currently
validates trimmed values but leaves the raw string intact; change its schema to
use .transform() to normalize and output a string[] (split by "," and trim each
entry) so validation and parsed value align, update the consumer permissions
logic in src/utils/permissions.ts (which references env.ALLOWED_USERS and
currently calls .trim()) to accept the new string[] shape and update any call
sites/tests accordingly to handle arrays instead of a comma-separated string.
src/worker/jobs/sendMotivation.ts (1)

21-31: Use startOf("isoWeek") instead of startOf("week") for deterministic week boundaries.

startOf("week") defaults to the locale's week start (Sunday in 'en' locale), making the delivery period non-deterministic. For consistent Monday-start ISO 8601 boundaries, import and extend the isoWeek plugin:

Required changes
 import dayjs from "dayjs";
 import utc from "dayjs/plugin/utc.js";
 import timezone from "dayjs/plugin/timezone.js";
+import isoWeek from "dayjs/plugin/isoWeek.js";
 import type { Client } from "discord.js";
 dayjs.extend(utc);
 dayjs.extend(timezone);
+dayjs.extend(isoWeek);
     case "Weekly":
-      return now.startOf("week").utc().toDate();
+      return now.startOf("isoWeek").utc().toDate();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/worker/jobs/sendMotivation.ts` around lines 21 - 31, The weekly boundary
is non-deterministic because periodStart uses dayjs().startOf("week"); update
the file to import the isoWeek plugin from dayjs (e.g., dayjs/plugin/isoWeek)
and call dayjs.extend(isoWeek) once, then change the Weekly branch in
periodStart(guild: Guild) to use startOf("isoWeek") so weeks use ISO
Monday-based boundaries (leave Daily and Monthly branches unchanged and keep
referencing guild.motivationFrequency).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 12b09d08-0e33-40b7-bb22-04bc8c8664dc

📥 Commits

Reviewing files that changed from the base of the PR and between b890fbf and 7b4ebc4.

📒 Files selected for processing (74)
  • .env.example
  • Dockerfile
  • drizzle/0000_loud_mother_askani.sql
  • src/api/routes/health.ts
  • src/app.ts
  • src/bot.ts
  • src/commands/admin/activity/list.ts
  • src/commands/admin/quote/list.ts
  • src/commands/admin/suggestion/approve.ts
  • src/commands/admin/suggestion/list.ts
  • src/commands/admin/suggestion/reject.ts
  • src/commands/admin/suggestion/stats.ts
  • src/commands/owner/premium/testCreate.ts
  • src/commands/owner/premium/testDelete.ts
  • src/commands/owner/premium/testList.ts
  • src/commands/premium.ts
  • src/commands/quote.ts
  • src/commands/setup/schedule.ts
  • src/database/index.ts
  • src/database/migrate.ts
  • src/database/schema.ts
  • src/events/interactionCreate.ts
  • src/events/shardDisconnect.ts
  • src/utils/commandErrors.ts
  • src/utils/envSchema.ts
  • src/utils/ownerGuard.ts
  • src/utils/permissions.ts
  • src/utils/premium.ts
  • src/utils/quoteHelpers.ts
  • src/utils/replyHelpers.ts
  • src/utils/scheduleEvaluator.ts
  • src/utils/suggestionHelpers.ts
  • src/utils/trimArray.ts
  • src/worker/index.ts
  • src/worker/jobs/sendMotivation.ts
  • src/worker/jobs/setActivity.ts
  • src/worker/jobs/setActivityCore.ts
  • tests/api/health.test.ts
  • tests/commands/admin/activity/create.test.ts
  • tests/commands/admin/activity/list.test.ts
  • tests/commands/admin/activity/remove.test.ts
  • tests/commands/admin/quote/create.test.ts
  • tests/commands/admin/quote/list.test.ts
  • tests/commands/admin/quote/remove.test.ts
  • tests/commands/admin/suggestion/approve.test.ts
  • tests/commands/admin/suggestion/list.test.ts
  • tests/commands/admin/suggestion/reject.test.ts
  • tests/commands/admin/suggestion/stats.test.ts
  • tests/commands/owner/testCreate.test.ts
  • tests/commands/premium.test.ts
  • tests/commands/quote.test.ts
  • tests/commands/setup/channel.test.ts
  • tests/commands/setup/schedule.test.ts
  • tests/commands/suggestion.test.ts
  • tests/events/entitlementCreate.test.ts
  • tests/events/entitlementDelete.test.ts
  • tests/events/entitlementUpdate.test.ts
  • tests/events/guildCreate.test.ts
  • tests/events/guildDelete.test.ts
  • tests/events/interactionCreate.test.ts
  • tests/events/ready.test.ts
  • tests/events/shardDisconnect.test.ts
  • tests/helpers.ts
  • tests/utils/commandErrors.test.ts
  • tests/utils/env.test.ts
  • tests/utils/guildDatabase.test.ts
  • tests/utils/quoteHelpers.test.ts
  • tests/utils/replyHelpers.test.ts
  • tests/utils/scheduleEvaluator.parseHourMinute.test.ts
  • tests/utils/suggestionHelpers.test.ts
  • tests/utils/trimArray.test.ts
  • tests/worker/index.test.ts
  • tests/worker/sendMotivation.test.ts
  • tests/worker/setActivity.test.ts
💤 Files with no reviewable changes (2)
  • src/utils/trimArray.ts
  • tests/utils/trimArray.test.ts

Comment thread src/api/routes/health.ts
Comment on lines +10 to +24
function withTimeout<T>(promise: PromiseLike<T>, ms: number, label: string): Promise<T> {
return new Promise<T>((resolve, reject) => {
const timer = setTimeout(() => reject(new Error(`${label} timed out after ${ms}ms`)), ms);
Promise.resolve(promise).then(
(value) => {
clearTimeout(timer);
resolve(value);
},
(err) => {
clearTimeout(timer);
reject(err);
}
);
});
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

withTimeout does not cancel the underlying probe.

On timeout the outer promise rejects but the queryClient\SELECT 1`query andredisClient.ping() keep running in the background, still holding a pool connection until the driver itself gives up (connect_timeout: 10` on Postgres). Under a real DB stall, a burst of health calls can pile up pending queries and make the bad state worse rather than surface it quickly.

Not a blocker given health checks are typically low-QPS, but worth being aware of — especially if this endpoint is wired up to a liveness probe that runs often. For postgres-js you can call .cancel() on the pending query, and for ioredis you can send ping() on a dedicated connection with a short command timeout.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/routes/health.ts` around lines 10 - 24, The withTimeout wrapper
currently only rejects the outer promise but does not cancel the underlying
probe, so long-running queries keep holding resources; update withTimeout to
accept or produce a cancellation hook and use it on timeout (e.g., when wrapping
queryClient `query()` capture the returned query object and call its `.cancel()`
on timeout, and when wrapping `redisClient.ping()` use a dedicated Redis
connection or a command timeout and explicitly close or abort that command on
timeout). Locate the withTimeout function and change its signature to allow
passing in a cancel callback or the cancellable object, ensure the timeout
handler invokes that cancel (calling the postgres-js `.cancel()` on the pending
query and closing/aborting the dedicated ioredis command/connection), and make
sure to clear the timeout when the underlying operation completes or errors.

Comment thread src/api/routes/health.ts
Comment thread src/bot.ts
Comment thread src/commands/admin/suggestion/list.ts Outdated
Comment thread src/commands/admin/suggestion/reject.ts Outdated
Comment thread src/commands/owner/premium/testList.ts
Comment thread src/events/interactionCreate.ts Outdated
Comment thread src/utils/quoteHelpers.ts
Comment thread src/utils/replyHelpers.ts
bun:test's mock.module registry is process-global. When
interactionCreate.test.ts or ready.test.ts top-level-mocked the ten
individual command modules, those mocks leaked into each command's own
test file and the stub-`execute` (or missing `execute`) caused 50
assertion failures on CI Linux (which walks tests/events/ before
tests/commands/).

- Extract command dispatch table to src/events/commandRegistry.ts.
  interactionCreate.ts and ready.ts now import from this single module
  instead of ten command modules.
- interactionCreate.test.ts and ready.test.ts now mock only the
  registry. Command test files are no longer poisoned.
- health.ts: gate raw probe error details behind NODE_ENV !== production
  and log failures server-side via logger.error("API", ...). Note added
  about withTimeout not cancelling the underlying probe — acceptable for
  low-QPS probe cadence.
- bot.ts: gate startWorker(queue) on Events.ClientReady so BullMQ cannot
  dequeue jobs before the Discord client is authenticated.
- admin/suggestion/list.ts: reject invalid status values up front with a
  clear ephemeral error; base emptyMessage on validStatus so users never
  get a misleading "filtered" response for an ignored filter.
- admin/suggestion/reject.ts: fold the Pending check into the UPDATE
  (WHERE id=? AND status='Pending' RETURNING id) and short-circuit when
  no rows were affected so concurrent rejects can't double-post the
  embed + DM + reply.
- owner/premium/testList.ts: reserve OVERFLOW_RESERVE=32 chars in the
  truncation budget so "\n...and N more" cannot push the reply past
  Discord's 2000-char cap.
- events/interactionCreate.ts: narrow with interaction.isCommand() before
  reply/followUp so autocomplete errors don't hit the inner catch as
  "Failed to send error response to user".
- utils/quoteHelpers.ts: log resolveQuoteAuthor failures via logger.warn
  instead of silent catch.
- utils/replyHelpers.ts: empty-state branch now respects the `ephemeral`
  option, matching the populated-list branch.

Tests: new reject.test.ts case covers the zero-rows TOCTOU short-circuit.
Two more sources of bun:test cross-file mock leakage surfaced on the
Ubuntu CI runner that macOS's test discovery order happened to avoid:

- ready.test.ts top-level-mocked guildDatabase.js, poisoning
  utils/guildDatabase.test.ts. Fix: introduce src/events/readyDeps.ts
  (thin re-export shim for pruneGuilds/ensureGuildExists/setActivity);
  ready.ts imports from it; ready.test.ts mocks only the shim.
- reject.test.ts top-level-mocked suggestionHelpers.js, poisoning
  admin/suggestion/approve.test.ts. Fix: rewrite reject.test.ts to
  drive fetchPendingSuggestion via db.select mocks (same pattern
  approve.test.ts already uses), so suggestionHelpers.js stays real
  for every consumer.
@nathanialhenniges nathanialhenniges merged commit 8ef1f21 into main Apr 17, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant