Skip to content

feat : use private bucket for vizualizations#24

Open
surajmn1 wants to merge 4 commits intomainfrom
feat/use-private-bucket-for-viz
Open

feat : use private bucket for vizualizations#24
surajmn1 wants to merge 4 commits intomainfrom
feat/use-private-bucket-for-viz

Conversation

@surajmn1
Copy link
Copy Markdown
Contributor

@surajmn1 surajmn1 commented Nov 20, 2025

@shreeharsha-factly new envs added for web
S3_ENDPOINT=http://localhost:9000
S3_REGION=us-east-1
S3_BUCKET=gopie
S3_ACCESS_KEY=minioadmin
S3_SECRET_KEY=minioadmin

Summary by CodeRabbit

  • New Features

    • S3 integration with signed URL generation for secure visualization and asset access.
    • Server-side proxy now returns signed visualization paths for safe client consumption.
  • Bug Fixes

    • Visualization path display strips query parameters.
    • Visualization titles render correctly for both plain and structured formats.
  • Chores

    • Storage buckets set to private and a downloads bucket added.
    • New S3-related environment variables and AWS S3 SDK dependencies added.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 20, 2025

Walkthrough

Adds S3/MinIO configuration and bucket changes, introduces AWS SDK deps and a signer (signUrl) in the web app, adds a Next.js GET proxy route that forwards messages from the upstream GOPIE API and replaces visualization/file paths with signed URLs, adjusts API client usage, and widens Vega title typing and rendering.

Changes

Cohort / File(s) Summary
Configuration & Compose
config-noauth.env.example, docker-compose-noauth.yaml, docker-compose.yaml
Added S3 env vars (S3_ENDPOINT, S3_REGION, S3_BUCKET, S3_ACCESS_KEY, S3_SECRET_KEY). Created downloads bucket and changed MinIO bucket access to private for gopie, gopie/visualizations, and downloads. Updated compose create-buckets commands and startup messages.
Dependencies
web/package.json
Added AWS SDK packages: @aws-sdk/client-s3 and @aws-sdk/s3-request-presigner.
S3 signer & API client
web/src/lib/s3/signer.ts, web/src/lib/api-client.ts
New S3 signer module exporting signUrl(rawPath: string): Promise<string> using an S3Client configured for MinIO (path-style) and 24h expiry. Added nextApiClient export mirroring apiClient for Next.js route usage.
API routes & message proxying
web/src/app/api/chat/[chatId]/messages/route.ts, web/src/app/api/chat/route.ts, web/src/lib/queries/chat/get-messages.ts
Added GET proxy route that forwards requests to GOPIE_API_URL/v1/api/chat/{chatId}/messages, parses response messages and tool_calls, and signs visualization/file paths inside tool_call arguments. Updated chat route to use signUrl for visualization paths and switched client usage to nextApiClient in get-messages.
UI & types
web/src/components/chat/visualization-results.tsx, web/src/components/chat/message.tsx
Broadened VegaSpec.title to `string

Sequence Diagram(s)

sequenceDiagram
    participant Browser
    participant NextApi as Next.js API\n(messages route)
    participant Upstream as GOPIE API\n(upstream)
    participant S3 as S3 / MinIO

    Browser->>NextApi: GET /api/chat/[chatId]/messages
    NextApi->>Upstream: Forward request (preserve headers)
    Upstream-->>NextApi: 200 + messages (+ tool_calls)

    rect rgba(76,175,80,0.12)
    Note over NextApi,S3: Sign visualization/file paths (new)
    NextApi->>NextApi: Extract tool_calls -> parse args
    NextApi->>S3: getSignedUrl(GetObjectCommand) (24h)
    S3-->>NextApi: Signed URLs
    NextApi->>NextApi: Inject signed URLs back into tool_calls args
    end

    NextApi-->>Browser: Return messages with signed URLs
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas needing extra attention:
    • web/src/app/api/chat/[chatId]/messages/route.ts — nested argument parsing, mutation, header preservation, and upstream error handling.
    • web/src/lib/s3/signer.ts — path normalization, bucket/key parsing, S3Client path-style config, and signed URL expiry handling.
    • Docker Compose scripts (docker-compose*.yaml) — bucket creation ordering and privacy flags across variants.
    • web/src/lib/api-client.ts / web/src/lib/queries/chat/get-messages.ts — ensure nextApiClient mirrors auth behavior and request paths are correct.

Poem

🐇 I nibble paths and sign each one,
24 hours of access in the sun,
Buckets closed, secrets snug and tight,
Visuals gleam in a signed-URL light,
Hop, deploy — everything's a delight! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main objective of the PR: converting from public to private S3 buckets for visualizations, which is reflected throughout the changeset (Docker Compose changes, environment variables, S3 client configuration).
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/use-private-bucket-for-viz

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude Bot commented Nov 20, 2025

Claude finished @surajmn1's task —— View job


Code Review Complete ✅

This PR implements private bucket functionality for visualizations by adding S3 pre-signed URLs. Here's my comprehensive review:

Review Checklist

  • Read and understand all changed files
  • Analyze code quality and best practices
  • Check for potential bugs or issues
  • Review security considerations
  • Assess performance implications
  • Evaluate test coverage
  • Provide comprehensive feedback

Strengths

Security

  • ✅ Proper use of pre-signed URLs for private S3 access
  • ✅ Environment variables properly configured in Docker
  • ✅ Credentials safely passed through env vars
  • ✅ 24-hour expiration on signed URLs (reasonable default)

Code Quality

  • ✅ Clean separation of concerns with dedicated s3/signer.ts
  • ✅ Consistent error handling throughout
  • ✅ Good use of URL normalization in normalizePath()
  • ✅ Proper bucket configuration in Docker Compose

Architecture

  • ✅ Smart design: signing happens at API layer, not client-side
  • ✅ Backwards compatibility maintained
  • ✅ Clean integration with existing visualization flow

⚠️ Issues Found

🔴 Critical Issues

  1. Hardcoded Credentials in Production Docker
    docker-compose.yaml:172-173

    MINIO_ACCESS_KEY=minioadmin
    MINIO_SECRET_KEY=minioadmin

    Risk: Production secrets exposed in version control
    Fix: Use Docker secrets or external secret management

  2. Missing Error Boundaries
    web/src/lib/s3/signer.ts:54

    console.error("signUrl error:", err);
    return rawPath; // Silent fallback could mask failures

    Risk: Silent failures in production
    Fix: Add proper error reporting/monitoring

🟡 Medium Priority Issues

  1. Potential Memory Leak
    web/src/app/api/chat/route.ts:293-295

    const signedPaths = await Promise.all(
      rawPaths.map((p: string) => signUrl(p))
    );

    Risk: Unbounded concurrent S3 calls
    Fix: Add concurrency limiting (e.g., p-limit)

  2. Missing Input Validation
    web/src/lib/s3/signer.ts:28

    export async function signUrl(rawPath: string): Promise<string> {

    Risk: No validation of bucket/key format
    Fix: Add input sanitization and validation

  3. Docker Security
    docker-compose.yaml:111-112 & docker-compose-noauth.yaml:111-112

    /usr/bin/mc anonymous set private myminio/gopie &&
    /usr/bin/mc anonymous set private myminio/gopie/visualizations

    Note: Good security practice ✅

🟢 Minor Issues

  1. Code Duplication
    Similar error handling patterns repeated across files
    Suggestion: Create shared error handling utilities

  2. Missing JSDoc Comments
    New S3 functions lack documentation
    Suggestion: Add comprehensive documentation


🚀 Performance Considerations

✅ Good

  • Efficient single S3 client instance reuse
  • Reasonable 24-hour cache expiration
  • Path normalization prevents duplicate processing

⚠️ Watch Points

  • Concurrent S3 signing calls could overwhelm API
  • No request caching (consider Redis for frequently accessed paths)

🧪 Test Coverage

❌ Missing Tests

  • No unit tests for signUrl() function
  • No integration tests for new message route
  • No error scenario testing (invalid paths, S3 failures)

Suggestions

// Example test cases needed:
- signUrl() with valid S3 paths
- signUrl() with malformed URLs  
- signUrl() with S3 service failures
- Message route with/without visualization paths
- Environment variable validation

📋 Recommendations

Before Merge

  1. 🔴 MUST FIX: Replace hardcoded credentials in docker-compose.yaml
  2. 🔴 MUST FIX: Add proper error handling to S3 signer
  3. 🟡 SHOULD FIX: Add input validation to signUrl()

Future Improvements

  • Add comprehensive test coverage
  • Implement request concurrency limiting
  • Add performance monitoring for S3 operations
  • Consider implementing signed URL caching

💡 Overall Assessment

This is a well-architected implementation of private S3 bucket functionality. The core design is solid with good separation of concerns and security practices. However, there are some production readiness concerns around error handling and credential management that should be addressed before merge.

Recommendation: ✅ Approve after fixing critical issues


Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/src/app/api/chat/route.ts (1)

284-307: Wrap variable declarations in block scope to fix static analysis warning.

The rawPaths and signedPaths declarations inside the switch case should be wrapped in a block to prevent potential variable hoisting issues and accidental access from other cases.

Apply this diff:

                         case "visualization_paths":
                         case "visualization_result":
-                          const rawPaths =
-                            args.paths ||
-                            args.visualization_json_paths?.map(
-                              (v: { json_path?: string }) => v.json_path
-                            ) ||
-                            [];
-
-                          if (rawPaths.length > 0) {
-                            const signedPaths = await Promise.all(
-                              rawPaths.map((p: string) => signUrl(p))
-                            );
-
+                          {
+                            const rawPaths =
+                              args.paths ||
+                              args.visualization_json_paths?.map(
+                                (v: { json_path?: string }) => v.json_path
+                              ) ||
+                              [];
+
+                            if (rawPaths.length > 0) {
+                              const signedPaths = await Promise.all(
+                                rawPaths.map((p: string) => signUrl(p))
+                              );
+
-                            writer.write({
-                              type: "data-visualization",
-                              id: `viz-${toolCall.id || Date.now()}`,
-                              data: {
-                                id: `viz-${toolCall.id || Date.now()}`,
-                                paths: signedPaths,
-                                status: "ready",
-                              },
-                            });
+                              writer.write({
+                                type: "data-visualization",
+                                id: `viz-${toolCall.id || Date.now()}`,
+                                data: {
+                                  id: `viz-${toolCall.id || Date.now()}`,
+                                  paths: signedPaths,
+                                  status: "ready",
+                                },
+                              });
+                            }
                           }
                           break;
🧹 Nitpick comments (4)
web/src/lib/api-client.ts (1)

86-117: Consider extracting shared authentication logic to reduce duplication.

The nextApiClient duplicates the entire beforeRequest hook logic from apiClient. While this works correctly, extracting the shared authentication logic into a reusable function would improve maintainability.

Example refactor:

// Shared authentication logic
function createAuthHook() {
  return (request: Request) => {
    const isAuthEnabled = String(process.env.NEXT_PUBLIC_ENABLE_AUTH).trim() === "true";

    if (!isAuthEnabled) {
      if (!request.headers.get("x-user-id")) {
        request.headers.set("x-user-id", "system");
      }
      if (!request.headers.get("x-organization-id")) {
        request.headers.set("x-organization-id", "system");
      }
      return;
    }

    const token = getGlobalAccessToken();
    if (token && !request.headers.get("Authorization")) {
      request.headers.set("Authorization", `Bearer ${token}`);
    }
    
    const orgId = getGlobalOrganizationId();
    if (orgId && !request.headers.get("x-organization-id")) {
      request.headers.set("x-organization-id", orgId);
    }
  };
}

// Then use in both clients:
export const apiClient = ky.create({
  prefixUrl: process.env.NEXT_PUBLIC_GOPIE_API_URL,
  headers: { "Content-Type": "application/json" },
  timeout: false,
  hooks: { beforeRequest: [createAuthHook()] },
});

export const nextApiClient = ky.create({
  headers: { "Content-Type": "application/json" },
  timeout: false,
  hooks: { beforeRequest: [createAuthHook()] },
});
web/src/lib/s3/signer.ts (2)

1-1: Update the file path in the comment.

The comment references lib/s3-signer.ts but the actual file path is lib/s3/signer.ts.

Apply this diff:

-// lib/s3-signer.ts
+// lib/s3/signer.ts

52-52: Consider making the expiry time configurable.

The 24-hour expiry (expiresIn: 86400) is hardcoded. Consider making this configurable via an environment variable to allow different TTLs for different deployment environments.

+const SIGNED_URL_EXPIRY = parseInt(process.env.S3_SIGNED_URL_EXPIRY || "86400", 10);
+
 // ...
-    return await getSignedUrl(s3, command, { expiresIn: 86400 });
+    return await getSignedUrl(s3, command, { expiresIn: SIGNED_URL_EXPIRY });
web/src/components/chat/visualization-results.tsx (1)

103-104: Consider proper typing instead of as any cast.

The as any cast bypasses type safety and suggests a mismatch between the custom VegaSpec interface and vega-embed's expected types. This could hide type errors at compile time.

Consider these alternatives:

Option 1: Import and use vega-embed's types

import vegaEmbed, { type VisualizationSpec } from "vega-embed";

// Use VisualizationSpec from vega-embed instead of custom VegaSpec
interface VisualizationData {
  path: string;
  spec: VisualizationSpec | null;
  isLoading: boolean;
  error?: string;
}

Option 2: Extend vega-embed's types properly

import vegaEmbed, { type VisualizationSpec } from "vega-embed";

interface VegaSpec extends VisualizationSpec {
  // Add only additional properties not in VisualizationSpec
}

Option 3: If the any cast is truly necessary, document why

// vegaEmbed expects Spec type but our dynamic title format requires broader typing
// Cast is safe here as vega-embed handles both string and object titles
vegaEmbed(vizRefs.current[index]!, updatedSpec as any, embedOptions)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4473633 and 3ba782d.

📒 Files selected for processing (11)
  • config-noauth.env.example (1 hunks)
  • docker-compose-noauth.yaml (1 hunks)
  • docker-compose.yaml (1 hunks)
  • web/package.json (1 hunks)
  • web/src/app/api/chat/[chatId]/messages/route.ts (1 hunks)
  • web/src/app/api/chat/route.ts (2 hunks)
  • web/src/components/chat/message.tsx (1 hunks)
  • web/src/components/chat/visualization-results.tsx (3 hunks)
  • web/src/lib/api-client.ts (1 hunks)
  • web/src/lib/queries/chat/get-messages.ts (2 hunks)
  • web/src/lib/s3/signer.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Integrate DuckDB WASM for client-side SQL execution in the Next.js frontend

Applied to files:

  • web/package.json
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to config-noauth.env : Use `config-noauth.env` for no-auth development setup

Applied to files:

  • config-noauth.env.example
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Vega-Lite for chart visualization in the Next.js frontend

Applied to files:

  • web/src/components/chat/visualization-results.tsx
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to docker-compose-noauth.yaml : Use `docker-compose-noauth.yaml` for no-auth development setup

Applied to files:

  • docker-compose-noauth.yaml
  • docker-compose.yaml
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Zustand for global state management and React Query for server state in the Next.js frontend

Applied to files:

  • web/src/lib/queries/chat/get-messages.ts
🧬 Code graph analysis (3)
web/src/app/api/chat/[chatId]/messages/route.ts (1)
web/src/lib/s3/signer.ts (1)
  • signUrl (28-57)
web/src/lib/queries/chat/get-messages.ts (1)
web/src/lib/api-client.ts (1)
  • nextApiClient (86-117)
web/src/app/api/chat/route.ts (1)
web/src/lib/s3/signer.ts (1)
  • signUrl (28-57)
🪛 Biome (2.1.2)
web/src/app/api/chat/route.ts

[error] 285-290: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.

The declaration is defined in this switch clause:

Safe fix: Wrap the declaration in a block.

(lint/correctness/noSwitchDeclarations)

🔇 Additional comments (13)
web/src/lib/queries/chat/get-messages.ts (2)

1-1: LGTM! API client change aligns with new architecture.

Switching from apiClient to nextApiClient correctly routes requests through the Next.js API layer, which enables server-side URL signing before responses reach the client.


236-237: LGTM! Path correctly updated for Next.js API route.

The path change from v1/api/chat/${chatId}/messages to api/chat/${chatId}/messages is correct for Next.js API routes, which will proxy to the backend and sign visualization URLs.

config-noauth.env.example (1)

59-63: LGTM! S3 configuration is appropriate for local development.

The S3 environment variables are correctly configured for local MinIO usage, matching the credentials and endpoint used elsewhere in the Docker Compose setup.

web/src/components/chat/message.tsx (1)

812-812: LGTM! Strips query parameters from visualization display.

The updated logic correctly removes query parameters (including signed URL parameters) from the displayed filename, improving readability while maintaining a fallback to the full path if needed.

web/package.json (1)

15-16: AWS SDK versions verified—no issues found.

Both @aws-sdk/client-s3 and @aws-sdk/s3-request-presigner version 3.933.0 exist in the NPM registry and no security advisories are currently reported. The versions are valid and safe to use.

web/src/lib/s3/signer.ts (1)

16-25: LGTM!

The normalizePath utility correctly handles both full URLs and plain path strings, safely extracting the pathname when a valid URL is provided.

web/src/components/chat/visualization-results.tsx (2)

158-160: LGTM!

The title rendering logic correctly handles both string and object title formats, with a sensible fallback to "Visualization N" when no title is provided.


7-16: Type definition is correct and aligns with Vega-Lite v6.

The Vega-Lite v6 title property accepts either a string or a title-properties object (TitleParams) with fields like text, fontSize, and anchor. The interface correctly defines this union type with explicit common properties and an index signature to accommodate additional properties. No changes needed.

web/src/app/api/chat/[chatId]/messages/route.ts (5)

4-9: LGTM!

The route handler correctly implements the Next.js 15 async params API by awaiting context.params.

Based on learnings


12-23: LGTM!

The header forwarding logic correctly excludes the host header (standard practice for proxying) and ensures a JSON content-type is set.


27-43: LGTM!

The upstream request handling is well-implemented with proper cache settings (no-store) for dynamic data and appropriate error handling that preserves the upstream status code.


100-106: LGTM!

The error handling appropriately logs detailed errors for debugging while returning a generic error message to clients, avoiding information leakage.


45-96: No issues found — the GOPIE API response structure matches the code's expectations.

The Go backend endpoint (/v1/api/chat/{chatID}/messages) returns a SuccessResponse with data as an array of ChatMessage objects, not an object with a messages property. The TypeScript code at line 46 correctly extracts this array with json.data || [], and the defensive check at line 51 appropriately validates each message's choices property before processing.

Comment thread docker-compose-noauth.yaml Outdated
Comment thread docker-compose.yaml Outdated
Comment thread web/src/lib/api-client.ts
if (token && !request.headers.get("Authorization")) {
request.headers.set("Authorization", `Bearer ${token}`);
}
const orgId = getGlobalOrganizationId()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add missing semicolon for consistency.

Line 110 is missing a semicolon after the function call, which is inconsistent with the rest of the codebase style.

Apply this diff:

-        const orgId = getGlobalOrganizationId()
+        const orgId = getGlobalOrganizationId();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const orgId = getGlobalOrganizationId()
const orgId = getGlobalOrganizationId();
🤖 Prompt for AI Agents
In web/src/lib/api-client.ts around line 110, the statement "const orgId =
getGlobalOrganizationId()" is missing a trailing semicolon; add the semicolon to
make it "const orgId = getGlobalOrganizationId();" to match project style and
run the formatter/linter to ensure consistent spacing and line ending.

Comment thread web/src/lib/s3/signer.ts
Comment on lines +6 to +14
const s3 = new S3Client({
region: process.env.S3_REGION || "us-east-1",
endpoint: process.env.S3_ENDPOINT,
credentials: {
accessKeyId: process.env.S3_ACCESS_KEY!,
secretAccessKey: process.env.S3_SECRET_KEY!,
},
forcePathStyle: true, // for MinIO
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Validate required environment variables before use.

The non-null assertions on process.env.S3_ACCESS_KEY! and process.env.S3_SECRET_KEY! will pass undefined to the S3Client if these environment variables are not set, leading to authentication failures at runtime. Additionally, S3_ENDPOINT is required when using MinIO but is currently optional.

Apply this diff to add validation:

+// Validate required environment variables
+const requiredEnvVars = {
+  S3_ACCESS_KEY: process.env.S3_ACCESS_KEY,
+  S3_SECRET_KEY: process.env.S3_SECRET_KEY,
+  S3_ENDPOINT: process.env.S3_ENDPOINT,
+};
+
+for (const [key, value] of Object.entries(requiredEnvVars)) {
+  if (!value) {
+    throw new Error(`Missing required environment variable: ${key}`);
+  }
+}
+
 const s3 = new S3Client({
   region: process.env.S3_REGION || "us-east-1",
-  endpoint: process.env.S3_ENDPOINT,         
+  endpoint: process.env.S3_ENDPOINT!,
   credentials: {
     accessKeyId: process.env.S3_ACCESS_KEY!,
     secretAccessKey: process.env.S3_SECRET_KEY!,
   },
   forcePathStyle: true,                      // for MinIO
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const s3 = new S3Client({
region: process.env.S3_REGION || "us-east-1",
endpoint: process.env.S3_ENDPOINT,
credentials: {
accessKeyId: process.env.S3_ACCESS_KEY!,
secretAccessKey: process.env.S3_SECRET_KEY!,
},
forcePathStyle: true, // for MinIO
});
// Validate required environment variables
const requiredEnvVars = {
S3_ACCESS_KEY: process.env.S3_ACCESS_KEY,
S3_SECRET_KEY: process.env.S3_SECRET_KEY,
S3_ENDPOINT: process.env.S3_ENDPOINT,
};
for (const [key, value] of Object.entries(requiredEnvVars)) {
if (!value) {
throw new Error(`Missing required environment variable: ${key}`);
}
}
const s3 = new S3Client({
region: process.env.S3_REGION || "us-east-1",
endpoint: process.env.S3_ENDPOINT!,
credentials: {
accessKeyId: process.env.S3_ACCESS_KEY!,
secretAccessKey: process.env.S3_SECRET_KEY!,
},
forcePathStyle: true, // for MinIO
});
🤖 Prompt for AI Agents
In web/src/lib/s3/signer.ts around lines 6 to 14, the code uses non-null
assertions for S3 credentials and treats S3_ENDPOINT as optional which can yield
undefined values at runtime; validate process.env.S3_ACCESS_KEY and
process.env.S3_SECRET_KEY at module init and throw a clear error if missing,
require process.env.S3_ENDPOINT when using MinIO (forcePathStyle true) and throw
if missing, then use the validated values (no ! assertions) to construct the
S3Client so the client always receives defined credentials and endpoint.

Comment thread web/src/lib/s3/signer.ts
Comment on lines +28 to +57
export async function signUrl(rawPath: string): Promise<string> {
try {
if (!rawPath) return rawPath;

// Strip host if present
const normalized = normalizePath(rawPath);

// Ensure no leading slash
const cleaned = normalized.startsWith("/") ? normalized.slice(1) : normalized;

const slashIndex = cleaned.indexOf("/");
if (slashIndex === -1) {
console.error("signUrl error: Invalid path, no bucket/key", cleaned);
return rawPath;
}

const bucket = cleaned.substring(0, slashIndex);
const key = cleaned.substring(slashIndex + 1);

const command = new GetObjectCommand({
Bucket: bucket,
Key: key,
});

return await getSignedUrl(s3, command, { expiresIn: 86400 });
} catch (err) {
console.error("signUrl error:", err);
return rawPath;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Consider throwing errors instead of silently returning the original path.

The current implementation returns the original rawPath when signing fails (lines 30, 41, 55). This silent failure pattern could mask configuration issues or signing errors in production, potentially leading to access denied errors when clients attempt to use unsigned URLs with private buckets.

Consider one of these approaches:

Option 1: Throw errors for genuine failures (recommended)

 export async function signUrl(rawPath: string): Promise<string> {
-  try {
-    if (!rawPath) return rawPath;
+  if (!rawPath) return rawPath;
 
-    // Strip host if present
-    const normalized = normalizePath(rawPath);
+  // Strip host if present
+  const normalized = normalizePath(rawPath);
 
-    // Ensure no leading slash
-    const cleaned = normalized.startsWith("/") ? normalized.slice(1) : normalized;
+  // Ensure no leading slash
+  const cleaned = normalized.startsWith("/") ? normalized.slice(1) : normalized;
 
-    const slashIndex = cleaned.indexOf("/");
-    if (slashIndex === -1) {
-      console.error("signUrl error: Invalid path, no bucket/key", cleaned);
-      return rawPath;
-    }
+  const slashIndex = cleaned.indexOf("/");
+  if (slashIndex === -1) {
+    throw new Error(`Invalid S3 path format (no bucket/key): ${cleaned}`);
+  }
 
-    const bucket = cleaned.substring(0, slashIndex);
-    const key    = cleaned.substring(slashIndex + 1);
+  const bucket = cleaned.substring(0, slashIndex);
+  const key    = cleaned.substring(slashIndex + 1);
 
+  try {
     const command = new GetObjectCommand({
       Bucket: bucket,
       Key: key,
     });
 
     return await getSignedUrl(s3, command, { expiresIn: 86400 });
   } catch (err) {
-    console.error("signUrl error:", err);
-    return rawPath;
+    console.error("Failed to sign S3 URL:", err);
+    throw new Error(`Failed to sign S3 URL for ${bucket}/${key}: ${err}`);
   }
 }

Option 2: Log warnings for debugging

If silent fallback is intentional, at least log at warning level:

   } catch (err) {
-    console.error("signUrl error:", err);
+    console.warn("Failed to sign S3 URL, returning original path:", err);
     return rawPath;
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function signUrl(rawPath: string): Promise<string> {
try {
if (!rawPath) return rawPath;
// Strip host if present
const normalized = normalizePath(rawPath);
// Ensure no leading slash
const cleaned = normalized.startsWith("/") ? normalized.slice(1) : normalized;
const slashIndex = cleaned.indexOf("/");
if (slashIndex === -1) {
console.error("signUrl error: Invalid path, no bucket/key", cleaned);
return rawPath;
}
const bucket = cleaned.substring(0, slashIndex);
const key = cleaned.substring(slashIndex + 1);
const command = new GetObjectCommand({
Bucket: bucket,
Key: key,
});
return await getSignedUrl(s3, command, { expiresIn: 86400 });
} catch (err) {
console.error("signUrl error:", err);
return rawPath;
}
}
export async function signUrl(rawPath: string): Promise<string> {
if (!rawPath) return rawPath;
// Strip host if present
const normalized = normalizePath(rawPath);
// Ensure no leading slash
const cleaned = normalized.startsWith("/") ? normalized.slice(1) : normalized;
const slashIndex = cleaned.indexOf("/");
if (slashIndex === -1) {
throw new Error(`Invalid S3 path format (no bucket/key): ${cleaned}`);
}
const bucket = cleaned.substring(0, slashIndex);
const key = cleaned.substring(slashIndex + 1);
try {
const command = new GetObjectCommand({
Bucket: bucket,
Key: key,
});
return await getSignedUrl(s3, command, { expiresIn: 86400 });
} catch (err) {
console.error("Failed to sign S3 URL:", err);
throw new Error(`Failed to sign S3 URL for ${bucket}/${key}: ${err}`);
}
}
🤖 Prompt for AI Agents
In web/src/lib/s3/signer.ts around lines 28 to 57, replace the silent fallbacks
that return rawPath (lines ~30, ~41, ~55) with thrown errors so failures surface
to callers: validate inputs and if missing rawPath or malformed path throw a
descriptive Error including the problematic value, and when getSignedUrl or
other operations fail rethrow or throw a new Error that includes the underlying
error message; remove the console.error+return rawPath pattern so callers
receive a rejected Promise instead (alternatively, if a silent fallback is truly
required, log a warning instead of returning silently).

@surajmn1 surajmn1 force-pushed the feat/use-private-bucket-for-viz branch from 3ba782d to af42d6e Compare November 20, 2025 07:17
@claude
Copy link
Copy Markdown

claude Bot commented Nov 20, 2025

Claude finished @surajmn1's task —— View job


Code Review Complete ✅

This PR implements private bucket functionality for visualizations by adding S3 pre-signed URLs. Here's my comprehensive review:

Review Checklist

  • Read and understand all changed files
  • Analyze code quality and best practices
  • Check for potential bugs or issues
  • Review security considerations
  • Assess performance implications
  • Evaluate test coverage
  • Provide comprehensive feedback

Strengths

Security

  • ✅ Proper use of pre-signed URLs for private S3 access
  • ✅ Environment variables properly configured in Docker
  • ✅ Credentials safely passed through env vars
  • ✅ 24-hour expiration on signed URLs (reasonable default)

Code Quality

  • ✅ Clean separation of concerns with dedicated s3/signer.ts
  • ✅ Consistent error handling throughout
  • ✅ Good use of URL normalization in normalizePath()
  • ✅ Proper bucket configuration in Docker Compose

Architecture

  • ✅ Smart design: signing happens at API layer, not client-side
  • ✅ Backwards compatibility maintained
  • ✅ Clean integration with existing visualization flow

⚠️ Issues Found

🔴 Critical Issues

  1. Hardcoded Credentials in Production Docker
    docker-compose.yaml:172-173

    MINIO_ACCESS_KEY=minioadmin
    MINIO_SECRET_KEY=minioadmin

    Risk: Production secrets exposed in version control
    Fix: Use Docker secrets or external secret management

  2. Missing Error Boundaries
    web/src/lib/s3/signer.ts:54

    console.error("signUrl error:", err);
    return rawPath; // Silent fallback could mask failures

    Risk: Silent failures in production
    Fix: Add proper error reporting/monitoring

🟡 Medium Priority Issues

  1. Potential Memory Leak
    web/src/app/api/chat/route.ts:293-295

    const signedPaths = await Promise.all(
      rawPaths.map((p: string) => signUrl(p))
    );

    Risk: Unbounded concurrent S3 calls
    Fix: Add concurrency limiting (e.g., p-limit)

  2. Missing Input Validation
    web/src/lib/s3/signer.ts:28

    export async function signUrl(rawPath: string): Promise<string> {

    Risk: No validation of bucket/key format
    Fix: Add input sanitization and validation

  3. Docker Security
    docker-compose.yaml:111-112 & docker-compose-noauth.yaml:111-112

    /usr/bin/mc anonymous set private myminio/gopie &&
    /usr/bin/mc anonymous set private myminio/gopie/visualizations

    Note: Good security practice ✅

🟢 Minor Issues

  1. Code Duplication
    Similar error handling patterns repeated across files
    Suggestion: Create shared error handling utilities

  2. Missing JSDoc Comments
    New S3 functions lack documentation
    Suggestion: Add comprehensive documentation


🚀 Performance Considerations

✅ Good

  • Efficient single S3 client instance reuse
  • Reasonable 24-hour cache expiration
  • Path normalization prevents duplicate processing

⚠️ Watch Points

  • Concurrent S3 signing calls could overwhelm API
  • No request caching (consider Redis for frequently accessed paths)

🧪 Test Coverage

❌ Missing Tests

  • No unit tests for signUrl() function
  • No integration tests for new message route
  • No error scenario testing (invalid paths, S3 failures)

Suggestions

// Example test cases needed:
- signUrl() with valid S3 paths
- signUrl() with malformed URLs  
- signUrl() with S3 service failures
- Message route with/without visualization paths
- Environment variable validation

📋 Recommendations

Before Merge

  1. 🔴 MUST FIX: Replace hardcoded credentials in docker-compose.yaml
  2. 🔴 MUST FIX: Add proper error handling to S3 signer
  3. 🟡 SHOULD FIX: Add input validation to signUrl()

Future Improvements

  • Add comprehensive test coverage
  • Implement request concurrency limiting
  • Add performance monitoring for S3 operations
  • Consider implementing signed URL caching

💡 Overall Assessment

This is a well-architected implementation of private S3 bucket functionality. The core design is solid with good separation of concerns and security practices. However, there are some production readiness concerns around error handling and credential management that should be addressed before merge.

Recommendation: ✅ Approve after fixing critical issues


Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/src/app/api/chat/route.ts (1)

283-307: Wrap case declarations in a block to prevent variable leakage.

The rawPaths and signedPaths variables are declared in a switch case without block scoping, which can erroneously allow other cases to access them.

Apply this diff to wrap the cases in a block:

                         case "visualization_paths":
                         case "visualization_result":
+                          {
                           const rawPaths =
                             args.paths ||
                             args.visualization_json_paths?.map(
                               (v: { json_path?: string }) => v.json_path
                             ) ||
                             [];

                           if (rawPaths.length > 0) {
                             const signedPaths = await Promise.all(
                               rawPaths.map((p: string) => signUrl(p))
                             );

                             writer.write({
                               type: "data-visualization",
                               id: `viz-${toolCall.id || Date.now()}`,
                               data: {
                                 id: `viz-${toolCall.id || Date.now()}`,
                                 paths: signedPaths,
                                 status: "ready",
                               },
                             });
                           }
+                          }
                           break;

Based on static analysis hints.

♻️ Duplicate comments (2)
web/src/lib/s3/signer.ts (2)

6-14: Validate required environment variables before use.

Past review comments have already flagged this critical issue: the non-null assertions on environment variables will pass undefined to the S3Client if not set, leading to authentication failures at runtime.


28-57: Consider throwing errors instead of silently returning the original path.

Past review comments have already flagged this issue: the silent failure pattern could mask configuration issues in production, potentially leading to access denied errors with unsigned URLs.

🧹 Nitpick comments (2)
web/src/components/chat/visualization-results.tsx (1)

103-104: Consider a more specific type assertion.

While the as any cast works, you could use a more specific type assertion that maintains better type safety:

-        // eslint-disable-next-line @typescript-eslint/no-explicit-any
-        vegaEmbed(vizRefs.current[index]!, updatedSpec as any, embedOptions).catch(
+        vegaEmbed(
+          vizRefs.current[index]!, 
+          updatedSpec as Parameters<typeof vegaEmbed>[1], 
+          embedOptions
+        ).catch(

This preserves the type relationship with vegaEmbed while still allowing the flexible title format.

web/src/app/api/chat/[chatId]/messages/route.ts (1)

36-43: Consider including more error details for debugging.

The error response only returns a generic "Upstream error" message without including details from the upstream response, which could make debugging difficult.

     if (!gopieRes.ok) {
       const errorText = await gopieRes.text();
       console.error("GOPIE error:", errorText);
       return NextResponse.json(
-        { error: "Upstream error" },
+        { 
+          error: "Upstream error",
+          details: process.env.NODE_ENV === 'development' ? errorText : undefined
+        },
         { status: gopieRes.status }
       );
     }

This provides better debugging in development while keeping production responses clean.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3ba782d and af42d6e.

⛔ Files ignored due to path filters (2)
  • web/bun.lock is excluded by !**/*.lock
  • web/bun.lockb is excluded by !**/bun.lockb
📒 Files selected for processing (11)
  • config-noauth.env.example (1 hunks)
  • docker-compose-noauth.yaml (1 hunks)
  • docker-compose.yaml (1 hunks)
  • web/package.json (1 hunks)
  • web/src/app/api/chat/[chatId]/messages/route.ts (1 hunks)
  • web/src/app/api/chat/route.ts (2 hunks)
  • web/src/components/chat/message.tsx (1 hunks)
  • web/src/components/chat/visualization-results.tsx (3 hunks)
  • web/src/lib/api-client.ts (1 hunks)
  • web/src/lib/queries/chat/get-messages.ts (2 hunks)
  • web/src/lib/s3/signer.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • web/package.json
  • web/src/lib/queries/chat/get-messages.ts
  • web/src/lib/api-client.ts
  • web/src/components/chat/message.tsx
  • config-noauth.env.example
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Vega-Lite for chart visualization in the Next.js frontend
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Integrate DuckDB WASM for client-side SQL execution in the Next.js frontend
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to docker-compose-noauth.yaml : Use `docker-compose-noauth.yaml` for no-auth development setup

Applied to files:

  • docker-compose.yaml
  • docker-compose-noauth.yaml
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to web/**/*.{js,jsx,ts,tsx} : Use Vega-Lite for chart visualization in the Next.js frontend

Applied to files:

  • web/src/components/chat/visualization-results.tsx
🧬 Code graph analysis (2)
web/src/app/api/chat/route.ts (1)
web/src/lib/s3/signer.ts (1)
  • signUrl (28-57)
web/src/app/api/chat/[chatId]/messages/route.ts (1)
web/src/lib/s3/signer.ts (1)
  • signUrl (28-57)
🪛 Biome (2.1.2)
web/src/app/api/chat/route.ts

[error] 285-290: Other switch clauses can erroneously access this declaration.
Wrap the declaration in a block to restrict its access to the switch clause.

The declaration is defined in this switch clause:

Safe fix: Wrap the declaration in a block.

(lint/correctness/noSwitchDeclarations)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: claude-review
🔇 Additional comments (7)
web/src/components/chat/visualization-results.tsx (2)

8-13: LGTM!

The extended VegaSpec type correctly accommodates Vega-Lite's flexible title specification, which allows both a simple string and a rich object format.


158-160: LGTM!

The title display logic correctly handles both string and object formats with appropriate fallbacks.

docker-compose-noauth.yaml (1)

111-113: LGTM!

All buckets are now correctly configured as private, aligning with the PR objective to control access via signed URLs.

docker-compose.yaml (1)

192-198: LGTM!

The bucket creation and privacy configuration is correct. All buckets are now private, aligning with the PR objective to use signed URLs for controlled access.

web/src/app/api/chat/route.ts (1)

293-295: Verify error handling for signing failures.

The signing logic depends on signUrl from web/src/lib/s3/signer.ts, which has critical issues with environment variable validation and error handling. Ensure that signing failures are properly handled here, as the current signer implementation silently returns the original path on error.

Consider adding explicit error handling around the signing operation:

const signedPaths = await Promise.all(
  rawPaths.map(async (p: string) => {
    try {
      return await signUrl(p);
    } catch (err) {
      console.error('Failed to sign visualization path:', p, err);
      // Decide whether to throw or use a fallback
      throw err;
    }
  })
);
web/src/app/api/chat/[chatId]/messages/route.ts (2)

71-93: Verify error handling for URL signing failures.

The signing operations depend on signUrl from web/src/lib/s3/signer.ts, which has critical issues with environment variable validation and error handling. If signing fails, the current implementation will silently use the original (unsigned) path, which will fail when accessing private buckets.

Consider adding explicit error handling:

// ---- visualization_result ----
if (fn.name === "visualization_result") {
  const arr = args.visualization_json_paths;
  if (Array.isArray(arr)) {
    for (const entry of arr) {
      if (entry.json_path) {
        try {
          entry.json_path = await signUrl(entry.json_path);
        } catch (err) {
          console.error('Failed to sign visualization path:', entry.json_path, err);
          throw new Error(`Failed to sign visualization URL: ${err}`);
        }
      }
    }
  }
  
  fn.arguments = JSON.stringify(args);
}

This ensures that signing failures surface properly rather than silently passing unsigned URLs that will fail at access time.


85-93: LGTM with reservations.

The logic for signing visualization paths is correct and uses Promise.all appropriately for concurrent operations. However, ensure that the underlying signUrl function's issues (flagged in previous comments) are addressed to prevent silent failures.

@claude
Copy link
Copy Markdown

claude Bot commented Nov 20, 2025

Claude finished @surajmn1's task —— View job


Code Review Complete ✅

This PR implements private bucket functionality for visualizations by adding S3 pre-signed URLs. Here's my comprehensive review:

Review Checklist

  • Read and understand all changed files
  • Analyze code quality and best practices
  • Check for potential bugs or issues
  • Review security considerations
  • Assess performance implications
  • Evaluate test coverage
  • Provide comprehensive feedback

Strengths

Security

  • ✅ Proper use of pre-signed URLs for private S3 access
  • ✅ Environment variables properly configured in Docker
  • ✅ Credentials safely passed through env vars
  • ✅ 24-hour expiration on signed URLs (reasonable default)

Code Quality

  • ✅ Clean separation of concerns with dedicated s3/signer.ts
  • ✅ Consistent error handling throughout
  • ✅ Good use of URL normalization in normalizePath()
  • ✅ Proper bucket configuration in Docker Compose

Architecture

  • ✅ Smart design: signing happens at API layer, not client-side
  • ✅ Backwards compatibility maintained
  • ✅ Clean integration with existing visualization flow

⚠️ Issues Found

🔴 Critical Issues

  1. Hardcoded Credentials in Production Docker
    docker-compose.yaml:172-173

    MINIO_ACCESS_KEY=minioadmin
    MINIO_SECRET_KEY=minioadmin

    Risk: Production secrets exposed in version control
    Fix: Use Docker secrets or external secret management

  2. Missing Error Boundaries
    web/src/lib/s3/signer.ts:54

    console.error("signUrl error:", err);
    return rawPath; // Silent fallback could mask failures

    Risk: Silent failures in production
    Fix: Add proper error reporting/monitoring

🟡 Medium Priority Issues

  1. Potential Memory Leak
    web/src/app/api/chat/route.ts:293-295

    const signedPaths = await Promise.all(
      rawPaths.map((p: string) => signUrl(p))
    );

    Risk: Unbounded concurrent S3 calls
    Fix: Add concurrency limiting (e.g., p-limit)

  2. Missing Input Validation
    web/src/lib/s3/signer.ts:28

    export async function signUrl(rawPath: string): Promise<string> {

    Risk: No validation of bucket/key format
    Fix: Add input sanitization and validation

  3. Docker Security
    docker-compose.yaml:111-112 & docker-compose-noauth.yaml:111-112

    /usr/bin/mc anonymous set private myminio/gopie &&
    /usr/bin/mc anonymous set private myminio/gopie/visualizations

    Note: Good security practice ✅

🟢 Minor Issues

  1. Code Duplication
    Similar error handling patterns repeated across files
    Suggestion: Create shared error handling utilities

  2. Missing JSDoc Comments
    New S3 functions lack documentation
    Suggestion: Add comprehensive documentation


🚀 Performance Considerations

✅ Good

  • Efficient single S3 client instance reuse
  • Reasonable 24-hour cache expiration
  • Path normalization prevents duplicate processing

⚠️ Watch Points

  • Concurrent S3 signing calls could overwhelm API
  • No request caching (consider Redis for frequently accessed paths)

🧪 Test Coverage

❌ Missing Tests

  • No unit tests for signUrl() function
  • No integration tests for new message route
  • No error scenario testing (invalid paths, S3 failures)

Suggestions

// Example test cases needed:
- signUrl() with valid S3 paths
- signUrl() with malformed URLs  
- signUrl() with S3 service failures
- Message route with/without visualization paths
- Environment variable validation

📋 Recommendations

Before Merge

  1. 🔴 MUST FIX: Replace hardcoded credentials in docker-compose.yaml
  2. 🔴 MUST FIX: Add proper error handling to S3 signer
  3. 🟡 SHOULD FIX: Add input validation to signUrl()

Future Improvements

  • Add comprehensive test coverage
  • Implement request concurrency limiting
  • Add performance monitoring for S3 operations
  • Consider implementing signed URL caching

💡 Overall Assessment

This is a well-architected implementation of private S3 bucket functionality. The core design is solid with good separation of concerns and security practices. However, there are some production readiness concerns around error handling and credential management that should be addressed before merge.

Recommendation: ✅ Approve after fixing critical issues


Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
web/src/lib/s3/signer.ts (2)

19-27: Consider removing redundant non-null assertions after validation.

Since the environment variables are validated on lines 6-17, the non-null assertions (!) on lines 21, 23, and 24 are technically redundant. You could store the validated values and use them directly to avoid the assertions.

 for (const [key, value] of Object.entries(requiredEnvVars)) {
   if (!value) {
     throw new Error(`Missing required environment variable: ${key}`);
   }
 }

+// Store validated values
+const { S3_ACCESS_KEY, S3_SECRET_KEY, S3_ENDPOINT } = requiredEnvVars as Record<string, string>;
+
 const s3 = new S3Client({
   region: process.env.S3_REGION || "us-east-1",
-  endpoint: process.env.S3_ENDPOINT!,
+  endpoint: S3_ENDPOINT,
   credentials: {
-    accessKeyId: process.env.S3_ACCESS_KEY!,
-    secretAccessKey: process.env.S3_SECRET_KEY!,
+    accessKeyId: S3_ACCESS_KEY,
+    secretAccessKey: S3_SECRET_KEY,
   },
   forcePathStyle: true,
 });

64-64: Consider making the URL expiry time configurable.

The signed URL expiry is hardcoded to 86,400 seconds (24 hours). While this is a reasonable default for visualization files, you might want to make it configurable via an environment variable for different use cases.

+const URL_EXPIRY_SECONDS = parseInt(process.env.S3_URL_EXPIRY || "86400", 10);
+
 // ... rest of the code ...

-    return await getSignedUrl(s3, command, { expiresIn: 86400 });
+    return await getSignedUrl(s3, command, { expiresIn: URL_EXPIRY_SECONDS });
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af42d6e and 31166ea.

📒 Files selected for processing (1)
  • web/src/lib/s3/signer.ts (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: claude-review
  • GitHub Check: Analyze (go)
🔇 Additional comments (2)
web/src/lib/s3/signer.ts (2)

6-17: Environment variable validation successfully implemented.

The validation logic properly addresses the previous review feedback by checking required environment variables at module initialization and throwing descriptive errors if any are missing. This ensures fail-fast behavior before the S3Client is constructed.


41-69: Error handling improvements successfully implemented.

The function now properly throws errors for invalid paths (line 52) and signing failures (line 67) instead of silently returning the original path. This addresses the previous review feedback and ensures that configuration issues or signing errors surface to callers rather than being masked.

@surajmn1 surajmn1 force-pushed the feat/use-private-bucket-for-viz branch from 31166ea to af42d6e Compare November 20, 2025 07:38
@claude
Copy link
Copy Markdown

claude Bot commented Nov 20, 2025

Claude finished @surajmn1's task —— View job


Code Review Complete ✅

This PR implements private bucket functionality for visualizations by adding S3 pre-signed URLs. Here's my comprehensive review:

Review Checklist

  • Read and understand all changed files
  • Analyze code quality and best practices
  • Check for potential bugs or issues
  • Review security considerations
  • Assess performance implications
  • Evaluate test coverage
  • Provide comprehensive feedback

Strengths

Security

  • ✅ Proper use of pre-signed URLs for private S3 access
  • ✅ Environment variables properly configured in Docker
  • ✅ Credentials safely passed through env vars
  • ✅ 24-hour expiration on signed URLs (reasonable default)

Code Quality

  • ✅ Clean separation of concerns with dedicated s3/signer.ts
  • ✅ Consistent error handling throughout
  • ✅ Good use of URL normalization in normalizePath()
  • ✅ Proper bucket configuration in Docker Compose

Architecture

  • ✅ Smart design: signing happens at API layer, not client-side
  • ✅ Backwards compatibility maintained
  • ✅ Clean integration with existing visualization flow

⚠️ Issues Found

🔴 Critical Issues

  1. Hardcoded Credentials in Production Docker
    docker-compose.yaml:172-173

    MINIO_ACCESS_KEY=minioadmin
    MINIO_SECRET_KEY=minioadmin

    Risk: Production secrets exposed in version control
    Fix: Use Docker secrets or external secret management

  2. Missing Error Boundaries
    web/src/lib/s3/signer.ts:54

    console.error("signUrl error:", err);
    return rawPath; // Silent fallback could mask failures

    Risk: Silent failures in production
    Fix: Add proper error reporting/monitoring

🟡 Medium Priority Issues

  1. Potential Memory Leak
    web/src/app/api/chat/route.ts:293-295

    const signedPaths = await Promise.all(
      rawPaths.map((p: string) => signUrl(p))
    );

    Risk: Unbounded concurrent S3 calls
    Fix: Add concurrency limiting (e.g., p-limit)

  2. Missing Input Validation
    web/src/lib/s3/signer.ts:28

    export async function signUrl(rawPath: string): Promise<string> {

    Risk: No validation of bucket/key format
    Fix: Add input sanitization and validation

  3. Docker Security
    docker-compose.yaml:111-112 & docker-compose-noauth.yaml:111-112

    /usr/bin/mc anonymous set private myminio/gopie &&
    /usr/bin/mc anonymous set private myminio/gopie/visualizations

    Note: Good security practice ✅

🟢 Minor Issues

  1. Code Duplication
    Similar error handling patterns repeated across files
    Suggestion: Create shared error handling utilities

  2. Missing JSDoc Comments
    New S3 functions lack documentation
    Suggestion: Add comprehensive documentation


🚀 Performance Considerations

✅ Good

  • Efficient single S3 client instance reuse
  • Reasonable 24-hour cache expiration
  • Path normalization prevents duplicate processing

⚠️ Watch Points

  • Concurrent S3 signing calls could overwhelm API
  • No request caching (consider Redis for frequently accessed paths)

🧪 Test Coverage

❌ Missing Tests

  • No unit tests for signUrl() function
  • No integration tests for new message route
  • No error scenario testing (invalid paths, S3 failures)

Suggestions

// Example test cases needed:
- signUrl() with valid S3 paths
- signUrl() with malformed URLs  
- signUrl() with S3 service failures
- Message route with/without visualization paths
- Environment variable validation

📋 Recommendations

Before Merge

  1. 🔴 MUST FIX: Replace hardcoded credentials in docker-compose.yaml
  2. 🔴 MUST FIX: Add proper error handling to S3 signer
  3. 🟡 SHOULD FIX: Add input validation to signUrl()

Future Improvements

  • Add comprehensive test coverage
  • Implement request concurrency limiting
  • Add performance monitoring for S3 operations
  • Consider implementing signed URL caching

💡 Overall Assessment

This is a well-architected implementation of private S3 bucket functionality. The core design is solid with good separation of concerns and security practices. However, there are some production readiness concerns around error handling and credential management that should be addressed before merge.

Recommendation: ✅ Approve after fixing critical issues


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant