Skip to content

Integrate rustfs#40

Open
elliot14A wants to merge 2 commits intomainfrom
feat/rustfs
Open

Integrate rustfs#40
elliot14A wants to merge 2 commits intomainfrom
feat/rustfs

Conversation

@elliot14A
Copy link
Copy Markdown
Contributor

@elliot14A elliot14A commented Dec 18, 2025

Summary by CodeRabbit

  • Chores
    • Migrated storage backend from MinIO to RustFS across deployment and configuration; updated credentials, endpoints, bucket init workflow, and service wiring.
  • Bug Fixes
    • Improved upload URL handling so single-part uploads retain any returned object key when available, producing correct S3-style URLs.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 18, 2025

Walkthrough

This PR migrates object storage from MinIO to RustFS: environment variables, Docker Compose service definitions, and bucket-init scripts are updated to use RustFS and AWS CLI; web upload URL handling is adjusted to include an uploaded object key when available.

Changes

Cohort / File(s) Summary
Environment configuration updates
config-auth.env.example, config-noauth.env.example
Replace MinIO credentials/endpoints with RustFS equivalents (e.g., GOPIE_S3_*, COMPANION_AWS_*, CHAT_* moved from minioadmin/http://minio:9000 to rustfsadmin/http://rustfs:9000).
Docker Compose service replacement
docker-compose.yaml, docker-compose-auth.yaml, docker-compose-noauth.yaml
Remove minio service; add rustfs service (rustfs/rustfs:latest) with ports 9000/9001 and RUSTFS_* env vars. Update service names/depends_on (e.g., miniorustfs, createbucketscreate-buckets).
Bucket creation workflow migration
docker-compose.yaml, docker-compose-auth.yaml, docker-compose-noauth.yaml
Replace MinIO MC-based init with AWS CLI: create-buckets uses amazon/aws-cli:latest, sleeps then runs aws s3api create-bucket --endpoint-url http://rustfs:9000 .... Env keys updated to AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY.
Web upload URL handling
web/src/components/dataset/dataset-upload-wizard.tsx, web/src/components/dataset/file-refresh-wizard.tsx
For single-part non-S3 URLs, attempt to append uploadResponse.body.key to form s3://bucket/key when available (previously assumed s3://bucket). Comments updated to reference RustFS.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Dev as "docker-compose (orchestrator)"
  participant RustFS as "RustFS (storage)"
  participant Init as "create-buckets (AWS CLI)"
  participant Services as "gopie / companion / others"
  participant Web as "Web UI (upload flow)"

  rect rgba(200,230,201,0.25)
    Dev->>RustFS: start rustfs service (image rustfs/rustfs:latest)
    RustFS-->>Dev: listen on http://rustfs:9000 (S3), :9001 (console)
  end

  rect rgba(200,221,242,0.25)
    Dev->>Init: start create-buckets (depends_on rustfs)
    Init->>RustFS: (after sleep) AWS CLI create-bucket --endpoint-url http://rustfs:9000
    RustFS-->>Init: bucket created (200/201)
    Init-->>Dev: success
  end

  rect rgba(255,243,205,0.25)
    Dev->>Services: start application services (depend on rustfs)
    Services->>RustFS: use RUSTFS_* creds/endpoint for S3 operations
  end

  rect rgba(255,238,230,0.12)
    Web->>Services: upload request -> service returns uploadResponse (may include body.key)
    alt upload URL has single path part
      Web->>Web: if uploadResponse.body.key exists, construct s3://bucket/key
    else
      Web->>Web: fallback to s3://bucket
    end
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • factly/gopie#120 — Modifies URL-parsing logic in web/src/components/dataset/dataset-upload-wizard.tsx; touches the same upload parsing area as this PR.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Integrate rustfs' directly reflects the main objective of the pull request, which is to replace MinIO with RustFS across all configuration and docker-compose files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/rustfs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude Bot commented Dec 18, 2025

Claude finished @elliot14A's task —— View job


Pull Request Review

I've completed my analysis of the rustfs integration changes. Here's my comprehensive review:

Todo List:

  • Read all changed configuration files
  • Analyze Docker compose changes
  • Review environment variable configurations
  • Assess security implications
  • Check for best practices and potential issues
  • Provide comprehensive review feedback

Positive Changes

Clean Migration Strategy

  • Consistent replacement of minio references with rustfs across all configuration files
  • Proper service name changes from minio to rustfs and createbuckets to create-buckets
  • Updated dependency references in companion service

Simplified Configuration

  • Removed minio-specific healthchecks and complex mc client setup
  • Streamlined bucket creation using AWS CLI instead of minio mc client
  • Cleaner environment variable naming (RUSTFS_* prefix)

⚠️ Security Concerns

Hardcoded Credentials

High Priority: Default credentials rustfsadmin/rustfsadmin are used across all environments:

# Found in multiple files:
RUSTFS_ACCESS_KEY=rustfsadmin
RUSTFS_SECRET_KEY=rustfsadmin

Recommendation: These should be:

  1. Different for each environment
  2. Generated using secure random strings
  3. Stored as secrets, not in example files

Missing Security Headers

The rustfs service configuration lacks security-focused environment variables that minio typically includes for production deployments.

🐛 Potential Issues

Removed Health Checks

Issue: The minio service had health checks (docker-compose-auth.yaml:122-126 in previous version), but rustfs service doesn't include any.

# Previous minio healthcheck (now removed)
healthcheck:
  test: ["CMD", "mc", "ready", "local"]
  interval: 5s
  timeout: 5s
  retries: 5

Impact: Services depending on rustfs may start before rustfs is fully ready.

Recommendation: Add rustfs-specific health checks or implement readiness probes.

Bucket Creation Timing

Issue: create-buckets service uses a simple sleep 5 before attempting bucket creation:

sleep 5;
aws s3 mb s3://gopie --endpoint-url http://rustfs:9000 || true;

Risk: May fail if rustfs takes longer than 5 seconds to start.

Recommendation: Implement proper wait conditions or retry logic.

Missing Bucket Permissions

Issue: Previous minio setup included public access configuration:

# This line was removed:
/usr/bin/mc anonymous set public myminio/gopie/visualizations

Impact: If the application expects public read access to visualizations, this may break functionality.

Question: Does the application still need public access to the visualizations bucket?

📈 Performance Considerations

Volume Mapping Changes

  • Previous: ./volumes/minio/data:/export and ./volumes/minio/config:/root/.minio
  • Current: ./volumes/rustfs/data:/data

Note: Ensure the new path structure is compatible with rustfs data format.

Port Changes

  • Removed: MinIO Console port 9002
  • Added: RustFS Console port 9001

This is fine, but documentation should be updated to reflect the new console URL.

🏗️ Best Practice Recommendations

  1. Environment-Specific Configurations

    # Example for production
    RUSTFS_ACCESS_KEY=${RUSTFS_ACCESS_KEY}
    RUSTFS_SECRET_KEY=${RUSTFS_SECRET_KEY}
  2. Add Health Checks

    rustfs:
      # ... existing config
      healthcheck:
        test: ["CMD", "curl", "-f", "http://localhost:9000/health"]
        interval: 30s
        timeout: 10s
        retries: 3
  3. Improve Bucket Creation

    # More robust bucket creation
    until aws s3api head-bucket --bucket gopie --endpoint-url http://rustfs:9000 2>/dev/null; do
      aws s3 mb s3://gopie --endpoint-url http://rustfs:9000 || true
      sleep 2
    done

Test Coverage Suggestions

Since this changes core storage infrastructure, consider adding:

  1. Integration tests for S3 API compatibility
  2. Data migration verification tests
  3. Service connectivity tests
  4. Bucket creation verification tests

📊 Summary

Overall Assessment: ✅ Good migration with some security and reliability concerns that should be addressed.

Priority Actions:

  1. 🔴 High: Replace hardcoded credentials with secure alternatives
  2. 🟡 Medium: Add health checks to rustfs service
  3. 🟡 Medium: Improve bucket creation reliability
  4. 🔵 Low: Verify public access requirements for visualizations

The migration successfully replaces minio with rustfs while maintaining configuration consistency. Address the security concerns before merging to production.


@shreeharsha-factly shreeharsha-factly requested review from surajmn1 and removed request for shreeharsha-factly December 18, 2025 09:19
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
docker-compose-auth.yaml (1)

121-138: Consider adding a healthcheck to improve reliability.

The create-buckets service relies on a fixed 5-second sleep before attempting bucket creation. Without a healthcheck on the rustfs service, this approach is fragile and may fail if RustFS takes longer to start.

🔎 Add healthcheck to rustfs service:
  rustfs:
    image: rustfs/rustfs:latest
    restart: always
    ports:
      - "9000:9000" # S3 API
      - "9001:9001" # RustFS Console
    environment:
      - RUSTFS_ACCESS_KEY=rustfsadmin
      - RUSTFS_SECRET_KEY=rustfsadmin
      - RUSTFS_CONSOLE_ENABLE=true
      - RUSTFS_ADDRESS=:9000
    volumes:
      - ./volumes/rustfs/data:/data
+   healthcheck:
+     test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
+     interval: 5s
+     timeout: 5s
+     retries: 5
    networks:
      - gopie

Then update create-buckets to use condition:

  create-buckets:
    image: amazon/aws-cli:latest
    depends_on:
-     - rustfs
+     rustfs:
+       condition: service_healthy

This eliminates the need for the sleep and makes startup more reliable.

docker-compose-noauth.yaml (1)

94-111: Add healthcheck for improved reliability.

Similar to docker-compose-auth.yaml, this relies on a fixed sleep instead of a proper healthcheck. Consider adding a healthcheck to the rustfs service and using condition: service_healthy in the create-buckets dependency.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cce205 and 32dcac7.

📒 Files selected for processing (5)
  • config-auth.env.example (3 hunks)
  • config-noauth.env.example (3 hunks)
  • docker-compose-auth.yaml (2 hunks)
  • docker-compose-noauth.yaml (2 hunks)
  • docker-compose.yaml (4 hunks)
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Store dataset files in S3-compatible storage (MinIO) in the Go backend server
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to config-noauth.env : Use `config-noauth.env` for no-auth development setup

Applied to files:

  • config-noauth.env.example
  • config-auth.env.example
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Configure environment variables including NEXT_PUBLIC_COMPANION_URL, NEXT_PUBLIC_GOPIE_API_URL, GOPIE_API_URL, and NEXT_PUBLIC_ENABLE_AUTH

Applied to files:

  • config-noauth.env.example
  • config-auth.env.example
  • docker-compose.yaml
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to server/**/*.go : Store dataset files in S3-compatible storage (MinIO) in the Go backend server

Applied to files:

  • config-noauth.env.example
  • config-auth.env.example
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to docker-compose-noauth.yaml : Use `docker-compose-noauth.yaml` for no-auth development setup

Applied to files:

  • docker-compose-auth.yaml
  • docker-compose-noauth.yaml
📚 Learning: 2025-12-02T11:48:34.955Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: chat-server/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:34.955Z
Learning: Applies to chat-server/**/docker-compose.yaml : Use docker-compose for orchestrating service dependencies in development

Applied to files:

  • docker-compose-auth.yaml
  • docker-compose.yaml
📚 Learning: 2025-07-29T09:36:39.908Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-29T09:36:39.908Z
Learning: Applies to chat-server/{pyproject.toml,requirements.txt} : Use `uv sync` to install dependencies in the chat server

Applied to files:

  • docker-compose-auth.yaml
  • docker-compose.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: claude-review
🔇 Additional comments (5)
config-auth.env.example (1)

53-55: LGTM! Credentials and endpoints consistently updated.

The migration from MinIO to RustFS credentials and endpoints is correctly applied across all three service configurations (GOPIE, COMPANION, and CHAT server).

Also applies to: 92-93, 212-212, 214-215

config-noauth.env.example (1)

27-29: LGTM! RustFS configuration correctly applied.

The credentials and endpoints are consistently updated across all service configurations, matching the changes in config-auth.env.example.

Also applies to: 44-45, 172-172, 174-175

docker-compose-noauth.yaml (1)

78-93: Verify RustFS configuration and consider using a specific version tag.

Same concerns as in docker-compose-auth.yaml:

  1. Verify the RustFS Docker image and environment variables are correct
  2. Consider using a specific version tag instead of :latest for reproducible deployments
docker-compose.yaml (2)

163-177: Verify RustFS configuration and use a specific version tag.

Same concerns as the other docker-compose files regarding the RustFS Docker image, environment variables, and the use of :latest tag.


197-204: Remove this comment. The config.env file is gitignored and therefore not included in the PR—it's a local deployment concern, not something a reviewer can or should verify. The example configuration files (config-auth.env.example and config-noauth.env.example) already specify RustFS credentials consistently across all services, confirming the intended storage backend.

Likely an incorrect or invalid review comment.

Comment thread docker-compose-auth.yaml
Comment on lines +105 to 120
rustfs:
image: rustfs/rustfs:latest
restart: always
ports:
- "9000:9000"
- "9002:9002"
volumes:
- ./volumes/minio/data:/export
- ./volumes/minio/config:/root/.minio
- "9000:9000" # S3 API
- "9001:9001" # RustFS Console
environment:
- MINIO_ACCESS_KEY=minioadmin
- MINIO_SECRET_KEY=minioadmin
- RUSTFS_ACCESS_KEY=rustfsadmin
- RUSTFS_SECRET_KEY=rustfsadmin
- RUSTFS_CONSOLE_ENABLE=true
- RUSTFS_ADDRESS=:9000
volumes:
- ./volumes/rustfs/data:/data
networks:
- gopie
command: server -console-address :9002 /export
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 5s
timeout: 5s
retries: 5

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix RUSTFS_ADDRESS format to include IP binding.

The RUSTFS_ADDRESS environment variable should be 0.0.0.0:9000 instead of :9000 to properly bind to all interfaces. Additionally, add RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001 to explicitly configure the console binding. The official RustFS configuration uses RUSTFS_ADDRESS=0.0.0.0:9000 and RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001.

Also consider adding a healthcheck: "curl -f http://localhost:9000/health && curl -f http://localhost:9001/rustfs/console/health" to ensure proper service readiness before bucket creation attempts. The default RUSTFS_ACCESS_KEY and RUSTFS_SECRET_KEY values are intended for development only — rotate these credentials for production.

🤖 Prompt for AI Agents
In docker-compose-auth.yaml around lines 105 to 120, update the rustfs service
environment and readiness: change RUSTFS_ADDRESS from ":9000" to "0.0.0.0:9000",
add RUSTFS_CONSOLE_ADDRESS="0.0.0.0:9001" to bind the console, and add a
healthcheck that probes both endpoints (e.g. curl -f
http://localhost:9000/health && curl -f
http://localhost:9001/rustfs/console/health) so other services wait for
readiness; also note that RUSTFS_ACCESS_KEY/RUSTFS_SECRET_KEY are development
defaults and should be rotated for production.

Comment thread docker-compose.yaml
Comment on lines +179 to 195
create-buckets:
image: amazon/aws-cli:latest
depends_on:
minio:
condition: service_healthy
- rustfs
environment:
- AWS_ACCESS_KEY_ID=rustfsadmin
- AWS_SECRET_ACCESS_KEY=rustfsadmin
- AWS_DEFAULT_REGION=us-east-1
networks:
- gopie
entrypoint: >
/bin/sh -c "
echo 'Minio is up. Creating the bucket!!' &&
/usr/bin/mc alias set myminio http://minio:9000 minioadmin minioadmin &&
/usr/bin/mc mb -p myminio/gopie &&
/usr/bin/mc anonymous set public myminio/gopie
sleep 5;
aws s3 mb s3://gopie --endpoint-url http://rustfs:9000 || true;
echo 'Buckets created successfully';
exit 0;
"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent bucket creation and missing healthcheck.

Two issues:

  1. This compose file only creates the gopie bucket (line 192), while docker-compose-auth.yaml and docker-compose-noauth.yaml create both gopie and downloads buckets. Based on config-auth.env.example line 58 (GOPIE_DOWNLOADS_S3_BUCKET=downloads), the downloads bucket appears to be required.

  2. Similar to other files, consider adding a healthcheck instead of relying on sleep 5.

🔎 Add the missing downloads bucket:
  entrypoint: >
    /bin/sh -c "
    sleep 5;
    aws s3 mb s3://gopie --endpoint-url http://rustfs:9000 || true;
+   aws s3 mb s3://downloads --endpoint-url http://rustfs:9000 || true;
    echo 'Buckets created successfully';
    exit 0;
    "
🤖 Prompt for AI Agents
In docker-compose.yaml around lines 179 to 195, the create-buckets service only
creates the gopie bucket and uses a hardcoded sleep rather than a healthcheck;
update it to also create the downloads bucket (aws s3 mb s3://downloads
--endpoint-url http://rustfs:9000) so it matches docker-compose-auth.yaml/noauth
and config-auth.env.example, and replace the sleep hack by adding a proper
healthcheck for the rustfs service (or for this container) that waits for the S3
endpoint on rustfs:9000 to be ready before running the bucket creation commands,
then run both aws s3 mb commands (each tolerant of existing buckets) and exit
successfully.

- mc: <ERROR> Unable to get bucket CORS configuration for rustfs/gopie. GetBucketCors is not implemented yet.
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Claude finished @elliot14A's task —— View job


Pull Request Review: Integrate rustfs

Todo List:

  • Read all changed configuration files
  • Analyze Docker compose changes
  • Review environment variable configurations
  • Check web component changes
  • Assess security implications
  • Check for best practices and potential issues
  • Provide comprehensive review feedback

Strengths of the Migration

Comprehensive Migration

  • Consistent Updates: All MinIO references properly replaced with RustFS across configuration files, Docker compose services, and environment variables
  • Maintained Functionality: S3-compatible API preserved, ensuring application compatibility
  • Clean Service Definitions: RustFS service properly configured with appropriate ports and environment variables

Good Configuration Structure

  • Environment Separation: Proper configuration for both auth and no-auth environments
  • Service Dependencies: Correct depends_on configurations updated from minio to rustfs
  • Bucket Creation: AWS CLI approach for bucket creation is appropriate for RustFS

🚨 Critical Security Issues

1. Hardcoded Default Credentials (HIGH PRIORITY)

Files: All config and docker-compose files
Issue: Default credentials rustfsadmin/rustfsadmin used across all environments

# Found in multiple files:
RUSTFS_ACCESS_KEY=rustfsadmin
RUSTFS_SECRET_KEY=rustfsadmin

Risk:

  • Production deployments using default credentials are easily compromised
  • Credentials are identical across all services and environments

Recommendation:

  1. Generate unique, strong credentials for each environment
  2. Use environment-specific secrets management
  3. Never commit actual production credentials to repository

2. Missing Encryption Configuration

Issue: No TLS/SSL configuration for RustFS service
Risk: Data transmission in plaintext in production environments


⚠️ Reliability & Performance Issues

3. Missing Health Checks (MEDIUM PRIORITY)

Files: docker-compose-auth.yaml:105-120, docker-compose-noauth.yaml:78-93

Issue: RustFS service lacks health checks, unlike the previous MinIO setup

# Previous MinIO had health checks, RustFS doesn't
rustfs:
  image: rustfs/rustfs:latest
  # Missing healthcheck configuration

Impact:

  • Dependent services may start before RustFS is ready
  • No automatic restart on service failure
  • Poor startup reliability

Recommendation:

rustfs:
  # ... existing config
  healthcheck:
    test: ["CMD", "curl", "-f", "http://localhost:9000/health"]
    interval: 30s
    timeout: 10s
    retries: 3

4. Fragile Bucket Creation

Files: All docker-compose files
Issue: sleep 5 approach is unreliable

# Current approach:
sleep 5;
aws s3 mb s3://gopie --endpoint-url http://rustfs:9000 || true;

Problems:

  • Fixed timeout may be insufficient
  • No retry logic on failure
  • Silent failures due to || true

Recommendation: Implement proper wait-for-service logic or use health check conditions

5. Inconsistent Bucket Creation

File: docker-compose.yaml:192
Issue: Only creates gopie bucket, while auth/noauth versions create both gopie and downloads

Fix Needed: Add missing downloads bucket creation to match other files


🔧 Configuration Issues

6. Incorrect Address Binding

Files: All docker-compose files
Issue: RUSTFS_ADDRESS=:9000 should be RUSTFS_ADDRESS=0.0.0.0:9000

Impact: May cause binding issues in Docker containers

7. Latest Tag Usage

Files: All docker-compose files
Issue: Using rustfs/rustfs:latest instead of pinned version

Risk: Unpredictable behavior with automatic updates
Recommendation: Use specific version tags for production reliability


📊 Web Component Analysis

8. S3 URL Parsing Logic

Files: web/src/components/dataset/*.tsx

Positive:

  • Comprehensive URL parsing for different S3 endpoint formats
  • Proper handling of RustFS localhost endpoints
  • Good error handling and logging

Areas for Improvement:

  • Complex URL parsing logic could be extracted to utility function
  • Consider adding unit tests for URL parsing scenarios

🏗️ Best Practice Recommendations

9. Environment Variable Consistency

  • Consider using RUSTFS_* prefix consistently instead of mixed AWS_* and RUSTFS_*
  • Add validation for required environment variables

10. Documentation Updates Needed

  • README should document RustFS migration
  • Console URL changed from :9002 to :9001 - update documentation
  • Environment setup instructions need updating

11. Volume Path Changes

Previous: ./volumes/minio/data:/export
Current: ./volumes/rustfs/data:/data

Ensure data migration strategy for existing deployments.


🧪 Testing Recommendations

Missing Test Coverage

  1. S3 API Compatibility Tests: Verify RustFS maintains S3 API compatibility
  2. Service Integration Tests: Test service startup order and dependencies
  3. Bucket Creation Tests: Verify all required buckets are created
  4. Data Migration Tests: Test existing data compatibility

📋 Action Items

Immediate (Before Merge)

  1. 🔴 Replace hardcoded credentials with secure alternatives
  2. 🟡 Add health checks to RustFS service in all compose files
  3. 🟡 Fix missing downloads bucket in docker-compose.yaml
  4. 🟡 Correct RUSTFS_ADDRESS binding format

Short Term (After Merge)

  1. 🔵 Pin RustFS version instead of using latest
  2. 🔵 Implement robust bucket creation logic
  3. 🔵 Add integration tests for S3 functionality
  4. 🔵 Update documentation for new console port and setup

Long Term

  1. 🔵 Add TLS/SSL configuration for production
  2. 🔵 Implement proper secrets management
  3. 🔵 Create data migration guide for existing deployments

📊 Summary

Overall Assessment: ✅ Solid migration with critical security issues to address

The migration successfully replaces MinIO with RustFS while maintaining S3 API compatibility. The implementation is thorough and consistent across all configuration files. However, critical security concerns with default credentials and missing health checks must be addressed before production deployment.

Merge Recommendation: ✅ Approve with required changes - Address security issues and reliability improvements before merging to production.


Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32dcac7 and 6a55464.

📒 Files selected for processing (2)
  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/file-refresh-wizard.tsx
🧰 Additional context used
📓 Path-based instructions (3)
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx}: Use Next.js 15 with App Router and React 19 for the web frontend framework
Use TypeScript as the language for the web frontend
Generate and use types from API responses for type safety
Use dynamic imports and code splitting for optimal performance
Load DuckDB WASM on-demand and use web workers for client-side SQL execution

Files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/file-refresh-wizard.tsx
web/src/components/**/*.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/components/**/*.{ts,tsx}: Use TailwindCSS with Radix UI components for styling instead of custom CSS
Use React Hook Form with Zod validation for form handling
Use pre-configured queries and mutations from lib/queries and lib/mutations for API calls
Wrap features in error boundaries to handle errors gracefully
Always handle loading and error states in UI components
Use Next.js Image component for image optimization

Files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/file-refresh-wizard.tsx
web/src/components/dataset/**upload**.{ts,tsx}

📄 CodeRabbit inference engine (web/CLAUDE.md)

Use Uppy for dataset file uploads in upload components

Files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
🧠 Learnings (1)
📚 Learning: 2025-12-02T11:48:50.411Z
Learnt from: CR
Repo: factly/gopie PR: 0
File: web/CLAUDE.md:0-0
Timestamp: 2025-12-02T11:48:50.411Z
Learning: Applies to web/src/components/dataset/**upload**.{ts,tsx} : Use Uppy for dataset file uploads in upload components

Applied to files:

  • web/src/components/dataset/dataset-upload-wizard.tsx
  • web/src/components/dataset/file-refresh-wizard.tsx
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: claude-review
🔇 Additional comments (1)
web/src/components/dataset/file-refresh-wizard.tsx (1)

174-184: Document the intended behavior for single-path bucket URLs and add validation if uploadResponse.body.key should always be present.

The code constructs an s3 URL without a key when uploadResponse?.body?.key is undefined (s3://bucket). This fallback appears intentionally designed—it's implemented identically in both file-refresh-wizard.tsx (line 184) and dataset-upload-wizard.tsx (line 394), each with the comment "assume it's just the bucket," suggesting support for RustFS or similar storage backends that may handle bucket-only URLs.

However, the lack of validation and error handling creates ambiguity: it's unclear whether this fallback is intentional and whether the backend source/s3/refresh endpoint actually accepts key-less s3:// paths. If this fallback is intentional for RustFS:

  • Document it clearly to prevent confusion
  • Add a comment explaining the RustFS use case

If uploadResponse.body.key should always be present for valid operations:

  • Add validation with descriptive error messaging before constructing the URL

Comment on lines +381 to +394
// Non-S3 URL, might be RUSTFS, presigned URL, or proxy
// For localhost:9000 (RUSTFS) or similar, the format is usually: http://localhost:9000/bucket
// Try to extract path assuming format: /bucket
if (pathParts.length >= 2) {
const bucket = pathParts[0];
const key = pathParts.slice(1).join("/");
s3Url = `s3://${bucket}/${key}`;
} else if (pathParts.length === 1) {
// Only one path part, assume it's just the bucket
s3Url = `s3://${pathParts[0]}`;
const uploadResponse = useUploadStore.getState()
.uploadResponse as { body?: { key?: string } } | undefined;

const key = uploadResponse?.body?.key;
s3Url = `${key ? `s3://${pathParts[0]}/${key}` : `s3://${pathParts[0]}`}`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate that the s3:// URL includes a key and consider refactoring duplicated logic.

This code has the same issue as in file-refresh-wizard.tsx: when pathParts.length === 1 and uploadResponse?.body?.key is undefined or empty, it constructs an incomplete s3 URL (s3://bucket). This will likely cause the dataset creation API call at line 425 to fail.

Additionally, this URL parsing logic is duplicated across multiple upload components, which increases maintenance burden.

🔎 Proposed fix with validation
          } else {
            // Non-S3 URL, might be RUSTFS, presigned URL, or proxy
            // For localhost:9000 (RUSTFS) or similar, the format is usually: http://localhost:9000/bucket
            // Try to extract path assuming format: /bucket
            if (pathParts.length >= 2) {
              const bucket = pathParts[0];
              const key = pathParts.slice(1).join("/");
              s3Url = `s3://${bucket}/${key}`;
            } else if (pathParts.length === 1) {
-             // Only one path part, assume it's just the bucket
              const uploadResponse = useUploadStore.getState()
                .uploadResponse as { body?: { key?: string } } | undefined;

              const key = uploadResponse?.body?.key;
-             s3Url = `${key ? `s3://${pathParts[0]}/${key}` : `s3://${pathParts[0]}`}`;
+             if (!key) {
+               throw new Error("Upload response missing object key for single-path URL");
+             }
+             s3Url = `s3://${pathParts[0]}/${key}`;
            } else {

Refactoring suggestion: Consider extracting the S3 URL parsing logic into a shared utility function (e.g., lib/utils/s3-url-parser.ts) to avoid duplication:

// lib/utils/s3-url-parser.ts
export function parseUploadUrlToS3(
  uploadURL: string, 
  uploadResponse?: { body?: { key?: string } }
): string {
  // Consolidated parsing logic here
}

Then use it in both files:

s3Url = parseUploadUrlToS3(uploadURL, useUploadStore.getState().uploadResponse);
🤖 Prompt for AI Agents
In web/src/components/dataset/dataset-upload-wizard.tsx around lines 381 to 394,
the code constructs s3://bucket when pathParts.length === 1 and
uploadResponse?.body?.key is missing, producing an incomplete S3 URL that will
fail later; update the logic to validate that a key exists before forming an
s3://bucket/key URL (if no key, fail early or return an explicit error/undefined
instead of s3://bucket), and refactor this parsing into a shared utility (e.g.,
lib/utils/s3-url-parser.ts with signature parseUploadUrlToS3(uploadURL,
uploadResponse)) so both this file and file-refresh-wizard.tsx call the same
function and avoid duplicated logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants