Skip to content

S3 storage backend integration#2

Merged
dcadenas merged 1 commit intomainfrom
s3_storage_integration
Apr 15, 2025
Merged

S3 storage backend integration#2
dcadenas merged 1 commit intomainfrom
s3_storage_integration

Conversation

@dcadenas
Copy link
Copy Markdown
Collaborator

@dcadenas dcadenas commented Apr 15, 2025

Sets up config for s3 integration

Summary by CodeRabbit

  • New Features

    • Added a new script to upload media files to the server with NIP-24242 authentication.
  • Configuration

    • S3 storage backend is now enabled by default in the configuration file.
    • S3 secret access key must be provided via an environment variable.
  • Dependency and Build Changes

    • Simplified feature flags and dependencies; S3 and related dependencies are now always included.
    • Docker build steps and runtime dependencies updated for consistency and expanded support.
  • Refactor

    • Unified storage type handling and removed conditional compilation for S3 storage.
    • Improved error handling and logging for S3 operations.
  • Chores

    • Removed the storage management CLI tool and related configuration options.
    • Updated compose file to remove unused feature flags and add S3 secret environment variable.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2025

Walkthrough

This set of changes primarily removes optional feature flags and conditional compilation from the project, making S3 storage support and related dependencies always enabled. The storage-manage CLI binary and its associated code are deleted, and the configuration files now default to using S3 storage with required environment variables for secrets. Build scripts and Docker configurations are simplified by removing feature-specific arguments and enabling default features. The S3 storage backend’s error handling is improved for better clarity. A new script is added for uploading media files with authentication. The codebase is refactored to use updated naming and unconditional inclusion of S3 support.

Changes

File(s) Change Summary
Cargo.toml Removed storage-manage binary and related features; made S3 and other dependencies non-optional; simplified feature flags and dependencies.
src/bin/storage-manage.rs Deleted file implementing the storage-manage CLI tool and all its public entities.
src/settings.rs Renamed StorageBackendType to StorageType; removed conditional compilation for S3; made S3 settings always present; added Default for S3 settings.
src/lib.rs Removed conditional compilation from spaces module, making it always included.
src/bin/main.rs Updated import and usage of storage type enum; removed conditional S3 compilation.
src/spaces.rs Refined error handling for S3 head_object call, improving error classification and logging.
.../Dockerfile, Dockerfile.arm64 Simplified build steps; removed feature-specific build flags; added runtime dependencies for ARM64.
config.yaml Enabled S3 storage backend by default; added S3 configuration; secret key set via environment variable.
docker-compose.yml Removed build argument for features; added S3 secret key environment variable.
scripts/blossom_media_upload.sh New script for uploading media files to the server with authentication and error handling.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant blossom_media_upload.sh
    participant NAK CLI
    participant BlossomServer

    User->>blossom_media_upload.sh: Provide file, server URL, group ID, secret
    blossom_media_upload.sh->>NAK CLI: Generate auth event (base64)
    blossom_media_upload.sh->>BlossomServer: PUT /media (with file, headers, auth)
    BlossomServer-->>blossom_media_upload.sh: Respond with status, body, headers
    blossom_media_upload.sh->>User: Output result, processed URL, status
Loading

Poem

🐇
Features trimmed and options few,
S3 is now the path we hew.
The storage-manage bunny hops away,
While scripts for uploads save the day.
With simpler builds and config, too,
Our codebase shines—so fresh, so new!
Hooray for change, from me to you!

Tip

⚡💬 Agentic Chat (Pro Plan, General Availability)
  • We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
scripts/blossom_media_upload.sh (2)

74-76: Consider expanding MIME type corrections for additional file types.

The script only handles correcting MIME types for markdown files. Consider adding corrections for other common file types that might be incorrectly identified.

 # If FILE_TYPE is text/plain for a markdown file, correct it (though unlikely for media endpoint)
 if [[ "$FILE_PATH" == *.md ]] && [[ "$FILE_TYPE" == "text/plain" ]]; then
     FILE_TYPE="text/markdown"
+elif [[ "$FILE_PATH" == *.json ]] && [[ "$FILE_TYPE" == "text/plain" ]]; then
+    FILE_TYPE="application/json"
+elif [[ "$FILE_PATH" == *.svg ]] && [[ "$FILE_TYPE" == "text/plain" ]]; then
+    FILE_TYPE="image/svg+xml"
 fi

160-164: Don't forget to handle interrupt signals.

While the script cleans up temporary files on normal exit and error paths, it doesn't handle interrupt signals (CTRL+C). Consider adding a trap for proper cleanup on interruption.

+# Set up trap to clean up temp files on script interruption
+trap 'echo ""; echo "Interrupted, cleaning up..."; rm -f "$TEMP_RESPONSE_FILE" "$TEMP_HEADERS_FILE"; exit 1' INT TERM

 # Clean up temp files
 rm -f "$TEMP_RESPONSE_FILE" "$TEMP_HEADERS_FILE"
config.yaml (1)

20-20: Update outdated comment about feature flag requirement

The comment mentions that S3 backend requires the 's3-storage' feature flag, but based on the changes in Cargo.toml and src/settings.rs, this feature flag has been removed and S3 support is now unconditional.

-# Requires the 's3-storage' feature flag during compilation.
+# S3 storage is now a core feature and always available.
src/spaces.rs (1)

174-179: Consider consolidating NotFound checks

There are two separate conditions checking for 404/NotFound status, which could be consolidated into a single check, especially since both conditions result in the same outcome (proceeding with upload).

-                if inner_error.is_not_found() {
-                    log::debug!(
-                        "Object {} not found (is_not_found() == true, HTTP {}), proceeding with upload",
-                        key,
-                        http_status
-                    );
-                // Fallthrough to upload logic
-                } else if http_status == 404 {
-                    log::debug!(
-                        "Object {} not found (HTTP status 404), proceeding with upload",
-                        key
-                    );
-                    // Fallthrough to upload logic
+                if inner_error.is_not_found() || http_status == 404 {
+                    log::debug!(
+                        "Object {} not found (is_not_found() == {}, HTTP {}), proceeding with upload",
+                        key,
+                        inner_error.is_not_found(),
+                        http_status
+                    );
+                    // Fallthrough to upload logic
src/settings.rs (1)

33-33: Update outdated comment about feature flag

This comment mentions that S3 backend settings are "only available if 's3-storage' feature is enabled", but the feature has been removed and S3 support is now unconditional.

-    /// S3 backend specific settings (only available if 's3-storage' feature is enabled)
+    /// S3 backend specific settings
Cargo.toml (1)

28-28: Empty dev feature set

The dev feature is now empty, which could be confusing for developers. Consider either removing it entirely if it's no longer needed or documenting its purpose if it will be used in the future.

Either remove the empty feature entirely or add a comment explaining its purpose:

-dev = []
+# Either remove this line if not needed, or:
+dev = [] # Reserved for development-only dependencies and features
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between deb515e and 389876f.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • Cargo.toml (3 hunks)
  • Dockerfile (1 hunks)
  • Dockerfile.arm64 (2 hunks)
  • config.yaml (1 hunks)
  • docker-compose.yml (1 hunks)
  • scripts/blossom_media_upload.sh (1 hunks)
  • src/bin/main.rs (3 hunks)
  • src/bin/storage-manage.rs (0 hunks)
  • src/lib.rs (0 hunks)
  • src/settings.rs (3 hunks)
  • src/spaces.rs (1 hunks)
💤 Files with no reviewable changes (2)
  • src/lib.rs
  • src/bin/storage-manage.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/settings.rs (1)
src/spaces.rs (1)
  • new (32-84)
🔇 Additional comments (20)
scripts/blossom_media_upload.sh (6)

1-16: Excellent implementation of a pause_and_run function.

The function provides a useful debugging and confirmation mechanism for complex commands, enhancing the script's usability during testing and troubleshooting.


40-45: Good dependency check for the nak tool.

The script properly checks for required dependencies before proceeding and provides clear installation instructions when missing.


48-50: Cross-platform file information collection is well implemented.

The script handles both macOS and Linux variants of the stat command for file size, ensuring better compatibility across different environments.


58-65: Secure authentication implementation.

The script correctly uses the nak event tool to generate NIP-24242 authentication with appropriate metadata and expiration settings.


108-135: Comprehensive error handling implementation.

The error handling is thorough, with detailed error messages, response parsing, and proper cleanup of temporary files. The script extracts multiple sources of error information (status code, JSON error messages, custom headers) to provide clear feedback.


145-158: Insightful response processing.

The script intelligently parses the JSON response and notes when the processed media's SHA256 hash differs from the original, which is helpful for understanding server-side transformations.

Dockerfile.arm64 (2)

60-61: Build configuration simplified to use default features.

The build command was simplified to rely on default features instead of explicitly specifying them. This aligns with the S3 storage backend integration, making it a default part of the application.


83-83: Added X11 client libraries for extended functionality.

The addition of libxcb1 and libxcb-shm0 libraries suggests new image processing capabilities. These dependencies are required for certain graphics operations, particularly when using libraries that interact with X11.

Dockerfile (2)

37-40: Performance optimization through native CPU targeting.

Adding the target-cpu=native rustflag enables architecture-specific optimizations, which can significantly improve performance. This is a good practice for Docker builds where the target architecture is known.


42-45: Simplified installation with clear documentation.

The comments clearly document the feature changes, and the installation command is simplified to rely on default features. This makes the build process more maintainable and easier to understand.

src/bin/main.rs (2)

21-21: Type rename improves naming consistency.

Changing from StorageBackendType to StorageType improves naming consistency and simplifies the code. This is a good refactoring choice.


73-94: S3 storage now unconditionally supported.

Removing the conditional compilation for S3 storage makes it a first-class storage option, simplifying the codebase by eliminating conditional paths. This is aligned with the PR objective to integrate S3 storage as a standard backend.

config.yaml (2)

7-7: LGTM: Storage type changed to S3

The configuration has been updated to use S3 as the storage backend which aligns with the PR objective to set up S3 integration.


15-17: Filesystem storage settings correctly commented out

The filesystem storage configuration has been commented out as it's no longer the default storage type, which is consistent with the S3 integration goal.

src/spaces.rs (1)

159-205: Improved error handling for S3 head_object operations

The updated error handling provides better granularity and clarity:

  1. Explicit matching on SdkError::ServiceError type
  2. Proper extraction of HTTP status codes and error details
  3. Improved logging with more context for debugging
  4. Addition of a separate match arm for non-service errors
  5. Better error messages with context for troubleshooting

These improvements will make it easier to diagnose S3-related issues and provide better error reporting.

src/settings.rs (3)

10-12: Updated default_storage_type function to use the renamed enum

The default_storage_type function has been updated to use the renamed StorageType enum, maintaining the same default behavior.


4-8:

Details

❓ Verification inconclusive

Renamed StorageBackendType to StorageType and removed feature gating

The storage type enum has been renamed to a simpler name and the conditional compilation for S3 has been removed, making S3 storage always available regardless of feature flags.

Let's check if there are any other references to the old enum name in the codebase that need updating:


🏁 Script executed:

#!/bin/bash
# Search for any remaining references to the old enum name
rg "StorageBackendType" --type rust

Length of output: 35


Final Assessment: Updated Enum Name Verified – Please Manually Confirm
The search for StorageBackendType returned no matches, indicating that the renaming to StorageType is complete and the feature gating has been removed as intended. However, because the search output did not provide detailed context, please perform a manual verification to ensure no lingering references (including in comments or non‑Rust files) remain.


92-103:

Details

❓ Verification inconclusive

Added Default implementation for S3StorageSettings

The Default implementation for S3StorageSettings provides empty or default values for all fields, which is useful for initialization but requires users to set all necessary values explicitly before use.

Let's verify that all code paths properly handle the default initialization of S3StorageSettings:


🏁 Script executed:

#!/bin/bash
# Look for places where S3StorageSettings might be used with default values
rg -A 5 "S3StorageSettings::default" --type rust

# Check the error handling for empty/blank settings
rg -B 10 "access_key_id|secret_access_key|bucket_name" --type rust

Length of output: 8192


Ensure proper handling of default S3StorageSettings values

The Default implementation sets fields with empty strings or default values, which means that any code consuming these settings (for example in src/spaces.rs) will receive unvalidated inputs. The verification via rg confirms that default instances are used directly without additional error checks, so it's important that callers explicitly override these defaults to avoid runtime issues.

  • Default values: empty strings for bucket_name, access_key_id, and secret_access_key, and false for force_path_style.
  • Verified usage: Consumers in src/spaces.rs rely directly on these fields without validating them.
Cargo.toml (2)

14-14: Simplified default features for consistent S3 support

The default features list has been reduced, removing s3-storage and ranges as separate features. This change aligns with making S3 support unconditional in the codebase.


61-69:

Details

❓ Verification inconclusive

Dependencies for S3 storage made non-optional

Previously optional dependencies for S3 storage are now non-optional, aligning with the unconditional S3 support in the codebase. This will ensure all necessary packages are always available, but it does increase the base dependency footprint.

Let's check the impact on build size and any potential conflicts:


🏁 Script executed:

#!/bin/bash
# Check dependency tree for the AWS dependencies
cargo tree -e normal aws-sdk-s3 aws-config http-body-0-4 http-range-header

# Look for any potential version conflicts
cargo tree -d

Length of output: 284


Manual Verification Needed: Confirm Build Impact and Version Conflicts

The PR now makes S3 dependencies non-optional to support unconditional S3 integration. This change increases the base dependency footprint, so it’s important to confirm that there’s no negative impact on the build (e.g., unwanted bloat or version conflicts).

Notes:

  • The dependency tree checks via the provided cargo commands were inconclusive because the cargo command wasn’t available in the current environment.
  • Action: Please verify in an environment where cargo is installed by running:
    • cargo tree -e normal aws-sdk-s3 aws-config http-body-0-4 http-range-header
    • cargo tree -d

Comment thread docker-compose.yml
Comment on lines +34 to +35
# S3 credentials
APP__S3__SECRET_ACCESS_KEY: ${BLOSSOM_DEV_SECRET}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Ensure S3 credentials are properly secured.

The S3 secret access key is correctly sourced from an environment variable, but make sure this environment variable is properly set and secured. Consider adding validation or a check to ensure the variable is present.


🏁 Script executed:

#!/bin/bash
# Check if the required environment variable is defined
if [ -z "${BLOSSOM_DEV_SECRET}" ]; then
  echo "ERROR: BLOSSOM_DEV_SECRET environment variable is not set"
  echo "This variable is required for S3 storage to work correctly"
  exit 1
fi
echo "BLOSSOM_DEV_SECRET is set (value hidden for security)"

Length of output: 276


Action Required: Secure and properly configure S3 credentials.

The verification script clearly indicates that the BLOSSOM_DEV_SECRET environment variable is not set, causing the container to exit and S3 functionality to fail. Please ensure that this variable is defined—either via a secure environment file or another secure method—before deploying. Additionally, consider adding validation checks in your deployment or container initialization process to catch this error earlier.

  • File: docker-compose.yml (Lines 34-35)
  • Current snippet:
          # S3 credentials
          APP__S3__SECRET_ACCESS_KEY: ${BLOSSOM_DEV_SECRET}
  • Next steps:
    • Verify that BLOSSOM_DEV_SECRET is properly provided in your environment.
    • If using a .env file or CI/CD secrets manager, double-check that the variable is correctly spelled and loaded.
    • Add runtime validations or startup checks to ensure the presence of this critical credential.

Comment thread config.yaml
Comment on lines +21 to +28
s3:
region: "us-east-1" # Optional, defaults to provider default
endpoint_url: "https://nyc3.digitaloceanspaces.com" # Optional, use for non-AWS S3-compatible storage like DigitalOcean Spaces
bucket_name: "blossom-test"
access_key_id: "DO801BJEC2FP8YC8B7XD"
# Secret key is now set via environment variable APP__S3__SECRET_ACCESS_KEY
secret_access_key: ""
force_path_style: false # Optional, default is false. Set to true for some S3-compatible services like MinIO
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify S3 credentials handling and security

The configuration includes a hardcoded access key ID and references an environment variable for the secret key. While storing the secret key in an environment variable is good practice, exposing the access key ID in the config file might be a security concern if this file is committed to version control.

Are there plans to also move the access key ID to an environment variable? Additionally, please verify that the DigitalOcean Space (based on the endpoint URL) and credentials are properly configured and secured:


🏁 Script executed:

#!/bin/bash
# Check if the environment variable for the secret key is set in docker-compose.yml
grep -A 10 "environment:" docker-compose.yml | grep "APP__S3__SECRET_ACCESS_KEY"

# Check if there are any environment variables for access key ID
grep -A 10 "environment:" docker-compose.yml | grep -i "access_key"

# Look for sensitive keys in git history (only look for potential patterns, don't display actual keys)
git log -p | grep -i "secret\|key\|token\|password" | wc -l

Length of output: 256


Security Issue: Hardcoded Access Key ID Persists in Config

  • The secret key is correctly externalized via the environment variable (as confirmed by the docker-compose.yml content).
  • However, the S3 configuration in config.yaml still hardcodes the access_key_id ("DO801BJEC2FP8YC8B7XD"). This could expose sensitive credential information if the file is committed to version control.
  • Please consider moving the access key ID to an environment variable (or another secure store) to ensure both parts of the S3 credentials are handled consistently.
  • Additionally, double-check that the DigitalOcean Spaces endpoint (https://nyc3.digitaloceanspaces.com) and its associated credentials are correctly and securely configured.

@dcadenas dcadenas merged commit 7395935 into main Apr 15, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant