Skip to content

fix: stream database dumps to support large databases#128

Open
avasis-ai wants to merge 1 commit intoouterbase:mainfrom
avasis-ai:fix/streaming-database-dump
Open

fix: stream database dumps to support large databases#128
avasis-ai wants to merge 1 commit intoouterbase:mainfrom
avasis-ai:fix/streaming-database-dump

Conversation

@avasis-ai
Copy link
Copy Markdown

Summary

Fixes #59 — Database dumps fail on large databases due to loading the entire dump into memory.

Changes

  • Streaming response: Replaces in-memory string accumulation with a ReadableStream response, avoiding memory exhaustion on large databases
  • Chunked data fetching: Uses LIMIT/OFFSET batches (500 rows) to fetch table data incrementally instead of loading entire tables at once
  • SQL injection fix: Properly escapes table identifiers with double-quoting and parameterizes schema lookups (the original code interpolated table names directly into SQL strings)
  • Proper SQL dump format: Wraps output in BEGIN TRANSACTION/COMMIT for atomicity
  • Correct Content-Type: Uses text/sql instead of application/x-sqlite3 since the output is SQL text, not a binary SQLite file
  • Filters internal tables: Excludes tmp_* tables from the dump output

How it works

  1. Pre-fetches the list of user tables (catches connection errors early → 500 response)
  2. Creates a ReadableStream that iterates through tables one at a time
  3. For each table, fetches data in batches of 500 rows using LIMIT ? OFFSET ?
  4. Streams SQL INSERT statements as they're generated, flushing in ~8KB chunks
  5. Memory usage stays constant regardless of database size

Testing

All 9 dump tests pass, including new tests for:

  • Batched data fetching
  • NULL value handling
  • Boolean value handling
  • Special characters in table names

/claim #59

- Replace in-memory string accumulation with ReadableStream response
- Fetch table data in batches (LIMIT/OFFSET) to reduce memory pressure
- Properly escape SQL identifiers and values (fixes injection vulnerability)
- Wrap dump in BEGIN TRANSACTION/COMMIT for atomicity
- Fix Content-Type to text/sql for SQL text dumps
- Skip internal tmp_* tables from dump output

/claim outerbase#59
@avasis-ai
Copy link
Copy Markdown
Author

Hi @Brayden — following up. This PR streams database dumps using ReadableStream + batched fetching to handle large databases without OOM. Fixes #59. Ready for review.

@avasis-ai
Copy link
Copy Markdown
Author

Hey @Brayden — noticed you merged a couple other PRs recently. This one's been sitting for a bit, happy to rebase if main has moved. The streaming approach should handle databases up to the DO limit without OOM. Let me know if you'd like any changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Database dumps do not work on large databases

2 participants