-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Data Persistence Architecture - Migration from File-Based to Database-First
Summary
Our current data persistence architecture has critical limitations that prevent scaling and risk data loss. After comprehensive analysis of requirements, constraints, and technology options, we recommend migrating to a PostgreSQL + Apache AGE architecture for production deployment.
Current Issues (HIGH Priority):
- 5-minute data loss window (gateway saves every 5 min)
- No version history or point-in-time recovery
- Project-level granularity prevents lazy loading
- File-based storage without transactions or backup strategy
- No search capability across projects or semantic search
Key Requirements Analyzed
Must-Have Constraints:
- Cloud-agnostic (no AWS/GCP/Azure-specific services)
- Gateway-agnostic architecture (gateways are stateless, can serve any org)
- Schema-less storage (modules define unknown data structures at runtime)
- Lazy loading (cannot load entire projects into memory)
- CRDT metadata retention (24h only, with snapshot compaction)
- Full-text AND semantic search ("car" matches "vehicle")
- Entity-level persistence (not project-level)
- Built-in backup and recovery strategy
Graph Nature of Our Platform:
- Whiteboard with nodes and edges (literal graph structure)
- Entity relationships across modules
- Dependency tracking ("what depends on this?")
- Version trees (v1 → v2 → v3)
- Path finding and traversal queries are common
Recommended Architecture: PostgreSQL + Apache AGE
Why This Combination:
-
Apache AGE = Graph Extension for PostgreSQL
- No new infrastructure (just PostgreSQL extension)
- SQL for structured queries + Cypher for graph queries
- Single database = unified backup/restore/monitoring
- Cloud-agnostic, open-source (Apache 2.0)
-
Addresses All Requirements:
- ✅ Schema-less: JSONB for entity properties
- ✅ Graph queries: Native Cypher support
- ✅ Entity-level storage: Load entities on-demand
- ✅ Full-text search: PostgreSQL tsvector
- ✅ Semantic search: pgvector extension + Ollama embeddings
- ✅ Versioning: Entity history with deltas
- ✅ CRDT compaction: Hourly snapshots, 24h operation retention
- ✅ Gateway-agnostic: Centralized PostgreSQL
-
Operational Simplicity:
- Already using PostgreSQL (no learning curve)
- Same backup tools (pgBackRest)
- Same monitoring and alerting
- Gradual adoption (add graph queries where beneficial)
-
Cost-Effective:
- $50-100/month (VPS + storage)
- Scales to 1K organizations, 10K projects, 1M+ entities
- No additional licensing (Apache 2.0)
Technology Stack (Tier 2 - Recommended)
Primary Database: PostgreSQL 15+ with Apache AGE extension
- Core data (users, orgs, projects)
- Entity storage (JSONB for schema-less data)
- Graph relationships (Cypher queries)
- Full-text search (tsvector + GIN indexes)
- Vector search (pgvector extension)
- Entity versioning (history table with deltas)
- CRDT operation log (24h retention with auto-cleanup)
Object Storage: MinIO (S3-compatible, self-hosted)
- Large snapshots (>1MB)
- Binary files and attachments
- Backup destination
Full-Text Search: Meilisearch
- Fast typo-tolerant search
- Better UX than PostgreSQL FTS alone
- Faceting and filtering
Semantic Search: pgvector + Ollama
- Self-hosted embeddings (nomic-embed-text model)
- "car" matches "vehicle" via vector similarity
- Cloud-agnostic (no OpenAI dependency)
CRDT Management: Yjs + PostgreSQL
- In-memory collaboration
- Operations logged to PostgreSQL (30s flush)
- Hourly snapshot compaction
- Auto-cleanup after 24h
Backup: pgBackRest
- Incremental backups
- Point-in-time recovery
- Automated verificationMigration Path
Phase 1: Foundation
- Install Apache AGE extension in PostgreSQL
- Create new schema (entities, entity_history, entity_relationships)
- Add pgvector extension for semantic search
- Set up MinIO for object storage
Phase 2: Migration
- Migrate organization data from JSON files to database tables
- Create graph nodes/edges from existing entities
- Migrate permissions and OAuth data
- Verify data integrity
Phase 3: Search & Indexing
- Deploy Meilisearch
- Set up Ollama for embeddings
- Index existing entities for full-text and semantic search
- Build search API endpoints
Phase 4: Gateway Integration
- Update gateway to use new persistence layer
- Implement lazy loading with LRU cache
- Add CRDT operation logging
- Deploy snapshot compaction jobs
Phase 5: Monitoring & Optimization
- Set up monitoring and alerting
- Optimize query performance
- Load testing and tuning
- Documentation and runbooks
Alternative Considered: PostgreSQL + Neo4j
If graph performance becomes critical (>5 level traversals, graph algorithms needed), we can upgrade to dual database architecture:
- PostgreSQL for source of truth
- Neo4j for graph projection
- Change Data Capture (CDC) for sync
- Cost: $100-200/month, Complexity: ⭐⭐⭐⭐
When to Reconsider: Only if AGE performance becomes a bottleneck or we need advanced graph algorithms (PageRank, community detection).
Expected Outcomes
Reliability:
- Zero data loss (ACID transactions)
- Point-in-time recovery to any moment
- Automated backups with verification
Performance:
- Sub-second entity queries
- Lazy loading enables 10K+ entity projects
- Graph traversal 10x faster than recursive CTEs
Scalability:
- 1K organizations, 10K projects, 1M+ entities
- Linear scaling with data growth
- Entity-level granularity prevents memory issues
Maintainability:
- Single database (PostgreSQL)
- Standard tools and practices
- Clear monitoring and alerting
Next Steps
- Review and approve this architectural direction
- Allocate resources for implementation
- Set up staging environment for testing
- Begin Phase 1 (foundation work)
References
-
Full Analysis:
doc/architecture/DATA_PERSISTENCE_ANALYSIS.md(2500+ lines)- Complete requirements and constraints
- 40+ technologies analyzed
- 6 architecture patterns evaluated
- Algorithms and strategies for each aspect
- Trade-off analysis and cost modeling
-
Apache AGE: https://age.apache.org/
-
pgvector: https://github.com/pgvector/pgvector
-
Meilisearch: https://www.meilisearch.com/
-
MinIO: https://min.io/
-
Ollama: https://ollama.ai/
Labels: enhancement, architecture, database, high-priority
Milestone: Production Readiness
Assignee: TBD