diff --git a/README.md b/README.md index 28146ea863..71c5b5c233 100644 --- a/README.md +++ b/README.md @@ -4,43 +4,207 @@ FoundationDB is a distributed database designed to handle large volumes of struc To learn more about FoundationDB, visit [foundationdb.org](https://www.foundationdb.org/) -# FoundationDB Record Layer - -The Record Layer is a Java API providing a record-oriented store on top of FoundationDB, -(very) roughly equivalent to a simple relational database, featuring: - -* **Structured types** - Records are defined and stored in terms of - [protobuf](https://developers.google.com/protocol-buffers/) messages. -* **Indexes** - The Record Layer supports a variety of different index - types including value indexes (the kind provided by most databases), - rank indexes, and aggregate indexes. Indexes and primary keys can - be defined either via protobuf options or programmatically. -* **Complex types** - Support for complex types, such as lists and - nested records, including the ability to define indexes against - such nested structures. -* **Queries** - The Record Layer does not provide a query language, however - it provides query APIs with the ability to scan, filter, and sort - across one or more record types, and a query planner capable of - automatic selection of indexes. -* **Many record stores, shared schema** - The Record Layer provides the - ability to support many discrete record store instances, all with - a shared (and evolving) schema. For example, rather than modeling a - single database in which to store all users' data, each user can be - given their own record store, perhaps sharded across different FDB - cluster instances. -* **Very light weight** - The Record layer is designed to be used in a - large, distributed, stateless environment. The time between opening - a store and the first query is intended to be measured in milliseconds. -* **Extensible** - New index types and custom index key expressions - may be dynamically incorporated into a record store. - -The Record Layer may be used directly or provides an excellent foundational -layer on which more complex systems can be constructed. +# FoundationDB Record Layer (FRL) + +FRL provides a **relational database with SQL support** built on top of FoundationDB, featuring: + +* **SQL Database** - SQL support with JDBC connectivity for defining schemas, + querying data, and managing tables using familiar SQL syntax. The SQL API is under + active development with frequent enhancements. +* **Advanced Data Types** - Beyond standard SQL types (STRING, INTEGER, FLOAT, BOOLEAN), + FRL supports: + * **Nested Structures** - User-defined struct types that can be nested arbitrarily deep + * **Arrays** - Collections of primitives or complex types + * **Vectors** - Fixed-dimension numerical vectors for ML embeddings and similarity search +* **Schema Templates** - Reusable schema definitions that enable multi-tenant architectures + where each tenant gets their own database instance with a shared, evolvable schema. +* **Intelligent Query Planning** - Automatic index selection and query optimization with + support for JOINs, aggregations (COUNT, SUM, etc.), GROUP BY, and ORDER BY. + Queries are efficiently executed using index-backed operations without in-memory sorting. +* **Indexes** - Rich indexing capabilities including value indexes, rank indexes, aggregate + indexes, and indexes on nested fields. Indexes are materialized views that update incrementally. +* **Scalable Architecture** - Designed for distributed, stateless environments with + millisecond-level store initialization. Perfect for applications managing thousands + of discrete database instances. +* **ACID Transactions** - Full transactional semantics inherited from FoundationDB, + with support for continuations for efficiently paging through large result sets. + +## Quick Start with SQL + +```java +// Connect via JDBC +String url = "jdbc:embed:/__SYS?schema=CATALOG"; +Connection conn = DriverManager.getConnection(url); + +// Define a schema template with tables and indexes +conn.createStatement().execute(""" + CREATE SCHEMA TEMPLATE my_template + CREATE TABLE customers ( + customer_id BIGINT, + name STRING, + email STRING, + PRIMARY KEY(customer_id) + ) + CREATE INDEX email_idx AS + SELECT email FROM customers ORDER BY email + """); + +// Create a database and schema +conn.createStatement().execute( + "CREATE DATABASE /my_app/production"); +conn.createStatement().execute( + "CREATE SCHEMA /my_app/production/main WITH TEMPLATE my_template"); + +// Insert and query data +PreparedStatement insert = conn.prepareStatement( + "INSERT INTO customers VALUES (?, ?, ?)"); +insert.setLong(1, 1); +insert.setString(2, "Alice"); +insert.setString(3, "alice@example.com"); +insert.executeUpdate(); + +ResultSet rs = conn.createStatement().executeQuery( + "SELECT * FROM customers WHERE email = 'alice@example.com'"); +``` + +## Key Features + +### Multi-Tenant Schema Templates + +Schema templates enable efficient multi-tenant architectures: + +```sql +-- Define the template once +CREATE SCHEMA TEMPLATE user_data_template + CREATE TABLE documents (id BIGINT, content STRING, PRIMARY KEY(id)) + CREATE INDEX content_idx AS SELECT content FROM documents ORDER BY content; + +-- Create separate database instances for each tenant +CREATE DATABASE /tenant/user_1; +CREATE SCHEMA /tenant/user_1/data WITH TEMPLATE user_data_template; + +CREATE DATABASE /tenant/user_2; +CREATE SCHEMA /tenant/user_2/data WITH TEMPLATE user_data_template; +``` + +Each tenant's data is completely isolated with its own database, yet all share +the same schema definition for easy management and evolution. + +### Advanced Type System + +Define complex, nested data structures: + +```sql +-- Define custom struct types +CREATE TYPE AS STRUCT address ( + street STRING, + city STRING, + postal_code STRING +) + +CREATE TYPE AS STRUCT contact_info ( + email STRING, + phone STRING, + mailing_address address +) + +-- Use in tables with arrays and nesting +CREATE TABLE users ( + user_id BIGINT, + name STRING, + contacts contact_info ARRAY, + PRIMARY KEY(user_id) +) +``` + +### Vector Support for ML Applications + +Store and query high-dimensional vectors for embeddings and similarity search: + +```sql +CREATE TABLE embeddings ( + doc_id BIGINT, + content STRING, + embedding_half VECTOR(128, HALF), -- 16-bit precision + embedding_float VECTOR(768, FLOAT), -- 32-bit precision + embedding_double VECTOR(1024, DOUBLE), -- 64-bit precision + PRIMARY KEY(doc_id) +) +``` + +Vectors are inserted via JDBC PreparedStatements and can be efficiently stored +and retrieved using the FoundationDB backend. + +### Index-Backed Query Execution + +FRL's query planner intelligently selects indexes to execute +queries efficiently: + +```sql +-- Queries use indexes automatically +SELECT name FROM customers WHERE email = 'alice@example.com'; +-- Uses email_idx if available + +-- JOINs using comma-separated FROM clause +SELECT c.name, o.order_id +FROM customers c, orders o +WHERE c.customer_id = o.customer_id; + +-- Aggregations backed by indexes +SELECT category, COUNT(*) +FROM products +GROUP BY category; +-- Requires ordered index or primary key on category for streaming aggregate, or a aggregate index for direct retrieval + +-- ORDER BY requires index or primary key order +SELECT * FROM customers ORDER BY email; +-- Requires index on email (like email_idx above) +``` + +**Important**: FRL does not perform in-memory sorting or aggregation. Operations like ORDER BY, GROUP BY, +and aggregates require underlying indexes to provide the required ordering. + +## Architecture Notes + +FRL is designed for: +* **Horizontal scalability** - Thousands of independent database instances +* **Low latency** - Millisecond-level initialization and query execution +* **Stateless services** - No server-side state; all data in FoundationDB +* **Schema evolution** - Templates can evolve over time (template evolution features + coming to relational layer; currently available via advanced Record Layer API) + +## Advanced: Direct Record Layer API + +For applications requiring fine-grained control over storage layout, index +maintenance, or features not yet available in the SQL Relational layer, the Record Layer +provides a low-level Java API using Protocol Buffers. + +**Note**: This API is maintained for advanced use cases but is being positioned +as a lower-level alternative to the SQL interface. Features available only +through this API will migrate to the SQL layer over time. Long-term support of this lower-level API is not guaranteed once equivalent features are available at the Relational Layer. + +Key Record Layer API features: +* **Protobuf-based schema definition** - Define records using `.proto` files +* **Programmatic index management** - `IndexMaintainer` extension points +* **Custom query components** - Extend the query planner +* **Schema evolution** - `MetaDataEvolutionValidator` for safe schema changes +* **Low-level control** - Direct access to FoundationDB operations + +See [Record Layer Documentation](https://foundationdb.github.io/fdb-record-layer/Overview.html) for details. ## Documentation +* **Getting Started** - [SQL Quick Start](https://foundationdb.github.io/fdb-record-layer/GettingStarted.html) +* **SQL Reference** - [SQL Commands and Data Types](https://foundationdb.github.io/fdb-record-layer/SQL_Reference.html) +* **Schema Templates** - [Databases, Schemas, and Templates](https://foundationdb.github.io/fdb-record-layer/reference/Databases_Schemas_SchemaTemplates.html) +* **Advanced: Record Layer API** - [Record Layer Overview](https://foundationdb.github.io/fdb-record-layer/Overview.html) * [Documentation Home](https://foundationdb.github.io/fdb-record-layer/) * [Contributing](CONTRIBUTING.md) * [Code of Conduct](CODE_OF_CONDUCT.md) * [License](LICENSE) +## Getting Help + +* **Documentation Issues**: [Submit Documentation Feedback](https://github.com/FoundationDB/fdb-record-layer/issues) +* **Bugs & Feature Requests**: [GitHub Issues](https://github.com/FoundationDB/fdb-record-layer/issues) +* **Community**: [FoundationDB Community Forums](https://forums.foundationdb.org/)