crate · amotl · Aug 23, 2025 · Oct 15, 2025 · Oct 15, 2025 · Oct 15, 2025
diff --git a/docs/start/query/index.md b/docs/start/query/index.md
@@ -48,7 +48,7 @@ CrateDB is not just a real-time analytics database, it’s a powerful platform t
 
 aggregations
 ad-hoc
-search
+search/index
 ai-integration
 Performance <performance>
 ```

diff --git a/docs/start/query/search/fulltext.md b/docs/start/query/search/fulltext.md
@@ -0,0 +1,153 @@
+(start-fulltext)=
+# Full-text search
+
+:::{div} sd-text-muted
+CrateDB enables real-time full-text search at scale.
+:::
+
+Unlike exact-match filters, **full-text search** allows **fuzzy, linguistic matching** on human language text. It tokenizes input, analyzes language, and searches for **tokens, stems, synonyms**, etc.
+
+CrateDB supports powerful full-text search capabilities directly via the `FULLTEXT` index and the `MATCH()` SQL predicate. This allows you to **combine unstructured search with structured filtering and aggregations**—all in one query, with no need for external search systems like Elasticsearch.
+
+CrateDB supports you whether you are working with log messages, customer feedback, machine-generated data, or IoT event streams.
+
+## Why CrateDB for Full-text Search?
+
+| Feature               | Benefit                                           |
+| --------------------- | ------------------------------------------------- |
+| Full-text indexing    | Tokenized, language-aware search on any text      |
+| SQL + search          | Combine structured filters with keyword queries   |
+| JSON support          | Search within nested object fields                |
+| Real-time ingestion   | Search new data immediately—no sync delay         |
+| Scalable architecture | Built to handle high-ingest, high-query workloads |
+
+## Common Query Patterns
+
+### Basic Keyword Search
+
+```sql
+SELECT id, message
+FROM logs
+WHERE MATCH(message, 'authentication failed');
+```
+
+### Combine with Structured Filters
+
+```sql
+SELECT id, message
+FROM logs
+WHERE service = 'auth'
+  AND MATCH(message, 'token expired');
+```
+
+### Search Nested JSON
+
+```sql
+SELECT id, payload['comment']
+FROM feedback
+WHERE MATCH(payload['comment'], 'battery life');
+```
+
+### Aggregate Search Results
+
+```sql
+SELECT COUNT(*)
+FROM tickets
+WHERE MATCH(description, 'login')
+  AND priority = 'high';
+```
+
+## Real-World Examples
+
+### Log and Event Search
+
+Search logs for error messages across microservices:
+
+```sql
+SELECT timestamp, service, message
+FROM logs
+WHERE MATCH(message, 'connection reset')
+ORDER BY timestamp DESC
+LIMIT 100;
+```
+
+### Customer Feedback Analysis
+
+Extract customer sentiment from support messages:
+
+```sql
+SELECT payload['sentiment'], COUNT(*)
+FROM feedback
+WHERE MATCH(payload['message'], 'slow performance')
+GROUP BY payload['sentiment'];
+```
+
+### Anomaly Investigation
+
+Search across telemetry events for unexpected patterns:
+
+```sql
+SELECT *
+FROM device_events
+WHERE MATCH(payload['error_message'], 'overheat');
+```
+
+## Language Support and Analyzers
+
+CrateDB supports language-specific analyzers, enabling more accurate matching across different natural languages. You can specify analyzers during table creation or at query time.
+
+```sql
+CREATE TABLE docs ( id INTEGER, text TEXT INDEX USING FULLTEXT WITH (analyzer = 'english') ); 
+```
+
+To use a specific analyzer in a query:
+
+```sql
+SELECT * FROM docs WHERE MATCH(text, 'power outage') USING 'english';
+```
+
+## Indexing and Performance Tips
+
+| Tip                              | Why It Helps                              |
+| -------------------------------- | ----------------------------------------- |
+| Use `TEXT` with `FULLTEXT` index | Enables tokenized search                  |
+| Index only needed fields         | Reduce indexing overhead                  |
+| Pick appropriate analyzer        | Match the language and context            |
+| Use `MATCH()` not `LIKE`         | Full-text is more performant and relevant |
+| Combine with filters             | Boost performance using `WHERE` clauses   |
+
+## Further reading
+
+:::::{grid} 1 3 3 3
+:margin: 4 4 0 0
+:padding: 0
+:gutter: 2
+
+::::{grid-item-card} {material-outlined}`article;1.5em` Reference
+:columns: 3
+- {ref}`crate-reference:sql_dql_fulltext_search`
+- {ref}`crate-reference:fulltext-indices`
+- {ref}`crate-reference:predicates_match`
+- {ref}`crate-reference:ref-create-analyzer`
+::::
+
+::::{grid-item-card} {material-outlined}`link;1.5em` Related
+:columns: 3
+- {ref}`start-geospatial`
+- {ref}`start-vector`
+- {ref}`start-hybrid`
+::::
+
+::::{grid-item-card} {material-outlined}`read_more;1.5em` Read more
+:columns: 6
+- [How CrateDB differs from Elasticsearch]
+- [Tutorial: Full-text search on logs]
+- {ref}`FTS feature details <fulltext-search>`
+- {ref}`Data modeling with FTS <model-fulltext>`
+::::
+
+:::::
+
+
+[How CrateDB differs from Elasticsearch]: https://archive.fosdem.org/2018/schedule/event/cratedb/
+[Tutorial: Full-text search on logs]: https://community.cratedb.com/t/storing-server-logs-on-cratedb-for-fast-search-and-aggregations/1562
diff --git a/docs/start/query/search/geo.md b/docs/start/query/search/geo.md
@@ -0,0 +1,104 @@
+(start-geospatial)=
+# Geospatial search
+
+:::{div} sd-text-muted
+Query geospatial data through SQL, combining ease of use with advanced capabilities.
+:::
+
+CrateDB enables geospatial search using **Lucene’s prefix tree** and **BKD tree** indexing structures. With CrateDB, you can:
+
+* Store and index geographic **points** and **shapes**
+* Perform spatial queries using **bounding boxes**, **circles**, **donut shapes**, and more
+* Filter, sort, or boost results by **distance**, **area**, or **spatial relationship**
+
+See the {ref}`data-modelling` section for details of data types and how to insert data.
+
+## Querying Geospatial Data
+
+CrateDB supports several SQL functions and predicates to work with geospatial data:
+
+| Function                               | Description                                                                      |
+| -------------------------------------- | -------------------------------------------------------------------------------- |
+| `distance(p1, p2)`                     | Computes the distance (in meters) between two points using the Haversine formula |
+| `within(shape, region)`                | Checks if a shape is fully within another shape                                  |
+| `intersects(shape1, shape2)`           | Checks if two shapes intersect                                                   |
+| `area(shape)`                          | Returns the area of a given shape in square degrees using geodetic awareness     |
+| `latitude(point)` / `longitude(point)` | Extracts lat/lon from a `GEO_POINT`                                              |
+| `geohash(point)`                       | Returns a 12-character geohash representation of a point                         |
+
+### MATCH Predicate
+
+CrateDB provides a `MATCH` predicate for geospatial relationships:
+
+```sql
+sqlCopierModifier-- Find parks that intersect with a given region
+SELECT name
+FROM parks
+WHERE MATCH(area) AGAINST('INTERSECTS POLYGON ((...))');
+```
+
+Supported relations: `INTERSECTS`, `DISJOINT`, `WITHIN`.
+
+## Example: Finding Nearby Cities
+
+The following query finds the 10 closest capital cities to the current location of the International Space Station:
+
+```sql
+SELECT
+  city AS "City Name",
+  country AS "Country",
+  DISTANCE(i.position, c.location)::LONG / 1000 AS "Distance [km]"
+FROM demo.iss i
+CROSS JOIN demo.world_cities c
+WHERE capital = 'primary'
+  AND ts = (SELECT MAX(ts) FROM demo.iss)
+ORDER BY 3 ASC
+LIMIT 10;
+```
+
+## Indexing Strategies
+
+CrateDB supports multiple indexing strategies for `GEO_SHAPE` columns:
+
+| Index Type          | Description                                                  |
+| ------------------- | ------------------------------------------------------------ |
+| `geohash` (default) | Hash-based prefix tree for point-based queries               |
+| `quadtree`          | Space-partitioning using recursive quadrant splits           |
+| `bkdtree`           | Lucene BKD tree for efficient bounding box and range queries |
+
+You can choose and configure the indexing method when defining your table schema.
+
+### Performance Note
+
+While CrateDB can perform **exact computations** on complex geometries (e.g. large polygons, geometry collections), these can be computationally expensive. Choose your index strategy carefully based on your query patterns.
+
+For full details, refer to the Geo Shape column definition section in the reference documentation.
+
+## Further reading
+
+:::::{grid} 1 3 3 3
+:margin: 4 4 0 0
+:padding: 0
+:gutter: 2
+
+::::{grid-item-card} {material-outlined}`article;1.5em` Reference
+:columns: 3
+- {ref}`crate-reference:data-types-geo-point`
+- {ref}`crate-reference:data-types-geo-shape`
+- {ref}`crate-reference:sql_dql_geo_search`
+::::
+
+::::{grid-item-card} {material-outlined}`link;1.5em` Related
+:columns: 3
+- {ref}`start-fulltext`
+- {ref}`start-vector`
+- {ref}`start-hybrid`
+::::
+
+::::{grid-item-card} {material-outlined}`read_more;1.5em` Read more
+:columns: 6
+- {ref}`Geospatial feature details <geospatial-search>`
+- {ref}`Data modeling with geospatial data <model-geospatial>`
+::::
+
+:::::
diff --git a/docs/start/query/search/hybrid.md b/docs/start/query/search/hybrid.md
@@ -0,0 +1,120 @@
+(start-hybrid)=
+# Hybrid search
+
+:::{div} sd-text-muted
+Combine vector similarity (kNN) and term-based full-text (BM25)
+searches in a single SQL query.
+:::
+
+While **vector search** provides powerful semantic retrieval based on machine learning models, it's not always optimal, especially when models are not fine-tuned for a specific domain. On the other hand, **traditional full-text search** (e.g., BM25 scoring) offers high precision on exact or keyword-based queries, with strong performance out of the box. **Hybrid search** blends these approaches, combining semantic understanding with keyword relevance to deliver more accurate, robust, and context-aware search results.
+
+Hybrid search is particularly effective for **knowledge bases, product or document search, multilingual content search, FAQ bots and semantic assistants**, and **AI-powered search experiences.** It allows applications to go beyond keyword matching, incorporating vector similarity while still respecting domain-specific terms.
+
+CrateDB supports **hybrid search** by combining **vector similarity search** (kNN) and **term-based full-text search** (BM25) in a single SQL query. CrateDB lets you implement hybrid search natively in SQL using **common table expressions (CTEs)** and **scoring fusion techniques**, such as:
+
+* **Convex combination** (weighted sum of scores)
+* **Reciprocal rank fusion (RRF)**
+
+## Supported Search Capabilities in CrateDB
+
+| Search Type           | Function      | Description                                    |
+| --------------------- | ------------- |------------------------------------------------|
+| **Vector search**     | `KNN_MATCH()` | Finds vectors closest to a given vector        |
+| **Full-text search**  | `MATCH()`     | Uses Lucene's BM25 scoring                     |
+| **Geospatial search** | `MATCH()`     | For shapes and points (see: Geospatial search) |
+
+CrateDB enables all three through **pure SQL**, allowing flexible combinations and advanced analytics.
+
+## Example: Hybrid Search in SQL
+
+Here’s a simple structure of a hybrid search query combining BM25 and vector results using a CTE:
+
+```sql
+WITH 
+    vector_results AS (
+        SELECT id, title, content, 
+               _score AS vector_score
+        FROM documents
+        WHERE KNN_MATCH(embedding, [0.2, 0.1, ..., 0.3], 10)
+    ),
+    bm25_results AS (
+        SELECT id, title, content, 
+               _score AS bm25_score
+        FROM documents
+        WHERE MATCH(content, 'knn search')
+    )
+
+SELECT 
+    v.id,
+    v.title,
+    bm25_score,
+    vector_score,
+    0.5 * bm25_score + 0.5 * vector_score AS hybrid_score
+FROM 
+    bm25_results b
+JOIN 
+    vector_results v ON v.id = b.id
+ORDER BY 
+    hybrid_score DESC
+LIMIT 10;
+```
+
+You can adjust the weighting (`0.5`) depending on your desired balance between keyword precision and semantic similarity.
+
+## Sample Results
+
+### Hybrid Scoring (Convex Combination)
+
+| hybrid\_score | bm25\_score | vector\_score | title                                         |
+| ------------- | ----------- | ------------- | --------------------------------------------- |
+| 0.7440        | 1.0000      | 0.5734        | knn\_match(float\_vector, float\_vector, int) |
+| 0.4868        | 0.5512      | 0.4439        | Searching On Multiple Columns                 |
+| 0.4716        | 0.5694      | 0.4064        | array\_position(...)                          |
+
+### Reciprocal Rank Fusion (RRF)
+
+| final\_rank | bm25\_rank | vector\_rank | title                                         |
+| ----------- | ---------- | ------------ | --------------------------------------------- |
+| 0.03278     | 1          | 1            | knn\_match(float\_vector, float\_vector, int) |
+| 0.03105     | 7          | 2            | Searching On Multiple Columns                 |
+| 0.03057     | 8          | 3            | Usage                                         |
+
+:::{note}
+RRF rewards documents that rank highly across multiple methods,
+regardless of exact score values.
+:::
+
+## Further reading
+
+:::::{grid} 1 3 3 3
+:margin: 4 4 0 0
+:padding: 0
+:gutter: 2
+
+::::{grid-item-card} {material-outlined}`article;1.5em` Reference
+:columns: 3
+- {ref}`crate-reference:sql_dql_fulltext_search`
+- {ref}`crate-reference:fulltext-indices`
+- {ref}`crate-reference:predicates_match`
+- {ref}`crate-reference:scalar_knn_match`
+- {ref}`crate-reference:scalar_vector_similarity`
+- {ref}`crate-reference:type-float_vector`
+::::
+
+::::{grid-item-card} {material-outlined}`link;1.5em` Related
+:columns: 3
+- {ref}`start-fulltext`
+- {ref}`start-geospatial`
+- {ref}`start-vector`
+::::
+
+::::{grid-item-card} {material-outlined}`read_more;1.5em` Read more
+:columns: 6
+- [Doing Hybrid Search in CrateDB]
+- {ref}`Hybrid search feature details <hybrid-search>`
+::::
+
+:::::
+
+
+[Doing Hybrid Search in CrateDB]: https://cratedb.com/blog/hybrid-search-explained