From 11e641f4f23df52d0c91ed0ca29a59e2026a333f Mon Sep 17 00:00:00 2001 From: Walter Behmann Date: Wed, 1 Oct 2025 00:09:50 +0200 Subject: [PATCH 01/15] Sharding: Add section "Sizing considerations" --- docs/performance/sharding.md | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index 249dec73..0b0cadb0 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -15,6 +15,42 @@ the type of hardware you're using. :::{NOTE} This guide assumes you know the basics. +## Sizing considerations + +General principles requires careful consideration for cluster +sizing and architecture. +Keep the following things in mind when building your sharding strategy. +Each shard incurs overhead in terms of open files, RAM allocation, and CPU cycles +for maintenance operations. + +### Shard size vs. number of shards + +The optimal approach balances shard count with shard size. Individual shards should +typically contain 3-70 GB of data, with 10-50 GB being the sweet spot for most +workloads. In large clusters, this often means fewer shards than total CPU cores, +as larger shards can still be processed efficiently by multiple CPU cores during +query execution. +Smaller shards also result in reduced Lucene index efficiency, which can adversely +affect computed search term relevance. + +### CPU-to-shard ratio + +If most nodes have more shards per table than they have CPUs, the cluster can +experience performance degradations. +For example, on clusters with substantial CPU resources (e.g., 8 nodes × 32 CPUs += 256 total CPUs), creating 256+ shards per table often proves counterproductive. +If you don't manually set the number of shards per table, CrateDB will make a +best guess, based on the assumption that your nodes have two CPUs each. + +### 1000 shards per node limit + +To avoid _oversharding_, CrateDB by default limits the number of shards per node to +1_000 as a critical stability boundary. Any operation that would exceed that limit +leads to an exception. +For an 8-node cluster, this allows up to 8_000 total shards across all tables. +Approaching this limit typically indicates a suboptimal sharding strategy rather +than optimal performance tuning. See also relevant documentation about +{ref}`table reconfiguration ` wrt. sharding options. If you are looking for an intro to sharding, see {ref}`sharding `. From 8375f1c3e99a992535d55308f06ed6330c93d431 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 00:12:55 +0200 Subject: [PATCH 02/15] Sharding: Modernize and refactor, just regular copy-editing --- docs/admin/sharding-partitioning.md | 36 ++++--- docs/performance/scaling.md | 16 ++-- docs/performance/sharding.md | 139 ++++++++++++++++------------ 3 files changed, 101 insertions(+), 90 deletions(-) diff --git a/docs/admin/sharding-partitioning.md b/docs/admin/sharding-partitioning.md index 5031a1d7..3ff1db4f 100644 --- a/docs/admin/sharding-partitioning.md +++ b/docs/admin/sharding-partitioning.md @@ -1,6 +1,6 @@ (sharding-partitioning)= -# Sharding and Partitioning +# Sharding and Partitioning 101 ## Introduction @@ -92,23 +92,7 @@ INSERT INTO second_table (ts, val) VALUES (1620415701974, 2.31); We can see that there are now 8 shards for the table `second_table` in the cluster. -:::{danger} -**Over-sharding and over-partitioning** - -Sharding can drastically improve the performance on large datasets. -However, having too many small shards will most likely degrade performance. -Over-sharding and over-partitioning are common flaws leading to an overall -poor performance. - -**As a rule of thumb, a single shard should hold somewhere between 5 - 50 -GB of data.** - -To avoid oversharding, CrateDB by default limits the number of shards per -node to 1000. Any operation that would exceed that limit, leads to an -exception. -::: - -## How to choose your sharding and partitioning strategy +## Strategy An optimal sharding and partitioning strategy always depends on the specific use case and should typically be determined by conducting @@ -123,9 +107,21 @@ Then, to calculate the number of shards, you should consider that the size of ea shard should roughly be between 5 - 50 GB, and that each node can only manage up to 1000 shards. -### Time series example +:::{caution} +**Over-sharding and over-partitioning** + +Sharding can drastically improve the performance on large datasets. +However, having too many small shards will most likely degrade performance. +Over-sharding and over-partitioning are common flaws leading to an overall +poor performance. + +Learn how to discover an optimal sharding strategy for your dataset +in the {ref}`sharding-guide`. +::: + +## Example -To illustrate the steps above, let's use them on behalf of an example. Imagine +Let's create a basic sharding strategy on behalf of a concrete example. Imagine you want to create a *partitioned table* on a *three-node cluster* to store time series data with the following assumptions: diff --git a/docs/performance/scaling.md b/docs/performance/scaling.md index 33fd8387..6ecc8ff8 100644 --- a/docs/performance/scaling.md +++ b/docs/performance/scaling.md @@ -48,25 +48,21 @@ large result sets from CrateDB][fetching large result sets from cratedb]. ## Number of shards -In CrateDB data in tables and partitions is distributed in storage units that we -call shards. +In CrateDB data in tables and partitions is distributed in storage units +called "shards". -If we do not specify how many shards we want for a table/partition CrateDB will +If we do not specify how many shards we want for a table/partition, CrateDB will derive a default from the number of nodes. -CrateDB also has replicas of data and this results in additional shards in the -cluster. - Having too many or too few shards has performance implications, so it is very -important to get familiar with the {ref}`Sharding Performance Guide -`. +important to get familiar with the {ref}`sharding-guide`. In particular, there is a soft limit of 1000 shards per node; so table schemas, partitioning strategy, and number of nodes need to be planned to stay well below this limit, one strategy can be to aim for a configuration where even if one node -in the cluster is lost the remaining nodes would still have less than 1000 shards. +in the cluster is lost, the remaining nodes would still have less than 1000 shards. -If this was not considered when initially defining the tables we have the +If this was not taken into account when initially defining the tables, we have the following considerations: - changing the partitioning strategy requires creating a new table and copying diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index 0b0cadb0..043a8578 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -1,20 +1,39 @@ (sharding-guide)= - (sharding-performance)= -# Sharding Performance Guide +# Sharding performance guide -This document is a sharding best practice guide for CrateDB. +:::{div} sd-text-muted +Applying sharding can drastically improve the performance on large datasets. +::: -A brief recap: CrateDB tables are split into a configured number of shards, and -then these shards are distributed across the cluster. +This document is a sharding best practice guide for CrateDB. +A brief recap: CrateDB tables are split into a configured number of shards. +These shards are distributed across the cluster to optimize concurrent and +parallel data processing. -Figuring out how many shards to use for your tables requires you to think about -the type of data you're processing, the types of queries you're running, and -the type of hardware you're using. +Whenever possible, CrateDB will parallelize query workloads and distribute them +across the whole cluster. The more CPUs this query workload can be distributed +across, the faster the query will run. -:::{NOTE} +:::{seealso} This guide assumes you know the basics. +If you are looking for an intro to sharding, see also the +{ref}`sharding-partitioning` and the +{ref}`sharding reference ` documentation. +::: + + +## General recommendations + +To avoid running your clusters with too many shards or too large shards, +implement the following guidelines as a rule of thumb: + +- Use shard sizes between 10GB and 50GB. + +- Keep the number of records on each shard below 200 million. + + ## Sizing considerations General principles requires careful consideration for cluster @@ -52,15 +71,56 @@ Approaching this limit typically indicates a suboptimal sharding strategy rather than optimal performance tuning. See also relevant documentation about {ref}`table reconfiguration ` wrt. sharding options. -If you are looking for an intro to sharding, see {ref}`sharding -`. +### Partitions + +If you are using {ref}`partitioned tables `, +note that each partition is clustered into as many shards as you configure +for the table. + +For example, a table with four shards and two partitions will have eight +shards that can be commonly queried across. But a query that only touches +one partition will only query across four shards. + +How this factors into balancing your shard allocation will depend on the +types of queries you intend to run. + +### Replicas + +CrateDB also has replicas of data and this results in additional shards in +the cluster. + +For the purposes of calculating how many shards a table should be clustered +into, you can typically ignore replica partitions as these are not usually +queried across for reads. + + +## Caveats + +:::{caution} +:class: hero +Balancing the number and size of your shards is important for the performance +and stability of your CrateDB clusters. ::: -## Optimising for query performance +### Avoid imbalances -(sharding-under-allocation)= +- Too many shards can degrade search performance and make the cluster unstable. + This is referred to as _oversharding_. + +- Very large shards can slow down cluster operations and prolong recovery times + after failures. -### Under-allocation is bad +Finding the right balance when it comes to sharding will vary on a lot of +things. While it's generally advisable to slightly over-allocate, we +recommend to benchmark your particular setup to find the sweet spot to +implement an appropriate sharding strategy. + +Figuring out how many shards to use for your tables requires you to think about +the type of data you're processing, the types of queries you're running, and +the type of hardware you're using. + +(sharding-under-allocation)= +### Avoid under-allocation :::{CAUTION} If you have fewer shards than CPUs in the cluster, this is called @@ -68,10 +128,6 @@ If you have fewer shards than CPUs in the cluster, this is called of CrateDB. ::: -Whenever possible, CrateDB will parallelize query workloads and distribute them -across the whole cluster. The more CPUs this query workload can be distributed -across, the faster the query will run. - To increase the chances that a query can be parallelized and distributed maximally, there should be at least as many shards for a table than there are CPUs in the cluster. This is because CrateDB will automatically balance shards @@ -81,7 +137,8 @@ In summary: the smaller your shards are, the more of them you will have, and so the more likely it is that they will be distributed across the whole cluster, and hence across all of your CPUs, and hence the faster your queries will run. -### Significant over-allocation is bad +(sharding-over-allocation)= +### Avoid too much over-allocation :::{CAUTION} If you have more shards per table than CPUs, this is called *over-allocation*. A @@ -93,48 +150,10 @@ When you have slightly more shards per table than CPUs, you ensure that query workloads can be parallelized and distributed maximally, which in turn ensures maximal query performance. -However, if most nodes have more shards per table than they have CPUs, you -could actually see performance degradation. Each shard comes with a cost in -terms of open files, RAM, and CPU cycles. Smaller shards also means small shard -indexes, which can adversely affect computed search term relevance. - -For performance reasons, one thousand shards per table per node is considered -the highest recommended configuration. If you exceed this you will experience a -failing cluster check. - -### Balancing allocation - -Finding the right balance when it comes to sharding will vary on a lot of -things. And while it's generally advisable to slightly over-allocate, it's also -a good idea to benchmark your particular setup so as to find the sweet spot. - -If you don't manually set the number of shards per table, CrateDB will make a best guess, -based on the assumption that your nodes have two CPUs each. - -:::{TIP} -For the purposes of calculating how many shards a table should be clustered -into, you can typically ignore replica partitions as these are not usually -queried across for reads. -::: - -:::{CAUTION} -If you are using {ref}`partitioned tables `, -note that each partition is -clustered into as many shards as you configure for the table. - -For example, a table with four shards and two partitions will have eight -shards that can be commonly queried across. But a query that only touches -one partition will only query across four shards. - -How this factors into balancing your shard allocation will depend on the -types of queries you intend to run. -::: - (sharding-ingestion)= +## Optimising for ingestion -## Optimising for ingestion performance - -As with [Optimising for query performance], when doing heavy ingestion, it is +When doing heavy ingestion, it is good to cluster a table across as many nodes as possible. However, [we have found][we have found] that ingestion throughput can often increase as the table shard per CPU ratio on each node *decreases*. @@ -144,7 +163,7 @@ sizes, batch insert size, and the hardware. In particular: using solid-state drives (SSDs) instead of hard-disk drives (HDDs) can massively increase ingestion throughput. -It's a good idea to benchmark your particular setup so as to find the sweet +We recommend to benchmark your particular ingest workload to find the sweet spot. [we have found]: https://cratedb.com/blog/big-cluster-insights-ingesting From 27b5c3925f09e948ef698fcf83a9d3b025065c1b Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 23:13:39 +0200 Subject: [PATCH 03/15] Sharding: Remove "Smaller shards also result in reduced efficiency..." --- docs/performance/sharding.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index 043a8578..d20dc06e 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -49,8 +49,6 @@ typically contain 3-70 GB of data, with 10-50 GB being the sweet spot for most workloads. In large clusters, this often means fewer shards than total CPU cores, as larger shards can still be processed efficiently by multiple CPU cores during query execution. -Smaller shards also result in reduced Lucene index efficiency, which can adversely -affect computed search term relevance. ### CPU-to-shard ratio From fdbf9bdf07bf8a4eca40978c7fec198e15db6034 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 23:17:26 +0200 Subject: [PATCH 04/15] Sharding: Remove 3-70 GB recommendation. Recommend 10-50 GB. --- docs/admin/sharding-partitioning.md | 4 ++-- docs/feature/cluster/index.md | 2 +- docs/performance/sharding.md | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/admin/sharding-partitioning.md b/docs/admin/sharding-partitioning.md index 3ff1db4f..9b806c7e 100644 --- a/docs/admin/sharding-partitioning.md +++ b/docs/admin/sharding-partitioning.md @@ -104,7 +104,7 @@ for a benchmark. - Calculate the throughput Then, to calculate the number of shards, you should consider that the size of each -shard should roughly be between 5 - 50 GB, and that each node can only manage +shard should roughly be between 10 - 50 GB, and that each node can only manage up to 1000 shards. :::{caution} @@ -132,7 +132,7 @@ time series data with the following assumptions: Given the daily throughput is around 10 GB/day, the monthly throughput is 30 times that (~ 300 GB). The partition column can be day, week, month, quarter, etc. So, assuming a monthly partition, the next step is to calculate the number of shards -with the **shard size recommendation** (5 - 50 GB) and the **number of nodes** in +with the **shard size recommendation** (10 - 50 GB) and the **number of nodes** in the cluster in mind. With three shards, each shard would hold 100 GB (300 GB / 3 shards), which is above diff --git a/docs/feature/cluster/index.md b/docs/feature/cluster/index.md index 75b2af93..04fb9f23 100644 --- a/docs/feature/cluster/index.md +++ b/docs/feature/cluster/index.md @@ -109,7 +109,7 @@ data loss, and to improve read performance. ## Synopsis With a monthly throughput of 300 GB, partitioning your table by month, and using six shards, each shard will manage 50 GB of data, which is -within the recommended size range (5 - 50 GB). +within the recommended size range (10 - 50 GB). Through replication, the table will store three copies of your data, in order to reduce the chance of permanent data loss. diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index d20dc06e..bc2fbe85 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -45,7 +45,7 @@ for maintenance operations. ### Shard size vs. number of shards The optimal approach balances shard count with shard size. Individual shards should -typically contain 3-70 GB of data, with 10-50 GB being the sweet spot for most +typically contain 10-50 GB of data, being the sweet spot for most workloads. In large clusters, this often means fewer shards than total CPU cores, as larger shards can still be processed efficiently by multiple CPU cores during query execution. From 710e7c5fcc47d2b3a63559dacb8a61c425fddd99 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 23:18:50 +0200 Subject: [PATCH 05/15] Sharding: Explain "1000 shards per node" limit as "protection limit" --- docs/performance/sharding.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index bc2fbe85..4582c141 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -62,7 +62,7 @@ best guess, based on the assumption that your nodes have two CPUs each. ### 1000 shards per node limit To avoid _oversharding_, CrateDB by default limits the number of shards per node to -1_000 as a critical stability boundary. Any operation that would exceed that limit +1_000 as a protection limit. Any operation that would exceed that limit leads to an exception. For an 8-node cluster, this allows up to 8_000 total shards across all tables. Approaching this limit typically indicates a suboptimal sharding strategy rather From 08c76fd4c89181b9ac98217dc84658ff6b7b2d00 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 23:33:13 +0200 Subject: [PATCH 06/15] Sharding: Relocate notes about finding the right balance to "general" --- docs/performance/sharding.md | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index 4582c141..bd777cba 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -33,6 +33,20 @@ implement the following guidelines as a rule of thumb: - Keep the number of records on each shard below 200 million. +Finding the right balance when it comes to sharding will vary on a lot of +things. While it is generally advisable to slightly over-allocate, we +recommend to benchmark your particular setup to find the sweet spot to +implement an appropriate sharding strategy. + +Figuring out how many shards to use for your tables requires you to think about +the type of data you are processing, the types of queries you are running, and +the type of hardware you are using. + +- Too many shards can degrade search performance and make the cluster unstable. + This is referred to as _oversharding_. + +- Very large shards can slow down cluster operations and prolong recovery times + after failures. ## Sizing considerations @@ -100,23 +114,6 @@ Balancing the number and size of your shards is important for the performance and stability of your CrateDB clusters. ::: -### Avoid imbalances - -- Too many shards can degrade search performance and make the cluster unstable. - This is referred to as _oversharding_. - -- Very large shards can slow down cluster operations and prolong recovery times - after failures. - -Finding the right balance when it comes to sharding will vary on a lot of -things. While it's generally advisable to slightly over-allocate, we -recommend to benchmark your particular setup to find the sweet spot to -implement an appropriate sharding strategy. - -Figuring out how many shards to use for your tables requires you to think about -the type of data you're processing, the types of queries you're running, and -the type of hardware you're using. - (sharding-under-allocation)= ### Avoid under-allocation From 20aa42a1110ed11ea672ad2dca0c21ec4ecd52dc Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 23:33:48 +0200 Subject: [PATCH 07/15] Sharding: Rename bottom section and relocate subsection --- docs/performance/sharding.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index bd777cba..ba541bf3 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -106,7 +106,7 @@ into, you can typically ignore replica partitions as these are not usually queried across for reads. -## Caveats +## Notes :::{caution} :class: hero @@ -146,7 +146,7 @@ workloads can be parallelized and distributed maximally, which in turn ensures maximal query performance. (sharding-ingestion)= -## Optimising for ingestion +### Optimize for ingestion When doing heavy ingestion, it is good to cluster a table across as many nodes as possible. However, [we have From bcd13a6a92396758ef4b5a0b3ccf222dae96ad92 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 23:35:44 +0200 Subject: [PATCH 08/15] Sharding: Add miniature section about "segments" --- docs/performance/sharding.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index ba541bf3..eb018276 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -105,6 +105,10 @@ For the purposes of calculating how many shards a table should be clustered into, you can typically ignore replica partitions as these are not usually queried across for reads. +### Segments + +The number of segments within a shard affects query performance because more +segments have to be visited. ## Notes From 1227850d7cae0886ac7e53accf98122fe341c9d1 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 23:40:57 +0200 Subject: [PATCH 09/15] Sharding: Remove statement about ignoring replica partitions --- docs/performance/sharding.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index eb018276..067caa24 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -101,10 +101,6 @@ types of queries you intend to run. CrateDB also has replicas of data and this results in additional shards in the cluster. -For the purposes of calculating how many shards a table should be clustered -into, you can typically ignore replica partitions as these are not usually -queried across for reads. - ### Segments The number of segments within a shard affects query performance because more From ca082cceb99f8c919f0225e097a57cd3a210a231 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Wed, 1 Oct 2025 23:49:34 +0200 Subject: [PATCH 10/15] Sharding: Implement suggestions by CodeRabbit --- docs/admin/sharding-partitioning.md | 6 +++--- docs/performance/sharding.md | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/admin/sharding-partitioning.md b/docs/admin/sharding-partitioning.md index 9b806c7e..76e2931b 100644 --- a/docs/admin/sharding-partitioning.md +++ b/docs/admin/sharding-partitioning.md @@ -103,9 +103,9 @@ for a benchmark. - Identify the record size - Calculate the throughput -Then, to calculate the number of shards, you should consider that the size of each -shard should roughly be between 10 - 50 GB, and that each node can only manage -up to 1000 shards. +Then, to calculate the number of shards, consider that each shard should +roughly be between 10 – 50 GB, and that each node can manage +up to 1_000 shards by default. :::{caution} **Over-sharding and over-partitioning** diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index 067caa24..4402868e 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -29,7 +29,7 @@ If you are looking for an intro to sharding, see also the To avoid running your clusters with too many shards or too large shards, implement the following guidelines as a rule of thumb: -- Use shard sizes between 10GB and 50GB. +- Use shard sizes between 10 GB and 50 GB. - Keep the number of records on each shard below 200 million. @@ -50,7 +50,7 @@ the type of hardware you are using. ## Sizing considerations -General principles requires careful consideration for cluster +General principles require careful consideration of cluster sizing and architecture. Keep the following things in mind when building your sharding strategy. Each shard incurs overhead in terms of open files, RAM allocation, and CPU cycles @@ -59,7 +59,7 @@ for maintenance operations. ### Shard size vs. number of shards The optimal approach balances shard count with shard size. Individual shards should -typically contain 10-50 GB of data, being the sweet spot for most +typically contain 10-50 GB of data, being the sweet spot for most workloads. In large clusters, this often means fewer shards than total CPU cores, as larger shards can still be processed efficiently by multiple CPU cores during query execution. From d4ca40a66e51b8b208ec25360601b59bc65dfbbd Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Tue, 7 Oct 2025 11:29:15 +0200 Subject: [PATCH 11/15] Sharding: Implement suggestions by Marios --- docs/admin/sharding-partitioning.md | 2 +- docs/performance/sharding.md | 6 +++++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/docs/admin/sharding-partitioning.md b/docs/admin/sharding-partitioning.md index 76e2931b..58d7340b 100644 --- a/docs/admin/sharding-partitioning.md +++ b/docs/admin/sharding-partitioning.md @@ -121,7 +121,7 @@ in the {ref}`sharding-guide`. ## Example -Let's create a basic sharding strategy on behalf of a concrete example. Imagine +Let's create a basic sharding strategy by using a concrete example. Imagine you want to create a *partitioned table* on a *three-node cluster* to store time series data with the following assumptions: diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index 4402868e..deb2ae55 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -100,6 +100,10 @@ types of queries you intend to run. CrateDB also has replicas of data and this results in additional shards in the cluster. +By default, CrateDB uses the replica setting `0-1` on newly created tables, +so it will end up with twice the number of shards configured. The more +replicas you add, the higher is the multiplier (x3, x4, etc.) how you +compute required capacities. ### Segments @@ -133,7 +137,7 @@ the more likely it is that they will be distributed across the whole cluster, and hence across all of your CPUs, and hence the faster your queries will run. (sharding-over-allocation)= -### Avoid too much over-allocation +### Avoid extensive over-allocation :::{CAUTION} If you have more shards per table than CPUs, this is called *over-allocation*. A From 580221a1a4135c597ea4fa266dd8070dfa8b1597 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Tue, 7 Oct 2025 11:46:11 +0200 Subject: [PATCH 12/15] Chore: Fix broken link reference --- docs/integrate/locust/tutorial.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/integrate/locust/tutorial.md b/docs/integrate/locust/tutorial.md index 7a9fa670..48db2f8c 100644 --- a/docs/integrate/locust/tutorial.md +++ b/docs/integrate/locust/tutorial.md @@ -479,7 +479,7 @@ If you want to download the locust data, you can do that on the last tab. When you want to run a load test against a CrateDB Cluster with multiple queries, Locust is a great and flexible tool that lets you quickly define a load test and see what numbers regarding users and RPS are possible for that particular setup. -[CrateDB CLI tools]: https://cratedb.com/docs/crate/clients-tools/en/latest/connect/cli.html#cli +[CrateDB CLI tools]: project:#cli [DBeaver]: https://dbeaver.io [fully-managed]: https://console.cratedb.cloud/ [Locust]: https://locust.io From fdc8076a9a3cdadc22b58906101a4aeb552eef61 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Tue, 7 Oct 2025 11:47:54 +0200 Subject: [PATCH 13/15] Sharding/Replicas: Implement suggestions by Marios and CodeRabbit --- docs/performance/sharding.md | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index deb2ae55..6e48f998 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -98,12 +98,18 @@ types of queries you intend to run. ### Replicas -CrateDB also has replicas of data and this results in additional shards in -the cluster. -By default, CrateDB uses the replica setting `0-1` on newly created tables, -so it will end up with twice the number of shards configured. The more -replicas you add, the higher is the multiplier (x3, x4, etc.) how you -compute required capacities. +CrateDB uses replicas for both data durability and query performance. When a +node goes down, replicas ensure no data is lost. For read operations, CrateDB +randomly distributes queries across both primary and replica shards, improving +concurrent read throughput. + +Each replica adds to the total shard count in the cluster. By default, CrateDB +uses the replica setting `0-1` on newly created tables, resulting in twice the +number of configured shards. The more replicas you add, the higher the +multiplier (x3, x4, etc.) for capacity planning + +See the {ref}`replication reference ` +documentation for more details. ### Segments From 2849eb7f62797d0d2fa090fbc7fce4f1e9729c0d Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Tue, 7 Oct 2025 12:02:41 +0200 Subject: [PATCH 14/15] Sharding/Shard-per-CPU: Implement suggestions by Marios --- docs/performance/sharding.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index 6e48f998..d9d85a7f 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -64,7 +64,7 @@ workloads. In large clusters, this often means fewer shards than total CPU cores as larger shards can still be processed efficiently by multiple CPU cores during query execution. -### CPU-to-shard ratio +### Shard-per-CPU ratio If most nodes have more shards per table than they have CPUs, the cluster can experience performance degradations. @@ -72,6 +72,7 @@ For example, on clusters with substantial CPU resources (e.g., 8 nodes × 32 CPU = 256 total CPUs), creating 256+ shards per table often proves counterproductive. If you don't manually set the number of shards per table, CrateDB will make a best guess, based on the assumption that your nodes have two CPUs each. +The general advice is to calculate with 1 shard per CPU as a starting point. ### 1000 shards per node limit From 1d73a79669dfd9409cbbc9670810fb6420651980 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Tue, 7 Oct 2025 14:49:09 +0200 Subject: [PATCH 15/15] Sharding: Use "5-50 GB" size recommendation --- docs/admin/sharding-partitioning.md | 4 ++-- docs/feature/cluster/index.md | 2 +- docs/performance/sharding.md | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/admin/sharding-partitioning.md b/docs/admin/sharding-partitioning.md index 58d7340b..ce0a542d 100644 --- a/docs/admin/sharding-partitioning.md +++ b/docs/admin/sharding-partitioning.md @@ -104,7 +104,7 @@ for a benchmark. - Calculate the throughput Then, to calculate the number of shards, consider that each shard should -roughly be between 10 – 50 GB, and that each node can manage +roughly be between 5-50 GB, and that each node can manage up to 1_000 shards by default. :::{caution} @@ -132,7 +132,7 @@ time series data with the following assumptions: Given the daily throughput is around 10 GB/day, the monthly throughput is 30 times that (~ 300 GB). The partition column can be day, week, month, quarter, etc. So, assuming a monthly partition, the next step is to calculate the number of shards -with the **shard size recommendation** (10 - 50 GB) and the **number of nodes** in +with the **shard size recommendation** (5-50 GB) and the **number of nodes** in the cluster in mind. With three shards, each shard would hold 100 GB (300 GB / 3 shards), which is above diff --git a/docs/feature/cluster/index.md b/docs/feature/cluster/index.md index 04fb9f23..4a725387 100644 --- a/docs/feature/cluster/index.md +++ b/docs/feature/cluster/index.md @@ -109,7 +109,7 @@ data loss, and to improve read performance. ## Synopsis With a monthly throughput of 300 GB, partitioning your table by month, and using six shards, each shard will manage 50 GB of data, which is -within the recommended size range (10 - 50 GB). +within the recommended size range (5-50 GB). Through replication, the table will store three copies of your data, in order to reduce the chance of permanent data loss. diff --git a/docs/performance/sharding.md b/docs/performance/sharding.md index d9d85a7f..1716ef16 100644 --- a/docs/performance/sharding.md +++ b/docs/performance/sharding.md @@ -29,7 +29,7 @@ If you are looking for an intro to sharding, see also the To avoid running your clusters with too many shards or too large shards, implement the following guidelines as a rule of thumb: -- Use shard sizes between 10 GB and 50 GB. +- Use shard sizes between 5 GB and 50 GB. - Keep the number of records on each shard below 200 million. @@ -59,7 +59,7 @@ for maintenance operations. ### Shard size vs. number of shards The optimal approach balances shard count with shard size. Individual shards should -typically contain 10-50 GB of data, being the sweet spot for most +typically contain 5-50 GB of data, being the sweet spot for most workloads. In large clusters, this often means fewer shards than total CPU cores, as larger shards can still be processed efficiently by multiple CPU cores during query execution.