Skip to content

FK-based region-column inference #20018

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 1, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions src/current/_includes/v25.3/misc/table-storage-parameters.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
| Parameter name | Description | Data type | Default value |
|------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|-----------|---------------|
| `exclude_data_from_backup` | Exclude the data in this table from any future backups. | Boolean | `false` |
| <a name="storage-parameter-schema-locked"></a> `schema_locked` | Disallow [schema changes]({% link {{ page.version.version }}/online-schema-changes.md %}) on this table. Enabling `schema_locked` can help [improve performance of changefeeds]({% link {{ page.version.version }}/create-changefeed.md %}#disallow-schema-changes-on-tables-to-improve-changefeed-performance) running on this table. | Boolean | `false` |
| `sql_stats_automatic_collection_enabled` | Enable [automatic statistics collection]({% link {{ page.version.version }}/cost-based-optimizer.md %}#enable-and-disable-automatic-statistics-collection-for-tables) for this table. | Boolean | `true` |
| `sql_stats_automatic_collection_min_stale_rows` | Minimum number of stale rows in this table that will trigger a statistics refresh. | Integer | 500 |
| `sql_stats_automatic_collection_fraction_stale_rows` | Fraction of stale rows in this table that will trigger a statistics refresh. | Float | 0.2 |
| `sql_stats_forecasts_enabled` | Enable [forecasted statistics]({% link {{ page.version.version }}/show-statistics.md %}#display-forecasted-statistics) collection for this table. | Boolean | `true` |
| Parameter name | Description | Data type | Default value |
|----------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|---------------|
| `exclude_data_from_backup` | Exclude the data in this table from any future backups. | Boolean | `false` |
| <a name="storage-parameter-schema-locked"></a> `schema_locked` | Disallow [schema changes]({% link {{ page.version.version }}/online-schema-changes.md %}) on this table. Enabling `schema_locked` can help [improve performance of changefeeds]({% link {{ page.version.version }}/create-changefeed.md %}#disallow-schema-changes-on-tables-to-improve-changefeed-performance) running on this table. | Boolean | `false` |
| `sql_stats_automatic_collection_enabled` | Enable [automatic statistics collection]({% link {{ page.version.version }}/cost-based-optimizer.md %}#enable-and-disable-automatic-statistics-collection-for-tables) for this table. | Boolean | `true` |
| `sql_stats_automatic_collection_min_stale_rows` | Minimum number of stale rows in this table that will trigger a statistics refresh. | Integer | 500 |
| `sql_stats_automatic_collection_fraction_stale_rows` | Fraction of stale rows in this table that will trigger a statistics refresh. | Float | 0.2 |
| `sql_stats_forecasts_enabled` | Enable [forecasted statistics]({% link {{ page.version.version }}/show-statistics.md %}#display-forecasted-statistics) collection for this table. | Boolean | `true` |
| `infer_rbr_region_col_using_constraint` | For [`REGIONAL BY ROW`]({% link {{ page.version.version }}/table-localities.md %}#regional-by-row-tables) tables, automatically populate the hidden `crdb_region` column on `INSERT`, `UPDATE`, and `UPSERT` by looking up the region of the referenced parent row. Set this parameter to the name of a [foreign key]({% link {{ page.version.version }}/foreign-key.md %}) constraint on the table that includes the `crdb_region` column. The foreign key cannot be dropped while the parameter is set. | String | `NULL` |

The following parameters are included for PostgreSQL compatibility and do not affect how CockroachDB runs:

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
In a _regional by row_ table, individual rows are optimized for access from different home regions. Each row's home region is specified in a hidden [`crdb_region` column]({% link {{ page.version.version }}/alter-table.md %}#crdb_region), and is by default the region of the [gateway node]({% link {{ page.version.version }}/architecture/life-of-a-distributed-transaction.md %}#gateway) from which the row is inserted. The `REGIONAL BY ROW` setting automatically divides a table and all of [its indexes]({% link {{ page.version.version }}/table-localities.md %}#indexes-on-regional-by-row-tables) into [partitions]({% link {{ page.version.version }}/partitioning.md %}) that use `crdb_region` as the prefix.
In a _regional by row_ table, each row is optimized for access from a specific home region. Each row's home region is specified in a hidden [`crdb_region` column]({% link {{ page.version.version }}/alter-table.md %}#crdb_region), which defaults to the region of the [gateway node]({% link {{ page.version.version }}/architecture/life-of-a-distributed-transaction.md %}#gateway) that inserted the row. The `REGIONAL BY ROW` setting automatically [partitions]({% link {{ page.version.version }}/partitioning.md %}) the table and all of [its indexes]({% link {{ page.version.version }}/table-localities.md %}#indexes-on-regional-by-row-tables) by region using `crdb_region` as the partition key prefix.

Use regional by row tables when your application requires low-latency reads and writes at a row level where individual rows are primarily accessed from a single region. For an example of a table in a multi-region cluster that can benefit from the `REGIONAL BY ROW` setting, see the `users` table from the [MovR application]({% link {{ page.version.version }}/movr.md %}), which could store users' data in specific regions for better performance.
Use regional by row tables when individual rows are frequently accessed from a single region, and your application requires low-latency reads and writes at the row level. A typical `REGIONAL BY ROW` use case is the `users` table in the [MovR application]({% link {{ page.version.version }}/movr.md %}), where user data can be co-located with the user's region for better performance.

To take advantage of regional by row tables:

Expand All @@ -10,7 +10,7 @@ To take advantage of regional by row tables:

- [Turn on auto-rehoming for regional by row tables]({% link {{ page.version.version }}/alter-table.md %}#turn-on-auto-rehoming-for-regional-by-row-tables). A row's home region will be automatically set to the gateway region of any [`UPDATE`]({% link {{ page.version.version }}/update.md %}) or [`UPSERT`]({% link {{ page.version.version }}/upsert.md %}) statements that write to those rows.

For instructions showing how to set a table's locality to `REGIONAL BY ROW` and configure the home regions of its rows, see [`ALTER TABLE ... SET LOCALITY`]({% link {{ page.version.version }}/alter-table.md %}#crdb_region).
For instructions showing how to set a table's locality to `REGIONAL BY ROW` and configure the home regions of its rows, refer to [`ALTER TABLE ... SET LOCALITY`]({% link {{ page.version.version }}/alter-table.md %}#crdb_region).

For more information on regional by row tables, see the [Cockroach Labs blog post](https://www.cockroachlabs.com/blog/regional-by-row/).

Expand Down
168 changes: 167 additions & 1 deletion src/current/v25.3/alter-table.md
Original file line number Diff line number Diff line change
Expand Up @@ -2571,6 +2571,10 @@ SELECT crdb_region, id FROM {table};
UPDATE {table} SET crdb_region = 'eu-west' WHERE id IN (...)
~~~

{{site.data.alerts.callout_success}}
{% include_cached new-in.html version="v25.3" %} CockroachDB can also [infer a row's home region from a foreign key constraint](#infer-a-rows-home-region-from-a-foreign-key).
{{site.data.alerts.end}}

To add a new row to a regional by row table, you must choose one of the following options.

- Let CockroachDB set the row's home region automatically. It will use the region of the [gateway node]({% link {{ page.version.version }}/architecture/life-of-a-distributed-transaction.md %}#gateway) from which the row is inserted.
Expand Down Expand Up @@ -2614,6 +2618,168 @@ ALTER TABLE rides ADD COLUMN region crdb_internal_region AS (

{% include {{page.version.version}}/sql/locality-optimized-search.md %}

#### Infer a row's home region from a foreign key

{{site.data.alerts.callout_info}}
{% include feature-phases/preview.md %}
{{site.data.alerts.end}}

The [`infer_rbr_region_col_using_constraint` table storage parameter]({% link {{ page.version.version }}/with-storage-parameter.md %}#table-parameters) lets a [`REGIONAL BY ROW`]({% link {{ page.version.version }}/table-localities.md %}#regional-by-row-tables) child table automatically set the hidden `crdb_region` column by looking up the referenced parent row. The parameter must be set to the name of a [foreign key]({% link {{ page.version.version }}/foreign-key.md %}) constraint on the child table that includes the `crdb_region` column.

{{site.data.alerts.callout_info}}
`infer_rbr_region_col_using_constraint` and [auto-rehoming](#turn-on-auto-rehoming-for-regional-by-row-tables) are mutually exclusive. Enable one mode per `REGIONAL BY ROW` table, not both.
{{site.data.alerts.end}}

1. Start a [multi-node, multi-region `cockroach demo` cluster]({% link {{ page.version.version }}/cockroach-demo.md %}#start-a-multi-region-demo-cluster):

{% include_cached copy-clipboard.html %}
~~~ shell
cockroach demo --global --nodes 9 --insecure
~~~

The command opens an interactive SQL shell connected to the temporary, in-memory cluster that is running with three simulated regions.

1. In the SQL shell, create a multi-region `demo` database:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE DATABASE demo PRIMARY REGION "us-east1" REGIONS "us-west1", "europe-west1";
USE demo;
~~~

1. Verify the regions that are available to the `demo` database:

{% include_cached copy-clipboard.html %}
~~~ sql
SHOW REGIONS FROM DATABASE demo;
~~~

~~~
database | region | primary | secondary | zones
-----------+--------------+---------+-----------+----------
demo | us-east1 | t | f | {b,c,d}
demo | europe-west1 | f | f | {b,c,d}
demo | us-west1 | f | f | {a,b,c}
(3 rows)
~~~

1. Enable the following cluster setting:

{% include_cached copy-clipboard.html %}
~~~ sql
SET CLUSTER SETTING feature.infer_rbr_region_col_using_constraint.enabled = true;
~~~

1. Create a parent table that is `REGIONAL BY ROW`:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE TABLE parent (
id INT PRIMARY KEY,
data TEXT
) LOCALITY REGIONAL BY ROW;
~~~

1. Create a child table whose rows should live in the same region as their parent rows. The table's foreign key **must** include `crdb_region`, and the `infer_rbr_region_col_using_constraint` parameter must be set to the name of the foreign key (in this example, `fk_parent`):

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE TABLE child (
id INT PRIMARY KEY,
parent_id INT,
info TEXT,
CONSTRAINT fk_parent FOREIGN KEY (crdb_region, parent_id) REFERENCES parent (crdb_region, id)
) WITH (infer_rbr_region_col_using_constraint = 'fk_parent')
LOCALITY REGIONAL BY ROW;
~~~

1. Insert a row into the parent table that lives in the primary region (`us-east1`):

{% include_cached copy-clipboard.html %}
~~~ sql
INSERT INTO parent (id, data, crdb_region) VALUES (1, 'east row', 'us-east1');
~~~

1. Run the following command to view IDs and connection strings for all nodes in the cluster:

{% include_cached copy-clipboard.html %}
~~~
\demo ls
~~~

~~~
node 1:
(webui) http://127.0.0.1:8080

Application tenant:
(cli) cockroach sql --insecure -d cluster:demoapp/movr
(sql) postgresql://root@127.0.0.1:26257/movr?options=-ccluster%3Ddemoapp&sslmode=disable
(sql/jdbc) jdbc:postgresql://127.0.0.1:26257/movr?options=-ccluster%3Ddemoapp&sslmode=disable&user=root
(sql/unix) postgresql://root:unused@/defaultdb?host=%2FUsers%2Fryankuo%2F.cockroach-demo&options=-ccluster%3Ddemoapp&port=26257
(rpc) 127.0.0.1:26357

...

node 9:
(webui) http://127.0.0.1:8088

Application tenant:
(cli) cockroach sql --insecure -p 26265 -d cluster:demoapp/movr
(sql) postgresql://root@127.0.0.1:26265/movr?options=-ccluster%3Ddemoapp&sslmode=disable
(sql/jdbc) jdbc:postgresql://127.0.0.1:26265/movr?options=-ccluster%3Ddemoapp&sslmode=disable&user=root
(sql/unix) postgresql://root:unused@/defaultdb?host=%2FUsers%2Fryankuo%2F.cockroach-demo&options=-ccluster%3Ddemoapp&port=26265
(rpc) 127.0.0.1:26365
~~~

1. Follow one of the `webui` URLs in the preceding output and open the [**Network Latency** Page]({% link {{ page.version.version }}/ui-network-latency-page.md %}) in the DB Console. This page displays the region associated with each node ID.

In a **new** terminal, connect to a node in a non-primary region (in this example, `us-west1` or `europe-west1`) using its connection string:

{% include_cached copy-clipboard.html %}
~~~ shell
cockroach sql --insecure -p 26265 -d cluster:demoapp/demo
~~~

1. Verify that the node you're connected to is in a non-primary region:

{% include_cached copy-clipboard.html %}
~~~ sql
SHOW LOCALITY;
~~~

~~~
locality
----------------------------
region=europe-west1,az=d
~~~

1. Insert a row into the child table from the node you're connected to:

{% include_cached copy-clipboard.html %}
~~~ sql
INSERT INTO child (id, parent_id, info) VALUES (10, 1, 'hello from gateway');
~~~

1. View the value of the hidden `crdb_region` column. Even though the row was inserted from a non-primary gateway region, CockroachDB uses the foreign-key constraint to look up and copy the parent row's region into the child row:

{% include_cached copy-clipboard.html %}
~~~ sql
SELECT id, parent_id, crdb_region FROM child;
~~~

~~~
id | parent_id | crdb_region
-----+-----------+--------------
10 | 1 | us-east1
~~~

1. If you later need to disable the behavior, [reset the storage parameter](#reset-storage-parameter):

{% include_cached copy-clipboard.html %}
~~~ sql
ALTER TABLE child RESET (infer_rbr_region_col_using_constraint);
~~~

#### Turn on auto-rehoming for `REGIONAL BY ROW` tables

{{site.data.alerts.callout_info}}
Expand All @@ -2634,7 +2800,7 @@ SET enable_auto_rehoming = on;
Once enabled, the auto-rehoming behavior described here has the following limitations:

- It **will only apply to newly created `REGIONAL BY ROW` tables**, using an `ON UPDATE` expression that is added to the [`crdb_region`](#crdb_region) column. Existing `REGIONAL BY ROW` tables will not be auto-rehomed.
- The [`crdb_region`](#crdb_region) column from a [`REGIONAL BY ROW`](#set-the-table-locality-to-regional-by-row) table cannot be referenced as a [foreign key]({% link {{ page.version.version }}/foreign-key.md %}) from another table.
- The [`crdb_region`](#crdb_region) column from a [`REGIONAL BY ROW`](#set-the-table-locality-to-regional-by-row) table cannot be referenced as a [foreign key]({% link {{ page.version.version }}/foreign-key.md %}) from another table. This also means the [`infer_rbr_region_col_using_constraint` table storage parameter]({% link {{ page.version.version }}/with-storage-parameter.md %}#table-parameters) cannot be used to [infer a row's home region](#infer-a-rows-home-region-from-a-foreign-key).

To enable auto-rehoming for an existing `REGIONAL BY ROW` table, manually update it using an [`ALTER TABLE ... ALTER COLUMN`](#alter-column) statement with an `ON UPDATE` expression:

Expand Down
Loading
Loading