Skip to content

Commit ddfc251

Browse files
doc: describe schema reload implementation
This patch adds a small document describing how space and sharding schemas works in the module. It covers what schemas are, how we store, use and reload them in requests. Closes #253
1 parent a577434 commit ddfc251

File tree

1 file changed

+227
-0
lines changed

1 file changed

+227
-0
lines changed

doc/dev/schema.md

Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
# Database schema information design document
2+
3+
Two types of schema are used in ``crud`` requests: ``net.box`` spaces
4+
schema and ``ddl`` sharding schema. If a change had occurred in one of
5+
those, router instances should reload the schema and reevaluate
6+
a request using an updated one. This document clarifies how schema
7+
is obtained, used and reloaded.
8+
9+
## Table of Contents
10+
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
11+
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
12+
13+
- [Space schema](#space-schema)
14+
- [How schema is stored](#how-schema-is-stored)
15+
- [When schema is used](#when-schema-is-used)
16+
- [How schema is reloaded](#how-schema-is-reloaded)
17+
- [When schema is reloaded and operation is retried](#when-schema-is-reloaded-and-operation-is-retried)
18+
- [Alternative approaches](#alternative-approaches)
19+
- [Sharding schema](#sharding-schema)
20+
- [How schema is stored](#how-schema-is-stored-1)
21+
- [When schema is used](#when-schema-is-used-1)
22+
- [How schema is reloaded](#how-schema-is-reloaded-1)
23+
- [When schema is reloaded and operation is retried](#when-schema-is-reloaded-and-operation-is-retried-1)
24+
- [Alternative approaches](#alternative-approaches-1)
25+
26+
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
27+
28+
## Space schema
29+
30+
Related links: [#98](https://github.com/tarantool/crud/issues/98),
31+
[PR#111](https://github.com/tarantool/crud/pull/111).
32+
33+
### How schema is stored
34+
35+
Every ``crud`` router is a [``vshard``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/)
36+
router, the same applies to storages. In ``vshard`` clusters, spaces are
37+
created on storages. Thus, each storage has a schema (space list,
38+
space formats and indexes) on it.
39+
40+
Every router has a [``net.box``](https://www.tarantool.io/en/doc/latest/reference/reference_lua/net_box/)
41+
connection to each storage it could interact with. (They can be
42+
retrieved with [``vshard.router.routeall``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_router/#lua-function.vshard.router.routeall)
43+
call.) Each ``net.box`` connection has space schema for an instance it
44+
is connected to. Router can access space schema by using ``net.box``
45+
connection object contents.
46+
47+
### When schema is used
48+
49+
Space schema is used
50+
- to flatten objects on `crud.*_object*` requests;
51+
- to resolve update operation fields if updating
52+
by field name is not supported natively;
53+
- to calculate ``bucket_id`` to choose replicaset for a request
54+
(together with sharding info);
55+
- in `metadata` field of request response, so it could be used later
56+
for `crud.unflatten_rows`.
57+
58+
### How schema is reloaded
59+
60+
``net.box`` schema reload works as follows. For each request (we use ``call``s
61+
to execute our procedures) from client (router) to server (storage) we
62+
receive a schema version together with response data. If schema versions mismatch,
63+
client reloads schema. The request is not retried. ``net.box`` reloads
64+
schema before returning synchronous ``call`` result to a user, so a next
65+
request in the fiber will use the updated schema. (See [tarantool/tarantool/6169](https://github.com/tarantool/tarantool/issues/6169).)
66+
67+
``crud`` cannot implicitly rely on ``net.box`` reloads: ``crud`` requests
68+
use ``net.box`` spaces schema to build ``net.box`` ``call`` request data,
69+
so if something is not relevant anymore, a part of this request data need
70+
to be recalculated. ``crud`` uses connection ``reload_schema`` handle
71+
(see [PR#111 comment](https://github.com/tarantool/crud/pull/111#issuecomment-765811556))
72+
to ping the storage and updates `net.box` schema if it is outdated. ``crud``
73+
reloads each replicaset schema. If there are several requests to reload info,
74+
the only one reload fiber is started and every request waits for its completion.
75+
76+
### When schema is reloaded and operation is retried
77+
78+
The basic logic is as follows: if something had failed and space schema
79+
mismatch could be the reason, reload the schema and retry. If it didn't
80+
help after ``N`` retries (now ``N`` is ``1``), pass the error to the
81+
user.
82+
83+
Retry with reload is triggered
84+
- before a network request if an action that depends on the schema has failed:
85+
- space not found;
86+
- object flatten on `crud.*_object*` request has failed;
87+
- ``bucket_id`` calculation has failed due to any reason;
88+
- if updating by field name is not supported natively and id resolve
89+
has failed;
90+
- network operation had failed on storage, hash check is on and hashes mismatch.
91+
92+
Let's talk a bit more about the last one. To enable hash check, the user
93+
should pass `add_space_schema_hash` option to a request. This option is
94+
always enabled for `crud.*_object*` requests. If hashes mismatch, it
95+
means the router schema of this space is inconsistent with the storage
96+
space schema, so we reload it. For ``*_many`` methods, reload and retry
97+
happens only if all tuples had failed with hash mismatch; otherwise,
98+
errors are passed to a user.
99+
100+
Retries with reload are processed by wrappers that wraps "pure" crud operations
101+
and utilities. Retries are counted per function, so in fact there could be more
102+
reloads than ``N`` for a single request. For example, `crud.*_object*` code looks
103+
like this:
104+
```lua
105+
local function reload_schema_wrapper(func, ...)
106+
for i = 1, N do
107+
local status, res, err, reload_needed = pcall(func, ...)
108+
if err ~= nil and reload_needed then
109+
process_reload()
110+
-- ...
111+
end
112+
end
113+
end
114+
115+
function crud.insert_object(args)
116+
local flatten_obj, err = reload_schema_wrapper(flatten_obj_func, space_name, obj)
117+
-- ...
118+
return reload_schema_wrapper(call_insert_func, space_name, flatten_obj, opts)
119+
end
120+
```
121+
122+
For `crud.*_object_many`, each tuple flatten is retried separately (though it
123+
is hard to imagine a real case of multiple successful flatten retries):
124+
```lua
125+
function crud.insert_object_many(args)
126+
for _, obj in ipairs(obj_array) do
127+
local flatten_obj, err = reload_schema_wrapper(flatten_obj_func, space_name, obj)
128+
-- ...
129+
end
130+
-- ...
131+
return reload_schema_wrapper(call_insert_many_func, space_name, flatten_obj_array, opts)
132+
end
133+
```
134+
135+
### Alternative approaches
136+
137+
One of the alternatives considered was to ping a storage instance on
138+
each request to refresh schema (see [PR#111 comment](https://github.com/tarantool/crud/pull/111#discussion_r562757016)),
139+
but it was declined due to performance degradation.
140+
141+
142+
## Sharding schema
143+
144+
Related links: [#166](https://github.com/tarantool/crud/issues/166),
145+
[PR#181](https://github.com/tarantool/crud/pull/181),
146+
[#237](https://github.com/tarantool/crud/issues/237),
147+
[PR#239](https://github.com/tarantool/crud/pull/239),
148+
[#212](https://github.com/tarantool/crud/issues/212),
149+
[PR#268](https://github.com/tarantool/crud/pull/268).
150+
151+
### How schema is stored
152+
153+
Again, ``crud`` cluster is a [``vshard``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/)
154+
cluster. Thus, data is sharded based on some key. To extract the key
155+
from a tuple and compute a [``bucket_id``](https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_architecture/),
156+
we use [``ddl``](https://github.com/tarantool/ddl) module data.
157+
``ddl`` module schema describes sharding key (how to extract data
158+
required to compute ``bucket_id`` from a tuple based on space schema)
159+
and sharding function (how to compute ``bucket_id`` based on the data).
160+
This information is stored on storages in some tables and Tarantool
161+
spaces: ``_ddl_sharding_key`` and ``_ddl_sharding_func``. ``crud``
162+
module uses `_ddl_sharding_key` and `_ddl_sharding_func` spaces to fetch
163+
sharding schema: thus, you don't obliged to use ``ddl`` module and can
164+
setup only ``_ddl_*`` spaces manually, if you want. ``crud`` module uses
165+
plain Lua tables to store sharding info on routers and storages.
166+
167+
### When schema is used
168+
169+
Sharding schema is used
170+
- to compute ``bucket_id``. Thus, it is used each time we need to execute
171+
a non-map-reduce request[^1] and `bucket_id` is not specified by user.
172+
If there is no sharding schema specified, we use defaults: sharding key
173+
is primary key, sharding function is ``vshard.router.bucket_id_strcrc32``.
174+
175+
[^1]: It includes all ``insert_*``, ``replace_*``, ``update_*``,
176+
``delete_*``, ``upsert_*``, ``get_*`` requests. ``*_many`` requests
177+
also use sharding schema to find a storage for each tuple. ``select``,
178+
``count`` and ``pairs`` requests use sharding info if user conditions
179+
have equal condition for a sharding key. (User still can force map-reduce
180+
with `force_map_call`. In this case sharding schema won't be used.)
181+
182+
### How schema is reloaded
183+
184+
Storage sharding schema (internal ``crud`` Lua tables) is updated on
185+
initialization (first time when info is requested) and each time someone
186+
changes ``_ddl_sharding_key`` or ``_ddl_sharding_func`` data — we use
187+
``on_replace`` triggers.
188+
189+
Routers fetch sharding schema if cache wasn't initialized yet or each
190+
time reload was requested. Reload could be requested with ``crud``
191+
itself (see below) or by user (with ``require('crud.common.sharding_key').update_cache()``
192+
or ``require('crud.common.sharding_func').update_cache()`` handles).
193+
The handles was deprecated after introducing automatic reload.
194+
195+
The sharding information reload procedure always fetches all sharding keys
196+
and all sharding functions disregarding of a reason that triggers the reload.
197+
If there are several requests to reload info, the only one
198+
reload fiber is started and every request waits for its completion.
199+
200+
### When schema is reloaded and operation is retried
201+
202+
Retry with reload is triggered
203+
- if router and storage schema hashes mismatch on request that
204+
uses sharding info.
205+
206+
Each request that uses sharding info passes sharding hashes from
207+
a router with a request. If hashes mismatch with storage ones, we return
208+
a specific error. The request retries if it receives a hash mismatch
209+
error from the storage. If it didn't help after ``N`` retries (now ``N``
210+
is ``1``), the error is passed to the user. For ``*_many`` methods,
211+
reload and retry happens only if all tuples had failed with hash
212+
mismatch; otherwise, errors are passed to the user.
213+
214+
Retries with reload are processed by wrappers that wraps "pure" crud
215+
operations, same as in space schema reload. Since there is only one
216+
use case for sharding info, there is only one wrapper per operation:
217+
around the main "pure" crud function with network call.
218+
219+
### Alternative approaches
220+
221+
There were different implementations of working with hash: we tried
222+
to compute it instead of storing pre-computed values (both on router
223+
and storage), but pre-computed approach with triggers was better in
224+
terms of performance. It was also an option to ping storage before
225+
sending a request and verify sharding info relevance before sending
226+
the request with separate call, but it was also declined due to
227+
performance degradation.

0 commit comments

Comments
 (0)