You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/mysql-external-table.md
+80-28Lines changed: 80 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,48 +1,88 @@
1
1
# MySQL External Table
2
2
3
-
## Overview
3
+
## Overview
4
4
5
5
Timeplus can read or write MySQL tables directly. This unlocks a set of new use cases, such as
6
6
7
-
- Use Timeplus to efficiently process real-time data in Kafka/Redpanda, apply flat transformation or stateful aggregation, then write the data to the local or remote MySQL for further analysis or visualization.
8
-
-Enrich the live data with the static or slow-changing data in MySQL. Apply streaming JOIN.
9
-
- Use Timeplus to query historical or recent data in MySQL.
7
+
-**Stream Processing**: Use Timeplus to efficiently process real-time data in Kafka/Redpanda, apply flat transformations or stateful aggregations, then write the processed data to the local or remote MySQL for further analysis or visualization.
8
+
-**Data Enrichment**: Enrich live streaming data with the static or slow-changing data from MySQL using streaming JOINs.
9
+
-**Unified Analytics**: Use Timeplus to query historical or recent data in MySQL alongside your streaming data for comprehensive analytics.
10
10
11
-
This integration is done by introducing "External Table" in Timeplus. Similar to [External Stream](/external-stream), there is no data persisted in Timeplus. However, since the data in MySQL is in the form of table, not data stream, so we call this as External Table. Currently, we support MySQL and ClickHouse. In the roadmap, we will support more integration by introducing other types of External Table.
11
+
This integration is done by introducing "External Table" in Timeplus. Similar to [External Stream](/external-stream), there is no data persisted in Timeplus. They are called as "External Table" since the data in MySQL is structured as table rather than stream.
12
+
13
+
The implementation is built on top of StorageMySQL with connection pooling and failover support.
12
14
13
15
## Create MySQL External Table
14
16
15
17
```sql
16
-
CREATE EXTERNAL TABLE name
17
-
SETTINGS type='mysql',
18
-
address='host:port',
19
-
user='..',
20
-
password='..',
21
-
database='..',
22
-
config_file='..',
23
-
table='..',
24
-
replace_query=false, -- optional, if it is ture, use REPLACE INTO instead of INSERT INTO
25
-
on_duplicate_clause='..', -- optinal, set the expression for ON DUPLICATE KEY
26
-
pooled_connections=16; -- optional, the maximum pooled connections to the database. Default 16.
18
+
CREATE EXTERNAL TABLE
19
+
table_name
20
+
SETTINGS
21
+
type='mysql',
22
+
address='host:port',
23
+
[ database='..', ]
24
+
[ table='..', ]
25
+
[ user='..', ]
26
+
[ password='..', ]
27
+
[ replace_query=false, ]
28
+
[ on_duplicate_clause='..', ]
29
+
[ pooled_connections=16, ]
30
+
[ config_file='..', ]
31
+
[ named_collection='..' ]
27
32
```
28
33
29
-
The required settings are type and address. For other settings, the default values are
34
+
### Required Settings
35
+
36
+
-**type** (string) - Must be set to `'mysql'`
37
+
-**address** (string) - MySQL server address in format `'host:port'`. Default port is 3306
38
+
39
+
### Database Settings
40
+
**user** (string, default: `'default'`) - MySQL username.
41
+
-**password** (string, default: `''`) - MySQL password.
42
+
-**database** (string, default: `'default'`) - MySQL database name.
43
+
-**table** (string, default: external table name) - Remote MySQL table name. If omitted, uses the external table name.
44
+
-**replace_query** (bool, default: `false`) - Flag that converts `INSERT INTO` queries to `REPLACE INTO`. If `true`, the query is executed as `INSERT INTO`. If `false`, the query is executed as `REPLACE INTO`.
45
+
-**on_duplicate_clause** (string, default: `''`) - The `ON DUPLICATE KEY on_duplicate_clause` expression that is added to the `INSERT` query. Can be specified only with `replace_query=false`. Example: `UPDATE c=c+1`. See the [MySQL documentation](https://dev.mysql.com/doc/refman/8.4/en/insert-on-duplicate.html) to find which on_duplicate_clause you can use with the ON DUPLICATE KEY clause.
46
+
-**pooled_connections** (uint64, default: `16`) - Maximum pooled TCP connections.
47
+
48
+
### Configuration Management Settings
30
49
31
-
- 'default' for `user`
32
-
- '' (empty string) for `password`
33
-
- 'default' for `database`
34
-
- If you omit the table name, it will use the name of the external table
-**named_collection** (string, default: `''`) - Name of pre-defined named collection configuration
35
52
36
-
The `config_file` setting is available since Timeplus Enterprise 2.7. You can specify the path to a file that contains the configuration settings. The file should be in the format of `key=value` pairs, one pair per line. You can set the MySQL user and password in the file.
53
+
The `config_file` setting is available since Timeplus Enterprise 2.7. You can specify the path to a configuration file that contains the configuration settings. The file should be in the format of `key=value` pairs, one pair per line. You can set the MySQL user and password in the file.
37
54
38
-
Please follow the example in [Kafka External Stream](/kafka-source#config_file).
55
+
Example configuration file content:
39
56
40
-
You don't need to specify the columns, since the table schema will be fetched from the MySQL server.
57
+
```ini
58
+
address=localhost:3306
59
+
user=root
60
+
password=secret123
61
+
database=production
62
+
```
63
+
64
+
The `named_collection` setting is available since Timeplus Enterprise 3.0. Similar with `config_file`, you can specify the name of a pre-defined named collection which contains the configuration settings.
65
+
66
+
Example named collection definition:
67
+
68
+
```sql
69
+
CREATE NAMED COLLECTION
70
+
mysql_config
71
+
AS
72
+
address='localhost:3306',
73
+
user='root',
74
+
password='secret123',
75
+
database='production';
76
+
```
77
+
78
+
### Columns Definition
79
+
80
+
You don't need to specify the columns in external table DDL, since the table schema will be fetched from the MySQL server.
41
81
42
82
Once the external table is created successfully, you can run the following SQL to list the columns:
43
83
44
84
```sql
45
-
DESCRIBE name
85
+
DESCRIBE table_name;
46
86
```
47
87
48
88
:::info
@@ -51,6 +91,12 @@ The data types in the output will be Timeplus data types, such as `uint8`, inste
51
91
52
92
:::
53
93
94
+
:::info
95
+
96
+
Timeplus fetches and caches the MySQL table schema when the external table is attached. When the remote MySQL table schema changes (e.g., adding columns, changing data types, dropping columns), you must **restart** to reload the updated schema.
97
+
98
+
:::
99
+
54
100
You can define the external table and use it to read data from the MySQL table, or write to it.
55
101
56
102
## Connect to a local MySQL {#local}
@@ -83,8 +129,14 @@ Limitations:
83
129
1. tumble/hop/session/table functions are not supported for External Table (coming soon)
84
130
2. scalar or aggregation functions are performed by Timeplus, not the remote MySQL
85
131
3.`LIMIT n` is performed by Timeplus, not the remote MySQL
132
+
4. No query predicate pushdown to MySQL (planned for future versions)
86
133
87
134
## Write data to MySQL {#write}
135
+
MySQL external tables support standard INSERT operations with the following behaviors:
-**Replace Mode**: When `replace_query=true`, uses `REPLACE INTO` instead
139
+
-**On Duplicate Key**: Custom conflict resolution with `on_duplicate_clause`
88
140
89
141
You can run regular `INSERT INTO` to add data to MySQL table. However it's more common to use a Materialized View to send the streaming SQL results to MySQL.
90
142
@@ -110,9 +162,9 @@ SELECT * FROM some_source_stream
*`max_insert_block_size` - The maximum block size for insertion, i.e. maximum number of rows in a batch. Default value: 65409
114
-
*`max_insert_block_bytes` - The maximum size in bytes of block for insertion. Default value: 1 MiB.
115
-
*`insert_block_timeout_ms` - The maximum time in milliseconds for constructing a block(a block) for insertion. Increasing the value gives greater possibility to create bigger blocks (limited by `max_insert_block_bytes` and `max_insert_block_size`), but also increases latency. Negative numbers means no timeout. Default value: 500.
165
+
-`max_insert_block_size` - The maximum block size for insertion, i.e. maximum number of rows in a batch. Default value: 65409
166
+
-`max_insert_block_bytes` - The maximum size in bytes of block for insertion. Default value: 1 MiB.
167
+
-`insert_block_timeout_ms` - The maximum time in milliseconds for constructing a block(a block) for insertion. Increasing the value gives greater possibility to create bigger blocks (limited by `max_insert_block_bytes` and `max_insert_block_size`), but also increases latency. Negative numbers means no timeout. Default value: 500.
0 commit comments