diff --git a/documentation/concept/decimal.md b/documentation/concept/decimal.md new file mode 100644 index 000000000..bf15b266a --- /dev/null +++ b/documentation/concept/decimal.md @@ -0,0 +1,340 @@ +--- +title: Decimal +description: + This document describes how to work with the decimal data type in QuestDB, + including precision and scale, literal syntax, arithmetic operations, and best + practices for financial and scientific calculations. +--- + +QuestDB provides a `decimal` data type for exact numeric calculations, useful +for financial computations, scientific measurements, and any scenario where +precision matters. This page explains how to use decimals effectively, including +syntax, operations, and performance considerations. + +## What are decimals? + +Decimals are fixed-point numbers that maintain exact precision during arithmetic +operations. Unlike floating-point types (`float` and `double`), decimals avoid +rounding errors by storing numbers as scaled integers internally. This makes +them ideal for monetary calculations where accuracy is critical. + +## Decimal type in QuestDB + +QuestDB implements decimals with the syntax `DECIMAL(precision, scale)`: + +- **Precision**: Total number of significant digits (1-76) +- **Scale**: Number of digits after the decimal point (0-precision) + +For example, `DECIMAL(10, 2)` can store values from -99,999,999.99 to +99,999,999.99. + +If neither the precision and scale are provided, the type defaults to a +precision of 18 and a scale of 3. + +### Storage + +QuestDB automatically selects the optimal storage size based on the decimal's +precision: + +| Precision | Storage Size | Internal Type | +| ------------ | ------------ | ------------- | +| 1-2 digits | 1 byte | DECIMAL8 | +| 3-4 digits | 2 bytes | DECIMAL16 | +| 5-9 digits | 4 bytes | DECIMAL32 | +| 10-18 digits | 8 bytes | DECIMAL64 | +| 19-38 digits | 16 bytes | DECIMAL128 | +| 39-76 digits | 32 bytes | DECIMAL256 | + +## Decimal literals + +QuestDB requires the `m` suffix to distinguish decimal literals from +floating-point numbers: + +```questdb-sql +-- Decimal literals use the 'm' suffix +SELECT 123.45m; -- Decimal value 123.45 +SELECT 0.001m; -- Decimal value 0.001 +SELECT 1000000.00m; -- Decimal value 1,000,000.00 + +-- Without 'm' suffix, numbers are treated as double +SELECT 123.45; -- Double value (floating-point) +``` + +:::important + +Always use the `m` suffix for decimal literals. QuestDB does not implicitly +convert doubles to decimals to prevent unintended precision loss. + +::: + +## Creating tables with decimals + +Define decimal columns by specifying precision and scale: + +```questdb-sql +CREATE TABLE transactions ( + id LONG, + amount DECIMAL(15, 2), -- Up to 999,999,999,999.99 + tax_rate DECIMAL(5, 4), -- Up to 9.9999 (e.g., 0.0875 for 8.75%) + quantity DECIMAL(10, 3), -- Up to 9,999,999.999 + timestamp TIMESTAMP +) timestamp(timestamp); +``` + +## Working with decimals + +### Basic arithmetic + +Decimal arithmetic maintains precision automatically: + +```questdb-sql +-- Insert transaction data +INSERT INTO transactions VALUES + (1, 99.99m, 0.0875m, 2.500m, now()), + (2, 150.00m, 0.0625m, 1.750m, now()), + (3, 1250.50m, 0.0875m, 10.000m, now()); + +-- Arithmetic operations maintain precision +SELECT + amount, + amount * tax_rate AS tax_amount, + amount + (amount * tax_rate) AS total, + amount * quantity AS extended_amount +FROM transactions; +``` + +### Precision and scale in operations + +QuestDB automatically determines the result precision and scale for decimal +operations based on the operands: + +#### Addition and subtraction + +- **Scale**: Maximum scale of the operands +- **Precision**: Maximum precision of the operands (scaled) + 1 + +```questdb-sql +-- Addition with different scales +SELECT 10.5m + 1.234m; -- scale: max(1, 3) = 3, Result: 11.734 + +-- Adding DECIMAL(10,2) + DECIMAL(8,2) → DECIMAL(11,2) +SELECT 99999999.99m + 999999.99m; -- Result has precision 11, scale 2 +``` + +The additional precision digit allows the result to accommodate potential +overflow (e.g., 99.9 + 99.9 = 199.8 requires 4 digits instead of 3). + +#### Multiplication + +- **Scale**: Sum of the scales of both operands +- **Precision**: Sum of the precision of both operands + +```questdb-sql +-- Multiplication adds scales +SELECT 10.50m * 1.25m; -- scale: 2 + 2 = 4, Result: 13.1250 + +-- DECIMAL(5,2) * DECIMAL(4,2) → DECIMAL(9,4) +SELECT 100.50m * 12.34m; -- Result: 1240.1700 +``` + +#### Division + +- **Scale**: Maximum scale of the operands + +```questdb-sql +-- Division uses maximum scale +SELECT 10.50m / 2.0m; -- scale: max(2, 1) = 2, Result: 5.25 + +-- Division may truncate beyond the scale +SELECT 10.00m / 3.00m; -- Result: 3.33 (limited to scale 2) +``` + +### Comparison operations + +Decimals support all standard comparison operators: + +```questdb-sql +-- Find high-value transactions +SELECT * FROM transactions WHERE amount > 1000.00m; + +-- Find specific tax rates +SELECT * FROM transactions WHERE tax_rate = 0.0875m; + +-- Range queries +SELECT * FROM transactions WHERE amount BETWEEN 100.00m AND 500.00m; +``` + +## Type casting + +### Explicit casting + +Convert between numeric types using `CAST`: + +```questdb-sql +-- From integer to decimal +SELECT CAST(100 AS DECIMAL(10, 2)); -- Result: 100.00 + +-- From double to decimal (use with caution - may lose precision) +SELECT CAST(123.456789 AS DECIMAL(8, 3)); -- Result: 123.457 + +-- From decimal to other types +SELECT CAST(99.99m AS INT); -- Result: 99 (truncate) +SELECT CAST(99.99m AS DOUBLE); -- Result: 99.99 (as floating-point) +``` + +### Important casting rules + +- **No implicit conversion from double/float**: Must use explicit `CAST` or + decimal literals +- **Integer to decimal**: Safe, no precision loss, the decimals have a scale of + 0 +- **Double to decimal**: May lose precision due to floating-point representation +- **Between decimal types**: Automatic when precision/scale allows + +## Performance considerations + +### Advantages + +- **Exact results**: Perfect for financial calculations and accounting +- **Predictable behavior**: No surprising rounding errors +- **Regulatory compliance**: Meets requirements for exact monetary calculations + +### Trade-offs + +- **Slower than floating-point**: Typically slower than `double` operations +- **More storage**: May use more space than `float` for equivalent range +- **Complex operations**: Division have overhead + +### Performance tips + +- **Use appropriate precision**: Don't over-specify precision beyond your needs +- **Keep precision ≤ 18 when possible**: DECIMAL64 operations are faster than + DECIMAL128/256 + +## Common use cases + +### Financial calculations + +```questdb-sql +-- Portfolio valuation with exact arithmetic +CREATE TABLE portfolio ( + symbol SYMBOL, + shares DECIMAL(12, 4), -- Fractional shares supported + price DECIMAL(10, 2), -- Stock price + commission DECIMAL(7, 2), -- Trading fees + timestamp TIMESTAMP +) timestamp(timestamp); + +-- Calculate exact portfolio value +SELECT + symbol, + shares, + price, + shares * price AS position_value, + shares * price - commission AS net_value, + sum(shares * price) OVER () AS total_portfolio_value +FROM portfolio +WHERE timestamp = now(); +``` + +### Cryptocurrency trading + +```questdb-sql +-- ETH trading with high precision (18 decimals like wei) +CREATE TABLE crypto_trades ( + trade_id LONG, + pair SYMBOL, + eth_amount DECIMAL(28, 18), -- ETH with full wei precision + usdt_price DECIMAL(12, 2), -- USDT price per ETH + fee_rate DECIMAL(5, 4), -- Trading fee (e.g., 0.001 for 0.1%) + gas_fee_eth DECIMAL(18, 18), -- Gas fee in ETH + timestamp TIMESTAMP +) timestamp(timestamp); + +-- Calculate trade values with exact precision +SELECT + trade_id, + eth_amount, + usdt_price, + eth_amount * usdt_price AS trade_value_usdt, + eth_amount * usdt_price * fee_rate AS fee_usdt, + eth_amount * usdt_price * (1.0m - fee_rate) AS net_value_usdt, + eth_amount - gas_fee_eth AS net_eth_received +FROM crypto_trades; +``` + +### Scientific measurements + +```questdb-sql +-- High-precision sensor data +CREATE TABLE sensor_readings ( + sensor_id SYMBOL, + measurement DECIMAL(20, 10), -- 10 decimal places of precision + calibration_factor DECIMAL(6, 5), + timestamp TIMESTAMP +) timestamp(timestamp); + +-- Apply calibration with exact arithmetic +SELECT + sensor_id, + measurement, + measurement * calibration_factor AS calibrated_value, + avg(measurement) OVER (PARTITION BY sensor_id) AS avg_reading +FROM sensor_readings +SAMPLE BY 1h; +``` + +## Best practices + +### When to use decimals + +**Use decimals for:** + +- Financial data (prices, amounts, exchange rates) +- Accounting calculations +- Scientific measurements requiring exact precision +- Regulatory compliance scenarios +- Any calculation where rounding errors are unacceptable + +**Avoid decimals for:** + +- Scientific calculations requiring extensive math functions +- Performance-critical analytics on large datasets +- Approximate values where precision isn't critical +- Coordinates or measurements where float precision suffices + +### Design guidelines + +1. **Choose appropriate precision and scale** + + ```questdb-sql + -- Good: Matches business requirements + CREATE TABLE prices ( + amount DECIMAL(10, 2) -- Cents precision for USD + ); + + -- Avoid: Excessive precision + CREATE TABLE prices ( + amount DECIMAL(30, 15) -- Unnecessary for most use cases + ); + ``` + +2. **Use the 'm' suffix consistently** + + ```questdb-sql + -- Good: Clear decimal literals + INSERT INTO prices VALUES (99.99m); + + -- Error: Missing 'm' suffix + INSERT INTO prices VALUES (99.99); -- Treated as double, will fail + ``` + +3. **Explicit casting when mixing types** + + ```questdb-sql + -- Good: Explicit cast + SELECT amount + CAST(10 AS DECIMAL(10, 2)) FROM prices; + + -- Good: Use decimal literal + SELECT amount + 10.00m FROM prices; + ``` diff --git a/documentation/pgwire/pgwire-intro.md b/documentation/pgwire/pgwire-intro.md index 0152c5e02..f5a374d0d 100644 --- a/documentation/pgwire/pgwire-intro.md +++ b/documentation/pgwire/pgwire-intro.md @@ -7,54 +7,84 @@ description: import { Clients } from "../../src/components/Clients" -QuestDB implements the PostgreSQL wire protocol (PGWire) to allow clients to connect to QuestDB using PostgreSQL client -libraries. This is a great way to get started with QuestDB, as it allows you to use existing PostgreSQL clients and +QuestDB implements the PostgreSQL wire protocol (PGWire) to allow clients to +connect to QuestDB using PostgreSQL client libraries. This is a great way to get +started with QuestDB, as it allows you to use existing PostgreSQL clients and libraries. - + -When using PGWire with QuestDB, there are a few important things to know and the rest of this document will cover them -in more detail. +When using PGWire with QuestDB, there are a few important things to know and the +rest of this document will cover them in more detail. ### Querying vs. Ingestion -The PGWire interface is primarily recommended for querying data from -QuestDB. For data ingestion, especially for high-throughput scenarios, QuestDB recommends using its clients that -support the [InfluxDB Line Protocol (ILP)](/docs/ingestion-overview/). These are optimized for fast data insertion. +The PGWire interface is primarily recommended for querying data from QuestDB. +For data ingestion, especially for high-throughput scenarios, QuestDB recommends +using its clients that support the +[InfluxDB Line Protocol (ILP)](/docs/ingestion-overview/). These are optimized +for fast data insertion. ### Timestamp Handling -QuestDB stores all timestamps internally in [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time). -However, when transmitting timestamps over the PGWire protocol, QuestDB represents them as `TIMESTAMP WITHOUT TIMEZONE`. -This can lead to client -libraries interpreting these timestamps in their local timezone by default, potentially causing confusion or incorrect -data representation. Our language-specific guides provide detailed examples on how to configure your client to correctly -interpret these timestamps as UTC. +QuestDB stores all timestamps internally in +[UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time). However, when +transmitting timestamps over the PGWire protocol, QuestDB represents them as +`TIMESTAMP WITHOUT TIMEZONE`. This can lead to client libraries interpreting +these timestamps in their local timezone by default, potentially causing +confusion or incorrect data representation. Our language-specific guides provide +detailed examples on how to configure your client to correctly interpret these +timestamps as UTC. -We realize the current behavior is not ideal and we are actively working on improving it. In the meantime, we -recommend that you set the timezone in your client library to UTC to ensure consistent handling of timestamps. +We realize the current behavior is not ideal and we are actively working on +improving it. In the meantime, we recommend that you set the timezone in your +client library to UTC to ensure consistent handling of timestamps. ### PGWire vs. SQL Semantics -While QuestDB supports the PGWire protocol for communication, its SQL dialect and feature -set are not identical to PostgreSQL. QuestDB is a specialized time-series database and does not support all SQL -features, functions, or data types that a standard PostgreSQL server does. Always refer to the QuestDB SQL +While QuestDB supports the PGWire protocol for communication, its SQL dialect +and feature set are not identical to PostgreSQL. QuestDB is a specialized +time-series database and does not support all SQL features, functions, or data +types that a standard PostgreSQL server does. Always refer to the QuestDB SQL documentation for supported operations. ### Forward-only Cursors -QuestDB's cursors are forward-only, differing from PostgreSQL's support for scrollable cursors (which allow -bidirectional navigation and arbitrary row access). With QuestDB, you can iterate through query results sequentially -from start to finish, but you cannot move backward or jump to specific rows. Explicit DECLARE CURSOR statements for -scrollable types, or operations like fetching in reverse (e.g., Workspace BACKWARD), are not supported. +QuestDB's cursors are forward-only, differing from PostgreSQL's support for +scrollable cursors (which allow bidirectional navigation and arbitrary row +access). With QuestDB, you can iterate through query results sequentially from +start to finish, but you cannot move backward or jump to specific rows. Explicit +DECLARE CURSOR statements for scrollable types, or operations like fetching in +reverse (e.g., Workspace BACKWARD), are not supported. -This limitation can impact client libraries that rely on scrollable cursor features. For example, Python's psycopg2 -driver might encounter issues if attempting such operations. For optimal compatibility, choose drivers or configure -existing ones to use forward-only cursors, such as Python's asyncpg driver. +This limitation can impact client libraries that rely on scrollable cursor +features. For example, Python's psycopg2 driver might encounter issues if +attempting such operations. For optimal compatibility, choose drivers or +configure existing ones to use forward-only cursors, such as Python's asyncpg +driver. ### Protocol Flavors and Encoding -The PostgreSQL wire protocol has different implementations and options. When your -client library allows, prefer the Extended Query Protocol over the Simple Query Protocol. Additionally, for optimal -performance and type fidelity, choose clients that support BINARY encoding for data transfer over TEXT encoding -whenever possible. The specifics of how to configure this will vary by client library. +The PostgreSQL wire protocol has different implementations and options. When +your client library allows, prefer the Extended Query Protocol over the Simple +Query Protocol. Additionally, for optimal performance and type fidelity, choose +clients that support BINARY encoding for data transfer over TEXT encoding +whenever possible. The specifics of how to configure this will vary by client +library. + +### Decimal + +To insert `decimal` values via PGWire, you must either use the `m` suffix to +indicate that the value is a decimal literal or cast the value to `decimal`. For +example: + +```questdb-sql +INSERT INTO my_table (decimal_column) VALUES (123.45m); -- Using 'm' suffix +INSERT INTO my_table (decimal_column) VALUES (CAST($1 AS DECIMAL(18, 3))); -- Using CAST over bind parameter +``` + +In the text format, PostgreSQL clients send decimal values as strings. +Currently, QuestDB parses these strings as `double` values and doesn't +implicitly convert them to `decimal` to avoid unintended precision loss. So you +must explicitly cast `double` values to `decimal` in your SQL queries when +inserting into `decimal` columns. diff --git a/documentation/reference/api/ilp/columnset-types.md b/documentation/reference/api/ilp/columnset-types.md index d3ef9c4b0..a4665adb0 100644 --- a/documentation/reference/api/ilp/columnset-types.md +++ b/documentation/reference/api/ilp/columnset-types.md @@ -2,7 +2,7 @@ title: InfluxDB Line Protocol Columnset Value Types sidebar_label: Columnset value types description: - Describes all support value types in InfluxDB Line Protocol columnset. + Describes all supported value types in InfluxDB Line Protocol columnset. --- This page lists the supported InfluxDB Line Protocol columnset value types and @@ -11,7 +11,7 @@ details about type casting. If a target column does not exist, QuestDB will create a column using the same type that the ILP client sends. -Type casts that cause data loss will cause entire line to be rejected. +Type casts that cause data loss will cause the entire line to be rejected. ## Integer @@ -34,16 +34,16 @@ The line above will be accepted and `96i` will be cast to `short`. ### Cast table -The following `cast` operations are supported when existing table column type is -not `long`: +The following `cast` operations are supported when the existing table column +type is not `long`: -| | `byte` | `short` | `int` | `long` | `float` | `double` | `date` | `timestamp` | -| :-------- | :----- | :------ | :---- | :------- | :------ | :------- | :----- | :---------- | -| `integer` | cast | cast | cast | `native` | cast | cast | cast | cast | +| | `byte` | `short` | `int` | `long` | `float` | `double` | `date` | `timestamp` | `decimal` | +| :-------- | :----- | :------ | :---- | :------- | :------ | :------- | :----- | :---------- | :-------- | +| `integer` | cast | cast | cast | `native` | cast | cast | cast | cast | cast | ## Long256 -Custom type, which correspond to QuestDB type `long256`. The values are hex +Custom type, which corresponds to QuestDB type `long256`. The values are hex encoded 256-bit unsigned integer values with `i` suffix. For example: ```shell @@ -69,16 +69,44 @@ conventional double value would. ### Cast table -The following `cast` operations are supported when existing table column type is -not `double`: +The following `cast` operations are supported when the existing table column +type is not `double`: -| | `float` | `double` | -| :------ | :------ | :------- | -| `float` | cast | `native` | +| | `float` | `double` | `decimal` | +| :------ | :------ | :------- | :-------- | +| `float` | cast | `native` | cast | + +## Decimal + +Decimal values, which correspond to QuestDB type `decimal`. The values are +required to have an `d` suffix. For example: + +```shell +trade,ticker=BTCUSD price=30000.50d 1638202821000000000\n +``` + +When the column does not exist, it will be created with the `decimal` type using +the default precision of 18 and scale of 3. To specify custom precision and +scale, create the table upfront: + +```questdb-sql +CREATE TABLE trade (ticker SYMBOL, price DECIMAL(18, 2)); +``` + +The line above will be accepted and `30000.50` will be stored as `decimal`. + +### Cast table + +The following `cast` operations are supported when the existing table column +type is not `decimal`: + +| | `decimal` | `float` | `double` | +| :-------- | :-------- | :------ | :------- | +| `decimal` | `native` | cast | cast | ## Boolean -These value correspond to QuestDB type `boolean`. In InfluxDB Line Protocol +These values correspond to QuestDB type `boolean`. In InfluxDB Line Protocol `boolean` values can be represented in any of the following ways: | Actual value | Single char lowercase | Single char uppercase | Full lowercase | Full camelcase | Full uppercase | @@ -94,8 +122,8 @@ sensors,location=south warning=false\n ### Cast table -The following `cast` operations are supported when existing table column type is -not `boolean`: +The following `cast` operations are supported when the existing table column +type is not `boolean`: | | `boolean` | `byte` | `short` | `int` | `float` | `long` | `double` | | :-------- | :-------- | :----- | :------ | :---- | :------ | :----- | :------- | @@ -105,7 +133,7 @@ When cast to numeric type, boolean `true` is `1` and `false` is `0` ## String -These value correspond to QuestDB type `varchar`. They must be enclosed in +These values correspond to QuestDB type `varchar`. They must be enclosed in quotes. The following characters in values must be escaped with a `\`: `"`, `\n`, `\r` and `\`. For example: @@ -127,17 +155,17 @@ String values must be UTF-8 encoded before sending. ### Cast table -The following `cast` operations are supported when existing table column type is -not `varchar`: +The following `cast` operations are supported when the existing table column +type is not `varchar`: -| | `varchar` | `char` | `string` | `geohash` | `symbol` | `uuid` | -|:---------|:----------|:-------|:---------|:----------|:---------|--------| -| `string` | `native` | cast | cast | cast | cast | cast | +| | `varchar` | `char` | `string` | `geohash` | `symbol` | `uuid` | `decimal` | +| :------- | :-------- | :----- | :------- | :-------- | :------- | ------ | :-------- | +| `string` | `native` | cast | cast | cast | cast | cast | cast | ### Cast to CHAR String value can be cast to `char` type if its length is less than 2 characters. -The following example are valid lines: +The following examples are valid lines: ```shell trade,ticker=BTCUSD status="A" 1638202821000000000\n @@ -151,8 +179,8 @@ The result: | 1638202821000000000 | BTCUSD | A | | 1638202821000000001 | BTCUSD | `null` | -Casting strings with 2 or more characters to `char` will cause entire line to be -rejected. +Casting strings with 2 or more characters to `char` will cause the entire line +to be rejected. ### Cast to GEOHASH @@ -271,9 +299,42 @@ The `uuid` column is populated with `uuid` values: When the `string` value is not a valid UUID, the entire line will be rejected. +### Cast to DECIMAL + +String values can be cast to the `decimal` type when all the following are true: + +- The destination column exists. +- The destination column type is `decimal`. +- The `string` values are valid IEEE-754 decimal values. + +```questdb-sql +CREATE TABLE trade ( + ticker SYMBOL, + price DECIMAL(18, 2), + timestamp TIMESTAMP +) TIMESTAMP(timestamp) PARTITION BY HOUR; +``` + +Send messages including decimal values as `string`: + +```shell +trade,ticker="BTCUSD" price="30000.50" 1638202821000000000\n +trade,ticker="BTCUSD" price="29999.99" 1638402821000000000\n +``` + +The `price` column is populated with `decimal` values: + +| timestamp | ticker | price | +| :-------------------------- | :----- | :------- | +| 2021-11-29T16:20:21.000000Z | BTCUSD | 30000.50 | +| 2021-12-01T23:53:41.000000Z | BTCUSD | 29999.99 | + +When the `string` value is not a valid IEEE-754 decimal value, the entire line +will be rejected. + ## Timestamp -These value correspond to QuestDB type `timestamp`. Timestamp values are epoch +These values correspond to QuestDB type `timestamp`. Timestamp values are epoch `microseconds` suffixed with `t`. In this example we're populating _non-designated_ timestamp field `ts1`: @@ -290,12 +351,12 @@ CREATE TABLE (loc SYMBOL, ts TIMESTAMP) TIMESTAMP(ts) PARTITION BY DAY; When we send: -```shell title="Sending mixed desginated timestamp values" +```shell title="Sending mixed designated timestamp values" tracking,loc=north ts=2000000000t 1000000000\n tracking,loc=south ts=3000000000t\n ``` -The result in `columnset` value always wins: +The `columnset` value always wins: | loc | ts | | :---- | :--------- | diff --git a/documentation/reference/function/numeric.md b/documentation/reference/function/numeric.md index 4b3cf3b0c..cb79d867e 100644 --- a/documentation/reference/function/numeric.md +++ b/documentation/reference/function/numeric.md @@ -40,8 +40,8 @@ FROM long_sequence(3); ## ceil / ceiling -`ceil(value)` or `ceiling()` returns the smallest integer greater than, or equal -to, a specified numeric expression. +`ceil(value)` or `ceiling(value)` returns the smallest integer greater than, or +equal to, a specified numeric expression. **Arguments:** @@ -49,7 +49,7 @@ to, a specified numeric expression. **Return value:** -Return value type is `double`. +Returns `double`, or `decimal` if the operand is `decimal`. **Examples:** @@ -63,7 +63,7 @@ SELECT ceil(15.75) as RoundedUp; ## exp -`exp()` returns the exponential value of a specified numeric expression. +`exp(value)` returns the exponential value of a specified numeric expression. **Arguments:** @@ -85,8 +85,8 @@ SELECT exp(2) as Exponent; ## floor -`floor()` returns the largest integer less than or equal to a specified numeric -expression. +`floor(value)` returns the largest integer less than or equal to a specified +numeric expression. **Arguments:** @@ -94,7 +94,7 @@ expression. **Return value:** -Return value type is `double`. +Returns `double`, or `decimal` if the operand is `decimal`. **Examples:** @@ -102,24 +102,23 @@ Return value type is `double`. SELECT floor(15.75) as RoundedDown; ``` -| RoundedUp | -| --------- | -| 15 | - +| RoundedDown | +| ----------- | +| 15 | ## greatest -`greatest(args...)` returns the largest entry in a series of numbers. +`greatest(args...)` returns the largest entry in a series of numbers. `null` will be returned only if all of the arguments are `null`. **Arguments:** -- `args...` is a variable-size list of `long` or `double` values. +- `args...` is a variable-size list of `long`, `double` or `decimal` values. **Return value:** -Return value type is `double` or `long`. +Return value type is `double`, `long` or `decimal`. **Examples:** @@ -128,25 +127,22 @@ SELECT greatest(11, 3, 8, 15) ``` | greatest | -|----------| +| -------- | | 15 | - - ## least `least(args...)` returns the smallest entry in a series of numbers. - `null` will be returned only if all of the arguments are `null`. **Arguments:** -- `args...` is a variable-size list of `long` or `double` values. +- `args...` is a variable-size list of `long`, `double` or `decimal` values. **Return value:** -Return value type is `double` or `long`. +Return value type is `double`, `long` or `decimal`. **Examples:** @@ -155,10 +151,9 @@ SELECT least(11, 3, 8, 15) ``` | least | -|-------| +| ----- | | 3 | - ## ln `ln(value)` return the natural logarithm (**log*e***) of a given number. @@ -181,7 +176,6 @@ SELECT ln(4.123) | -------------- | | 1.416581053724 | - ## log `log(value)` return the base 10 logarithm of a given number. @@ -200,14 +194,17 @@ Return value type is `double`. SELECT log(100) ``` -| log | -| ------------ | -| 2 | +| log | +| --- | +| 2 | :::note -Some databases use `LOG` to refer to the natural logarithm and `LOG10` for the base 10 logarithm. QuestDB follows PostgreSQL conventions and uses `LOG` for base 10 and `LN` for natural logarithm. -::: +Some databases use `LOG` to refer to the natural logarithm and `LOG10` for the +base 10 logarithm. QuestDB follows PostgreSQL conventions and uses `LOG` for +base 10 and `LN` for natural logarithm. + +::: ## power @@ -251,7 +248,7 @@ the `round_up` and `round_down` values. **Return value:** -Return value type is `double`. +Returns `double`, or `decimal` if the operand is `decimal`. **Examples:** @@ -291,7 +288,7 @@ FROM dbl; **Return value:** -Return value type is `double`. +Returns `double`, or `decimal` if the operand is `decimal`. **Examples:** @@ -315,12 +312,12 @@ FROM dbl; | 0.069361448 | 0 | 0 | 0 | 0 | 0.06 | | 4.003627053 | 0 | 0 | 4 | 4 | 4 | | 86.91359825 | 0 | 80 | 86 | 86.9 | 86.91 | -| 376.3807766 | 400 | 370 | 376 | 376.3 | 376.38 | +| 376.3807766 | 300 | 370 | 376 | 376.3 | 376.38 | ## round_half_even `round_half_even(value, scale)` - returns the **closest** value in the specified -scale. It uses the "half up" tie-breaking method when the value is exactly +scale. It uses the "half even" tie-breaking method when the value is exactly halfway between the `round_up` and `round_down` values. **Arguments:** @@ -333,7 +330,7 @@ halfway between the `round_up` and `round_down` values. **Return value:** -Return value type is `double`. +Returns `double`, or `decimal` if the operand is `decimal`. **Examples:** @@ -384,7 +381,7 @@ FROM dbl; **Return value:** -Return value type is `double`. +Returns `double`, or `decimal` if the operand is `decimal`. **Examples:** @@ -413,8 +410,9 @@ FROM dbl; ## sign `sign(value)` returns sign of the argument, that is: + - -1 for negative value -- 0 for zero +- 0 for zero - +1 for positive value **Arguments:** @@ -432,7 +430,7 @@ SELECT x-3 arg, sign(x-3) from long_sequence(5) ``` | arg | sign | -|-----|------| +| --- | ---- | | -2 | -1 | | -1 | -1 | | 0 | 0 | @@ -482,6 +480,6 @@ Return value type is `double`. SELECT sqrt(4000.32) ``` -| log | +| sqrt | | ---------------- | | 63.2480829749013 | diff --git a/documentation/reference/function/random-value-generator.md b/documentation/reference/function/random-value-generator.md index c8d178607..a03f5bd69 100644 --- a/documentation/reference/function/random-value-generator.md +++ b/documentation/reference/function/random-value-generator.md @@ -34,6 +34,7 @@ QuestDB supports the following random generation functions: - [rnd_uuid4](#rnd_uuid4) - [rnd_ipv4](#rnd_ipv4) - [rnd_double_array](#rnd_double_array) +- [rnd_decimal](#rnd_decimal) ## Usage @@ -659,14 +660,14 @@ rnd_ipv4('22.43.200.9/16', 0) Generates a `DOUBLE` array with random elements. There are two main forms: -1. `rnd_double_array(nDims, [ nanRate, [ maxDimLength ] ])` — generates an array with - the specified dimensionality and random dimension lengths, as well as random - elements. `nanRate` and `maxDimLength` are optional parameters. The default - `nanRate` is zero and the default `maxDimLength` is 16. +1. `rnd_double_array(nDims, [ nanRate, [ maxDimLength ] ])` — generates an array + with the specified dimensionality and random dimension lengths, as well as + random elements. `nanRate` and `maxDimLength` are optional parameters. The + default `nanRate` is zero and the default `maxDimLength` is 16. 2. `rnd_double_array(nDims, nanRate, 0, dim1Len, dim2Len, dim3Len, ...)` — - generates an array of fixed size with random elements. Note the dummy argument 0, - its is needed to disambiguate from other forms. + generates an array of fixed size with random elements. Note the dummy + argument 0, its is needed to disambiguate from other forms. **Examples:** @@ -695,3 +696,32 @@ SELECT rnd_double_array(2, 0, 0, 2, 5); [0.2928431722534959, 0.4269209916086062, 0.08520276767101154, 0.5371988206397026, 0.5786689751730609] ] ``` + +## rnd_decimal + +- `rnd_decimal(precision, scale, nanRate)` - generates a random **positive** + `decimal` between 0 and the maximum value representable by the given precision + and scale. + +**Arguments:** + +- `nanRate` is an `int` defining the frequency of occurrence of `NaN` values: +- `0`: No `NaN` will be returned. +- `1`: Will only return `NaN`. +- `N > 1`: On average, one in N generated values will be `NaN`. + +**Return value:** + +Return value type is `decimal`. + +**Examples:** + +```questdb-sql title="Random decimal" +SELECT rnd_decimal(8, 2, 0) FROM long_sequence(5); +SELECT rnd_decimal(8, 2, 4) FROM long_sequence(5); +``` + +``` +6618.97 5037.02 7118.16 9024.15 537.05 +null 734.74 787.93 null 789.92 +``` diff --git a/documentation/reference/sql/alter-table-change-column-type.md b/documentation/reference/sql/alter-table-change-column-type.md index 5d4cd9f39..312582d61 100644 --- a/documentation/reference/sql/alter-table-change-column-type.md +++ b/documentation/reference/sql/alter-table-change-column-type.md @@ -65,15 +65,15 @@ result in data loss marked with `L`. Numeric types support a wide range of conversions, but many of them can result in the data / precision loss. -| From \ To | boolean | byte | short | int | float | long | double | date | timestamp | -| --------- | ------- | ---- | ----- | --- | ----- | ---- | ------ | ---- | --------- | -| boolean | | X | X | X | X | X | X | X | X | -| byte | L | | X | X | X | X | X | X | X | -| short | L | L | | X | X | X | X | X | X | -| int | L | L | L | | L | X | X | X | X | -| float | L | L | L | L | | L | X | L | L | -| long | L | L | L | L | L | | L | X | X | -| double | L | L | L | L | X | L | | L | L | +| From \ To | boolean | byte | short | int | float | long | double | date | timestamp | decimal | +| --------- | ------- | ---- | ----- | --- | ----- | ---- | ------ | ---- | --------- | ------- | +| boolean | | X | X | X | X | X | X | X | X | | +| byte | L | | X | X | X | X | X | X | X | X | +| short | L | L | | X | X | X | X | X | X | X | +| int | L | L | L | | L | X | X | X | X | X | +| float | L | L | L | L | | L | X | L | L | L | +| long | L | L | L | L | L | | L | X | X | X | +| double | L | L | L | L | X | L | | L | L | L | Conversions between `TIMESTAMP` and `DATE` types and numeric types are fully supported. Timestamp values are represented in microseconds since the EPOCH, @@ -84,10 +84,10 @@ Additionally, when converting from `BOOLEAN` values to numerics, `false` is represented as `0`, and `true` is represented as `1`. On the way back `0` and `NULL` are converted to `false` and all other values converted to `true`. -| From \ To | boolean | byte | short | int | float | long | double | date | timestamp | -| --------- | ------- | ---- | ----- | --- | ----- | ---- | ------ | ---- | --------- | -| date | L | L | L | L | L | X | X | | X | -| timestamp | L | L | L | L | L | X | X | L | | +| From \ To | boolean | byte | short | int | float | long | double | date | timestamp | decimal | +| --------- | ------- | ---- | ----- | --- | ----- | ---- | ------ | ---- | --------- | ------- | +| date | L | L | L | L | L | X | X | | X | | +| timestamp | L | L | L | L | L | X | X | L | | | Conversions to `SYMBOL`, `STRING` and `VARCHAR` are supported from most of the data types. @@ -103,6 +103,7 @@ data types. | date | X | X | X | | timestamp | X | X | X | | double | X | X | X | +| decimal | | X | X | | ipv4 | X | X | X | | char | X | X | X | | uuid | X | X | X | @@ -113,11 +114,11 @@ data types. However conversion from `SYMBOL`, `STRING` and `VARCHAR` to other types can result in `NULL` values for inconvertable string values. -| From \ To | boolean | byte | short | char | int | float | long | date | timestamp | double | uuid | -| --------- | ------- | ---- | ----- | ---- | --- | ----- | ---- | ---- | --------- | ------ | ---- | -| string | L | L | L | L | L | L | L | L | L | L | L | -| varchar | L | L | L | L | L | L | L | L | L | L | L | -| symbol | L | L | L | L | L | L | L | L | L | L | L | +| From \ To | boolean | byte | short | char | int | float | long | date | timestamp | double | uuid | decimal | +| --------- | ------- | ---- | ----- | ---- | --- | ----- | ---- | ---- | --------- | ------ | ---- | ------- | +| string | L | L | L | L | L | L | L | L | L | L | L | L | +| varchar | L | L | L | L | L | L | L | L | L | L | L | L | +| symbol | L | L | L | L | L | L | L | L | L | L | L | | When column type change results into range overflow or precision loss, the same rules as explicit [CAST](/docs/reference/sql/cast/) apply. diff --git a/documentation/reference/sql/cast.md b/documentation/reference/sql/cast.md index d66421fc9..25bab46f0 100644 --- a/documentation/reference/sql/cast.md +++ b/documentation/reference/sql/cast.md @@ -88,10 +88,14 @@ Implicit casting also prevents data loss. When an operation involves multiple types, the resulting type will be the smallest possible type so that no data is lost. +## Casting table + The below chart illustrates the explicit and implicit cast available in QuestDB: + ![Table showing the different possibilities the cast function supports, those are defined by an input and output types](/images/docs/castmap.jpg) + ```questdb-sql title="Queries" SELECT 1234L + 567, diff --git a/documentation/reference/sql/create-table.md b/documentation/reference/sql/create-table.md index 623611a62..a2ba72210 100644 --- a/documentation/reference/sql/create-table.md +++ b/documentation/reference/sql/create-table.md @@ -297,7 +297,7 @@ When specifying a column, a name and [type definition](/docs/reference/sql/datatypes/) must be provided. The `symbol` type may have additional optional parameters applied. -![Flow chart showing the syntax of the different column types](/images/docs/diagrams/columnTypeDef.svg) +![Flow chart showing the syntax of the different column types](/images/docs/diagrams/typeDef.svg) ### Symbols @@ -375,8 +375,7 @@ CREATE TABLE trades ( [**symbol capacity**](/docs/concept/symbol/#usage-of-symbols) are different settings. - The index capacity value should not be changed, unless a user is aware of all - the implications. -::: + the implications. ::: See the [Index concept](/docs/concept/indexes/#how-indexes-work) for more information about indexes. diff --git a/documentation/reference/sql/datatypes.md b/documentation/reference/sql/datatypes.md index 744cf58af..e6ce995e7 100644 --- a/documentation/reference/sql/datatypes.md +++ b/documentation/reference/sql/datatypes.md @@ -4,28 +4,29 @@ sidebar_label: Data types description: Data types reference documentation. --- -| Type Name | Storage bits | Nullable | Description | -|-------------------|-----------------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `boolean` | `1` | No | Boolean `true` or `false`. | -| `ipv4` | `32` | Yes | `0.0.0.1` to `255.255.255.255` | -| `byte` | `8` | No | Signed integer, `-128` to `127`. | -| `short` | `16` | No | Signed integer, `-32,768` to `32,767`. | -| `char` | `16` | Yes | `unicode` character. | -| `int` | `32` | Yes | Signed integer, `-2,147,483,648` to `2,147,483,647`. | -| `float` | `32` | Yes | Single precision IEEE 754 floating point value. | -| `symbol` | `32` | Yes | A symbol, stored as a 32-bit signed index into the symbol table. Each index corresponds to a `string` value. The index is transparently translated to the string value. Symbol table is stored separately from the column data. | -| `varchar` | `128 + utf8Len` | Yes | Length-prefixed sequence of UTF-8 encoded characters, stored using a 128-bit header and UTF-8 encoded data. Sequences shorter than 9 bytes are fully inlined within the header and do not occupy any additional data space. | -| `string` | `96+n*16` | Yes | Length-prefixed sequence of UTF-16 encoded characters whose length is stored as signed 32-bit integer with maximum value of `0x7fffffff`. | -| `long` | `64` | Yes | Signed integer, `-9,223,372,036,854,775,808` to `9,223,372,036,854,775,807`. | -| `date` | `64` | Yes | Signed offset in **milliseconds** from [Unix Epoch](https://en.wikipedia.org/wiki/Unix_time). | -| `timestamp` | `64` | Yes | Signed offset in **microseconds** from [Unix Epoch](https://en.wikipedia.org/wiki/Unix_time). | -| `double` | `64` | Yes | Double precision IEEE 754 floating point value. | -| `uuid` | `128` | Yes | [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier) values. See also [the UUID type](#the-uuid-type). | -| `binary` | `64+n*8` | Yes | Length-prefixed sequence of bytes whose length is stored as signed 64-bit integer with maximum value of `0x7fffffffffffffffL`. | -| `long256` | `256` | Yes | Unsigned 256-bit integer. Does not support arithmetic operations, only equality checks. Suitable for storing a hash code, such as crypto public addresses. | -| `geohash()` | `8`-`64` | Yes | Geohash with precision specified as a number followed by `b` for bits, `c` for chars. See [the geohashes documentation](/docs/concept/geohashes/) for details on use and storage. | -| `array` | See description | Yes | Header: 20 + 4 \* `nDims` bytes. Payload: dense array of values. Example: `DOUBLE[3][4]`: header 28 bytes, payload 3\*4\*8 = 96 bytes. | -| `interval` | `128` | Yes | Pair of timestamps representing a time interval. Not a persisted type: you can use it in expressions, but can't have a database column of this type. | +| Type Name | Storage bits | Nullable | Description | +| ------------------------------- | --------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `boolean` | `1` | No | Boolean `true` or `false`. | +| `ipv4` | `32` | Yes | `0.0.0.1` to `255.255.255.255` | +| `byte` | `8` | No | Signed integer, `-128` to `127`. | +| `short` | `16` | No | Signed integer, `-32,768` to `32,767`. | +| `char` | `16` | Yes | `unicode` character. | +| `int` | `32` | Yes | Signed integer, `-2,147,483,648` to `2,147,483,647`. | +| `float` | `32` | Yes | Single precision IEEE 754 floating point value. | +| `symbol` | `32` | Yes | A symbol, stored as a 32-bit signed index into the symbol table. Each index corresponds to a `string` value. The index is transparently translated to the string value. Symbol table is stored separately from the column data. | +| `varchar` | `128 + utf8Len` | Yes | Length-prefixed sequence of UTF-8 encoded characters, stored using a 128-bit header and UTF-8 encoded data. Sequences shorter than 9 bytes are fully inlined within the header and do not occupy any additional data space. | +| `string` | `96+n*16` | Yes | Length-prefixed sequence of UTF-16 encoded characters whose length is stored as signed 32-bit integer with maximum value of `0x7fffffff`. | +| `long` | `64` | Yes | Signed integer, `-9,223,372,036,854,775,808` to `9,223,372,036,854,775,807`. | +| `date` | `64` | Yes | Signed offset in **milliseconds** from [Unix Epoch](https://en.wikipedia.org/wiki/Unix_time). | +| `timestamp` | `64` | Yes | Signed offset in **microseconds** from [Unix Epoch](https://en.wikipedia.org/wiki/Unix_time). | +| `double` | `64` | Yes | Double precision IEEE 754 floating point value. | +| `uuid` | `128` | Yes | [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier) values. See also [the UUID type](#the-uuid-type). | +| `binary` | `64+n*8` | Yes | Length-prefixed sequence of bytes whose length is stored as signed 64-bit integer with maximum value of `0x7fffffffffffffffL`. | +| `long256` | `256` | Yes | Unsigned 256-bit integer. Does not support arithmetic operations, only equality checks. Suitable for storing a hash code, such as crypto public addresses. | +| `geohash()` | `8`-`64` | Yes | Geohash with precision specified as a number followed by `b` for bits, `c` for chars. See [the geohashes documentation](/docs/concept/geohashes/) for details on use and storage. | +| `array` | See description | Yes | Header: 20 + 4 \* `nDims` bytes. Payload: dense array of values. Example: `DOUBLE[3][4]`: header 28 bytes, payload 3\*4\*8 = 96 bytes. | +| `interval` | `128` | Yes | Pair of timestamps representing a time interval. Not a persisted type: you can use it in expressions, but can't have a database column of this type. | +| `decimal(, )` | `8`-`256` | Yes | Decimal floating point with user-specified precision and scale. | ## N-dimensional array @@ -75,27 +76,28 @@ enforce is configurable via `cairo.max.array.element.count`, with the default of Many nullable types reserve a value that marks them `NULL`: -| Type Name | Null value | Description | -| ---------------- | -------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -| `float` | `NaN`, `+Infinity`, `-Infinity` | As defined by IEEE 754 (`java.lang.Float.NaN` etc.) | -| `double` | `NaN`, `+Infinity`, `-Infinity` | As defined by IEEE 754 (`java.lang.Double.NaN`, etc.) | -| `long256` | `0x8000000000000000800000000000000080000000000000008000000000000000` | The value equals four consecutive `long` null literals. | -| `long` | `0x8000000000000000L` | Minimum possible value a `long` can take, -2^63. | -| `date` | `0x8000000000000000L` | Minimum possible value a `long` can take, -2^63. | -| `timestamp` | `0x8000000000000000L` | Minimum possible value a `long` can take, -2^63. | -| `int` | `0x80000000` | Minimum possible value an `int` can take, -2^31. | -| `uuid` | `80000000-0000-0000-8000-000000000000` | Both 64 highest bits and 64 lowest bits set to -2^63. | -| `char` | `0x0000` | The zero char (`NUL` in ASCII). | -| `geohash(byte)` | `0xff` | Valid for geohashes of 1 to 7 bits (inclusive). | -| `geohash(short)` | `0xffff` | Valid for geohashes of 8 to 15 bits (inclusive). | -| `geohash(int)` | `0xffffffff` | Valid for geohashes of 16 to 31 bits (inclusive). | -| `geohash(long)` | `0xffffffffffffffff` | Valid for geohashes of 32 to 60 bits (inclusive). | -| `symbol` | `0x80000000` | Symbol is stored as an `int` offset into a lookup file. The value `-1` marks it `NULL`. | -| `ipv4` | `0.0.0.0` (`0x00000000`) | IPv4 address is stored as a 32-bit integer and the zero value represents `NULL`. | -| `varchar` | `N/A` | Varchar column has an explicit `NULL` marker in the header. | -| `string` | `N/A` | String column is length-prefixed, the length is an `int` and `-1` marks it `NULL`. | -| `binary` | `N/A` | Binary column is length prefixed, the length is a `long` and `-1` marks it `NULL`. | -| `array` | `N/A` | Array column marks a `NULL` value with a zero in the `size` field of the header. | +| Type Name | Null value | Description | +| ---------------- | -------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | +| `float` | `NaN`, `+Infinity`, `-Infinity` | As defined by IEEE 754 (`java.lang.Float.NaN` etc.) | +| `double` | `NaN`, `+Infinity`, `-Infinity` | As defined by IEEE 754 (`java.lang.Double.NaN`, etc.) | +| `long256` | `0x8000000000000000800000000000000080000000000000008000000000000000` | The value equals four consecutive `long` null literals. | +| `long` | `0x8000000000000000L` | Minimum possible value a `long` can take, -2^63. | +| `date` | `0x8000000000000000L` | Minimum possible value a `long` can take, -2^63. | +| `timestamp` | `0x8000000000000000L` | Minimum possible value a `long` can take, -2^63. | +| `int` | `0x80000000` | Minimum possible value an `int` can take, -2^31. | +| `uuid` | `80000000-0000-0000-8000-000000000000` | Both 64 highest bits and 64 lowest bits set to -2^63. | +| `char` | `0x0000` | The zero char (`NUL` in ASCII). | +| `geohash(byte)` | `0xff` | Valid for geohashes of 1 to 7 bits (inclusive). | +| `geohash(short)` | `0xffff` | Valid for geohashes of 8 to 15 bits (inclusive). | +| `geohash(int)` | `0xffffffff` | Valid for geohashes of 16 to 31 bits (inclusive). | +| `geohash(long)` | `0xffffffffffffffff` | Valid for geohashes of 32 to 60 bits (inclusive). | +| `symbol` | `0x80000000` | Symbol is stored as an `int` offset into a lookup file. The value `-1` marks it `NULL`. | +| `ipv4` | `0.0.0.0` (`0x00000000`) | IPv4 address is stored as a 32-bit integer and the zero value represents `NULL`. | +| `varchar` | `N/A` | Varchar column has an explicit `NULL` marker in the header. | +| `string` | `N/A` | String column is length-prefixed, the length is an `int` and `-1` marks it `NULL`. | +| `binary` | `N/A` | Binary column is length prefixed, the length is a `long` and `-1` marks it `NULL`. | +| `array` | `N/A` | Array column marks a `NULL` value with a zero in the `size` field of the header. | +| `decimal` | `N/A` | Minimal value of the underlying decimal type, impossible to reach through arithmetic as it is always out-of-range. | To filter columns that contain, or don't contain, `NULL` values use a filter like: diff --git a/documentation/reference/sql/latest-on.md b/documentation/reference/sql/latest-on.md index ed5206844..4b24128df 100644 --- a/documentation/reference/sql/latest-on.md +++ b/documentation/reference/sql/latest-on.md @@ -19,7 +19,7 @@ where: column. - `columnName` list used in the `PARTITION BY` part of the clause is a list of columns of one of the following types: `SYMBOL`, `STRING`, `BOOLEAN`, `SHORT`, - `INT`, `LONG`, `LONG256`, `CHAR`. + `INT`, `LONG`, `LONG256`, `CHAR`, `DECIMAL`. ## Description diff --git a/documentation/sidebars.js b/documentation/sidebars.js index 4f2521eb7..4bfbe8486 100644 --- a/documentation/sidebars.js +++ b/documentation/sidebars.js @@ -553,6 +553,7 @@ module.exports = { "concept/designated-timestamp", "concept/geohashes", "concept/array", + "concept/decimal", "concept/indexes", "concept/interval-scan", "concept/jit-compiler", diff --git a/scripts/generate_type_cast_chart.py b/scripts/generate_type_cast_chart.py new file mode 100644 index 000000000..43b745a07 --- /dev/null +++ b/scripts/generate_type_cast_chart.py @@ -0,0 +1,98 @@ +import matplotlib.pyplot as plt +import matplotlib.patches as mpatches + + +markdown_table = """ +| From \\ To | String | Boolean | Char | Byte | Short | Int | Long | Long256 | Float | Double | Decimal | Date | Timestamp | Symbol | Binary | +| --------- | ------ | ------- | ----- | ----- | ----- | ----- | ----- | ------- | ----- | ------ | ------- | ----- | --------- | ------ | ------ | +| String | | `E` | `E*` | `E` | `E` | `I` | `I` | `I` | `I` | `I` | `E` | `I` | `I` | `I` | `N/A` | +| Boolean | `I` | | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `N/A` | `I` | `I` | `I` | `N/A` | +| Char | `I` | `N/A` | | `E*` | `I` | `I` | `I` | `I` | `I` | `I` | `N/A` | `I` | `I` | `I` | `N/A` | +| Byte | `I` | `E*` | `I` | | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `N/A` | +| Short | `I` | `E*` | `E*` | `I` | | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `I` | `N/A` | +| Int | `E` | `E*` | `E*` | `E*` | `E*` | | `I` | `I` | `I*` | `I` | `I` | `I` | `I` | `I` | `N/A` | +| Long | `E` | `E*` | `E*` | `E*` | `E*` | `E*` | | `I` | `E*` | `I*` | `I` | `I` | `I` | `E` | `N/A` | +| Long256 | `E` | `E*` | `E*` | `E*` | `E*` | `E*` | `E*` | | `E*` | `E*` | `N/A` | `E*` | `E*` | `E*` | `N/A` | +| Float | `E` | `N/A` | `E*` | `E*` | `E*` | `I*` | `I*` | `I*` | | `I` | `E*` | `I*` | `I*` | `I` | `N/A` | +| Double | `E` | `N/A` | `E*` | `E*` | `E*` | `E*` | `I*` | `I*` | `E*` | | `E*` | `I*` | `I*` | `E` | `N/A` | +| Decimal | `E` | `N/A` | `N/A` | `E!` | `E!` | `E!` | `E!` | `N/A` | `E*` | `E*` | | `N/A` | `N/A` | `N/A` | `N/A` | +| Date | `E` | `E*` | `E*` | `E*` | `E*` | `E*` | `I` | `I` | `E*` | `I*` | `N/A` | | `I` | `E` | `N/A` | +| Timestamp | `E` | `E*` | `E*` | `E*` | `E*` | `E*` | `I` | `I` | `E*` | `I*` | `N/A` | `I*` | | `E` | `N/A` | +| Symbol | `I` | `E` | `E` | `E` | `E` | `E` | `I` | `I` | `E` | `I` | `N/A` | `I` | `I` | | `N/A` | +| Binary | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | `N/A` | +""" + +# --- Parse Markdown into matrix --- +lines = [line.strip() for line in markdown_table.strip().split("\n") if line.strip()] +header = [h.strip() for h in lines[0].split("|")[1:-1]] # remove leading/trailing pipe +rows = [] + +matrix = {} +for line in lines[2:]: # skip header + separator + parts = [p.strip().strip("`") for p in line.split("|")[1:-1]] + row_name, values = parts[0], parts[1:] + matrix[row_name] = dict(zip(header[1:], values)) # map To-cols to values + rows.append(row_name) + +cols = header[1:] + +# --- Color mapping --- +color_map = { + "": "white", + "I": "#a6d96a", # implicit + "E": "#fdae61", # explicit + "I*": "#313695", # implicit precision loss + "E*": "#d73027", # explicit precision loss + "E!": "#e08214", # explicit warning + "N/A": "#7f7f7f" # grey +} + +# --- Plot --- +fig, ax = plt.subplots(figsize=(5, 4)) + +for i, row in enumerate(rows): + for j, col in enumerate(cols): + val = matrix[row].get(col, "") + ax.add_patch(plt.Rectangle( + (j, i), 1, 1, + facecolor=color_map.get(val, "white"), + edgecolor="black", linewidth=0.5 + )) + +# Configure ticks +ax.set_xticks([i + 0.5 for i in range(len(cols))]) +ax.set_yticks([i + 0.5 for i in range(len(rows))]) +ax.set_xticklabels(cols, rotation=90) +ax.set_yticklabels(rows) + +# Put X axis labels at the top +ax.xaxis.tick_top() + +ax.set_xlim(0, len(cols)) +ax.set_ylim(0, len(rows)) +ax.invert_yaxis() +ax.set_aspect("equal") + +# Axis labels +ax.set_xlabel("TO", labelpad=20) +ax.xaxis.set_label_position("top") + +ax.set_ylabel("FROM", labelpad=20) +ax.yaxis.set_label_position("left") + +# Legend centered at bottom +legend_elements = [ + mpatches.Patch(facecolor="#a6d96a", edgecolor="black", label="Implicit"), + mpatches.Patch(facecolor="#fdae61", edgecolor="black", label="Explicit"), + mpatches.Patch(facecolor="#313695", edgecolor="black", label="Implicit (precision loss)"), + mpatches.Patch(facecolor="#d73027", edgecolor="black", label="Explicit (precision loss)"), + mpatches.Patch(facecolor="#7f7f7f", edgecolor="black", label="N/A") +] +ax.legend(handles=legend_elements, + bbox_to_anchor=(0.5, -0.05), loc="upper center", ncol=3, frameon=False) + +# Save directly to file +path = "../static/images/docs/castmap.jpg" +plt.savefig(path, dpi=300, bbox_inches="tight") +plt.close() +print("file written to: " + path) diff --git a/static/images/docs/castmap.jpg b/static/images/docs/castmap.jpg index 920118f31..a96ad001f 100644 Binary files a/static/images/docs/castmap.jpg and b/static/images/docs/castmap.jpg differ diff --git a/static/images/docs/diagrams/.railroad b/static/images/docs/diagrams/.railroad index 6f011c785..ae0ee44b0 100644 --- a/static/images/docs/diagrams/.railroad +++ b/static/images/docs/diagrams/.railroad @@ -99,7 +99,7 @@ typeDef ::= ( 'boolean' | 'byte' | 'short' | 'char' | 'int' | 'float' | ('symbol' (('CAPACITY' distinctValueEstimate) ('CACHE' | 'NOCACHE')?)? inlineIndexDef?) | - 'varchar' | 'string' | 'long' | 'date' | 'timestamp' | 'double' | 'binary' | 'long256' | 'geohash()' + 'varchar' | 'string' | 'long' | 'date' | 'timestamp' | 'double' | 'binary' | 'long256' | 'geohash()' | 'decimal(, )' ) indexDef diff --git a/static/images/docs/diagrams/columnTypeDef.svg b/static/images/docs/diagrams/columnTypeDef.svg deleted file mode 100644 index 056784c3a..000000000 --- a/static/images/docs/diagrams/columnTypeDef.svg +++ /dev/null @@ -1 +0,0 @@ -booleanbyteshortcharintfloatsymbolCAPACITYdistinctValueEstimateCACHENOCACHEinlineIndexDefstringlongdatetimestampdoublebinarylong256geohash(<size>) \ No newline at end of file diff --git a/static/images/docs/diagrams/createTableDef.svg b/static/images/docs/diagrams/createTableDef.svg index 16d59812b..a17e6a669 100644 --- a/static/images/docs/diagrams/createTableDef.svg +++ b/static/images/docs/diagrams/createTableDef.svg @@ -1,4 +1,4 @@ - + \ No newline at end of file diff --git a/static/images/docs/diagrams/declare.svg b/static/images/docs/diagrams/declare.svg index 367699a4c..86040c80d 100644 --- a/static/images/docs/diagrams/declare.svg +++ b/static/images/docs/diagrams/declare.svg @@ -1,91 +1,55 @@ - - - - - - DECLARE - - - @variable - - - := - - - - expression - - - - , - - - - withExpr - - - - - selectExpr - - - - - + + + + + DECLARE + + + @variable + + + := + + + expression + + , + + + withExpr + + + selectExpr + + + \ No newline at end of file diff --git a/static/images/docs/diagrams/fromTo.svg b/static/images/docs/diagrams/fromTo.svg index d55ce75e1..8c18e63fc 100644 --- a/static/images/docs/diagrams/fromTo.svg +++ b/static/images/docs/diagrams/fromTo.svg @@ -1,90 +1,53 @@ - - - - - - - someSampleBySelectQuery... - - - - FROM - - - - lowerBound - - - - TO - - - - upperBound - - - - TO - - - - upperBound - - - - - + + + + + someSampleBySelectQuery... + + FROM + + + lowerBound + + TO + + + upperBound + + TO + + + upperBound + + + \ No newline at end of file diff --git a/static/images/docs/diagrams/innerLeftJoin.svg b/static/images/docs/diagrams/innerLeftJoin.svg new file mode 100644 index 000000000..fd169f467 --- /dev/null +++ b/static/images/docs/diagrams/innerLeftJoin.svg @@ -0,0 +1,80 @@ + + + + + + + + + INNER + + + LEFT + + + JOIN + + + table + + ( + + + sub-query + + ) + + + ON + + + column + + + operator + + + anotherColumn + + AND + + + ( + + + column + + , + + + ) + + + + \ No newline at end of file diff --git a/static/images/docs/diagrams/insertInto.svg b/static/images/docs/diagrams/insertInto.svg index 9e634f9c5..69cefeeb8 100644 --- a/static/images/docs/diagrams/insertInto.svg +++ b/static/images/docs/diagrams/insertInto.svg @@ -1,4 +1,4 @@ - + - - - - - INSERT - - - batch - - - batchCount - - INTO - - - tableName - - ( - - - columnName - - , - - - ) - - - VALUES - - - ( - - - value - - , - - - ) - - - , - - - SELECT - - - queryDef - - - mainQuery - - ; - - - + + + + + INSERT + + + ATOMIC + + + BATCH + + + number + + o3MaxLag + + + numberWithUnit + + INTO + + + tableName + + ( + + + columnName + + , + + + ) + + + VALUES + + + ( + + + value + + , + + + ) + + + , + + + SELECT + + + queryDef + + + mainQuery + + ; + + + \ No newline at end of file diff --git a/static/images/docs/diagrams/typeDef.svg b/static/images/docs/diagrams/typeDef.svg index c0301dd23..c1ddb869d 100644 --- a/static/images/docs/diagrams/typeDef.svg +++ b/static/images/docs/diagrams/typeDef.svg @@ -1,4 +1,4 @@ - + \ No newline at end of file diff --git a/static/images/docs/diagrams/update.svg b/static/images/docs/diagrams/update.svg index abfbff034..1b550106f 100644 --- a/static/images/docs/diagrams/update.svg +++ b/static/images/docs/diagrams/update.svg @@ -1,4 +1,4 @@ - + - - - - - UPDATE - - tableName - - - SET - - - columnName - - = - - - expression - - - , - - - + + + + + UPDATE + + tableName + + + SET + + + columnName + + = + + + expression + + + , + + + FROM + + joinTable1 + + + JOIN + + joinTable2 + + + ON + + joinCondition2 + + + WHERE + + filter + + + \ No newline at end of file diff --git a/static/images/docs/diagrams/withAsInsert.svg b/static/images/docs/diagrams/withAsInsert.svg index ff4e54201..cba7937de 100644 --- a/static/images/docs/diagrams/withAsInsert.svg +++ b/static/images/docs/diagrams/withAsInsert.svg @@ -1,183 +1,87 @@ - - - - - - WITH - - - - subQueryName - - - - AS - - - ( - - - - subQuery - - - - ) - - - , - - - INSERT - - - ATOMIC - - - BATCH - - - - number - - - - o3MaxLag - - - - numberWithUnit - - - - INTO - - - - tableName - - - - SELECT - - - - queryDef - - - - - subQueryName - - - - ; - - - - + + + + + WITH + + + subQueryName + + AS + + + ( + + + subQuery + + ) + + + , + + + INSERT + + + ATOMIC + + + BATCH + + + number + + o3MaxLag + + + numberWithUnit + + INTO + + + tableName + + SELECT + + + queryDef + + + subQueryName + + ; + + + + \ No newline at end of file