Skip to content

Commit 84cf35c

Browse files
authored
[FSTORE-1817] Online row size validation adds too much overhead for varbinary (#507)
1 parent 7c0c041 commit 84cf35c

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

docs/user_guides/fs/feature_group/data_types.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,10 +139,22 @@ The byte size of each column is determined by its data type and calculated as fo
139139
| VARCHAR(LENGTH) | LENGTH * 4 |
140140
| VARCHAR(LENGTH) charset latin1; | LENGTH * 1 |
141141
| TEXT | 256 |
142-
| VARBINARY(LENGTH) | LENGTH / 1.4 |
142+
| VARBINARY(LENGTH) | LENGTH |
143143
| BLOB | 256 |
144144
| other | 8 |
145145

146+
!!! note "VARCHAR / VARBINARY overhead"
147+
148+
For VARCHAR and VARBINARY data types, an additional 1 byte is required if the size is less than 256 bytes. If the size is 256 bytes or greater, 2 additional bytes are required.
149+
150+
Memory allocation is performed in groups of 4 bytes. For example, a VARBINARY(100) requires 104 bytes of memory:
151+
152+
- 100 bytes for the data itself
153+
- 1 byte of overhead
154+
- Total = 101 bytes
155+
156+
Since memory is allocated in 4-byte groups, storing 101 bytes requires 26 groups (26 × 4 = 104 bytes) of allocated memory.
157+
146158

147159
#### Pre-insert schema validation for online feature groups
148160
For online enabled feature groups, the dataframe to be ingested needs to adhere to the online schema definitions. The input dataframe is validated for schema checks accordingly.

0 commit comments

Comments
 (0)