- 
                Notifications
    You must be signed in to change notification settings 
- Fork 18
compress.q
This library provides functions to assist with the compression of HDB splayed and partitioned tables.
Provides a symbol reference to a default compression mode for each supported compression type:
| Symbol | Compression | 
|---|---|
| `none | (0; 0; 0) | 
| `qipc | (17; 1; 0) | 
| `gzip | (17; 2; 7) | 
| `snappy | (17; 3; 0) | 
| `lz4hc | (17; 4; 9) | 
Provides the compression statistics (via -21!) for all columns in the specified splayed table folder, including any additional columns for nested lists or anymaps, and returns a table.
Any uncompressed columns will have a null compressed value
q) .compress.getSplayStats `:/tmp/hdb/2021.10.29/trade
column compressedLength uncompressedLength compressMode algorithm logicalBlockSize zipLevel
-------------------------------------------------------------------------------------------
time   8000560          8000016            qipc         1         17               0
sym    1182897          10101304           qipc         1         17               0
price  8000560          8000016            qipc         1         17               0
vol    3500240          8000016            qipc         1         17               0
q).compress.getSplayStats `:/tmp/hdb/2021.01.23/trade
column compressedLength uncompressedLength compressMode algorithm logicalBlockSize zipLevel
-------------------------------------------------------------------------------------------
time                    16                 none         0         0                0
sym                     4096               none         0         0                0
price                   16                 none         0         0                0
vol                     16                 none         0         0                0Provides the compression statistics (via -21!) for all columns in all tables within the specified partition with the specified HDB. The returned table is the same as .compress.getSplayStats with part and table columns added
Note that this function is not par.txt aware. If using a segmented HDB, the hdbRoot parameter should be the segment root.
q) select sum compressedLength by part, table from .compress.getPartitionStats[`:/tmp/hdb; 2021.01.23]
part       table    | uncompressedLength
--------------------| ------------------
2021.01.23 tbl      | 4392
2021.01.23 tbl10    | 40
2021.01.23 tbl2     | 40
2021.01.23 trade    | 4144
2021.01.23 tradeComp| 4144
Compresses a splayed table.
- 
compressType: Can either be a symbol (one ofnone,qipc,gzip,snappy,lz4hc) or a 3-element integer list describing the compression type
- 
options: A dictionary of options to modify the function's behaviour- 
recompress: If true, any compressed files will be recompressed (default isfalse)
- 
inplace: If true,targetSplayPathcan be the same assourceSplayPath(default isfalse)
- 
parallel: Compress columns within the splay in parallel if none of the columns havecopywrite mode (default istrue)
- 
dryrun: If true don't actually run the compression, just return the table result of what would be done. Note thedryruncolumn in the result table will be true in this case (default isfalse)
- 
gc: If true, perform a Garbage Collection after compression (default isfalse)
 
- 
The function doesn't always compress every column in the splay. It will return a table information describing the operation that was performed; writeMode provides the detail to what was performed and why:
- 
compress: The file was compressed- The file is uncompressed, or is compressed and the recompressoption is true
 
- The file is uncompressed, or is compressed and the 
- 
copy: The file was copied (via the OS-specific copy command)- The file is either empty (0 = count) or is already compressed and the recompressoption is missing or false
 
- The file is either empty (0 = count) or is already compressed and the 
- 
ignore: The file was ignored- The file would've been copied (as above) but was an inplace copy so nothing to do
- Additional files for nested lists should not be directly compressed, they will get created when the primary list is compressed
 
q) .compress.getSplayStats `:/tmp/hdb/2021.11.08/trade
column  compressedLength uncompressedLength compressMode algorithm logicalBlockSize zipLevel
--------------------------------------------------------------------------------------------
time                     8016               none         0         0                0
sym                      44696              none         0         0                0
price                    8016               none         0         0                0
vol                      8016               none         0         0                0
prices                   12096              none         0         0                0
prices#                  30096              none         0         0                0
q) .compress.splay[`:/tmp/hdb/2021.11.08/trade; `:/tmp/hdb/2021.11.08/trade; `lz4hc; ()!()]
...
column  source                             target                                 compressed inplace empty writeMode dryrun parallel time
---------------------------------------------------------------------------------------------------------------------------------------------------------
time    :/tmp/hdb/2021.11.08/trade/time    :/tmp/hdb/2021.11.08/tradeComp/time    0          0       0     compress  0      1        0D00:00:00.008962000
sym     :/tmp/hdb/2021.11.08/trade/sym     :/tmp/hdb/2021.11.08/tradeComp/sym     0          0       0     compress  0      1        0D00:00:00.010372000
price   :/tmp/hdb/2021.11.08/trade/price   :/tmp/hdb/2021.11.08/tradeComp/price   0          0       0     compress  0      1        0D00:00:00.009941000
vol     :/tmp/hdb/2021.11.08/trade/vol     :/tmp/hdb/2021.11.08/tradeComp/vol     0          0       0     compress  0      1        0D00:00:00.009440000
prices  :/tmp/hdb/2021.11.08/trade/prices  :/tmp/hdb/2021.11.08/tradeComp/prices  0          0       0     compress  0      1        0D00:00:00.004127000
prices# :/tmp/hdb/2021.11.08/trade/prices# :/tmp/hdb/2021.11.08/tradeComp/prices# 0          0       0     ignore    0      1
q) .compress.getSplayStats `:/tmp/hdb/2021.11.08/trade
column  compressedLength uncompressedLength compressMode algorithm logicalBlockSize zipLevel
--------------------------------------------------------------------------------------------
time    8072             8016               lz4hc        4         17               9
sym     10907            44696              lz4hc        4         17               9
price   7011             8016               lz4hc        4         17               9
vol     4864             8016               lz4hc        4         17               9
prices  4115             12096              lz4hc        4         17               9
prices# 212              30096              lz4hc        4         17               9Compresses multiple splayed tables within a HDB partition
- 
tbls: Either a list of tables to compress orCOMP_ALLcan be specified to compress all tables
- 
options: A dictionary of options to modify the function's behaviour- 
recompress: If true, any compressed files will be recompressed (default isfalse)
- 
inplace: If true,sourceRootcan be the same astargetRoot(default isfalse)
- 
srcParTxt: If true, anypar.txtinsourceRootwill be used to find the specified partition (default istrue)
- 
tgtParTxt: If true, anypar.txtintargetRootwill be used to write the specified partition (default istrue)
- 
parallel: Compress columns within the splay in parallel if none of the columns havecopywrite mode (default istrue)
- 
dryrun: If true don't actually run the compression, just return the table result of what would be done. Note thedryruncolumn in the result table will be true in this case (default isfalse)
- 
gc: If true, perform a Garbage Collection after compression (default isfalse)
 
- 
NOTE: There is no interaction with the sym file in the source or target HDBs with this function. It is expected that the sym file is shared across both the source and target.
The same information is returned as .compress.splay with part and table columns added.
Copyright (C) Sport Trades Ltd 2017 - 2020, John Keys and Jaskirat Rajasansir 2020 - 2024