Tables On Disk

Question: Tables can be written to disk as either data files, splayed tables, or partitioned tables. Data files are efficient for small sized tables (<1 million rows), splayed tables are efficient for medium sized tables (<100 million rows), partitioned tables are efficient for large sized tables (>100 million rows). Given the below table, save the table 't' to a file 'data', a splayed table 'splay', and date partitioned table 't' with the date being the date in the data. All of these files/directories should be contained within the same directory.

More Information:

https://code.kx.com/q4m3/14_Introduction_to_Kdb+/#140-overview

Example

                                
                                q)t:([]time:2020.01.01+asc 10?.z.T;sym:10?`AAPL`IBM;px:10?200f;size:10?200j)

/ your statements

q)\l .
q)\ls -l
"total 16"
"drwxr-xr-x 3 alvi alvi 4096 Jun  5 06:38 2020.01.01"
"-rw-r--r-- 1 alvi alvi  344 Jun  5 06:58 data"
"drwxr-xr-x 2 alvi alvi 4096 Jun  5 06:59 splay"
"-rw-r--r-- 1 alvi alvi   17 Jun  5 06:21 sym"
q)select from data / file
time                          sym  px       size
------------------------------------------------
2020.01.01D01:37:00.256000000 IBM  155.4261 0
2020.01.01D02:19:02.023000000 AAPL 16.45375 197
2020.01.01D02:41:42.304000000 AAPL 102.6404 113
2020.01.01D02:41:53.390000000 IBM  98.95657 192
2020.01.01D03:45:55.156000000 AAPL 173.3113 74
2020.01.01D05:24:32.733000000 IBM  128.2995 55
2020.01.01D06:09:51.007000000 AAPL 181.6542 150
2020.01.01D06:36:18.791000000 IBM  195.9219 51
2020.01.01D06:56:52.410000000 AAPL 61.54981 4
2020.01.01D06:58:25.413000000 AAPL 73.04546 46
q)select from splay / splayed
time                          sym  px       size
------------------------------------------------
2020.01.01D01:37:00.256000000 IBM  155.4261 0
2020.01.01D02:19:02.023000000 AAPL 16.45375 197
2020.01.01D02:41:42.304000000 AAPL 102.6404 113
2020.01.01D02:41:53.390000000 IBM  98.95657 192
2020.01.01D03:45:55.156000000 AAPL 173.3113 74
2020.01.01D05:24:32.733000000 IBM  128.2995 55
2020.01.01D06:09:51.007000000 AAPL 181.6542 150
2020.01.01D06:36:18.791000000 IBM  195.9219 51
2020.01.01D06:56:52.410000000 AAPL 61.54981 4
2020.01.01D06:58:25.413000000 AAPL 73.04546 46
q)select from t where date=2020.01.01 / partitioned
date       time                          sym  px       size
-----------------------------------------------------------
2020.01.01 2020.01.01D01:37:00.256000000 IBM  155.4261 0
2020.01.01 2020.01.01D02:19:02.023000000 AAPL 16.45375 197
2020.01.01 2020.01.01D02:41:42.304000000 AAPL 102.6404 113
2020.01.01 2020.01.01D02:41:53.390000000 IBM  98.95657 192
2020.01.01 2020.01.01D03:45:55.156000000 AAPL 173.3113 74
2020.01.01 2020.01.01D05:24:32.733000000 IBM  128.2995 55
2020.01.01 2020.01.01D06:09:51.007000000 AAPL 181.6542 150
2020.01.01 2020.01.01D06:36:18.791000000 IBM  195.9219 51
2020.01.01 2020.01.01D06:56:52.410000000 AAPL 61.54981 4
2020.01.01 2020.01.01D06:58:25.413000000 AAPL 73.04546 46
                                
                            

Solution

Tags:
tables
Searchable Tags
algorithms api architecture asynchronous c csv data structures dictionaries disk feedhandler finance functions ingestion ipc iterators machine learning math multithreading optimizations realtime shared library sql statistics streaming strings tables temporal utility websockets