cancel
Showing results for 
Search instead for 
Did you mean: 

Performance using a table to combine where phrases

eohara_kdb
New Contributor III

Hi all,

Passing a table in to a where clause instead of the individual values significantly affects performance, even compared to using & to join all the where phrases into one. What is the query doing under the covers?

1#.eoh.f
date       SRC    PRICE_TYPE INDEX1 INDEX2 CONTRACT1 PROFILE CONTRACT_TYPE
--------------------------------------------------------------------------
2023.06.28 market dev  	     JPM           2023.10.M         BLA        

// PUBLICATIONS is date-partitioned, and doesn't have attributes on the columns we filter on

\ts select from PUBLICATIONS where ([]date;SRC;PRICE_TYPE;INDEX1;INDEX2;CONTRACT1;PROFILE;CONTRACT_TYPE) in 1#.eoh.f
1176 177568

\ts select from PUBLICATIONS where date=2023.06.28,src=`market,PRICE_TYPE=`dev,INDEX1=`JPM,INDEX2=`,CONTRACT1=`2023.10.M,PROFILE=`,CONTRACT_TYPE=`BLA
5 2101760

\ts select from PUBLICATIONS where (date=2023.06.28)&(src=`market)&(PRICE_TYPE=`dev)&(INDEX1=`JPM)&(INDEX2=`)&(CONTRACT1=`2023.10.M)&(PROFILE=`)&CONTRACT_TYPE=`BLA
531 179664

\ts select from PUBLICATIONS where (date in 2023.06.28)&(SRC in `market)&(PRICE_TYPE in `dev)&(INDEX1 in `JPM)&(INDEX2 in `)&(CONTRACT1 in `2023.10.M)&(PROFILE in `)&CONTRACT_TYPE in `BLA
504 179664

Using kdb+ 3.5 2017.10.11

Thanks,

Eoghan

1 ACCEPTED SOLUTION

davidcrossey
Moderator Moderator
Moderator

I would presume optimizations such as left to right sub-clause filtering as mentioned here https://code.kx.com/q4m3/9_Queries_q-sql/#9333-multiple-where-subphrases are lost when filtering using a table/dict lookup.

i.e. we need to check all of the columns specified in the lookup simlataneouly, instead of passing each consective filter to the next sub-phrase.

To visualize differently:

q)show tab:([]c1:`a`b`c;c2:1 2 3)
c1 c2
-----
a  1
b  2
c  3

q)tab in 1#tab
100b

q)select from tab where tab in 1#tab
c1 c2
-----
a  1

 Also worth having a look at the parse trees:

 

q)parse "select from PUBLICATIONS where date=2023.06.28,src=`market,PRICE_TYPE=`dev,INDEX1=`JPM,INDEX2=`,CONTRACT1=`2023.10.M,PROFILE=`,CONTRACT_TYPE=`BLA"
?
`PUBLICATIONS
,((=;`date;2023.06.28);(=;`src;,`market);(=;`PRICE_TYPE;,`dev);(=;`INDEX1;,`JPM);(=;`INDEX2;,`);(=;`CONTRACT1;,`2023.10.M);(=;`PROFILE;,`);(=;`CONTRACT_TYPE;,`BLA))
0b
()

q)parse "select from PUBLICATIONS where ([]date;SRC;PRICE_TYPE;INDEX1;INDEX2;CONTRACT1;PROFILE;CONTRACT_TYPE) in 1#.eoh.f"
?
`PUBLICATIONS
,,(in;(+:;(!;,`date`SRC`PRICE_TYPE`INDEX1`INDEX2`CONTRACT1`PROFILE`CONTRACT_TYPE;(enlist;`date;`SRC;`PRICE_TYPE;`INDEX1;`INDEX2;`CONTRACT1;`PROFILE;`CONTRACT_TYPE)));(#;1;`.eoh.f))
0b
()

 

View solution in original post

2 REPLIES 2

gyorokpeter-kx
Contributor
Contributor

If you have a list of where clauses separated by a comma, it is executed sequentially, successively reducing the number of rows processed. On the other hand if you use the & operator, all operations will run on the full table.

davidcrossey
Moderator Moderator
Moderator

I would presume optimizations such as left to right sub-clause filtering as mentioned here https://code.kx.com/q4m3/9_Queries_q-sql/#9333-multiple-where-subphrases are lost when filtering using a table/dict lookup.

i.e. we need to check all of the columns specified in the lookup simlataneouly, instead of passing each consective filter to the next sub-phrase.

To visualize differently:

q)show tab:([]c1:`a`b`c;c2:1 2 3)
c1 c2
-----
a  1
b  2
c  3

q)tab in 1#tab
100b

q)select from tab where tab in 1#tab
c1 c2
-----
a  1

 Also worth having a look at the parse trees:

 

q)parse "select from PUBLICATIONS where date=2023.06.28,src=`market,PRICE_TYPE=`dev,INDEX1=`JPM,INDEX2=`,CONTRACT1=`2023.10.M,PROFILE=`,CONTRACT_TYPE=`BLA"
?
`PUBLICATIONS
,((=;`date;2023.06.28);(=;`src;,`market);(=;`PRICE_TYPE;,`dev);(=;`INDEX1;,`JPM);(=;`INDEX2;,`);(=;`CONTRACT1;,`2023.10.M);(=;`PROFILE;,`);(=;`CONTRACT_TYPE;,`BLA))
0b
()

q)parse "select from PUBLICATIONS where ([]date;SRC;PRICE_TYPE;INDEX1;INDEX2;CONTRACT1;PROFILE;CONTRACT_TYPE) in 1#.eoh.f"
?
`PUBLICATIONS
,,(in;(+:;(!;,`date`SRC`PRICE_TYPE`INDEX1`INDEX2`CONTRACT1`PROFILE`CONTRACT_TYPE;(enlist;`date;`SRC;`PRICE_TYPE;`INDEX1;`INDEX2;`CONTRACT1;`PROFILE;`CONTRACT_TYPE)));(#;1;`.eoh.f))
0b
()