cancel
Showing results for 
Search instead for 
Did you mean: 

Interaction between peach and other optimisations

erichards
New Contributor II

I understand there are various parallel optimisations that happen under the hood when running with some number of secondary threads, e.g. summing across multiple partitions. How do these interact with peach?

 

For example:

disk0/hdb/par.txt --> disk1/hdb/partitions , disk2/hdb/partitions

disk1/hdb/partitions/1-3-5

disk2/hdb/partitions/2-4-6

If I ran a query such as

select sum price by sym where int within (1;4)

and I had two secondary threads available, thread #1 would retrieve data from partitions 1, 3 on disk 1, and thread #2 would retrieve data from partitions 2, 4 on disk 2 to maximise I/O throughput.

 

But if my queries were wrapped in peach, would this still be possible, given peach would be using all available threads, e.g.

{x[]} peach (
    {select sum price by sym where int within (1;4)}; 
    {select sum price by sym where int within (5;6)}
)

 

So are there situations when using peach can reduce performance? Thank you

1 ACCEPTED SOLUTION

The parallelism can only go one layer deep.

.i.ie These 2 statements end up executing the same path. In the first one the inner ``peach``  can only run like an `each` as it is already in a thread: 

 

 

 

data:8#enlist til 1000000

\ts {{neg x} peach x} peach data
553 1968
\ts {{neg x} each x} peach data
551 1936

 

 

 

For queries map-reduce still will be used to reduce the memory load of your nested queries even if run inside a ``peach` even if not running the sub parts in parallel.

https://code.kx.com/q4m3/14_Introduction_to_Kdb%2B/#1437-map-reduce 

 

 

Where you choose to put your `peach` can be important and change the performance of your execution.

 

My example actually runs better without peach due to the overhead of passing data around versus `neg` being a simple operation

\ts {{neg x} each x} each data
348 91498576

 

.Q.fc exists to help in these cases

\ts {.Q.fc[{neg x};x]} each data
19 67110432

https://code.kx.com/q/ref/dotq/#fc-parallel-on-cut   

 

And in fact since `neg` has native multithreading and operates on vectors and vectors of vectors it is best of off left on it's own: 

\ts neg each data
5 67109216


\ts neg data
5 67109104
neg data

 

This example of course is extreme but does show that thought and optimisation can go in to each use-case on where to choose to iterate and place `peach``

View solution in original post

3 REPLIES 3

erichards
New Contributor II

I guess a more succint version of my question is "what happens to native parallelisations when running queries inside an instance of peach?"

The parallelism can only go one layer deep.

.i.ie These 2 statements end up executing the same path. In the first one the inner ``peach``  can only run like an `each` as it is already in a thread: 

 

 

 

data:8#enlist til 1000000

\ts {{neg x} peach x} peach data
553 1968
\ts {{neg x} each x} peach data
551 1936

 

 

 

For queries map-reduce still will be used to reduce the memory load of your nested queries even if run inside a ``peach` even if not running the sub parts in parallel.

https://code.kx.com/q4m3/14_Introduction_to_Kdb%2B/#1437-map-reduce 

 

 

Where you choose to put your `peach` can be important and change the performance of your execution.

 

My example actually runs better without peach due to the overhead of passing data around versus `neg` being a simple operation

\ts {{neg x} each x} each data
348 91498576

 

.Q.fc exists to help in these cases

\ts {.Q.fc[{neg x};x]} each data
19 67110432

https://code.kx.com/q/ref/dotq/#fc-parallel-on-cut   

 

And in fact since `neg` has native multithreading and operates on vectors and vectors of vectors it is best of off left on it's own: 

\ts neg each data
5 67109216


\ts neg data
5 67109104
neg data

 

This example of course is extreme but does show that thought and optimisation can go in to each use-case on where to choose to iterate and place `peach``

Many thanks for the reply and examples.

 

in fact since `neg` has native multithreading and operates on vectors and vectors of vectors it is best of off left on it's own


This is what I was keen to understand, and it's useful to know that there are cases when you may be better off without peach.