cancel
Showing results for 
Search instead for 
Did you mean: 

How to calculate correlation every few rows

Emily_Wilson
New Contributor
Hi All,

I have a table which looks like this:
num  sym1       sym2    sym3.    sym4
-----------------------------------------------------------------
1         1.5           -0.1.       9.4.            3.1
2         2.5          - 0.9        3.1             0.5
3         5.4            2.9.       -0.3.           1.0
4         0.5            0.1.       -0.4.           1.4
5         0.3            0.0         3.6.            0.4

How can I calculate the correlation between sym1 and sym2, sym1 and sym3, sym1 and sym4 for every 3 rows? 
(i.e, the correlation between sym1 row 1-3 and sym2/sym3/sym4 row 1-3, 
the correlation between sym1 row 2-4 and sym2/3/4 row 2-4,
the correlation between sym1 row 3-5 and sym2/3/4 row 3-5)

Any help will be very much appreciated!! 
4 REPLIES 4

dotsch
New Contributor
One way to do it:

select num, s1 cor' s2, s1 cor' s3, s1 cor' s4 from -2 _ select num, s1:sym1 i+\:0 1 2 ,s2:sym2 i+\:0 1 2 ,s3:sym3 i+\:0 1 2 ,s4:sym4 i+\:0 1 2 from t

Regards,
Andras


Thank you Andras!

matthew_clark
New Contributor
Hi Emily, 

I can refer you to this link here

Programming idioms – Knowledge Base – kdb+ and q documentation - Kdb+ and q documentation (kx.com)

which talks about sliding windows to aggregate over (primarily for monadic functions). I had a quick play around and have got a code that works based on some of this. However, I am sure you are likely to find some improvements to my code. Nevertheless, I am sure that this is enough to get you started.

// function to print out sliding window values
swin:{[w;s]{ 1_x,y }\[w#0;s]}

// function to calculate correlation between two columns from table
cf:{[w;table;syms]cor'[(w-1)_swin[w;(flip table)[syms[0]]];(w-1)_swin[w;(flip table)[syms[1]]]]}

You may wish to choose a better way to format the output, but here is some examples of how the output looks:

// Rolling correlation between sym1 and sym2 (for rows 1-3, 2-4 and 3-5)
q)cf[3;table;(`sym1`sym2)]
0.9002607 0.7809604 0.999991

// Rolling correlation for sym1 with each of sym2, sym3 and sym4 (for rows 1-3, 2-4 and 3-5)
q)cf[3;table]'[`sym1,'`sym2`sym3`sym4]
0.9002607  0.7809604   0.999991
-0.9025613 -0.08046509 -0.5109631
-0.5570187 -0.346547   0.1490308

Hope this gives you a good place to start.

All the best,
Matthew


Thank you so much Matthew! I will try it out!