cancel
Showing results for 
Search instead for 
Did you mean: 

xbar & timestamp manipulation

JP
New Contributor III

Hi,

I have a minute <DT(timestamp),O,H,L,C> table <tt>, from which I need to aggregate/process data for different intervals. In my case, as I need the equivalent of a <binr> (eg. hourly data from 10:00 to 11:00 must show on the output DT line yyyy.mm.ddD11:00:00), I use an offset value in the <select> statement.

No problem for any interval <= daily and weekly, however I don't understand what is going on for a 'non-standard' 2-day interval. Below is some basic data around 2013 new-year (01/01/13 is a Tuesday).

/Daily interval... OK
10#select first O,last C by DT:(1 xbar DT.date)+0D16:00 from tt       
DT                           | O       C      
-----------------------------| ---------------
2012.12.31D16:00:00.000000000| 1405.22 1425.69
2013.01.02D16:00:00.000000000| 1426.19 1461.36
2013.01.03D16:00:00.000000000| 1462.42 1459.07
2013.01.04D16:00:00.000000000| 1459.37 1466.1 
2013.01.07D16:00:00.000000000| 1466.47 1461.77
2013.01.08D16:00:00.000000000| 1461.89 1457.05
2013.01.09D16:00:00.000000000| 1457.15 1461.04
2013.01.10D16:00:00.000000000| 1461.02 1471.99
2013.01.11D16:00:00.000000000| 1472.12 1471.71
2013.01.14D16:00:00.000000000| 1472.05 1470.79

/2-Day interval... odd output
5#select first O, last C by DT:(2 xbar DT.date)+1D16:00 from tt
DT                           | O       C      
-----------------------------| ---------------
2012.12.31D16:00:00.000000000| 1405.22 1425.69  	/ok
2013.01.02D16:00:00.000000000| 1426.19 1459.07 		/close?... should be 1461.36 (from 01.02)
2013.01.04D16:00:00.000000000| 1459.37 1466.1  		/open? ... should be 1462.42 (from 01.03)
2013.01.06D16:00:00.000000000| 1466.47 1461.77		/Sunday data? taken from following Monday?
2013.01.08D16:00:00.000000000| 1461.89 1461.04 		/open? ... should be 1466.37 (from 01.07)

Questions:
- Isn't <xbar> expected to use ONLY the available DT.dates to build its bins, like it does for minute/hourly/daily/weekly intervals?

- How should the <select> be built to yield, for either one of the following cases:

a) if using 2-day buckets of calendar days, including when market is closed (no data): 
DT                           | O       C      
-----------------------------| ---------------
2012.12.31D16:00:00.000000000| 1405.22 1425.69 		/from 12.31 to 01.01 (nothing on 01.01, so shows on 12.31)
2013.01.03D16:00:00.000000000| 1426.19 1459.07 		/from 01.02 to 01.03
2013.01.04D16:00:00.000000000| 1459.37 1466.1  		/from 01.04 to 01.05 (nothing on 01.05, so shows on 01.04)
2013.01.07D16:00:00.000000000| 1466.47 1461.77  	/from 01.06 to 01.07 (nothing on 01.06)
2013.01.09D16:00:00.000000000| 1461.89 1461.04 		/from 01.08 to 01.09  
NOTE:
In the above example, 2012.12.31 is 'day 0'; if 2012.12.31 is 'day 1', obviously different results, but same concept would apply 

b) OR... what I would prefer, if using 2-day buckets of available data: 
DT                           | O       C      
-----------------------------| ---------------
2013.01.02D16:00:00.000000000| 1405.22 1461.36 		/from 12.31 to 01.02 (nothing on 01.01) 
2013.01.04D16:00:00.000000000| 1462.42 1466.1 		/from 01.03 to 01.04 
2013.01.08D16:00:00.000000000| 1466.47 1457.05 		/from 01.07 to 01.08 (01.05 and 06 is a week-end)
2013.01.10D16:00:00.000000000| 1457.15 1471.99 		/from 01.09 to 01.10 
2013.01.14D16:00:00.000000000| 1472.12 1470.79 		/from 01.11 to 01.14 (01.12 and 13 is a week-end)

Unless there is a relatively standard way to present data for 'n-day' intervals that I am unaware of, either one would be fine with me, as long as I understand how it works(!)

Thx
JP

 

2 REPLIES 2

cillianreilly
New Contributor III

I'm not sure what your complete data set looks like - but using a best guess, I get an expected result.

q)tt:([]DT:`timestamp$2012.12.31 2013.01.02 2013.01.03 2013.01.04 2013.01.07 2013.01.08 2013.01.09 2013.01.10 2013.01.11 2013.01.14;O:1405.22 1426.19 1462.42 1459.37 1466.47 1461.89 1457.15 1461.02 1472.12 1472.05;C:1425.69 1461.36 1459.07 1466.1 1461.77 1457.05 1461.04 1471.99 1471.71 1470.79)
q)select first O,last C by DT:(2 xbar DT.date)+1D16:00 from tt                                                     DT                           | O       C
-----------------------------| ---------------
2013.01.01D16:00:00.000000000| 1405.22 1425.69    // open 31st, close 31st
2013.01.03D16:00:00.000000000| 1426.19 1459.07    // open 2nd, close 3rd
2013.01.05D16:00:00.000000000| 1459.37 1466.1     // open 4th, close 4th
2013.01.07D16:00:00.000000000| 1466.47 1461.77    // open 7th, close 7th
2013.01.09D16:00:00.000000000| 1461.89 1461.04    // open 8th, close 9th
2013.01.11D16:00:00.000000000| 1461.02 1471.71    // open 10th, close 11th
2013.01.15D16:00:00.000000000| 1472.05 1470.79    // open 14th, close 14th

I also don't see how you can have 2012.12.31D16:00:00 in both results (1 day/2 day buckets), as it shouldn't appear in the 2-day bucket query (unless you have some other row dated 2012.12.30 in your table you're not showing us?)

q)(2 xbar 2012.12.31)+0D16:00 1D16:00
2012.12.31D16:00:00.000000000 2013.01.01D16:00:00.000000000

This looks like some data quality issue somewhere rather than issue with xbar - if you can provide a small stand alone sample of your data that replicates the problem, I can take a look using that.

JP
New Contributor III

Feeling embarrassed in here... This is entirely my mistake; no data missing, but faulty statement transcription in the forum's editor, while playing with the offset. The 2-day interval statement select does indeed yield the correct output.

Thank you for your time.