cancel
Showing results for 
Search instead for 
Did you mean: 

Storing Price + Fundamental Data

r691175002
New Contributor
I'm fairly new to databases.  My current system uses hdf5 and is really starting to hit its limit.

I have roughly 15 years of data for 1000 securities.  Each security has ~100 fields of varying frequency:

Daily: Open, Close, Adjusted close, Volume, VWAP, ...
Weekly: Short interest, Market cap, ...
Monthly: Industry/Sector, Financial statement data, ...

There will additionally be a large number of values that are calculated from this data (moving averages, ratios of fields, etc...).

The most common query will be for 30-50 fields re-sampled to a given frequency.


What should the tables look like?  Is okay to have 100+ columns in a table or should I spread them out somehow?
2 REPLIES 2

effbiae
New Contributor
it's fine to have many columns - whatever naturally fits
maybe start by directly export/import csv's

david_demner
New Contributor
<1bbe31a6-047e-4f98-b7f5-da91f05e9e15@googlegroups.com>To: Ryan Turner , personal-kdbplus@googlegroups.com
Hi Ryan,

It's totally fine to have 100+ columns in a table. Column store means you only access the columns in a particular query, even if it is 30 or 40% of them.

You should start with all splayed tables (one per data frequency) since it doesn't sound like much data for kdb.

Your monthly and weekly tables could be sparse daily and filled or aj to the desired frequency.‎ That would be more flexible than, say,  forcing market cap to be weekly when really it changes daily and you might want to track it as such for some large cap stocks or something. Or maybe one company releases earnings annually so doesn't fit the monthly scheme you're assuming. You can easily convert between frequencies in kdb.

Good luck,

David

From: Ryan Turner
Sent: Tuesday, June 16, 2015 19:33
To: personal-kdbplus@googlegroups.com
Reply To: personal-kdbplus@googlegroups.com
Subject: [personal kdb+] Storing Price + Fundamental Data

I'm fairly new to databases.  My current system uses hdf5 and is really starting to hit its limit.

I have roughly 15 years of data for 1000 securities.  Each security has ~100 fields of varying frequency:

Daily: Open, Close, Adjusted close, Volume, VWAP, ...
Weekly: Short interest, Market cap, ...
Monthly: Industry/Sector, Financial statement data, ...

There will additionally be a large number of values that are calculated from this data (moving averages, ratios of fields, etc...).

The most common query will be for 30-50 fields re-sampled to a given frequency.


What should the tables look like?  Is okay to have 100+ columns in a table or should I spread them out somehow?

--
You received this message because you are subscribed to the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to personal-kdbplus+unsubscribe@googlegroups.com.
To post to this group, send email to personal-kdbplus@googlegroups.com.
Visit this group at http://groups.google.com/group/personal-kdbplus.
For more options, visit https://groups.google.com/d/optout.