cancel
Showing results for 
Search instead for 
Did you mean: 

Climate Data

joelpoplin
New Contributor

I'm curious if kdb+/q would be an appropriate solution for spatial time series data. Climate data is typically 4-D and can be quite large. Would like be able to do some in memory computations and also query at much faster speeds than current standard structures (grib or netcdf files with python xarray).

  • m
1 ACCEPTED SOLUTION

rocuinneagain
Valued Contributor
Valued Contributor

Yes it could be used.

 

To test you could look at PyKX for an easy Python interface.

A 2 minute example of passing in a Dataset to q in shown below.

PyKX allows Registering Custom Conversions so you could create a function to pass the Dataset in exactly the form you wish to q instead of passing it all as a dictionary in my example.

 

import pykx as kx
import xarray as xr
import numpy as np
import pandas as pd
ds = xr.Dataset(
    {"foo": (("x", "y"), np.random.rand(5, 5))},
    coords={
        "x": [10, 20, 30, 40, 50],
        "y": pd.date_range("2000-01-01", periods=5),
        "z": ("x", list("abcde")),
    },
)
kx.q['ds'] = kx.toq(ds.to_dict())
kx.q('ds')
pykx.Dictionary(pykx.q('
coords   | `x`y`z!+`dims`attrs`data!((,`x;,`y;,`x);(()!();()!();()!());(10 20..
attrs    | ()!()
dims     | `x`y!5 5
data_vars| (,`foo)!+`dims`attrs`data!(,`x`y;,()!();,(0.7412575 0.2054306 0.10..
'))
kx.q('flip ds[`coords;;`data]')
pykx.Table(pykx.q('
x  y                             z
----------------------------------
10 2000.01.01D00:00:00.000000000 a
20 2000.01.02D00:00:00.000000000 b
30 2000.01.03D00:00:00.000000000 c
40 2000.01.04D00:00:00.000000000 d
50 2000.01.05D00:00:00.000000000 e
'))
kx.q('ds[`data_vars;`foo;`data]')
pykx.List(pykx.q('
0.7412575 0.2054306   0.1009393 0.8792678 0.04105999
0.1811459 0.01659637  0.2406029 0.4900055 0.551788  
0.6303767 0.0702013   0.6831359 0.5961667 0.3722388 
0.9255059 0.9202499   0.5055902 0.9767793 0.7440498 
0.7331576 0.003197568 0.4939932 0.5433492 0.01175784
'))

 

View solution in original post

1 REPLY 1

rocuinneagain
Valued Contributor
Valued Contributor

Yes it could be used.

 

To test you could look at PyKX for an easy Python interface.

A 2 minute example of passing in a Dataset to q in shown below.

PyKX allows Registering Custom Conversions so you could create a function to pass the Dataset in exactly the form you wish to q instead of passing it all as a dictionary in my example.

 

import pykx as kx
import xarray as xr
import numpy as np
import pandas as pd
ds = xr.Dataset(
    {"foo": (("x", "y"), np.random.rand(5, 5))},
    coords={
        "x": [10, 20, 30, 40, 50],
        "y": pd.date_range("2000-01-01", periods=5),
        "z": ("x", list("abcde")),
    },
)
kx.q['ds'] = kx.toq(ds.to_dict())
kx.q('ds')
pykx.Dictionary(pykx.q('
coords   | `x`y`z!+`dims`attrs`data!((,`x;,`y;,`x);(()!();()!();()!());(10 20..
attrs    | ()!()
dims     | `x`y!5 5
data_vars| (,`foo)!+`dims`attrs`data!(,`x`y;,()!();,(0.7412575 0.2054306 0.10..
'))
kx.q('flip ds[`coords;;`data]')
pykx.Table(pykx.q('
x  y                             z
----------------------------------
10 2000.01.01D00:00:00.000000000 a
20 2000.01.02D00:00:00.000000000 b
30 2000.01.03D00:00:00.000000000 c
40 2000.01.04D00:00:00.000000000 d
50 2000.01.05D00:00:00.000000000 e
'))
kx.q('ds[`data_vars;`foo;`data]')
pykx.List(pykx.q('
0.7412575 0.2054306   0.1009393 0.8792678 0.04105999
0.1811459 0.01659637  0.2406029 0.4900055 0.551788  
0.6303767 0.0702013   0.6831359 0.5961667 0.3722388 
0.9255059 0.9202499   0.5055902 0.9767793 0.7440498 
0.7331576 0.003197568 0.4939932 0.5433492 0.01175784
'))