cancel
Showing results for 
Search instead for 
Did you mean: 

About HDB maintenance

carfield1
New Contributor
Hi all, as far as I aware, there is no command to delete some records in HDB, I have to load all records to memory, modify it, and save them back, is that correct?

Or I just miss some tools that already available?
5 REPLIES 5

wp
New Contributor
correct

manni_patel
New Contributor

KP
New Contributor
You can load all records into memory or modify the splayed column files directly. It's a bit fiddly as for this you need to know the row index of the data you're no longer interested in. Here's an example:

// some random data
n
:1000;
mytbl
:([]id:n?0Ng;name:n?`foo`bar`baz;sz:n?50;px:n?100.0);
.Q.dpft[`
:db;2015.07.25;`name;`mytbl];

mytbl
:([]id:n?0Ng;name:n?`foo`bar`baz;sz:n?50;px:n?100.0);
.Q.dpft[`
:db;2015.07.26;`name;`mytbl];

// load hdb
\l db

// select data from mytbl table
show
select count i, t:wsum[sz;px] by date, name from mytbl

// get rid of name=`bar in 2015.07.25 partition
idx
:1+(select i from mytbl where date = 2015.07.25, name = `bar)`x

// drop from the list (name) and enumerate sym
`:./2015.07.25/mytbl/name set `sym?last idx _ get `:./2015.07.25/mytbl/name;

// drop indexes from other columns
`
:./2015.07.25/mytbl/id set last idx _ get `:./2015.07.25/mytbl/id;
`
:./2015.07.25/mytbl/px set last idx _ get `:./2015.07.25/mytbl/px;
`
:./2015.07.25/mytbl/sz set last idx _ get `:./2015.07.25/mytbl/sz;

// bar data has been removed from hdb
show select count i, t:wsum[sz;px] by date, name from mytbl

carfield1
New Contributor
Thanks a lot for the information!

Tom_Martin
New Contributor
New Contributor
You can delete records without loading the whole table into memory, but it's a non-atomic operation so if you do choose to do it I'd recommend copying the partition into a staging directory, doing the delete and then copying it back into your main hdb directory.

q)t:([]sym:`a`b`c`d;a:1 2 3 4;b:10 20 30 40)
q).Q.dpft[`:.;.z.D;`sym;`t]
`t
q)\l .
q)t
date       sym a b
-------------------
2015.07.26 a   1 10
2015.07.26 b   2 20
2015.07.26 c   3 30
2015.07.26 d   4 40

//Let's say we want to delete where sym=`b. Get the index of the rows you want to remove
q)inds:exec x from select i from t where sym=`b
q)inds
,1

//Then figure out the indices of the rows that you don't want to remove
q)keep:(exec til first x from select count i from t where date=2015.07.26) except inds
q)keep
0 2 3

//Now iterate over each column in the parition, keeping the relevant indices and discarding the ones we want to remove
q).[;();@;keep] each `:./2015.07.26/t/sym`:./2015.07.26/t/a`:./2015.07.26/t/b
`:./2015.07.26/t/sym`:./2015.07.26/t/a`:./2015.07.26/t/b

//Reload the HDB and the sym=`b row is gone.
q)\l .
q)t
date       sym a b
-------------------
2015.07.26 a   1 10
2015.07.26 c   3 30
2015.07.26 d   4 40

Thanks
Tom