cancel
Showing results for 
Search instead for 
Did you mean: 

real-time database multi-site replication

LamHinYan
New Contributor
feed --> ticker plant --> rdbLocal
                      --> rdbRemote

Consider the topology above, the feed sends real-time data to the ticker plant. Two rdb instances are subscribing from the ticker plant. The link between the ticker plant and rdbRemote is unreliable. The unreliable link may go offline a few times a day for a few minutes. How can I make sure that rdbRemote is eventually consistent with rdbLocal? Eventually consistent means given sufficient time, rdbRemote can catch up with rdbLocal. The local link is much faster (higher throughput and lower latency) than the remote link. Would you recommend another topology? Thx.

3 REPLIES 3

LamHinYan
New Contributor
From a theoretical perspective, am I asking for all of CAP, which is impossible? Note that eventual consistency is different from consistency.

https://en.wikipedia.org/wiki/CAP_theorem

pressjonny0
New Contributor
It depends on how “eventual” it can be.  You could leave rdbRemote as best effort, and when it rolls to history (assuming it does) sync up the historical partition. 

Simon wrote some code for recovering between servers if one of the TPs crashed: http://code.kx.com/wsvn/code/contrib/simon/tickrecover/recover.q.  You could maybe do something similar to this, though I imagine your case is much easier given the data is a replica - you would just have to find the gaps, copy the missing segment(s), then re-sort the table(s). 

I would maybe be tempted to change the model though.  Instead of pushing async to rdbRemote, have it pull periodically synchronously.  It could either pull from rdbLocal, or maybe a separate process.  You could build a local process which subscribes to the TP and keeps the TP messages in memory, exactly as received.  rdbRemote can then pull them synchronously on a slower timer and execute each one.  When they have been read by rdbRemote they could be dropped from memory.  The reason for doing it as a message list rather than converting to tables is to keep the sequencing correct, and so control messages (e.g. u.end) can be processed as well. 

(whatever it is, you’ll likely also have to change the start up procedure for rdbRemote as I imagine it won’t have access to the TP log for replay). 

Thanks 

Jonny

AquaQ Analytics 




Is it ok to mix kdb32 and kdb64 instances in a topology? Do kdb32 and kdb64 speak the same protocol? Is a single ipc request atomic? Does the server buffer the whole payload sent from the client before decoding and evaluating? Thx.