cancel
Showing results for 
Search instead for 
Did you mean: 

Avoiding Polling with Message-oriented_middleware

dan
New Contributor
<p>I have about a year of experience developing KDB systems and have never dealt with data feeds.</p>
<p>I want to set up a producer / consumer relationship with a system written in other-than-KDB where they are the producer. It is my job to be the &quot;real time data analysis resource&quot; for this larger system.</p>
<p>The producer system will be writing into data into Posgtresql running on AWS's Relational Database Service (RDS) service.</p>
<p>I <strong>could</strong> ask the producer system to put msgs on a MOM queue so that I don't need to poll for changes to Posgtresql.</p>
<p>I would like to avoid polling mostly to be more closer to real time, not having to wait for the next poll to start processing.</p>
<ul>
<li>What <a href="https://en.wikipedia.org/wiki/Message-oriented_middleware">MOM</a> queue system should I suggest to the project
<ul>
<li><a href="https://en.wikipedia.org/wiki/Amazon_Simple_Queue_Service">SQS</a></li>
<li><a href="https://en.wikipedia.org/wiki/RabbitMQ">RabbitMQ</a></li>
<li>Something else, ideally something else that is open source.</li>
</ul></li>
<li><p>If I use SQS, that seems to involve <a href="https://en.wikipedia.org/wiki/Comet_(programming)">long polling</a>. Does it make sense to &quot;long poll&quot; an SQS queue to avoid polling Posgtresql? On a performance and economic level, I assume that the economics of long polling are really more like an event-based system than like conventional polling. Is that right?</p></li>
<li><p>Should the KDB instances that are processing these messages and updates in Posgtresql be the same instance that is doing data analysis? (I assume it is better to have different instances take care of anything else to offload all other processing from the data analysis instances.)</p></li>
<li><p>What packages and code samples/examples should I leverage for building a real-time data warehouse?</p></li>
</ul>
7 REPLIES 7

Flying
New Contributor III
Looks like all you need is a bridge that get data off your Posgresql, and pushed to a kdb+tick infrastructure.

I.e., you need a feedhandler that can receive data from Posgresql/MOM/whatever, and published to a tickerplant.

quintanar401
New Contributor
You can use any service you want. We use JMS via java FH (you can leverage apach camel library for example) and from Q itself via a shared lib.

So we were considering amazon SQS which is http-based and therefore requires you to (long) poll the queue.
But we would like to receive data in true "push" fashion where each received message raises an event in my q program.
Is there some way to arrange that ? Perhaps a particular MQ product or adapter, or maybe there is just a slot in
the ".z" namespace where you would put a handler ?

You'll have to write a C lib (there is an example on Wiki-Interfaces section - how to create a handler) or you may use some non-Q feedhandler and send async msgs from it.

среда, 5 июля 2017 г., 8:22:22 UTC+3 пользователь dan написал:
So we were considering amazon SQS which is http-based and therefore requires you to (long) poll the queue.
But we would like to receive data in true "push" fashion where each received message raises an event in my q program.
Is there some way to arrange that ? Perhaps a particular MQ product or adapter, or maybe there is just a slot in
the ".z" namespace where you would put a handler ?


Is there any good reason not to use SQS and have a KDB instance do long polling? Anyone tried it ?

Flying
New Contributor III
IMHO, writing a lib and embed it into kdb+ limits your choice of programming languages, whereas a stand-alone feedhandler givens you the freedom to choose whichever programming language/technology that can best interface with your SQS.

Furthermore, since you need to perform long polling, your implementation will most probably need to run on its own thread. Sending information across threads in kdb+ (you need to send the received info into kdb+'s main thread for callback execution) involves serialization/deserialization of information anyway. So the overhead with a stand-alone feedhandler is not actually much higher than than an embedded lib.

Flying
New Contributor III
You can write a lib in C or C++, which can be loaded into your q program.

Alternatively, you can write a pure C/C++/C#/Java/etc. program, and push the data directly into your q instance via c.obj/c.cs/c.java/etc. interface.