<p>I have about a year of experience developing KDB systems and have never dealt with data feeds.</p>
<p>I want to set up a producer / consumer relationship with a system written in other-than-KDB where they are the producer. It is my job to be the "real time data analysis resource" for this larger system.</p>
<p>The producer system will be writing into data into Posgtresql running on AWS's Relational Database Service (RDS) service.</p>
<p>I <strong>could</strong> ask the producer system to put msgs on a MOM queue so that I don't need to poll for changes to Posgtresql.</p>
<p>I would like to avoid polling mostly to be more closer to real time, not having to wait for the next poll to start processing.</p>
<ul>
<li>What <a href="https://en.wikipedia.org/wiki/Message-oriented_middleware">MOM</a> queue system should I suggest to the project
<ul>
<li><a href="https://en.wikipedia.org/wiki/Amazon_Simple_Queue_Service">SQS</a></li>
<li><a href="https://en.wikipedia.org/wiki/RabbitMQ">RabbitMQ</a></li>
<li>Something else, ideally something else that is open source.</li>
</ul></li>
<li><p>If I use SQS, that seems to involve <a href="https://en.wikipedia.org/wiki/Comet_(programming)">long polling</a>. Does it make sense to "long poll" an SQS queue to avoid polling Posgtresql? On a performance and economic level, I assume that the economics of long polling are really more like an event-based system than like conventional polling. Is that right?</p></li>
<li><p>Should the KDB instances that are processing these messages and updates in Posgtresql be the same instance that is doing data analysis? (I assume it is better to have different instances take care of anything else to offload all other processing from the data analysis instances.)</p></li>
<li><p>What packages and code samples/examples should I leverage for building a real-time data warehouse?</p></li>
</ul>