I would like to share a use case here:
There is a TP whose memory consumption is abnormally big( we have another tp that has the same FH, whose mem usage is roughly 5% of this one, and this tp is in v4.5 stand-alone installation. ), we r using kxplatform 4.5 cluster-installation. After executing .utils.gc, only few space was released.
1. is there other possible way to reduce TP's memory consumption?
2. what it is that has taken so much space in TP?
Thanks for your question! It's great to understand your use case better.
A few questions so I can help you with your query - are there any other differences between the two TPs?
If you are using a Linux server, have you seen the recommended NUMA and THP settings here: https://code.kx.com/q/kb/linux-production/ ?
Thank you for letting me know the whitepaper, i'll dig into it later.
The main difference between two TP is: the one consumes more memory works in batching-mode("pubFreq"=50ms ) and it is in a 4 server cluster.
Moreover, this TP has higher CPU usage(60%~70% in average) as well comparing to another one whose CPU usage is nearly around 35%.
Based on our experience, single mode always has a better performance over the same process(TP and RDB) in cluster.
I would like to trouble you with:
1. referring to the attached pic, is it the TP binds multiple mem instead of one that caused a high numa-miss?
2. Since CPU binds to node 0, shall i do the same to MEM like : ""numactl --membind=0 --cpunodebind=0"" ?
3. how to make step 2 work in platform? Which "reserved param" shall i change?
thank you very much!
High memory usage on the tp is usually an indication of a slow subscriber. Memory builds up as messages sit in the output queues. Check in each tp:
sum each .z.W
Coincidentally, we encountered an issue this week where the RDB being too busy negatively impacted the TP and caused a data build up in the TP's outbound buffer.
For best practise, you can monitor the memory usage of subscribers by running .Q.w and memory stats in readable form will be returned - https://code.kx.com/q/ref/dotq/#qw-memory-stats
.Q.gc is also very important and can help with in-memory capacity issues. https://code.kx.com/q/ref/dotq/#qgc-garbage-collect
Thanks for sharing with the community. Hope this helps. 🙌
Not sure that is the best way to monitor a slow subscriber issue. Generally you don't want to call .Q.gc on a tickerplant especially in 24/6 or 7 operation. If slow subscribers are an issue, they should be subscribed to a chained tickerplant which drops slow subscribers. The main tickerplant should be protected at all costs to ensure continues and easily recoverable data capture
Thanks @matt_moore 👍
The particular situation we encountered related to the client connecting to the RDB by opening an Analyst session and mounted the HDB. Despite the data being on disk, the sym file was loaded into memory and this was quite large. Other assigned variables were also kept in memory. It was this manual one time action that resulted in the RDB being busy for a length amount of time rather than having a persistent slow subscriber issue. This lead to the build up of data in the TP's outbound buffer. We availed of .Q.w and .Q.gc while in the RDB process.
This prompted an internal investigation into memory limits of an Analyst session. The status of this is WIP.
Tel: +44 (0)28 3025 2242
Tel: +1 (212) 447 6700
Tel: +61 (0)2 9236 5700