Replies: 7 comments 11 replies
-
It is not really possible to optimize the Prometheus golang client to achieve a similar result as to what kube-state-metrics has since it removes all of the decoration added by the client to make metrics simpler to use. The idea we had when we started working on the optimization of cAdvisor was to create a whole new Metric type with a set of new interfaces to create a new workflow for more optimized but less practical const metrics to not overlap with the existing experience are used to and not break any clients. FWIW, I started hacking on a library that brings a new implementation of const metric, but it would take some times because having something ready to be consumed.
Caching is definitely the right way to go here and just for the sake's of correctness, cAdvisor doesn't use caching today which is why it faces the same problem as you do. We tried to introduce one implementation but it wasn't optimized enough in terms of memory and the additional resource usage was concerning.
That's definitely something great to have, especially if you do the separation based on the expected scrape frequency, but it won't solve the issue in the long run since advertising potential higher scrape frequency will also mean that the slow code will be hit more often. |
Beta Was this translation helpful? Give feedback.
-
thank you @dgrisonnet for the info.
Would love to collaborate on this. I think a flow would be:
|
Beta Was this translation helpful? Give feedback.
-
https://github.com/prometheus/client_golang/blob/671a2f0568cb89c0fc8ef21c826b228dacbe8516/prometheus/registry.go#L491 |
Beta Was this translation helpful? Give feedback.
-
Is implementing caching still on roadmap for Kepler? While integrating Kepler with OpenTelemetry and PEAKS I have faced similar issues of the Prometheus getting overloaded when fetching metrics. I am open to give this a try if the community thinks implementing cache is on roadmap. Thanks |
Beta Was this translation helpful? Give feedback.
-
@husky-parul I think the discussion here was related minimize memory utilization in Kepler. But, yes, minimize prometheus utilization is also important.
@mcalman ^^ |
Beta Was this translation helpful? Give feedback.
-
sorry for update late, but I got time to further investigation tonight.
if everything in a pointer, we should have a chance to avoid a channel copy but management those thing by map structure as ID and pointers. to avoid GC as much as possible? |
Beta Was this translation helpful? Give feedback.
-
Hey there! It looks like you're facing scaling issues with Prometheus client for Kepler metrics. Based on the discussion with @simonpasquier, there are a few potential solutions: Optimize the Prometheus golang client: Consider using an optimized golang client like the one used by kube-state-metric to reduce memory usage. Implement more caching: Cache metrics that do not change frequently, similar to what cAdvisor does, to reduce both metric creation and garbage collection overhead. Separate high and low level metrics: Export detailed metrics and high level metrics on different endpoints to customize scraping and reduce overall overhead. Hope these suggestions help! If you need further assistance, feel free to reach out. |
Beta Was this translation helpful? Give feedback.
-
Based on the discussion with @simonpasquier, metrics exported by Kepler is likely hit the same scaling issue as kube-state-metric (best explained in this issue). @simonpasquier offered three directions:
Beta Was this translation helpful? Give feedback.
All reactions