Prometheus client scaling #439

rootfs · 2022-11-29T20:12:47Z

rootfs
Nov 29, 2022
Maintainer

Based on the discussion with @simonpasquier, metrics exported by Kepler is likely hit the same scaling issue as kube-state-metric (best explained in this issue). @simonpasquier offered three directions:

Optimized Prometheus golang client. The Prometheus golang client serves most of the use cases but is not optimized for explosive metrics exporter like Kepler. Kube-state-metric comes up with an optimized golang client that significantly reduces memory usage.
Do more caching. Metrics that do not change overtime can be cached, as does in cAdvisor. This reduces both metric creation and GC overhead.
Separating high and low levels metrics through different endpoint. Detailed metrics (such as perf and cgroup) and high level metrics (such as energy per component/pod) are exported on different endpoints so that scraping can be customized to reduce overall overhead.

dgrisonnet · 2022-12-07T15:35:53Z

dgrisonnet
Dec 7, 2022

Optimized Prometheus golang client. The Prometheus golang client serves most of the use cases but is not optimized for explosive metrics exporter like Kepler. Kube-state-metric comes up with an optimized golang client that significantly reduces memory usage.

It is not really possible to optimize the Prometheus golang client to achieve a similar result as to what kube-state-metrics has since it removes all of the decoration added by the client to make metrics simpler to use.

The idea we had when we started working on the optimization of cAdvisor was to create a whole new Metric type with a set of new interfaces to create a new workflow for more optimized but less practical const metrics to not overlap with the existing experience are used to and not break any clients.
That said, client_golang maintainers were kind of reluctant to us starting the work in client_golang directly since this is a very niche use-case for now and it would be better to prove that it work well in various places before doing the change.

FWIW, I started hacking on a library that brings a new implementation of const metric, but it would take some times because having something ready to be consumed.

Do more caching. Metrics that do not change overtime can be cached, as does in cAdvisor. This reduces both metric creation and GC overhead.

Caching is definitely the right way to go here and just for the sake's of correctness, cAdvisor doesn't use caching today which is why it faces the same problem as you do. We tried to introduce one implementation but it wasn't optimized enough in terms of memory and the additional resource usage was concerning.

Separating high and low levels metrics through different endpoint. Detailed metrics (such as perf and cgroup) and high level metrics (such as energy per component/pod) are exported on different endpoints so that scraping can be customized to reduce overall overhead.

That's definitely something great to have, especially if you do the separation based on the expected scrape frequency, but it won't solve the issue in the long run since advertising potential higher scrape frequency will also mean that the slow code will be hit more often.

0 replies

rootfs · 2022-12-07T15:48:46Z

rootfs
Dec 7, 2022
Maintainer Author

thank you @dgrisonnet for the info.

That said, client_golang maintainers were kind of reluctant to us starting the work in client_golang directly since this is a very niche use-case for now and it would be better to prove that it work well in various places before doing the change.

Would love to collaborate on this. I think a flow would be:

benchmark the scalability of Kepler with Prometheus golang client, and get a go profiling report. We'll dig into the profiling and see what the golang client's impact there.
Experiment with the optimized client

0 replies

SamYuan1990 · 2023-04-30T15:30:49Z

SamYuan1990
Apr 30, 2023
Maintainer

https://github.com/prometheus/client_golang/blob/671a2f0568cb89c0fc8ef21c826b228dacbe8516/prometheus/registry.go#L491
if I am right, the GC happens when the prometheus sdk doing copy for the channel.
as the channel by default is just 1000 with size and it ... create and close the channel once the function been invoked. and to keep the channel able to close, they even if have a copy....

10 replies

marceloamaral May 1, 2023
Maintainer

@SamYuan1990 very interesting.

It seems that there is a new way to export metrics.
In this case, we will need to change the Describe and Collect functions.

Can you also implement this logic in your code example? That is, enable Prometheus to collect the metrics?
Note that both methods must redirected the results to the Prometheus channel.

m = NewGauge(GaugeOpts{
  Namespace: namespace,
  Subsystem: subsystem,
  Name:      name,
  Help:      d.Description.Description,
})

func (c *collector) Collect(ch chan<- Metric) {
      // old method
       ch <- MustNewConstMetric(...)
     
      // new method
      switch m := metric.(type) {
      case *counter:
         m.Add(sample.Value)
      case *gauge:
         m.Set(sample.Value)
      }
      m.Collect(ch)
}

Please note that the new method increments the counter using delta values, which differs from the old logic, where the counter value was updated with the aggregated data.

@SamYuan1990 after updating this, could you please show the results of the benchmark code output?

SamYuan1990 May 1, 2023
Maintainer

I suppose the 1st thing is thinking about how to make a benchmark. and the 2nd is the refactor. If I have any good idea, I will open with a pr.

marceloamaral May 1, 2023
Maintainer

it seems that if we can use Gauge directly instead of MustNewConstMetric we can save a lot of memory.

How did you measure that?
go profiling memory?

SamYuan1990 May 1, 2023
Maintainer

it seems that if we can use Gauge directly instead of MustNewConstMetric we can save a lot of memory.

How did you measure that? go profiling memory?

https://github.com/prometheus/client_golang/blob/671a2f0568cb89c0fc8ef21c826b228dacbe8516/prometheus/registry.go#L491
I suppose the channel implements case the GC, hence I suppose it's mechanism level, and we should considering how to measure at implementation level....

SamYuan1990 May 17, 2023
Maintainer

@marceloamaral , @rootfs I have a plan to test performance for current impl with steps below:
1st ref https://github.com/prometheus/client_golang/blob/8b1a836e7dbb44034aeee8dbf3826e1f5c2442fb/prometheus/registry_test.go#L870 to impl a prometheus client
2nd ref our implementation of PrometheusCollector
3rd run test

husky-parul · 2023-09-19T07:35:24Z

husky-parul
Sep 19, 2023
Maintainer

Do more caching. Metrics that do not change overtime can be cached, as does in cAdvisor. This reduces both metric creation and GC overhead.

Is implementing caching still on roadmap for Kepler? While integrating Kepler with OpenTelemetry and PEAKS I have faced similar issues of the Prometheus getting overloaded when fetching metrics. I am open to give this a try if the community thinks implementing cache is on roadmap.

Thanks
CC @marceloamaral @rootfs @sustainable-computing-io/maintainer

0 replies

marceloamaral · 2023-09-21T16:03:05Z

marceloamaral
Sep 21, 2023
Maintainer

@husky-parul I think the discussion here was related minimize memory utilization in Kepler.

But, yes, minimize prometheus utilization is also important.
We have also discussed the prometheus overloading problem before in #843.
There are two points to investigate:

which metrics are you enabling? container and process?
verify if the "command" label is increasing the metric cardinality (too many unique values)
decrease the Prometheus scrape interval in the serviceMonitor, the default prometheus interval is typically 15s and we use 3s for Kepler. Lower interval is better for gauge metrics, but since we have counters, using a longer interval might be fine..

@mcalman ^^

0 replies

SamYuan1990 · 2023-10-18T15:29:48Z

SamYuan1990
Oct 18, 2023
Maintainer

sorry for update late, but I got time to further investigation tonight.
I found it's interesting and sharing here:

for each metric, as created by prometheus.MustNewConstMetric. it takes a memory alloc and I suppose it will be moved into a GC after it consumed from channel as metric :<- ch
hence, as we refreshed the data inside a metric pre 3 seconds, which means if we don't want to recreate the metric in memory to avoid a GC, we have to keep a pointer, and as a cost, it alive in memory and never released.
What if, we make replace MustNewConstMetric by a pointer and keep the pointer in side our structure as ContainerMetrics?
and also we will need to change the Describe and Collect functions to consume a pointer. as @marceloamaral mentioned here

if everything in a pointer, we should have a chance to avoid a channel copy but management those thing by map structure as ID and pointers. to avoid GC as much as possible?

0 replies

Charlotte-br560 · 2024-03-15T17:19:09Z

Charlotte-br560
Mar 15, 2024

Hey there! It looks like you're facing scaling issues with Prometheus client for Kepler metrics. Based on the discussion with @simonpasquier, there are a few potential solutions:

Optimize the Prometheus golang client: Consider using an optimized golang client like the one used by kube-state-metric to reduce memory usage.

Implement more caching: Cache metrics that do not change frequently, similar to what cAdvisor does, to reduce both metric creation and garbage collection overhead.

Separate high and low level metrics: Export detailed metrics and high level metrics on different endpoints to customize scraping and reduce overall overhead.

Hope these suggestions help! If you need further assistance, feel free to reach out.

1 reply

SamYuan1990 Mar 25, 2024
Maintainer

hi @Charlotte-br560 , could you please guide me with cAdvisor's implementation? and maybe we can have a diff for those two implements to see the difference?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus client scaling #439

{{title}}

Replies: 7 comments 11 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Prometheus client scaling #439

rootfs Nov 29, 2022 Maintainer

Replies: 7 comments · 11 replies

dgrisonnet Dec 7, 2022

rootfs Dec 7, 2022 Maintainer Author

SamYuan1990 Apr 30, 2023 Maintainer

marceloamaral May 1, 2023 Maintainer

SamYuan1990 May 1, 2023 Maintainer

marceloamaral May 1, 2023 Maintainer

SamYuan1990 May 1, 2023 Maintainer

SamYuan1990 May 17, 2023 Maintainer

husky-parul Sep 19, 2023 Maintainer

marceloamaral Sep 21, 2023 Maintainer

SamYuan1990 Oct 18, 2023 Maintainer

Charlotte-br560 Mar 15, 2024

SamYuan1990 Mar 25, 2024 Maintainer

rootfs
Nov 29, 2022
Maintainer

Replies: 7 comments 11 replies

dgrisonnet
Dec 7, 2022

rootfs
Dec 7, 2022
Maintainer Author

SamYuan1990
Apr 30, 2023
Maintainer

marceloamaral May 1, 2023
Maintainer

SamYuan1990 May 1, 2023
Maintainer

marceloamaral May 1, 2023
Maintainer

SamYuan1990 May 1, 2023
Maintainer

SamYuan1990 May 17, 2023
Maintainer

husky-parul
Sep 19, 2023
Maintainer

marceloamaral
Sep 21, 2023
Maintainer

SamYuan1990
Oct 18, 2023
Maintainer

Charlotte-br560
Mar 15, 2024

SamYuan1990 Mar 25, 2024
Maintainer