Metrics all zero #750
Replies: 10 comments 11 replies
-
Here I exec into the Kepler-exporter pod and use curl to check the exposed metrics: [root@kepler-exporter-497tm /]# curl localhost:9102/metrics | grep kepler TYPE kepler_container_bpf_block_irq_total counterkepler_container_bpf_block_irq_total{container_id="07b9182c5a9f37d2a685c78a5372ec87878dab074f744627944bbb0343afc43c",container_name="prometheus-adapter",container_namespace="monitoring",pod_name="prometheus-adapter-648959cd84-wsvrq"} 0 |
Beta Was this translation helpful? Give feedback.
-
Oddly Prometheus seems to indicate that it is scraping values that are non-zero??? |
Beta Was this translation helpful? Give feedback.
-
Grafana is connected to Prometheus and when I test the data-source is says it is working... |
Beta Was this translation helpful? Give feedback.
-
Is the code in the repo broken? |
Beta Was this translation helpful? Give feedback.
-
@ajgillette I have seen issues kubectl exec -ti -n kepler daemonset/kepler-exporter -- bash -c "curl localhost:9102/metrics > /tmp/k.log; grep kepler_container_joules /tmp/k.log |sort -k 2 -g" Since prometheus can see kepler metrics, it is odd grafana still have zeroes. Can you check on your prometheus Can you also check what the CPU model you are using? |
Beta Was this translation helpful? Give feedback.
-
Mantisnet-tawon is a 5G cell site k8s communication analyzer. Complete with 5G protocol decoders |
Beta Was this translation helpful? Give feedback.
-
@ajgillette can we close this discussion? Or move this to an issue to close this discussion? |
Beta Was this translation helpful? Give feedback.
-
Yes I moved the discussion to a different group.
… On Sep 5, 2023, at 9:06 AM, Marcelo Carneiro do Amaral ***@***.***> wrote:
@ajgillette <https://github.com/ajgillette> can we close this discussion? Or move this to an issue to close this discussion?
—
Reply to this email directly, view it on GitHub <#750 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABJ7GHR2CLGLFRDMHVBE7W3XY4PVLANCNFSM6AAAAAAZRUBYTI>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
I am working with a checkout to get Kepler running on a three node cluster. Everything is installed and looks like it is working but when I check the metrics they are all present but with a value of zero. Here's the Kepler-exporter log:
kubectl logs -n kepler kepler-exporter-497tm
I0623 14:04:14.517972 1 gpu.go:46] Failed to init nvml, err: could not init nvml: error opening libnvidia-ml.so.1: libnvidia-ml.so.1: cannot open shared object file: No such file or directory
I0623 14:04:14.523257 1 acpi.go:67] Could not find any ACPI power meter path. Is it a VM?
I0623 14:04:14.531838 1 exporter.go:151] Kepler running on version: 2a32491
I0623 14:04:14.531884 1 config.go:210] using gCgroup ID in the BPF program: true
I0623 14:04:14.531947 1 config.go:212] kernel version: 6.3
I0623 14:04:14.532011 1 config.go:172] kernel source dir is set to /usr/src/kernels
I0623 14:04:14.532047 1 exporter.go:169] EnabledBPFBatchDelete: true
I0623 14:04:14.532134 1 rapl_msr_util.go:136] input/output error
I0623 14:04:14.532198 1 power.go:64] Not able to obtain power, use estimate method
I0623 14:05:44.604315 1 exporter.go:182] Initializing the GPU collector
I0623 14:05:44.605053 1 watcher.go:67] Using in cluster k8s config
perf_event_open: No such file or directory
I0623 14:05:46.683436 1 bcc_attacher.go:108] failed to attach perf event cpu_cycles_hc_reader: failed to open bpf perf event: no such file or directory
perf_event_open: No such file or directory
I0623 14:05:46.683500 1 bcc_attacher.go:108] failed to attach perf event cpu_ref_cycles_hc_reader: failed to open bpf perf event: no such file or directory
perf_event_open: No such file or directory
I0623 14:05:46.683543 1 bcc_attacher.go:108] failed to attach perf event cpu_instr_hc_reader: failed to open bpf perf event: no such file or directory
perf_event_open: No such file or directory
I0623 14:05:46.683589 1 bcc_attacher.go:108] failed to attach perf event cache_miss_hc_reader: failed to open bpf perf event: no such file or directory
I0623 14:05:46.683617 1 bcc_attacher.go:171] Successfully load eBPF module with option: [-DMAP_SIZE=10240 -DNUM_CPUS=6]
I0623 14:05:46.700159 1 exporter.go:226] Started Kepler in 1m32.168360643s
Beta Was this translation helpful? Give feedback.
All reactions