Replies: 2 comments 2 replies
-
@Yanbo0101 sounds good. I think we can do it first at the node level to expose the QAT metrics. You can create a new metrics e.g. |
Beta Was this translation helpful? Give feedback.
-
@rootfs and all community folks, @Yanbo0101 's idea above can be regarded as the continue of earlier discussion thread of CPU built-in accelerators' power attribution. Currently, power/energy consumption measurement still relies on RAPL, accelerator specific metrics we collected may help do the power attribution analysis and estimation. This also refers to the feature requirement here. For Intel SPR and later CPU models, accelerators' specific metrics, such as AMX metrics and QAT metrics, are no related to energy itself, so such collector mechanism could be different with current Kepler metrics collectors, we can continue discuss here for both mechanism and code organization, and listen more voice from community side, before we propose RFC issues. |
Beta Was this translation helpful? Give feedback.
-
Hi, I am collecting performance metrics for QAT(Intel® QuickAssist Technology) accelerator on the Intel SPR CPU.
For Intel SPR CPUs, each physical CPU has a QAT accelerator. When telemetry feature is turned on, metrics such as latency, bandwidth, encryption/decryption utilization, and compression/decompression utilization for each QAT device can be obtained through it. These metrics are updated once per second in the/sys/devices/* directory.
My idea is to add a config in Kepler that allows users to decide whether to turn on this feature. When the feature is turned on, Kepler can collect accelerator metrics at the node level.
Is this solution accepted by the community? Welcome everyone to come up with better ideas. Thanks.
Beta Was this translation helpful? Give feedback.
All reactions