-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple periodic readers increment the counter abnormaly #5866
Comments
We should try to create a simpler reproduction (not using the collector) to make this easier to root-cause |
@dashpole Can I pick this task? |
@pree-dew feel free to work on this. It is likely going to be tricky to root-cause, but any help is appreciated |
Able to reproduce this with sdk
Same time, same metric is showing values as 4, 12 while the correct value is 3. Same setup giving correct value with 1 reader. So far the observation is that it is happening with |
@dashpole I have added a test case to reproduce the behaviour, it will fail as of now but will pass once the fix goes out. Debugged the issue, this is my understanding of the issue: Step 1 Step2 Step3 Step4 This is happening as the instrument is same for both pipelines and before the previous value gets cleared here, next pipeline picks the previous value set by pipeline1 and add the value, thereby making the value incorrect. With above understanding, I am able to reproduce the issue. Let me know what do you think about this? |
Aggregations for each reader should be isolated from each other. A meter has a single resolver for int64: opentelemetry-go/sdk/metric/meter.go Line 59 in 1a964cc
The resolver has a list of pipelines: opentelemetry-go/sdk/metric/pipeline.go Lines 600 to 604 in 1a964cc
There is one pipeline for each reader: opentelemetry-go/sdk/metric/pipeline.go Lines 562 to 569 in 1a964cc
The callback is registered separately with each pipeline: opentelemetry-go/sdk/metric/pipeline.go Lines 572 to 577 in 1a964cc
During Collect, produce() should iterate over each callback on the pipeline: opentelemetry-go/sdk/metric/pipeline.go Lines 123 to 128 in 1a964cc
Some things to check:
|
@dashpole Sure, I will check these points. So far what I have observed is that the instance( opentelemetry-go/sdk/metric/pipeline.go Line 368 in 1a964cc
is same for both readers, even when there are 2 separate pipelines which internally is calling the measure function for both pipelines. |
Added some printing in the sum aggregation:
measure is being called for both pipelines when only one pipeline is collected. |
I did some more digging. The issue is that the Instrument we return to users from opentelemetry-go/sdk/metric/meter.go Lines 139 to 160 in 97f8401
Then, each time ObserveInt64 is called from our callback, it calls observe on the instrument provided by the user (which is our opentelemetry-go/sdk/metric/meter.go Line 558 in 97f8401
Observe iterates over all of the measure functions: opentelemetry-go/sdk/metric/instrument.go Lines 328 to 330 in 97f8401
|
I actually fixed a very similar bug if callbacks were registered using WithCallback() in #4742 |
This issue #4742 looks similar. So, since we are appending measure function for all pipelines, it is getting called even if 1 pipeline collect is getting called. So, we essentially have to append the measure function corresponding to the pipeline's reader as a solution. Is my understanding correct? |
To fix it, we need to ensure that only callbacks associated with the pipeline are called during the pipeline's But here, the instrument comes from the user and must be useable for callbacks for any pipeline, so we can't do that. The only other option, which is accessible from opentelemetry-go/sdk/metric/meter.go Line 444 in 97f8401
We might need to split the The opentelemetry-go/sdk/metric/meter.go Line 152 in 97f8401
And the |
Got it, so basically
|
It is probably easiest if observer has a reference to its pipeline, and the map[observableID]measure is on the pipeline. We probably need to leave appendMeasure alone for now, since the measures on the instrument is used for validation. But right next to where we call appendMeasure, you should update the map[observableID]measure on the pipeline. During |
Got it and makes sense. I will make changes accordingly. |
Description
When multiple (periodic) readers are present, metrics increase abnormally.
Originally reported here: open-telemetry/opentelemetry-collector#11327
Steps To Reproduce
Run the OpenTelemetry collector with the configuration provided in the bug above.
Expected behavior
Metrics should not increase abnormally.
The text was updated successfully, but these errors were encountered: