Add: exponential backoff for CAS operations on floats #1661
+235
−39
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem this PR is solving
Hi!
This kind of issues: cockroachdb/cockroach#133306 pushed me to investigate if it is possible to optimize client metrics library in terms of CPU and performance especially under contention. So mostly this PR is about that.
Proposed changes
Add sort of exponential backoff for tight loops on CAS operations, which potentially will decrease contention and as a result will lead to better latencies and lower CPU consumption. This is well known trick that is used in many projects that deal with concurrency.
In addition this logic was refactored into single place in code because case of atomic update of float64 can be meet in different parts of the codebase.
Results
All test results are from AWS c7i.2xlarge type of instance.
main vs proposed for histograms:
changes leading to slight increase of latency in single-threaded use case but significantly decrease it in contended case.
main vs proposed for summaries:
same is true for summaries as well
Some additional illustrations:
Gap between implementations is wider for higher number of parallel/contending goroutines.
Side by side comparison of backoff/no-backoff implementations are in
atomic_update_test.go
file:CPU profiles
for
go test -bench='BenchmarkHistogramObserve*' -run=^# -count=6
old top:
new top:
Downsides
Everything comes with the price, so in this case
time.sleep
introducing additional syscall (? not sure about that)Further improvements
runtime.Gosched()
and only thentime.sleep
but it was not looking impressive from results point of view@vesari
@ArthurSens
@bwplotka
@kakkoyun