Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document interoperability for OpenTelemetry sampling specifications #4243

Open
jmacd opened this issue Oct 3, 2024 · 1 comment
Open

Document interoperability for OpenTelemetry sampling specifications #4243

jmacd opened this issue Oct 3, 2024 · 1 comment
Labels
sig-issue spec:trace Related to the specification/trace directory

Comments

@jmacd
Copy link
Contributor

jmacd commented Oct 3, 2024

What are you trying to achieve?

OpenTelemetry has pending tracing specification work, #4162 and #4166. These specification documents do not provide any detail for vendors with existing sampling systems to consider for interoperability.

What did you expect to see?

We have one existing prototype that shows a viable approach, and this approach could be documented in the form of a supplementary guideline.

In this example, the Probabilistic Sampler processor has this snippet of code:

func randomnessFromBytes(b []byte, hashSeed uint32) sampling.Randomness {
	hashed32 := computeHash(b, hashSeed)
	hashed := uint64(hashed32 & bitMaskHashBuckets)

	// Ordinarily, hashed is compared against an acceptance
	// threshold i.e., sampled when hashed < scaledSamplerate,
	// which has the form R < T with T in [1, 2^14] and
	// R in [0, 2^14-1].
	//
	// Here, modify R to R' and T to T', so that the sampling
	// equation has identical form to the specification, i.e., T'
	// <= R', using:
	//
	//   T' = numHashBuckets-T
	//   R' = numHashBuckets-1-R
	//
	// As a result, R' has the correct most-significant 14 bits to
	// use in an R-value.
	rprime14 := numHashBuckets - 1 - hashed

	// There are 18 unused bits from the FNV hash function.
	unused18 := uint64(hashed32 >> (32 - numHashBucketsLg2))
	mixed28 := unused18 ^ (unused18 << 10)

	// The 56 bit quantity here consists of, most- to least-significant:
	// - 14 bits: R' = numHashBuckets - 1 - hashed
	// - 28 bits: mixture of unused 18 bits
	// - 14 bits: original `hashed`.
	rnd56 := (rprime14 << 42) | (mixed28 << 14) | hashed

	// Note: by construction:
	// - OTel samplers make the same probabilistic decision with this r-value,
	// - only 14 out of 56 bits are used in the sampling decision,
	// - there are only 32 actual random bits.
	rnd, _ := sampling.UnsignedToRandomness(rnd56)
	return rnd
}

To address this issue, we should document and vet the process used above, which synthesizes an explicit randomness value to justify an alternative consistent sampling decision. In that sampler (legacy),

  1. a 32-bit FNV hash is computed
  2. the least-significant 14 bits form an acceptance threshold
  3. sampling probability is scaled between [1, 2^14]
  4. ShouldSample() is defined as (acceptance threshold) < (scaled sampling probability)

In the interoperability logic shown above,

  1. The acceptance threshold is inverted into a rejection threshold
  2. The 14-bit scaled probability is inverted into the most significant 14 bits of an explicit randomness value
  3. The least-significant 42 bits of the explicit randomness value are filled with consistent psuedo-randomness from the original hash value.

Later in the code path

  • The corresponding OTel sampling (rejection) threshold value is formed
  • Both the randomness and threshold values are encoded in TraceState.

This logic is enabled when there is no incoming TraceState w/ existing OpenTelemetry TraceState sampling fields. The sampler reports an error when the arriving trace state already has open telemetry sampling information (Threshold or Explicit randomness value).

@danielgblanco
Copy link
Contributor

@jmacd is this something that you'd be willing to sponsor as part of the Sampling SIG?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig-issue spec:trace Related to the specification/trace directory
Projects
None yet
Development

No branches or pull requests

3 participants