Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should a trace id and span id look like? #5

Open
codefromthecrypt opened this issue Sep 14, 2016 · 6 comments
Open

What should a trace id and span id look like? #5

codefromthecrypt opened this issue Sep 14, 2016 · 6 comments

Comments

@codefromthecrypt
Copy link
Member

from a question: spring-cloud/spring-cloud-sleuth#400 (comment)

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Sep 14, 2016

B3 ids are fixed-length lowerhex encoded values Ex. "48485a3953bb6124". These are easy to copy/paste vs numeric values or UUIDs which have hyphens in them. They are expected to be fully random. This is important as some samplers are probabilistic and assume each bit is equally likely.

Traditionally, the start of a trace (root span) has the same value for trace id and span id. The root span has no parent id. Its child would share a trace id with its parent, but provision a new span id.

Ex. root span:

X-B3-TraceId: 48485a3953bb6124
X-B3-SpanId: 48485a3953bb6124

And its child

X-B3-TraceId: 48485a3953bb6124
X-B3-ParentSpanId: 48485a3953bb6124
X-B3-SpanId: 42e1e27066118385

Since spans are contained within the namespace of a trace, and traces usually have orders of hundreds of spans or less, there's little likelihood that a 64bit span id will ever clash.

However, 64bit trace identifiers are possible to clash in high-traffic circumstances, such as client-originated traces (devices or cars, for example), or very high volume websites (like twitter). For this reason, 128bit support will be added for trace identifiers (via #1). When these are added, they will have the following conventional behavior.

  • the lower 64bits of the trace id will be the same as the root span id.

Ex. 128 bit root span:

X-B3-TraceId: 463ac35c9f6413ad48485a3953bb6124
X-B3-SpanId: 48485a3953bb6124

This allows "compatibility mode" where a system that chooses to only look at the lower 64bits of a 128bit trace id appear exactly the same as prior practice. NOTE At the time of this writing 128 bit ids are not in use yet, and won't be until at least #1 is merged

@yurishkuro
Copy link

@adriancole prepending high 64bits to X-B3-TraceId might break id parsing in the non-upgraded clients, which makes it quite hard to deploy somewhere where tracing is already rolled out, because 100s of microservices cannot all upgrade the client libraries overnight. Do you have any thoughts on a possible (incremental) upgrade path?

@codefromthecrypt
Copy link
Member Author

So the node that starts the trace makes the decision whether to use 128
bits or not. The thing I mentioned works when high bits of all zeros are
not serialized and the node that starts the trace decides to not start a
128 bit trace.

The thinking is that users should do a wave of updates where they toss the
high bits of a X-B3-TraceId that is larger than 64 bits on ingest. Once
that's in the node that starts traces can start them at 128 bits, even if
it is lossy on the other side. This is a trivial change in any language,
and a better alternative than permanently defining an additional B3 trace
id header.

This probably hints at an operations story where you do analysis on the
propagation mode of a tracer.

On Wed, Sep 14, 2016 at 10:37 AM, Yuri Shkuro [email protected]
wrote:

@adriancole https://github.com/adriancole prepending high 64bits to
X-B3-TraceId might break id parsing in the non-upgraded clients, which
makes it quite hard to deploy somewhere where tracing is already rolled
out, because 100s of microservices cannot all upgrade the client libraries
overnight. Do you have any thoughts on a possible (incremental) upgrade
path?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAD618t54z6vLjBZvWuEMLu5KgTMHy_-ks5qp135gaJpZM4J8UH1
.

@codefromthecrypt
Copy link
Member Author

the general tradeoff is this:

  • at the cost of time (ex large brown fields like twitter won't be able to
    flip switch to 128bit until finagle updates to at least toss high bits)
  • keep one coherent user-level trace id (serialized same in X-B3-TraceId,
    and used in api queries and json)

On mitigator of time, is that we are in open source and can update tracers
quite quickly, especially if only tossing high bits. Also, if we start
soon, the time to converge also starts. The later we start, the later we
get tolerant reading libraries out there, and the longer time convergence
takes.

For example, the good thing about B3 being historically under-specified is
that many people have had to change their code in the last year. Many of
these people are still active in zipkin and are able to upgrade their apps.
Also, client-originated traces is a novelty for many, so the deployment
challenge is patching servers on the most part. The same folks that updated
their servers to fix a B3 goof earlier this year can do a change to toss or
support longer trace ids.

@yurishkuro
Copy link

Thanks. I had a different approach in mind with sending an alternative header in parallel with the old one during the transition, but I like your approach better. They both have a similar time horizon, can't flip a switch until the first wave of upgrades reaches critical mass, and once the switch is flipped the non-upgraded holdout services won't be able to parse the header and will be starting new traces.

@codefromthecrypt
Copy link
Member Author

added #6 for tracking library updates to 128bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants