-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Transition to 128bit trace ids (still 64bit span ids) #1262
Comments
spamming some folks of interest, of course not exhaustive @nicmunroe @felixbarny @shakuzen @yurishkuro @jcarres-mdsol @abesto @eirslett @kristofa @michaelsembwever @kevinoliver @mosesn @anuraaga @marcingrzejszczak @prat0318 @devinsba @basvanbeek @schlosna @ewhauser @klingerf @bogdandrutu @clehene |
One idea to bridge this in the current model is to add another u64 field: traceIdHigh This would be somewhat easy to implement, and we can default to zero. When 128 bit ids are used, they would be split between traceIdHigh and traceId (in data structures). In storage, they can either be concatenated or stored separately. |
actually, I don't like traceIdHigh as it would leave identifiers in a transitional state (and make querying awkward). I think it would be best to obviate the old traceId field with a wider one rather than append to it. |
I like this–in practice we run into collisions frequently. I'm not sure I like increasing the size of the thrift frame. What would you estimate the overall increase in storage size would be for TBinary and for TCompact? |
+1 from me as we have had to go from 128-bit to 64-bit trace IDs when pushing internally traced things into Zipkin |
Interop with wingtips should work out of the box since trace IDs are modeled internally as strings, and we can add some config to wingtips to auto-generate trace IDs as 128 bit if desired by the user. So no concerns from me. 👍 |
@adriancole do you have an alternative proposal for thrift? |
I didn't mean to imply not using thrift. I meant adding another field with 11: optional binary traceId128 |
For example, the trace context (SpanId) field would need to look at the At the moment, we only use big-endian TBinaryProtocol. This simplifies a If we go for a separate "binary traceId128" field, we add 26 bytes per thrift (3 /* Since no-one is likely indexing on thrift offsets, traceIdHigh might not be Assuming we'd need a separate index (minimally +16bytes), I'd say the on the topic of TBinaryProtocol vs TCompactProtocol |
I do not know how realistic are the chances of clashing in a 2**64 space. But most other systems are moving to 128-bit uuids so for the sake of uniformity sounds like a good change to me |
incidentally, just peeked at a work-in-progress for GRPC trace propagation. They are using 2 64bit fields to capture the 128bit trace id.
An aside, but they don't share the same span id across RPC calls, so only need to propagate the parent (as opposed to the parent and the current id)
|
Yes, I did mean extension to current thrift. String sounds rather wasteful, |
@yurishkuro there is no encoded string type in thrift. string (TType 11) fields are binary |
ps on high-low, unless I'm missing something, there's effectively no difference between traceIdHigh = 0 and traceIdHigh = null, so even if the field is optional in IDL, I'd just make it default to zero (ex in java). Reason being is that it only is used when traceId is set, and if you are packing 128 bits shifting 0 on the high side is the same as shifting nothing into the high side. |
What about these? struct BinaryAnnotation {
1: string key,
2: binary value, In the generated classes those will be |
@yurishkuro didn't mean to distract the issue. you are right that in thrift IDL there's a "binary" compiler hint that says the STRING field is being used for opaque bytes. I changed my example above. |
Just for reference, Stackdriver Trace uses a 32-byte string in their protobuf API. 16 bytes wasted, but compared to the rest of the span, maybe not a big deal. https://cloud.google.com/trace/api/reference/rpc/google.devtools.cloudtrace.v1 I tend to have a preference for readable IDs since they're used very often in debugging, so find the string type appropriate for it but can also see arguments for wanting to optimize the data over the wire. |
Yeah, I'm beginning to lean toward hi/lo. |
cool thing in thrift is we can actually rename traceId to lo without On Thu, Sep 1, 2016 at 10:50 PM, Moses Nakamura [email protected]
|
I haven't seen anyone against this, so here's a proposal: In thrift and zipkin.Span, add a field.. this can happen immediately
Make all of the below that accept trace ids, check length and prefer lower 64 bits
Change all of the storage backends to use a 128 bit traceId index. Pad 64 bit trace ids to 128 bit on write. Toss higher bits on read. After all the above is complete.. we can start writing the longer ids as a matter of course.
|
In Cloud Foundry, we already use a 128-bit trace id, in uuid format. To add support for Zipkin, we're having to add a second trace id because zipkin only supports a 64-bit id. It would be preferable to only have one trace id. |
Note this is for 128 bit ids not uuid (which are not 128bits of randomness This design is focused on permiting 128bit fully random ids, not the shape Zipkin has binary annotations (key,value) where different shaped ids can be |
Probably worth calling out in general that the text form of ids are almost
always different than storage, except where the store is json (ex
elasticsearch)
So when we talk about traceIdHigh, it is thrift binary form (which most
wont ever see) and is far smaller. The second field is better for
compatibility. Also, many dbs will store keys as fixed width binary and
defer to functions to render them in hex as needed.
In http and json world, most will only know about the lowerhex form, being
blissfully unaware of anything else. They will see a lowerhex id like they
do now, and later one up to twice as long.
Thats the working design anyway.
|
One more thing as the question of log correlation always comes up.
For offline, any different shape id or simply different id, needs to be
written once per trace. B3 doesnt define a log correlation key name.
For example, say I am a frontend and I have chosen to correlate logs with
the name cfTraceId and a value of mostly 128bit in UUID form.
I would write that field as is once to storage.. maybe I also write the
sampling method, too.
Ex span.addBinaryAnnotation("cfTraceId", "abcdef-....")
span.addBinaryAnnotation("sampleRate", "0.1")
Users dont interact with B3, and shouldnt know it intimately if we do our
jobs right. In fact a future B4 will almost certainly not separate traceid
as a separate header!
Anyway, in my user manual, I tell users to search for cfTraceId if I want
them to search with UUIDs as opposed to the raw zipkin traceid. Ex in
zipkin search panel cfTraceId=abcded-... In my impl I make sure logs are
using that key in their MDC etc.
As long as this is logged once, it will return unambiguously a single trace.
|
So I will open a new issue on Monday for implementation, and at that point
I will summarize things I've mentioned for docs.
Here's one thing that might not be obvious. Currently, zipkin trace ids are
logged as fixed 16 lowerhex in json.
When you copy paste this id into something with full text search, it comes
back. When we widen that to 128bit, and start encoding, we might want to
still encode ids with high bits unset as 16 chars as opposed to padding to
32.
Ex. Instead of one with traceId '12ab3..4' and an upgraded one with traceId
'0000...12ab..34' use 16char lowerhex encoding. That way, those who
currently do full text search on zipkin ids are likely to copy/paste a
value that works even in a system not fully upgraded.
If we did this, the cool part is that the edge (originator of a trace aka
the tracer that makes a root trace) makes the final decision on whether or
not 128bit ids are supported. Ex if it only sends 64bits of id, the system
will work on the brown field. When ready, that edge tracer flips a switch
and now 128bit ids are used.
Thoughts?
|
Sorry one more thing I should have said in the beginning.. the audience of Keep in mind that while most who want to integrate with Zipkin use B3 For example, Jaeger can log to zipkin, but uses a single concatenated It is important to note that while we are discussing intimate details of So, there are three audiences for this change..
There is a sub-group of above who want to know how to do log correlation, |
here's a work-in-progress for a java library that does 128bit B3 propagation based on finagle/brave. It doesn't yet do http headers. Note that thrift or json serialization isn't in scope of propagation. |
Added some notes about how 128bit ids will look in practice openzipkin/b3-propagation#5 |
This bridges support for 128bit trace ids by not failing on their receipt. Basically, this throws out any hex characters to the left of the 16 needed for the 64bit trace id. See #1262
moving to implementation in #1298 |
fyi, there's a tentative understanding that sampling (when traceId is a function) will only consider the lower 64-bits. The assumption is that the resolution is good enough and it prevents us from needing to redefine samplers. (recommended by @basvanbeek and I agree) |
throwing something out there.. what if we adopted a default behavior to make the 128bit trace ID exactly how Amazon X-Ray does? In this case, it would be possible to parse into an X-Ray ID, and still not impact us (as folks only sample lower 64 anyway). Ex. first 8 hex are epoch seconds, next 24 are random. I suspect the cost would be time to get the epoch secs. Also, we'd not know if the trace was generated like this (in order to parse it), but anyway, there'd be a chance. Worth it? Not worth it? |
Intriging. |
Interesting, however I would more think about adding a new layer for ID generation where one implementation would be a AWS X-Ray instead of adopting it from them. What do you think? |
Do Amazon docs explain why recording the time is useful? |
@yurishkuro I think the mean reason was something like this:
|
That sounds like an odd reason. The collection pipeline already knows when it receives the spans and can establish a TTL from that point, passing the timestamp in the headers doesn't add much there. There must be some other reason. |
@yurishkuro the example that they gave me when I asked that they have the timestamps in the Span for something similar they said something like this: Kind of they want to make decisions (analysis, keeping not keeping the span, maybe where to store it (there can be nice improvements based on this, have a way to store latest 1h trace then re-index etc.)) based on when the trace started. |
I see. Makes sense. Thanks. |
In the case of Zipkin, there is a large amount of users in Amazon, and a lot of the time we generate 128-bit w/o any consistency anyway. Some copy/paste the 64bit, etc. The rationale I was told from Abhishek is that the backend will reject the trace. The timestamp is used to store the data and enable even partitions in the backend. So, basically not using 32 of the 128 for the timestamp has a severe penalty if trying to use X-Ray propagation. 96 bits of random is great for probably 99.9% of deployments, and also doesn't impact sampling as most only look at lower 64-bit anyway. If we had great support for a different way of 128-bit trace ID generation, I'd not suggest something like this, but as we don't, seems a very good reason to recommend and track one. Of course folks can make trace ID generation pluggable, but I think the tradeoff wins in favor of making things align with the largest cloud. Taking this forward, for example, I'd open a ticket on B3 with a suggested generation format for 128-bit trace IDs, with this rationale, and whoever wants to do this could originate trace IDs that Amazon won't toss. Remember, it isn't just X-Ray, it is also ALB, Api Gateway and other products that use the same ID format (if only for log correlation). This means even if users never use X-Ray storage (which some sites can choose to as it is possible to convert zipkin data to X-Ray data), at least propagation will pass and you can ID correlate and pull data in a pinch. I think of this as tactical as it is easy to do and can be done even today. It allows us to tunnel AWS IDs through clients written for B3. Trace-Context stuff is more strategic as there's more going on there. This sort of integration shows almost exactly the pattern needed for trace-context, too. Anyone don't want me to suggest why a tracer might put timestamp in first 32 of 128? If I did, I'd mention it isn't fully random etc, and also maybe some sanity check tricks. Personally, I think not telling people about this is more damaging than telling them, as there's no easier path to Amazon integration (at least that I can think of). |
Sounds awesome to me. I see no problem following their format, for better or worse AWS is a big part of internet traffic, making your system work well with theirs will be a win-win for everyone |
@jcarres-mdsol so I have been toying around and hopefully by the end of the day I can verify zipkin and x-ray interop works (prereq to actually suggesting this in more than a hypothetical way). Will share as soon as done, but I see no road-blocks. Other notes: Some concern raised by @jcchavezs about encoding sensitivity. Here's the response to this. Amazon's format is version stamped. If the version isn't one, we can't continue the trace anyway (until that version is understood). We have choices including adding a tag about the linked ID before the trace is restarted. Amazon have a very good api compat track record. I am in no way concerned about X-Ray trace format v1 disappearing next year. We don't know what v2 will be anyway, and who knows, it might be the So, basically there are two parts.. the easy one is just using a different algo for root IDs. The second part is for AWS interop... if a library must use AWS-compat IDs, then it needs to restart traces which don't include valid epoch seconds. The second part only impacts sites running on AWS, and can be handled by an optional tracer plugin. |
@adriancole so what kind of interop are you targeting, a Zipkin-instrumented system talking to AWS SaaS? |
@adriancole <https://github.com/adriancole> so what kind of interop are
you targeting, a Zipkin-instrumented system talking to AWS SaaS?
moving the conversation here now. hope you don't mind!
#1754
|
Many times over the last years, we've had folks ask about 128 bit trace ids.
There have been a few use-cases mentioned...
Doing this comes at a cost of figuring out how to do interop, but that's well-charted territory:
Practically speaking we'd need to do the following to make this possible:
Provided we don't creep scope to also change span, parent id encoding, I think we can actually do this, and end up with an incremental win on the model that can help us bridge into systems that use wider ids increasingly often.
The text was updated successfully, but these errors were encountered: