Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instrument gRPC server span status codes #1127

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

damemi
Copy link
Contributor

@damemi damemi commented Sep 25, 2024

Ref #172

@damemi damemi requested a review from a team as a code owner September 25, 2024 14:37
@damemi damemi force-pushed the grpc-server-status-code branch 2 times, most recently from 637da87 to 0863a1e Compare September 25, 2024 14:43
void *stream_ptr = get_argument(ctx, 2);
u32 stream_id = 0;
bpf_probe_read(&stream_id, sizeof(stream_id), (void *)(stream_ptr + stream_id_pos));
struct grpc_request_t *grpcReq = bpf_map_lookup_elem(&streamid_to_grpc_events, &stream_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to look here at the grpc_events map.
If I understand this correctly, this probe should be called between the entry probe of handleStream (where an entry is created for the stream in the grpc_events map) and the return probe of handleStream.
If we can't find an entry it means the entry probe didn't run properly or didn't run at all, hence in that case we won't have a valid start time. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. If handleStream isn't called (say the server hits an error before that) then we might still want to record the status from WriteStatus. Maybe in that case, this probe should check if an event already exists in grpc_events and if not start a new span with the status?

It seems like WriteStatus might be used in places besides handleStream (SendMsg and RecvMsg are 2 instances I found). Do you think these are relevant? If not then I agree with your suggestion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that depends on the RPC life cycle the application is using.
The creation of a new stream can be called on the initialization phase of the instrumented process, that is also a case where we'll miss the entry probe of handleStream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The creation of a new stream can be called on the initialization phase of the instrumented process, that is also a case where we'll miss the entry probe of handleStream

That sounds like a case where we would want to create a new trace in the WriteStatus probe, is that what you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. We won't have the RPC method in that case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WriteStatus does have the same Stream argument that is passed to handleStream, so we could try to get the RPC method from there. I just pushed an update that does that, ptal. I think this would be neat to cover the cases where we miss the original initialization like you mentioned. If not I'm all for just simplifying this down

return 0;
}

UPROBE_RETURN(http2Server_WriteStatus, struct grpc_request_t, grpc_events, events, 2, stream_ctx_pos, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this return probe, since we already have UPROBE_RETURN(server_handleStream...).
The UPROBE_RETURN macro sends the event to user-space so I don't think we need to have it twice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants