Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update data capture to match path pattern #60

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

shashank11p
Copy link
Contributor

@shashank11p shashank11p commented Apr 7, 2021

Update DataCapture to match path pattern and use repeated DataCapture in AgentConfig for multiple data capture rules.
This can be used to define custom data capture rules for specific paths.
For example: If a user does not want to capture any data only for a specific endpoint that matches a path pattern, they can set the fields in the data capture rule with this path pattern to false.

config.proto Outdated Show resolved Hide resolved
config.proto Outdated

// CustomDataCaptureEndpoint represents a custom data capture rule for an endpoint that matches
// the given url pattern and host header
message CustomDataCaptureEndpoint {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about calling it Endpoint. Rule is better I think? But I'm open to other names too.

config.proto Outdated
message CustomDataCaptureEndpoint {

// data_capture describes the data being captured by instrumentation for this endpoint
DataCapture data_capture = 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice to describe this in terms of agent's DataCapture config instead of just having a single flag like capture_payload. This has implication on what we need to propagate: hypertrace/specification#7. It will not just be a single 0-1 flag to propagate in the header. Can we propagate the DataCapture definition? Otherwise we can propagate a matched rule ID, which downstream modules can lookup its DataCapture config definition of. Maybe the latter is nicer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid to mix capture and propagation here because that is overkill. Locally services know what to capture and I don't think upstream should override that.

config.proto Outdated
DataCapture data_capture = 1;

// url_pattern describes the url pattern to match for this endpoint
google.protobuf.StringValue url_pattern = 2;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move path_pattern and host_header to a separate message. Something like Match (there must be a better name) which can be oneof different ways of matching against a span. One of them being UrlPathMatch which contains these two. This way the match behavior is more extensible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not do that. What identify the capture is the matching criteria, so having same settings for many matching criteria sounds more like a microoptimization. I'd keep it as it is.

config.proto Outdated
@@ -127,3 +130,17 @@ message JavaAgent {
// filter_jar_paths is the list of path to filter jars, separated by `,`
repeated google.protobuf.StringValue filter_jar_paths = 1;
}

// CustomDataCaptureEndpoint represents a custom data capture rule for an endpoint that matches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear this overrides whatever you set in the DataCapture config? How do you resolve conflicting rules (e.g. 2 matching rules with different DataCapture)? All this must be specified here.

Copy link

@davexroth davexroth Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we shouldn't introduce a new top level message here at all. Instead we should move this into the existing DataCapture message. I propose something along the lines of:

message DataCapture {
    Message http_headers = 1;
    Message http_body = 2;
    Message rpc_metadata = 3;
    Message rpc_body = 4;
    google.protobuf.Int32Value body_max_size_bytes = 5;
    google.protobuf.StringValue path_match = 6;
    repeated google.protobuf.StringValue headers_match = 7; 
}

and then change

message AgentConfig {
...
    repeated DataCapture data_capture = 3;
...
}

I know changing data_capture to a repeated field is a breaking change, but it's not a widely deployed config yet, and will solve a lot of complexity in terms of defaults, precedence etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this then we will not have a default at all and only have rules for the matches that are given in config. What about when there is no rule given in config for a path?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default config would be a rule which matches everything. For the case of no config, the default in the agent would be capture everything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what you are proposing @davexroth is that if user does not pass a value for DataCapture (meaning empty list) we match everything with certain values for {http|grpc}_{headers|body}. I am curious of what would happen (it's possible I think) if the user wants to disable data capture for all, I am inclined to say I would just not pass anything and not expect a default to override that. I don't feel like a repeated is intuitive.

I would go for keeping the simple config for data capture and then a repeated message, same as DataCapture but also adding a path_regex_matcher and header_regex_matcher.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcchavezs I think disabling all still works pretty well with repeated. If path_match and headers_match are unspecified it means it will match all spans so disabling all would be specifying all headers and body message fields to false and not specifying path_match and headers_match.

Copy link
Contributor

@jcchavezs jcchavezs Apr 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to be clear, what we are saying is that if the user does not pass any data capture we apply the default (capture all on everything) and if the user wants to disable data capture for everything then they have to create a data capture rule with no regex and everything in off which in practice is a empty object (at least in go, zero values are false), I think that is more confusing.

I honestly would prefer to be explicit on everything so having one DefaultDataCapture (which user can set or coming from the defaults) and then SpecificDataCapture there passing a regex for either path or header is mandatory.

Can we get a simple example using this branch on how that would look like in code? I think that is the best to decide what option to take.

Back to the question about what if we have many matching, we can describe that the ordering matters so when there is a match all the rest is discarded. This means we need to have a "order" concept in the UI and probably a way to reorder them. BTW ordering is respected by proto (see https://developers.google.com/protocol-buffers/docs/proto3#specifying_field_rules)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an example in javaagent hypertrace/javaagent#301

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure about requiring two messages, one for DefaultDataCapture and one for SpecificDataCapture - as it's the same thing.

If there's a feeling that being explicit is required, we could have two fields, default_data_capture and data_capture even though this can still be achieved in one field, and I'm not sure if it really add that's much clarity, and adds more combinations to worry about - what happens if data_capturee is defined but default_data_capture isn't. Having two messages has the same problem.

Even with two fields or messages, the agent still need to define what the default behavior is if the default field/message isn't defined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default_data_capture does not have to be defined by the user because the agents generate it, The spec defines what the default behavior is.

config.proto Outdated

// CustomDataCaptureEndpoint represents a custom data capture rule for an endpoint that matches
// the given url pattern and host header
message CustomDataCaptureEndpoint {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about PerEndpointDataCapture?

config.proto Outdated
DataCapture data_capture = 1;

// url_pattern describes the url pattern to match for this endpoint
google.protobuf.StringValue url_pattern = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this work for RPC?

ENV_VARS.md Outdated Show resolved Hide resolved
@shashank11p shashank11p changed the title Add custom data capture endpoints Update data capture to match path pattern Apr 9, 2021
// The first rule in config that matches for a path should be used if multiple rules match.
// Therefore the default rule (if present) should be the last rule.
// If no default rule is given, the default behaviour should be to allow data capture for all elements
repeated DataCapture data_capture = 3;
Copy link
Member

@pavolloffay pavolloffay Apr 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings from the above description. It's not vague per se, but not elegant either.

I would keep the original data_capture as the default one and add a new field e.g. endpoint_data_capture of a new type

message EndpointDataCapture{
   DataCapture data_capture = 1;
  // endpoint 
  // host 
  // probably also the sampling config 
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also need whitelisted headers in the config https://github.com/hypertrace/specification/blob/main/agent/tracestate.md#whitelisted-headers. This should probably go to DataCapture

Copy link

@ryanericson ryanericson Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take allowlisting headers separately, for the purpose of core mode processing it's even ok for it to be not configurable at the moment. I've opened https://traceableai.atlassian.net/browse/ENG-8716 to track it though.

@pavolloffay
Copy link
Member

pavolloffay commented Apr 13, 2021

Two examples of config

service_name: emojis
propagation_formats:
  - B3
  - TRACECONTEXT
 reporting:
   endpoint: traceable-agent.traceableai:4317/
   trace_reporter_type: OTLP
data_capture: // the default case
    http_headers: false
endpoint_data_capture: // the override for a single endpoint
   - data_capture: 
        http_body: false 
     endpoint: /foo/bar

A single

service_name: emojis
propagation_formats:
  - B3
  - TRACECONTEXT
 reporting:
   endpoint: traceable-agent.traceableai:4317/
   trace_reporter_type: OTLP
data_capture:
   - data_capture: // the override for a single endpoint
        http_body: false
      endpoint: /foo/bar
   - data_capture: // the dafault case
      http_headers: false

@pavolloffay pavolloffay self-requested a review April 13, 2021 10:50
@davexroth
Copy link

It's a personal preference, but I think the combined case reads more intuitively.

With the first example, there's now two concepts a user needs to know - data_capture and endpoint_data_capture. How are these different? What's the implication of choosing one over the other? The concept of an endpoint isn't really something any of the agents have had to date.

Also, at an implementation level, now you have two types you need to pass around, meaning that all the code that deals with data capture needs to handle both cases, leading to potentially duplicated code and bugs where the developer doesn't realize they need to handle both types of data_capture configuration.

@ryanericson
Copy link

My preference is to just have repeated DataConfiguration. It's generic and therefore extensible in the future if we want to provide overriding data capture config based on something other than endpoint e.g. exit call, sql, etc.
I see it as the one place for data capture config, default, endpoint/url based or any kind of override in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants