Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: sanitize http.url attribute values #3487

Closed

Conversation

daniel-white
Copy link

Which problem is this PR solving?

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

I added a map of semantic attributes to a value sanitizing function. also made this work for one off values.

Errata: I'm not sure if you want to classify this as a breaking change or not. Also, I'm not sure why my tests are failing. Any vscode+mocha tips appreciated!

Fixes #2000

Short description of the changes

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • added unit tests

Checklist:

  • Followed the style guidelines of this project
  • Unit tests have been added
  • [] Documentation has been updated

@daniel-white daniel-white requested a review from a team December 14, 2022 03:13
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 14, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: daniel-white / name: Daniel A. White (b4c4cbc)

@daniel-white
Copy link
Author

i'm not sure what i need to do - i accepted as an individual contributor for the cla, but it says it is not enabled for this project.

@legendecas
Copy link
Member

Are you signing the CLA with the email used to commit?

@daniel-white daniel-white force-pushed the chore/sanitize-url branch 3 times, most recently from 89aea3a to e74790b Compare December 17, 2022 01:24
Comment on lines 128 to 155
it('should return the input string when the value is not a valid url', () => {
const out = sanitizeAttribute(
SemanticAttributes.HTTP_URL,
'invalid url'
);

assert.strictEqual(out, 'invalid url');
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add to the test log message validations?

let copyVal: AttributeValue;

if (Array.isArray(val)) {
copyVal = val.slice();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, we also want to return, no?

Comment on lines 131 to 115
if (!isValidPrimitiveAttributeValue(val)) {
diag.warn(
`Invalid attribute value set for key: ${SemanticAttributes.HTTP_URL}. Unable to sanitize complex value.`
);

out = val;

return out;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to check that if you only accept strings?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the URL constructor only takes a string or other URL instances.

@daniel-white
Copy link
Author

thanks @osherv for the review - i'm working on your feedback.

do you have any tips on how that i can debug in vscode?

@osherv
Copy link
Member

osherv commented Dec 19, 2022

@daniel-white sure! unfortunately i'm working in Webstorm, i can help you debug with Webstorm 😅

@daniel-white daniel-white force-pushed the chore/sanitize-url branch 3 times, most recently from 90a1bc7 to 6b9e85e Compare December 21, 2022 21:27
@daniel-white
Copy link
Author

i know my tests/lint are failing. however, i'd like to confirm that this is the pattern you all would like to go with

Copy link
Member

@legendecas legendecas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review. Thank you for working on this!

However, as the attribute http.url is part of the semantic conventions, but not part of the Trace SDK specification, I believe it would be more straightforward and performant to explicitly sanitize the credentials of the semantic convention attribute http.url in the instrumentations instead.

We can still export sanitizeHttpUrl in @opentelemetry/core to reduce duplications.

@@ -94,3 +122,34 @@ function isValidPrimitiveAttributeValue(val: unknown): boolean {

return false;
}

function sanitizeHttpUrl(val: AttributeValue): AttributeValue | undefined {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this return undefined?

@@ -279,7 +285,7 @@ export class Span implements api.Span, ReadableSpan {
* @param value Attribute value
* @returns truncated attribute value if required, otherwise same value
*/
private _truncateToSize(value: SpanAttributeValue): SpanAttributeValue {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the SDK, we still keep the deprecated SpanAttributeValue because the SDK is required to support the older version of @opentelemetry/api.

Would you mind reverting these unrelated changes? Thank you!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can.

describe('http.url', () => {
it('should remove username and password from the url', () => {
const out = sanitizeAttribute(
SemanticAttributes.HTTP_URL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind converting those test cases into table-based tests to reduce the repetition between tests?

const testData = [
  {
    name: 'should remove username and password from the url',
    input: 'http://user:pass@host:9000/path?query#fragment', 
    expect: 'http://host:9000/path?query#fragment',
  },
  ...
];

for (const item of testData) {
  it(item.name, () => {
    ...
  });
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure that would be possible.

let out: AttributeValue | undefined;

if (typeof val !== 'string') {
diag.warn(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd find these diag warnings in the value sanitizer unexpected. I think it should be still valid for trace-sdk users to set arbitrary valid attribute values as http.url. http.url is part of the semantic conventions, not part of the SDK specifications.

@open-telemetry/javascript-approvers WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to #2000, its part of the specification. i'm open to all of your guidance on how to approach this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's part of the semantic convention specification, not the trace sdk specification. So IMHO the constraints should only be applied to instrumentations that collect those attributes.

@github-actions
Copy link

github-actions bot commented Apr 3, 2023

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the stale label Apr 3, 2023
@github-actions
Copy link

This PR was closed because it has been stale for 14 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP Span Attributes: http.url must not contain username / password
3 participants