Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encourage use of de-facto standards for query string values #4

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

quii
Copy link

@quii quii commented Aug 25, 2020

It's a big smell that we have code where we build up URLs using string concatenation and joins and even worse having to parse it on the HTTP server side.

Instead we should be using built-in, safe, battle-tested URL builders which usually let you automatically represent multiple query values as described in the PR.

Here is an example in the Go world.

It's a big smell that we have code where we build up URLs using string concatenation and joins and even worse having to parse it on the HTTP server side. 

Instead we should be using built-in, safe, battle-tested URL builders which usually let you automatically represent multiple query values as described in the PR.
@quii quii changed the title Encourage use of de-factor standards Encourage use of de-factor standards for query string values Aug 25, 2020
@quii quii changed the title Encourage use of de-factor standards for query string values Encourage use of de-facto standards for query string values Aug 25, 2020
Copy link

@tomyan tomyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for it

@quii
Copy link
Author

quii commented Aug 25, 2020

I've added an example of a client and server sending params that you can run and mess around with here

The important thing to notice is I'm not building/parsing strings, it's all taken care of

@aceakash
Copy link

aceakash commented Aug 25, 2020

I like it.

I like commas more.

Consider the semantics of the API you're building. If your API is using these query strings as filters (which is very common), and different filters (param names) are ANDed together while values within the same filter are ORed, this leads to what I perceive as an inconsistency.

e.g. ? countries=spain & countries=germany & sectors=retail & dateFrom=2020-08-01
To the human eye, this reads as the resource should be tagged for spain AND germany AND retail AND should be post-August.

Using commas serves as a better DSL for this use-case and also allows you to formulate ORs and ANDs for the same filter.
e.g. ? countries=spain,germany & countries=india,france is interpreted as (spain or germany) AND (india or france). How do you do that with the other scheme?

Another small reason I like it even if you don't need the filter value in multiple contexts, is because of the co-location of values: all the values for a param are in one place, not scattered across the query string. I accept, though, that this mostly matters when a human is reading it.

For the above reasons, I feel the overhead of a req.query['countries'] && req.query['countries'].split(',') is worth it.

@tomyan
Copy link

tomyan commented Aug 25, 2020

@aceakash I'm sold, @quii what do you think?

@quii
Copy link
Author

quii commented Aug 26, 2020

I think these are great points @aceakash .

Using commas serves as a better DSL for this use-case and also allows you to formulate ORs and ANDs for the same filter.
e.g. ? countries=spain,germany & countries=india,france is interpreted as (spain or germany) AND (india or france). How do you do that with the other scheme?

Does this come up in practice? Normally we just OR within a facet and then AND across them, which can be easily expressed the de-facto way. I wonder if querystring parameters are always going to be expressive enough in general. Perhaps for more complicated queries we should consider the slightly wonky (but valid) way elasticsearch does it where you send more structured queries in the request body.

I guess I feel a key property of REST is "being of the web" because it brings consistency and simplicity, and I worry that this kind of complexity just bleeds everywhere, both for clients and servers.

I also don't think we should overly favour "human readable" URLs over what we can construct with code. In practice we call APIs with code we write, or better still using built-in libraries where we don't have to think about the transport too much; whether we're separating with , or 💩

Take this form. It couldn't be simpler and when the browser sends it, it encodes it in the de-facto way as you'd hope and expect.

<form method="get" action="/">
    <input name="country" type="text" />
    <input name="country" type="text" />
    <input type="submit" value="Submit" />
</form>

To use our APIs with commas, I am going to have to start writing some client side code and mess around with things to do what should be a simple API call.

TBF, it's not a hill I am especially willing to die on (but I will whinge about it forever). Perhaps we need some more views to shut me up :)

@aceakash
Copy link

aceakash commented Aug 26, 2020

Does this come up in practice?

Not that much, until it does 😊. I would expect it to surface on more search/filter-heavy APIs like ISS. But it's not a sticking point in general, I agree.

it's not a hill I am especially willing to die on

Same here - this is as bike-shedding as it gets. Let's just put it to a vote and get on with more important issues.

@quii
Copy link
Author

quii commented Aug 26, 2020

The default way that browsers and most clients/servers work as described will work for 90% of our use cases so I think we should just encourage using that.

If you need to do something that doesn't work with that, well figure something out for your use case. Let's not prescribe something non standard which'll cause more effort for everyone just for the sake of keeping things consistent for some edge cases.

Is there a votey thing on GH?

@aceakash
Copy link

Sold - let's go with ?country=spain&country=germany as the guideline and look at deviations on a case-by-case basis.

@ivoreis
Copy link

ivoreis commented Aug 26, 2020

To add on top of what @aceakash said, and for the simplest case, having comma-separated values reduces character/word duplication. If you need to debug/test the URL on a browser this will have an impact since browsers are limited to a finite amount of characters.

Besides, both HAL and JSONApi have examples of using comma-separated values:

@quii
Copy link
Author

quii commented Aug 27, 2020

While there is no definitive standard, most web frameworks allow multiple values to be associated with a single field (e.g. field1=value1&field1=value2&field2=value3).[5][6]

For each field of the form, the query string contains a pair field=value. Web forms may include fields that are not visible to the user; these fields are included in the query string when the form is submitted

This convention is a W3C recommendation

https://en.wikipedia.org/wiki/Query_string

@apfrancis
Copy link
Contributor

I was on holidays last week - and not checking GH issues(?!)

From my point of view I'd be happy to use what we feel gives the best developer experience. The original guidance (such as it is) for came from looking at the JSON API and HAL specifications already liked to above.

In addition to the point made by @ivoreis it's important to mention that AWS imposes url length limits on API Gateway.

https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html

API Gateway quotas for configuring and running a REST API

Resource or operation Default quota Can be increased
Length, in characters, of the URL for an edge-optimized API 8192 No
Length, in characters, of the URL for a regional API 10240 No

Maybe this is tolerable for most use cases but it might be worth adding a word of warning in this document, as I can see this catching people out. If there is potential in the use case for complex sets of parameters then suggest other approaches to make such requests?

@apfrancis
Copy link
Contributor

worth also mentioning that Node's inbuilt querystring library parses arrays into multiple querystring keys/values

querystring.stringify({ foo: 'bar', baz: ['qux', 'quux'], corge: '' });
// Returns 'foo=bar&baz=qux&baz=quux&corge='

but there's also the popular query-string module which supports all of the common ways of representing arrays in a url - https://github.com/sindresorhus/query-string#arrayformat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants