Add support for "Expect: 100-continue" header #679

johan-bjareholt · 2023-11-07T12:32:08Z

Works well, very easy to use by just setting the expect header like this

let huge_string = "abcde ".repeat(500);
let req = ureq::post("http://127.0.0.1:5000")
    .set("Expect", "100-continue");
 let res = req.send_string(&huge_string).unwrap();
println!("res: {:?}", res);
let body = res.into_string().unwrap();
println!("body: {}", body);

To-do

A timeout to send body even if server does not understand "Expect: 100-continue"
- If the client timed out waiting for the HTTP/1.1 100 Continue, then proceed with sending the body without waiting for an "100 Continue" response.
- (optional) Make the timeout configurable (1000ms default)
Follow the spec and on 417 resend the whole request but without the expect-100 header.
(optional) Set "Expect: 100-continue" by default
- Only set 100-continue if there is a body, if the body has a known size over X MB or always if the size is not known

johan-bjareholt · 2023-11-07T13:11:04Z

This is a follow up to issue #676

src/response.rs

src/unit.rs

algesten · 2023-11-08T13:29:32Z

Thanks for looking into this!

Overall looks like good changes. You can use the test.sh script in the root to roughly run the same tests the CI will do.

algesten

Some comments.

Was thinking of a bigger refactor might be clearer.

What if the part that just reads the header returns a new type ResponseHead (not public). Then there are two methods to "complete" that type into the public Response, either without_body (for expect-100) and with_body for the normal case?

It would potentially be cleaner than trying to make Response be slightly different depending...

src/response.rs

jsha · 2023-11-08T16:46:41Z

Thanks so much for the PR! I'll review it soon.

jsha · 2023-11-08T17:23:53Z

Oops, I posted that before I saw @algesten had already provided a review! But I will still plan to find some time and take a look soon.

johan-bjareholt · 2023-11-09T16:31:32Z

I removed the "consume" argument from "read_response_head", the reason why it existed was because I was testing against a broken http server that for some strange reason responded with "100 Continue" twice... It took me embarassly long until I realised this and found that out in wireshark.

I have now tested this code against a libsoup2 server and a python flask http server, works fine there.

algesten · 2023-11-09T21:53:27Z

I think this looks pretty good. Wonder if @jsha agrees?

jsha

This also looks good to me! Thanks so much for writing it.

jsha · 2023-11-10T01:35:56Z

Looks like you've got a clippy error. Mind fixing that and I'll merge?

johan-bjareholt · 2023-11-10T07:35:39Z

Looks like you've got a clippy error. Mind fixing that and I'll merge?

Done!

johan-bjareholt · 2023-11-10T07:43:19Z

Also, regarding the broken client. I'm not sure if this is a flask bug or something. To be more exact, when using "default" flask it's broken, but when using flask with werkzeug "serving.make_server" it works fine. Might be a bug in flask?

testclient

fn request() {
    let string = "abcde ".repeat(5);

    let req = ureq::post("http://127.0.0.1:5000/")
        .set("Expect", "100-continue");
    let res = req.send(string.as_bytes()).unwrap();
    println!("res2: {:?}", res);
    assert_eq!(res.status(), 200);
    let body = res.into_string().unwrap();
    assert_eq!("Hello, world!", body);
}

fn main() {
    request();
}

#[test]
fn test_a() {
    for i in 1..1000 {
        println!("\nRun {i}\n");
        request();
    }
}

Flask with werkzeug serving.make_server (works)

#!/usr/bin/env python3

from flask import Flask, request
from werkzeug import serving

app = Flask(__name__)

@app.route('/', methods = ['POST'])
def a():
    return "Hello, world!"


if __name__=='__main__':
    app.use_reloader = False
    server = serving.make_server(
        host="0.0.0.0",
        port=5000,
        app=app,
        ssl_context=None)
    server.serve_forever()

Flask with app.run (broken)

#!/usr/bin/env python3

from flask import Flask, request

app = Flask(__name__)

@app.route('/', methods = ['POST'])
def a():
    return "Hello, world!"


if __name__=='__main__':
    app.run()

Wireshark of broken server

POST / HTTP/1.1
Host: 127.0.0.1:5000
User-Agent: ureq/2.8.0
Accept: */*
Expect: 100-continue
accept-encoding: gzip
Transfer-Encoding: chunked

HTTP/1.1 100 Continue

HTTP/1.1 100 Continue

1e
abcde abcde abcde abcde abcde 
0

HTTP/1.1 200 OK
Server: Werkzeug/2.2.2 Python/3.9.2
Date: Fri, 10 Nov 2023 07:39:39 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 13
Connection: close

What happens is that because we get two 100-continue, our ureq client consumes the first 100-continue as it should and then puts the second 100-continue as its body, which the test client then asserts on as it expects a 200 OK.

algesten · 2023-11-10T08:04:02Z

I might have dreamt this. But I seem to recall curl waits for a while before sending the body regardless of a 100 response. This is to interoperate with servers that doesn't understand expect-100

https://www.rfc-editor.org/rfc/rfc7231.html#section-5.1.1

algesten · 2023-11-10T08:12:17Z

Ah yes CURLOPT_EXPECT_100_TIMEOUT_MS defaults to 1000ms

johan-bjareholt · 2023-11-10T08:13:54Z

Sounds reasonable, the only risk I see with this is if the server is just slow with responding with the 100-continue. So it might be a bit racy, seems like it would be important in that case to be able to configure the timeout?

algesten · 2023-11-10T08:15:42Z

I think there are two things we want to consider.

A timeout to send body anyway.
Follow the spec and on 417 resend the request without the expect-100 header.

algesten · 2023-11-10T11:52:21Z

Sounds reasonable, the only risk I see with this is if the server is just slow with responding with the 100-continue. So it might be a bit racy

I assume the racy-ness is normal behavior. The server would need to handle the case where the client starts sending the body regardless of the answer. In that respect 417 is only best effort.

seems like it would be important in that case to be able to configure the timeout?

Agree. We should put it as an option in Agent.

jsha · 2023-11-10T22:49:34Z

I think the racy-ness can be resolved like so:

If the client timed out waiting for the HTTP/1.1 100 Continue, then when it tries to read headers again after sending the body, it's possible it may receive an HTTP/1.1 100 Continue at that time, followed by the final headers. If the "reading headers after body" step gets a 100 (vs any other code), it should respond to that by reading headers a second time, and returning those.

Here's the most up-to-date RFC (substantively the same as RFC 7231 on this issue): https://www.rfc-editor.org/rfc/rfc9110#field.expect

A client MUST NOT generate a 100-continue expectation in a request that does not include content.
A client that will wait for a 100 (Continue) response before sending the request content MUST send an Expect header field containing a 100-continue expectation.
A client that sends a 100-continue expectation is not required to wait for any specific length of time; such a client MAY proceed to send the content even if it has not yet received a response. Furthermore, since 100 (Continue) responses cannot be sent through an HTTP/1.0 intermediary, such a client SHOULD NOT wait for an indefinite period before sending the content.
A client that receives a 417 (Expectation Failed) status code in response to a request containing a 100-continue expectation SHOULD repeat that request without a 100-continue expectation, since the 417 response merely indicates that the response chain does not support expectations (e.g., it passes through an HTTP/1.0 server).

I think we're seeing that the "time out if no early response" SHOULD is actually pretty necessary for decent operation, so let's add it. @algesten any thoughts on wanting to include it as part of this PR or as a follow-on?

I think the "retry on 417" SHOULD is good but I think it's not as urgent.

algesten · 2023-11-11T09:09:24Z

Splitting those apart.

A client MUST NOT generate a 100-continue expectation in a request that does not include content.

We should definitely follow this. No expect unless there's a body.

A client that will wait for a 100 (Continue) response before sending the request content MUST send an header field containing a 100-continue expectation.

✅ Doing that already.

A client that sends a 100-continue expectation is not required to wait for any specific length of time; such a client MAY proceed to send the content even if it has not yet received a response.

I think we should follow curl behavior here. Have a 1000ms default timeout and make it configurable on Agent.

Furthermore, since 100 (Continue) responses cannot be sent through an HTTP/1.0 intermediary, such a client SHOULD NOT wait for an indefinite period before sending the content.

Same as previous. But yeah. We shouldn't wait forever, even if it's not a MUST.

A client that receives a 417 status code in response to a request containing a 100-continue expectation SHOULD repeat that request without a 100-continue expectation, since the 417 response merely indicates that the response chain does not support expectations (e.g., it passes through an HTTP/1.0 server).

I think we should be good citizens and do this too, even if it's not a MUST. However on this point I can be convinced otherwise. Doesn't seem like a big deal though.

Whether the work is added on to this PR or a new PR, I don't think matters much. Either way I think we need to do this before shipping a new version with this functionality.

Just throwing out there: Should we do like curl and always send expect-100 headers and wait 1000ms when we have content?

@johan-bjareholt You already done a lot here. Thanks! It's up to you whether you want to take on these further requirements. Let us know if you intend to continue, or we will work out a plan for the rest of the work. No pressure!

johan-bjareholt · 2023-11-12T13:54:27Z

Whether the work is added on to this PR or a new PR, I don't think matters much. Either way I think we need to do this before shipping a new version with this functionality.

The pro of not merging it would be that then there wouldn't be a hurry to fix it before the next release, as it doesn't become a blocker. If we don't find it to be complete enough for a release, maybe we shouldn't merge it?

Just throwing out there: Should we do like curl and always send expect-100 headers and wait 1000ms when we have content?

To do it unconditionally seems like it would be against the point of what "Expect: 100-continue" is supposed to solve, possibility to deny requests early that are large to not waste network bandwidth and processing? To do it if the request payload is bigger than X megabytes would seem reasonable to me however.

@johan-bjareholt You already done a lot here. Thanks! It's up to you whether you want to take on these further requirements. Let us know if you intend to continue, or we will work out a plan for the rest of the work. No pressure!

Thanks, I highly appreciate the fast and thorough support!

I would love to help out some more, but the coming week will be a bit busy for me. So if you have some patience, I could continue working on this. For my use-case, the current solution works but the suggested improvements would also help.

jsha · 2023-11-13T19:00:33Z

The pro of not merging it would be that then there wouldn't be a hurry to fix it before the next release, as it doesn't become a blocker. If we don't find it to be complete enough for a release, maybe we shouldn't merge it?

Good point. I agree.

Just throwing out there: Should we do like curl and always send expect-100 headers and wait 1000ms when we have content?

To do it unconditionally seems like it would be against the point of what "Expect: 100-continue" is supposed to solve, possibility to deny requests early that are large to not waste network bandwidth and processing? To do it if the request payload is bigger than X megabytes would seem reasonable to me however.

Yeah, I agree here too. Here's what the curl docs say:

curl sends this Expect: header by default if the POST it will do is known or suspected to be larger than just minuscule. curl also does this for PUT requests.

I like this idea. We could do this for the known-length send methods like send_bytes. For send we won't know if the body will be miniscule or not. We could assume that send will always be non-miniscule because the point of passing an impl Read is that the data be too big or too dynamic to fit in a &[u8] easily.

I would love to help out some more, but the coming week will be a bit busy for me. So if you have some patience, I could continue working on this. For my use-case, the current solution works but the suggested improvements would also help.

We're happy to wait. Let's say if you come back to in the next few weeks, great; if not we'll pick up your PR and run with it. Thanks!

johan-bjareholt · 2023-12-18T12:45:24Z

I have unfortunately been swamped the past few weeks so have not been able to work on this. Neither do I see myself having time to start working on the rest before the middle of february. If someone else has time to look at it, that'd be much appreciated.

johan-bjareholt · 2024-02-06T10:58:37Z

Since this is unfortunately taking so long, would it be an option to merge this and create a new issue for the things missing?
Considering that this feature is only enabled if you explicitly set the header, it shouldn't break anything.

johan-bjareholt · 2024-03-22T16:46:13Z

Tried to rebase to fix merge conflict, but tests are failing on main again.

Pushed fix for main in seperate PR #742

There does not seem to be any good reason to take ownership of it. Makes us able to remove a clone and a TODO comment.

src/response.rs

src/unit.rs

johan-bjareholt · 2024-04-11T07:59:08Z

Added some unit tests, implemented support for handling 417 according to spec and added a TODO in the PR description.

johan-bjareholt · 2024-04-11T11:56:30Z

src/unit.rs

+                match response.status() {
+                    100 => debug!("Got 100-continue, proceeding with body"),
+                    200 => {
+                        // TODO: How should we handle this case?


Any idea how to best handle this case?

If a server does not understand the "Expect: 100-continue" header, it will wait for the body indefinitely. To solve this issue, we add a shorter timeout on reading the response status+headers and if that timeout is hit we send the body anyway.

johan-bjareholt · 2024-04-11T13:01:41Z

@algesten @jsha Feel free to review this again, I think it has all the critical features now.

johan-bjareholt · 2024-04-19T08:17:43Z

@algesten @jsha Ping 🙂

algesten · 2024-04-22T06:50:08Z

@johan-bjareholt I haven't forgotten. Just been a lot lately.

johan-bjareholt · 2024-06-13T11:42:48Z

Sorry for nagging, just wanted to ping this again.
I understand if you still got a lot to do.

algesten · 2024-06-28T17:14:02Z

I'm not ignoring this. I'm stalling! 🙈

algesten · 2024-06-28T17:21:22Z

To elaborate: ureq is now a quite popular crate, and I'm largely alone in maintaining it. This PR changes some of the inner workings and I've increasingly become more and more hesitant to do such things (don't know if you saw the fallout from trying to fix our test cases with hootbin)

Which isn't to say we don't want it, but I have this inertia/emotional block to get over.

johan-bjareholt · 2024-07-08T15:52:21Z

I understand.
Anything I can help to get you to overcome it? Any code that these changes might impact that we need more unittests for?
I've been using this patch continuously for a few months now without any issues at least with over 160 million requests and terrabytes of data, if that's gives any comfort. My use-case might be very specific though.

algesten · 2024-10-12T10:25:25Z

@johan-bjareholt

The ureq 3.x rewrite which now is in main should support expect-100-continue. Sorry for messing you around on this PR.

johan-bjareholt · 2024-10-12T12:09:51Z

It's ok, what's most important is that it's now supported. Thanks!

johan-bjareholt commented Nov 8, 2023

View reviewed changes

src/response.rs Outdated Show resolved Hide resolved

src/unit.rs Outdated Show resolved Hide resolved

johan-bjareholt force-pushed the expect-100-continue branch from 567e647 to e2c7fee Compare November 8, 2023 09:45

algesten requested changes Nov 8, 2023

View reviewed changes

src/response.rs Outdated Show resolved Hide resolved

src/response.rs Outdated Show resolved Hide resolved

src/response.rs Outdated Show resolved Hide resolved

src/response.rs Outdated Show resolved Hide resolved

johan-bjareholt force-pushed the expect-100-continue branch from e2c7fee to bd0fa29 Compare November 8, 2023 13:58

johan-bjareholt force-pushed the expect-100-continue branch from bd0fa29 to 20bc250 Compare November 9, 2023 16:27

jsha approved these changes Nov 10, 2023

View reviewed changes

johan-bjareholt force-pushed the expect-100-continue branch from 20bc250 to 046df14 Compare November 10, 2023 07:27

algesten mentioned this pull request Nov 15, 2023

http_interop: Implement Request conversion for http::request::Parts #669

Merged

6 tasks

algesten mentioned this pull request Dec 15, 2023

Update rustls for RISCV64 support #689

Closed

johan-bjareholt mentioned this pull request Feb 6, 2024

Timeout not always respected #700

Closed

johan-bjareholt force-pushed the expect-100-continue branch from 046df14 to 2103584 Compare March 22, 2024 16:27

response: Don't take ownership of unit in do_from_stream

0955dfe

There does not seem to be any good reason to take ownership of it. Makes us able to remove a clone and a TODO comment.

johan-bjareholt commented Mar 26, 2024

View reviewed changes

src/response.rs Outdated Show resolved Hide resolved

johan-bjareholt force-pushed the expect-100-continue branch 2 times, most recently from b5244bd to 0daee34 Compare March 26, 2024 10:47

johan-bjareholt commented Mar 26, 2024

View reviewed changes

src/unit.rs Outdated Show resolved Hide resolved

johan-bjareholt force-pushed the expect-100-continue branch from 0daee34 to c8e731b Compare March 27, 2024 14:34

johan-bjareholt force-pushed the expect-100-continue branch 4 times, most recently from 468ff94 to 31ba284 Compare April 11, 2024 07:58

johan-bjareholt force-pushed the expect-100-continue branch 2 times, most recently from 09c739a to 5419ffd Compare April 11, 2024 08:06

johan-bjareholt commented Apr 11, 2024

View reviewed changes

johan-bjareholt added 3 commits April 11, 2024 14:06

Add support for "Expect: 100-continue" header

353d727

Retry on 417 for "Expect: 100-continue"

d808fa0

johan-bjareholt force-pushed the expect-100-continue branch from 5419ffd to d54a264 Compare April 11, 2024 12:08

algesten closed this Oct 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for "Expect: 100-continue" header #679

Add support for "Expect: 100-continue" header #679

johan-bjareholt commented Nov 7, 2023 •

edited

Loading

johan-bjareholt commented Nov 7, 2023

algesten commented Nov 8, 2023

algesten left a comment

jsha commented Nov 8, 2023

jsha commented Nov 8, 2023

johan-bjareholt commented Nov 9, 2023

algesten commented Nov 9, 2023

jsha left a comment

jsha commented Nov 10, 2023

johan-bjareholt commented Nov 10, 2023

johan-bjareholt commented Nov 10, 2023

algesten commented Nov 10, 2023

algesten commented Nov 10, 2023

johan-bjareholt commented Nov 10, 2023

algesten commented Nov 10, 2023

algesten commented Nov 10, 2023

jsha commented Nov 10, 2023

algesten commented Nov 11, 2023

johan-bjareholt commented Nov 12, 2023

jsha commented Nov 13, 2023

johan-bjareholt commented Dec 18, 2023

johan-bjareholt commented Feb 6, 2024

johan-bjareholt commented Mar 22, 2024

johan-bjareholt commented Apr 11, 2024

johan-bjareholt Apr 11, 2024

johan-bjareholt commented Apr 11, 2024

johan-bjareholt commented Apr 19, 2024

algesten commented Apr 22, 2024

johan-bjareholt commented Jun 13, 2024

algesten commented Jun 28, 2024

algesten commented Jun 28, 2024

johan-bjareholt commented Jul 8, 2024

algesten commented Oct 12, 2024

johan-bjareholt commented Oct 12, 2024

Add support for "Expect: 100-continue" header #679

Add support for "Expect: 100-continue" header #679

Conversation

johan-bjareholt commented Nov 7, 2023 • edited Loading

To-do

johan-bjareholt commented Nov 7, 2023

algesten commented Nov 8, 2023

algesten left a comment

Choose a reason for hiding this comment

jsha commented Nov 8, 2023

jsha commented Nov 8, 2023

johan-bjareholt commented Nov 9, 2023

algesten commented Nov 9, 2023

jsha left a comment

Choose a reason for hiding this comment

jsha commented Nov 10, 2023

johan-bjareholt commented Nov 10, 2023

johan-bjareholt commented Nov 10, 2023

algesten commented Nov 10, 2023

algesten commented Nov 10, 2023

johan-bjareholt commented Nov 10, 2023

algesten commented Nov 10, 2023

algesten commented Nov 10, 2023

jsha commented Nov 10, 2023

algesten commented Nov 11, 2023

johan-bjareholt commented Nov 12, 2023

jsha commented Nov 13, 2023

johan-bjareholt commented Dec 18, 2023

johan-bjareholt commented Feb 6, 2024

johan-bjareholt commented Mar 22, 2024

johan-bjareholt commented Apr 11, 2024

johan-bjareholt Apr 11, 2024

Choose a reason for hiding this comment

johan-bjareholt commented Apr 11, 2024

johan-bjareholt commented Apr 19, 2024

algesten commented Apr 22, 2024

johan-bjareholt commented Jun 13, 2024

algesten commented Jun 28, 2024

algesten commented Jun 28, 2024

johan-bjareholt commented Jul 8, 2024

algesten commented Oct 12, 2024

johan-bjareholt commented Oct 12, 2024

johan-bjareholt commented Nov 7, 2023 •

edited

Loading