wasi: use File.Poll for all blocking FDs in poll_oneoff #1606

evacchi · 2023-07-31T14:33:52Z

This includes stdin, pipes but also sockets.

This also updates RATIONALE.md.
It also fixes some issues in the fsapi.File wrappers for sockets on POSIX and Windows (esp. the nonblocking flag was set correctly at all times)
I think with this we could also close support non-blocking files #1500.

Further refinements are possible (e.g. supporting poll for reading), but most of the work is ok now.

Rationale

Instead of special-casing for stdin, we can now use File.Poll() on basically every file. I am limiting to those that report blocking = true, because otherwise Read() is expected... not to block (i.e. it is allowed to return EAGAIN).

Notice that, previously, we knew we only handled stdin, so we always invoked Poll only on that one File.

This uncovered another interesting issue: in some cases, the event struct could be written at a wrong offset, because we precomputed the value. If for some reason one of the events was not written back, then we would leave an empty gap; this was hard to notice because of the special treatment reserved only for blocking stdin, and because in many cases we did not test more than one fd at a time (we did not simulate an unreported event).
The other issue is that, with a nonzero timeout, the poll syscall blocks an OS thread until it returns.

Now, we iterate on each File that has reported as "blocking" to invoke its method Poll(). However:

we no longer issue a blocking call to sysfs.poll() with the given timeout.
instead, we emulate time.After() and time.Tick() but we use sys.Nanosleep()
we repeatedly call sysfs.poll() with a zero timeout, similarly to how this is handled on the Windows side for WSAPoll, every time the tick goes off until the timeout is reached
This allows us to honor the given sys.Nanosleep() config instead of relying on poll regardless of the config settings (in fact, some test cases were broken because they did not configure sys.Nanosleep() properly if at all.

Alternative ways to do this would be:

scatter/gather by spawning a goroutine for each File.Poll() with a given timeout; however, if the given timeout does leverage sysfs.poll() this in turns issues a blocking call, taking over the underlying OS thread until the timeout is reached (which eventually may consume all resources).
call poll({fd_1, ..., fd_n}) which however, as already discussed in RATIONALE.md, is not an abstraction at the right level.

In fact, for our intents, we may also consider exporting File.Poll(Flag) with no timeouts, defaulting to 0ms, avoiding the risk of blocking altogether.

Finally, because emulating poll(2) on all platforms is not a goal, it might be possible to further refine this by replacing the Windows implementation of sysfs.poll() with ad-hoc versions basically remove the wrapper (e.g. without handling an arbitrary timeout, since this would be now handled in poll_oneoff and/or for specific file types instead of detecting them automatically in sysfs.poll(); in other words possibly going straight from WinTcp*File.Poll() to WSAPoll() etc.)

This includes stdin, pipes but also sockets. Updates RATIONALE.md Signed-off-by: Edoardo Vacchi <[email protected]>

imports/wasi_snapshot_preview1/poll.go

evacchi · 2023-08-01T20:35:05Z

I added more test cases, and I realized some issues with file sock on both Windows and POSIX re: setting the nonblocking flag. I also added a wasi test case with zig-cc. Tomorrow I'll try to figure out if I can write another implementation in Rust and/or Go that actually invokes poll with multiple FDs (I am not sure the higher-level APIs actually do it)

EDIT: heh, I tried to write a simple example with gotip but I couldn't figure out how to make it call poll_oneoff 😬

// mainMixed is an explicit test of a blocking socket + stdin pipe.
func mainMixed() error {
	// Get a listener from the pre-opened file descriptor.
	// The listener is the first pre-open, with a file-descriptor of 3.
	f := os.NewFile(3, "")
	l, err := net.FileListener(f)
	defer f.Close()
	if err != nil {
		return err
	}
	defer l.Close()

	ch1 := make(chan error)
	ch2 := make(chan error)

	go func() {
		// Accept a connection
		conn, err := l.Accept()
		if err != nil {
			ch1 <- err
			return
		}
		defer conn.Close()

		// Do a blocking read of up to 32 bytes.
		// Note: the test should write: "wazero", so that's all we should read.
		var buf [32]byte
		n, err := conn.Read(buf[:])
		if err != nil {
			ch1 <- err
			return
		}
		fmt.Println(string(buf[:n]))
		close(ch1)
	}()

	go func() {
		b, err := io.ReadAll(os.Stdin)
		if err != nil {
			ch2 <- err
			return
		}
		os.Stdout.Write(b)
		close(ch2)
	}()
	err1 := <-ch1
	err2 := <-ch2
	if err1 != nil {
		return err1
	}
	if err2 != nil {
		return err2
	}
	return nil
}

Signed-off-by: Edoardo Vacchi <[email protected]>

codefromthecrypt · 2023-08-02T00:29:35Z

Thanks for digging into the integration tests. This type of code/behavior is hard to pin down and exactly where the extra tests come in: to establish an "implementation quorum" please ping back when you feel things are settled or need a hand from someone else (even if technical over my head ;))

Signed-off-by: Edoardo Vacchi <[email protected]>

evacchi · 2023-08-02T19:09:13Z

Ok, I added a test for gotip, and I have also figured out something for Rust (using tokio-rs/mio).

I verified (with --hostlogging poll and also via debugger) they actually exercise poll_oneoff.
The tests are not all the same but they are ~similar.
The one that's actually most different is the one for gotip, because I can't tell if there is a straightforward way to make sure that all FDs are checked at once (spoiler: they aren't!), but at least it's checking more than one subscription at a time -- but they will all be in nonblocking mode, so it doesn't really follow the new code path, except for the timers.

I think at this point this is ready for review. It may still lack a bit of polish but your feedback is welcome.

EDIT: I have updated the top post.

Signed-off-by: Edoardo Vacchi <[email protected]>

evacchi · 2023-08-02T19:31:17Z

imports/wasi_snapshot_preview1/poll.go

-				// if the fd is Stdin, and it is in non-blocking mode,
-				// do not ack yet, append to a slice for delayed evaluation.
-				blockingStdinSubs = append(blockingStdinSubs, evt)
+				writeEvent(outBuf[outOffset:], evt)


writeEvents has been simplified, we pass the buffer at the right offset already

evacchi · 2023-08-02T19:31:42Z

imports/wasi_snapshot_preview1/poll.go

-	// and we don't need to wait for the timeout: clear it.
-	if readySubs != 0 {
-		timeout = 0
+	sysCtx := mod.(*wasm.ModuleInstance).Sys


the section below has been reordered for clarity

evacchi · 2023-08-02T19:32:54Z

imports/wasi_snapshot_preview1/poll.go

+			} else if file.File.IsNonblock() {
+				writeEvent(outBuf[outOffset:], evt)
+				nevents++
 			} else {
-				writeEvent(outBuf, evt)
-				readySubs++
+				// If the fd is blocking, do not ack yet,
+				// append to a slice for delayed evaluation.
+				fe := &filePollEvent{f: file, e: evt}
+				blockingSubs = append(blockingSubs, fe)


these have been reordered for clarity; first we avoid the double negation (!IsNonblock()), second the two immediate writes are in the if + else if, while else handles the special case of "delayed" processing.

evacchi · 2023-08-02T19:34:17Z

imports/wasi_snapshot_preview1/poll_test.go

+		0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+		0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+		0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+		0x0, 0x0, 0x0, 0x0,


fdReadSubFd is now being used in other poll tests (see above); e.g. to create multiple records; in order for such records to be valid, we zero-pad the byte slice to the right size (32 bytes)

Suggested change

0x0, 0x0, 0x0, 0x0,

0x0, 0x0, 0x0, 0x0, // pad to record size (32 bytes)

evacchi · 2023-08-02T19:34:42Z

imports/wasi_snapshot_preview1/sock_test.go

@@ -47,7 +47,7 @@ func Test_sockAccept(t *testing.T) {
 		t.Run(tc.name, func(t *testing.T) {
 			ctx := experimentalsock.WithConfig(testCtx, experimentalsock.NewConfig().WithTCPListener("127.0.0.1", 0))

-			mod, r, log := requireProxyModuleWithContext(ctx, t, wazero.NewModuleConfig())
+			mod, r, log := requireProxyModuleWithContext(ctx, t, wazero.NewModuleConfig().WithSysNanosleep())


poll_oneoff now respects SysNanosleep thus, it has to be configured properly

evacchi · 2023-08-02T19:35:52Z

internal/sysfs/sock_windows.go

@@ -100,7 +100,6 @@ func syscallConnControl(conn syscall.Conn, fn func(fd uintptr) (int, sys.Errno))
 // because they are sensibly different from Unix's.
 func newTCPListenerFile(tl *net.TCPListener) socketapi.TCPSock {
 	w := &winTcpListenerFile{tl: tl}
-	_ = w.SetNonblock(true)


we no longer default to nonblock on sockets (it's not necessary)

evacchi · 2023-08-02T19:36:21Z

internal/sysfs/sock_windows.go

+func (f *winTcpListenerFile) Poll(flag sys.Pflag, timeoutMillis int32) (ready bool, errno sys.Errno) {
+	return _pollSock(f.tl, flag, timeoutMillis)
+}


implement Poll properly for sockets

evacchi · 2023-08-02T19:36:41Z

internal/sysfs/sock_windows.go

+		if ready, errno := f.Poll(sys.POLLIN, 0); !ready || errno != 0 {
+			return nil, sys.EAGAIN
+		}


we can rewrite this in terms of f.Poll()

evacchi · 2023-08-03T07:44:55Z

oh, I was almost forgetting: while running make build.examples.zig-cc I noticed that the DWARF example was mistakingly being overwritten (I think the new build might have stripped the DWARF symbols) so we'll need to check that

codefromthecrypt

This looks good, just I might be missing a decision point around the poll interval. Right now, we seem to be polling via poll immediate followed by 100ms sleep.

100ms is a very long time, so I would suspect this should be a lot shorter, probably 100us even could be too long). It is worse because external sleep approach guarantees it will take that long.

So, I'm wondering mainly why not use poll with a short timeout isn't used, if it is a defect or we are trying for the native side to use the fake clock.

IMHO I think that since timeout is a parameter of poll, in the Poll api, how timeout is implemented is up to the backend, which may choose to use a real or fake clock to sleep or a native poll. If in any case we are not able to trust the implementation of the poll timeout and avoiding using it for that reason, I would try to make it very clear why not, because in worst case it can feel like "spaghetti around a problem" to do externnal orchestration of a feature defined in the poll documentation (timeout)

codefromthecrypt · 2023-08-04T03:06:11Z

imports/wasi_snapshot_preview1/poll.go

+	go closeChAfter(sysCtx, timeout, timeoutCh)
+
+	pollInterval := 100 * time.Millisecond
+	ticker := time.NewTicker(pollInterval)


fyi closeChAfter we are intentionally using the context clock, but this will use a real one..

codefromthecrypt · 2023-08-04T03:08:35Z

RATIONALE.md

+block the carrier thread of the goroutine, preventing any means
+to support context cancellation directly.
+
+We obviate this by invoking `poll` with a 0 timeout repeatedly,


I don't fully understand this. why not poll with 100ms vs poll zero+sleep? Are you suggesting that the poll implementation blocks too long even if you put 100ms? If so maybe the above paragraph needs to clarify this?

yes, if we put a 100ms timeout, then the syscall will block for 100ms, which means it will also block the underlying OS thread. I will add a clarification.

codefromthecrypt · 2023-08-04T03:10:37Z

imports/wasi_snapshot_preview1/poll_test.go

+		0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+		0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+		0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
+		0x0, 0x0, 0x0, 0x0,


Suggested change

0x0, 0x0, 0x0, 0x0,

0x0, 0x0, 0x0, 0x0, // pad to record size (32 bytes)

evacchi · 2023-08-04T07:07:55Z

100ms is a very long time, [...]
So, I'm wondering mainly why not use poll with a short timeout isn't used, if it is a defect or we are trying for the native side to use the fake clock.

100ms is completely arbitrary, I picked that because that's what I picked in the Windows impl (which now is irrelevant, since it is being invoked here with 0ms). We could certainly use a smaller delay, I am not an expert at all here.

IMHO I think that since timeout is a parameter of poll, in the Poll api, how timeout is implemented is up to the backend, which may choose to use a real or fake clock to sleep or a native poll. If in any case we are not able to trust the implementation of the poll timeout and avoiding using it for that reason, I would try to make it very clear why not, because in worst case it can feel like "spaghetti around a problem" to do externnal orchestration of a feature defined in the poll documentation (timeout)

yeah the main issue is that the real poll is actually hogging a real OS thread, that means e.g. if you invoke poll_oneoff with a long delay it won't return control to the Go runtime until the underlying poll has returned. This way we avoid relying on the syscall native delay, and we give the Go runtime more chances to take over if necessary (e.g. schedule other goroutines)

so this has more to do with the interaction between Go and the underlying syscall, than how the syscall is actually implemented at the OS-level. I will add some notes to the RATIONALE

the "bonus" is that by using the ctx clock we are also respecting that clock.

codefromthecrypt · 2023-08-04T07:58:54Z

This way we avoid relying on the syscall native delay

What I'm probably missing here is that this is a timeout. What I'm thinking and could be wrong, but timeout is the worst case. A delay means it is blocked regardless.

So say 100us and the "file is ready to write" event happens at +10us. Using a clock sleep approach you have to wait extra 90us anyway. If this is correct then it seems we are trying too hard to not use poll's timeout even when Go uses it. In other words, I feel we are trying too hard to not use it, and in the process force a longer delay than necessary (blocking the worst case even when an event occurs before the worst case). What am I missing?

ncruces · 2023-08-04T11:43:39Z

My understanding of the problem/solution is this:

If you poll with timeout:

the guest unblocks as soon as an event happens;
but while waiting…
the host has an OS thread blocked from doing anything else;
the host can only cancel the guest at timeout intervals.

If you poll zero and sleep the timeout:

the guest can only unblock at timeout intervals;
but while sleeping…
the host OS thread is free to do other things;
the host can cancel the guest at any time.

Basically this looks like a choice/balance between giving priority to host or guest resources.

For a single guest (like browser) environment, the first one would be the right call, hands down.
For scaling in the backend, I'd be inclined to go with the second, if we can't do better.

evacchi · 2023-08-04T12:05:01Z

since this is introducing a significant change in how we handle poll_oneoff, I will close this PR for now and instead contribute the tests, small cleanups and fixes that were part of it, without modifying how poll_oneoff handles FDs, we can always revisit this :)

codefromthecrypt · 2023-08-05T12:55:55Z

#1606 (comment) from @ncruces has basically the concern which was at the crux of this issue.

I would say that the code abandoned and also in the description seems to both say that using blocking via syscall timeout is bad, yet I believe this is actually what go does in net poll.

It makes me wonder how anyone would be able to choose a good value to block the client. Also why anyone would do this if there was only one blocking event. Like the guessed interval would always be wrong even if a little. You are choosing to wait not long enough or too long, unless there's a constant stream of data.

If the project as a whole wants to prevent use of the timeout parameter in the syscall, basically to always block for no time and guess an interval to sleep for (with a syscall each guess).. I feel like the API should change, and remove the ability to give a timeout. (remove the param from File.Poll). Now, it makes even less sense to have a timeout parameter and not use it.

I think personally I've said enough on this and I'll leave any decision up to you all, just maybe decide once and for all in a couple months? because we are exposing the filesystem api and it should make sense why poll would be exposed and never use the timeout val here or in a potential multi-poll scenario.

codefromthecrypt · 2023-08-05T13:01:50Z

The thing I believe whatever action to take is, that if we believe wazero should not support integrated syscall pollers, we should be very loud about it. It is a different direction than I expected considering the syscall layer otherwise acts like sys calls. I was expecting a comment like what go has on an emulated thing vs emulating poll behavior and intentionally not allowing native polling, by doing so above where someone has control of the impl (above the fs API)

Basically, I would suggest study the topic in go, like here, make a decision and then try to rationalize both in API and also in the RATIONALE why specifically here only in poll we are doing like this, where other places like blocking Read will still block the calling thread etc. I've an idea that folks can figure out a justification, just I don't want to moderate it.

wasi: refactor poll_oneoff to use Poll for all blocking FDs

697d7e4

This includes stdin, pipes but also sockets. Updates RATIONALE.md Signed-off-by: Edoardo Vacchi <[email protected]>

codefromthecrypt reviewed Jul 31, 2023

View reviewed changes

imports/wasi_snapshot_preview1/poll.go Outdated Show resolved Hide resolved

Add new test cases, fix sock impl

a24e05f

Signed-off-by: Edoardo Vacchi <[email protected]>

evacchi force-pushed the poll-all-fds branch from c2f3a1c to a24e05f Compare August 1, 2023 21:15

codefromthecrypt mentioned this pull request Aug 2, 2023

binary.decodeCode cause Memory leak ? #1600

Closed

evacchi added 2 commits August 2, 2023 11:07

contiguous offsets

ddeeabc

Signed-off-by: Edoardo Vacchi <[email protected]>

wip: gotip test

70f38cc

Signed-off-by: Edoardo Vacchi <[email protected]>

evacchi force-pushed the poll-all-fds branch 2 times, most recently from bbb7b33 to 65d5b3c Compare August 2, 2023 19:02

evacchi added 2 commits August 2, 2023 21:16

Add rust test case

943afcf

Signed-off-by: Edoardo Vacchi <[email protected]>

Improve comments in code

dc50c2b

Signed-off-by: Edoardo Vacchi <[email protected]>

evacchi force-pushed the poll-all-fds branch from 65d5b3c to dc50c2b Compare August 2, 2023 19:19

evacchi marked this pull request as ready for review August 2, 2023 19:29

evacchi requested a review from mathetake as a code owner August 2, 2023 19:29

evacchi commented Aug 2, 2023

View reviewed changes

evacchi requested a review from codefromthecrypt August 2, 2023 19:37

codefromthecrypt reviewed Aug 4, 2023

View reviewed changes

evacchi closed this Aug 4, 2023

evacchi mentioned this pull request Aug 4, 2023

wasi: add more test cases to poll_oneoff, cleanup impl #1612

Merged

codefromthecrypt mentioned this pull request Aug 7, 2023

Re-introduces internal fsapi.File with non-blocking methods #1613

Merged

mathetake deleted the poll-all-fds branch August 8, 2023 02:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wasi: use File.Poll for all blocking FDs in poll_oneoff #1606

wasi: use File.Poll for all blocking FDs in poll_oneoff #1606

evacchi commented Jul 31, 2023 •

edited

Loading

evacchi commented Aug 1, 2023 •

edited

Loading

codefromthecrypt commented Aug 2, 2023

evacchi commented Aug 2, 2023 •

edited

Loading

evacchi Aug 2, 2023

evacchi Aug 2, 2023

evacchi Aug 2, 2023

evacchi Aug 2, 2023

codefromthecrypt Aug 4, 2023

evacchi Aug 2, 2023

evacchi Aug 2, 2023

evacchi Aug 2, 2023

evacchi Aug 2, 2023

evacchi commented Aug 3, 2023

codefromthecrypt left a comment

codefromthecrypt Aug 4, 2023

evacchi Aug 4, 2023

codefromthecrypt Aug 4, 2023

evacchi Aug 4, 2023

codefromthecrypt Aug 4, 2023

evacchi commented Aug 4, 2023 •

edited

Loading

codefromthecrypt commented Aug 4, 2023 •

edited

Loading

ncruces commented Aug 4, 2023 •

edited

Loading

evacchi commented Aug 4, 2023

codefromthecrypt commented Aug 5, 2023 •

edited

Loading

codefromthecrypt commented Aug 5, 2023

	0x0, 0x0, 0x0, 0x0,
	0x0, 0x0, 0x0, 0x0, // pad to record size (32 bytes)

wasi: use File.Poll for all blocking FDs in poll_oneoff #1606

wasi: use File.Poll for all blocking FDs in poll_oneoff #1606

Conversation

evacchi commented Jul 31, 2023 • edited Loading

Rationale

evacchi commented Aug 1, 2023 • edited Loading

codefromthecrypt commented Aug 2, 2023

evacchi commented Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

evacchi commented Aug 3, 2023

codefromthecrypt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

evacchi commented Aug 4, 2023 • edited Loading

codefromthecrypt commented Aug 4, 2023 • edited Loading

ncruces commented Aug 4, 2023 • edited Loading

evacchi commented Aug 4, 2023

codefromthecrypt commented Aug 5, 2023 • edited Loading

codefromthecrypt commented Aug 5, 2023

evacchi commented Jul 31, 2023 •

edited

Loading

evacchi commented Aug 1, 2023 •

edited

Loading

evacchi commented Aug 2, 2023 •

edited

Loading

evacchi commented Aug 4, 2023 •

edited

Loading

codefromthecrypt commented Aug 4, 2023 •

edited

Loading

ncruces commented Aug 4, 2023 •

edited

Loading

codefromthecrypt commented Aug 5, 2023 •

edited

Loading