-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix dereferencing of unaligned FILE_NAME_INFO #51
base: master
Are you sure you want to change the base?
Conversation
if res == 0 { | ||
return false; | ||
} | ||
let name_info: &FILE_NAME_INFO = &*(name_info_bytes.as_ptr() as *const FILE_NAME_INFO); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that a smaller change here would be to just use (name_info_bytes.as_ptr() as *const FILE_NAME_INFO).read_unaligned()
, right? Then you don't need to reason about uninitialized memory and what not, which seems like overkill here IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to use uninitialized memory here anyway - mem::zeroed()
would be just fine. I personally prefer to use MaybeUninit
when dealing with OS apis, as it helps flag the invariants we're actually interested in. Even without the type there, this function entirely relies on the buffer being written to. That said, I'm happy to make the change to mem::zeroed
if you decide this breaks the complexity budget :)
Edit: On second thought mem::zeroed()
is overkill too 😄 we can simply initialize the struct itself. In the new commit, it turns out this drastically reduces the unsafety, limiting it to only the winapi calls.
As for a read_unaligned
, I think using the actual type rather than a byte buffer makes the intention clearer.
let size = mem::size_of::<FILE_NAME_INFO>();
// Imo, this takes no less effort to grok
let mut name_info_bytes = vec![0u8; size + MAX_PATH * mem::size_of::<WCHAR>()];
The main problem with using read_unaligned
would be that the invariants it requires are actually more complex than dereferencing a struct we've just initialized. i.e. The change you suggest would also be unsound: s
would now have a buffer overflow because only 1 WCHAR
would be read from the buffer (based on winapi's FILE_NAME_INFO
struct). We could absolutely make read_unaligned
work (we'd need to use something like name_info_bytes.as_ptr().add(4)
on line 143) but my concern would be that the safety is actually more nuanced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
w.r.t. to read_unaligned
, I was suggesting it specifically as a replacement to the raw pointer dereference. If read_unaligned
is unsound without some pointer arithmetic, then so is the raw pointer dereference. The key here is that this PR is trying to solve two problems (unaligned dereference and getting rid of heap alloc) despite the title of the PR only referring to one (unaligned dereference). Getting rid of heap allocs seems like a questionable thing to me, which means the smallest diff for this PR to address the unaligned deference is to simply use an unaligned load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I'm fine with fixing the heap alloc issue here. While it's not clear to me why fixing it is important, there is little cost to doing so I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That certainly wasn't the intention! I totally agree that it's good to keep the scope of these changes small, I just found that trying to align the vector properly was trickier than doing it on the stack, as I needed to intertwine the alignment with the generic type I was putting in the vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also, I know you're only skimming so I don't blame you for misreading the change, but the difference between a read_unaligned
and the previous pointer dereference is that it means we have a FILE_NAME_INFO
instead of a &FILE_NAME_INFO
. This makes name_info.FileName.as_ptr()
below unsound because now it's moved away from the actual path data! Point being, I'm glad we don't have to have any more nasty unsafe here :P)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Tricksy indeed. You're totally right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does look better to me than before, thanks. :-)
Note that I am not the maintainer of this crate. This crate is just a critical dependency for me, so I try to keep my eye on it.
if res == 0 { | ||
return false; | ||
} | ||
let name_info: &FILE_NAME_INFO = &*(name_info_bytes.as_ptr() as *const FILE_NAME_INFO); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
w.r.t. to read_unaligned
, I was suggesting it specifically as a replacement to the raw pointer dereference. If read_unaligned
is unsound without some pointer arithmetic, then so is the raw pointer dereference. The key here is that this PR is trying to solve two problems (unaligned dereference and getting rid of heap alloc) despite the title of the PR only referring to one (unaligned dereference). Getting rid of heap allocs seems like a questionable thing to me, which means the smallest diff for this PR to address the unaligned deference is to simply use an unaligned load.
unsafe fn msys_tty_on(fd: DWORD) -> bool { | ||
use std::{mem, slice}; | ||
|
||
fn msys_tty_on(fd: DWORD) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this should be unsafe
. It's taking a raw file descriptor without any guarantees about the validity of that file descriptor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fd
isn't used as a file descriptor internally, just as an argument to GetStdHandle
which maps the handle id to the real runtime handle. I've documented why that's sound in its own unsafe
block, which I'm fairly confident upholds winapi
s contracts.
Most winapi
APIs are actually very forgiving with handles, and will simply return errors when they're invalild.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sunfishcode What do you think about this? Particularly with respect to your new I/O safety RFC, should a routine like this be marked as unsafe
even though there is no possibility of UB for any value of fd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This... is a subtle one. It's true there's no possibility of UB here, or even corrupted I/O. On the other hand, the overall pattern here of using a pre-existing I/O handle without being given explicit access to it is the kind of thing I/O safety wants to avoid. On the first hand though, atty's own public API does this, by taking an enum instead of a handle; it's not unsafe
, and this is doing essentially the same thing.
So I think it's reasonable to leave this as a safe function for now. But I also think it makes sense to explore APIs where the user passes in a handle.
struct FILE_NAME_INFO { | ||
FileNameLength: DWORD, | ||
FileName: [WCHAR; MAX_PATH] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why the winapi
definition isn't equivalent to this? I'm worried we're missing something given that winapi
chose a different representation and it's not clear why to me. cc @retep998
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the way winapi normally models C's flexible array members, as it's typically the most flexible (heh) option. The actual max length of file name that GetFileInformationByHandleEx
will write is controlled by the fourth parameter, which could well be less than the full MAX_PATH
length (and could even be longer! It's possible to enable even longer path lengths, but our pty
s shouldn't need to worry about that :))
GetStdHandle(fd) | ||
}; | ||
let res = unsafe { | ||
// Safety: handle is valid, and buffer length is fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: The handle is not necessarily valid, GetStdHandle
can return an INVALID_HANDLE_VALUE
(ref docs) and it's fragile to rely on this api not doing anything wrong when passed invalid data. I would be more comfortable with this error being checked in rust before calling the system api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetFileInformationByHandleEx
should be well-behaved here, as documented in the comments. Handling an invalid handle is one of its error states. That said, I'd never really argue against defensive programming especially around winapi!
Either way, I probably won't do anything more with PR without input from a maintainer
The `atty` crate seems unmaintained. There's https://rustsec.org/advisories/RUSTSEC-2021-0145 filed against it, which `cargo-deny` complains about. A fix for that has been open for well over a year without being fixed (softprops/atty#51). The `is-terminal` crate is a fork of `atty` with that fixed, plus some other changes.
The `atty` crate seems unmaintained. There's https://rustsec.org/advisories/RUSTSEC-2021-0145 filed against it, which `cargo-deny` complains about. A fix for that has been open for well over a year without being fixed (softprops/atty#51). The `is-terminal` crate is a fork of `atty` with that fixed, plus some other changes. Since we also depend on `atty` via `clap`, I also added an exception to the `cargo-deny` config.
The `atty` crate seems unmaintained. There's https://rustsec.org/advisories/RUSTSEC-2021-0145 filed against it, which `cargo-deny` complains about. A fix for that has been open for well over a year without being fixed (softprops/atty#51). It turns out the functionality is also available via the `crossterm` crate (thanks, @yuja), which we already depend on. Since we also depend on `atty` via `clap`, I also added an exception to the `cargo-deny` config.
The `atty` crate seems unmaintained. There's https://rustsec.org/advisories/RUSTSEC-2021-0145 filed against it, which `cargo-deny` complains about. A fix for that has been open for well over a year without being fixed (softprops/atty#51). It turns out the functionality is also available via the `crossterm` crate (thanks, @yuja), which we already depend on. Since we also depend on `atty` via `clap`, I also added an exception to the `cargo-deny` config.
Fixes #50 as the problem was confirmed by a user on the community discord. I'm reasonably confident there aren't any problems with the new code, but it's always a good idea to have this checked 😀