Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caprevoke: merge dev 2023-07-24 #1766

Closed
wants to merge 2,906 commits into from
Closed

Conversation

brooksdavis
Copy link
Member

PR for CI

jrtc27 and others added 30 commits June 29, 2023 10:37
Whilst the kernel can support any number of COMPAT_FOO, world can only
build a single libfoo. Upstream this isn't such an issue, since the only
option is lib32 anyway, but downstreams, such as CheriBSD, may wish to
support multiple at the same time. Thus, adjust the top-level Makefiles
to turn _LIBCOMPAT into a _LIBCOMPATS list that gets iterated over, and
adjust bsd.compat.mk to support this use-case.

For the normal NEED_COMPAT/WANT_COMPAT case, LIBCOMPATFOO remain set and
refer to the requested compat's, preserving the current interface. For
the top-level Makefiles those variables are no longer set (since there
is no longer "the" compat) and only the per-compat ones are available.

Reviewed by:	brooks, jhb, imp
Differential Revision:	https://reviews.freebsd.org/D40571

(cherry picked from commit 91d7edd)
Makefile.libcompat now supports building multiple sets of compat
libraries so this is no longer needed. It is, however, rather academic
for now given CHERI-MIPS was removed (the only CHERI architecture whose
base architecture has LIB32 support in FreeBSD), at least until upstream
gains LIB32 for arm64.aarch64.
get_arm64_tls pokes at the PCB for TPIDR_EL0 so needs the relevant
definitions. Currently we have a downstream diff to sys/systm.h that
pulls in machine/pcb.h, but that was only needed for CHERI-MIPS and so
should go away, which exposes this missing include.
Upstream doesn't pull it in here; this diff dates back to CHERI-MIPS,
which stored DDC in the PCB and thus __USER_CAP needed its definition.
However, neither Morello nor CHERI-RISC-V get it from the PCB and so do
not need this include (nor is it a generic thing that's needed), so we
should not have this as a downstream diff any more.

This fixes building lib32 libprocstat on amd64. Currently it dies
building zfs_defs.c, since it transitively includes machine/fpu.h via
sys/systm.h and, because it hackily defines _KERNEL, gets kernel
prototypes for things like fpu_save_area_free which reference struct
savefpu, but x86/fpu.h will give the i386 definition, which is a union
not a struct, giving a tag mismatch error. This issue once again
highlights how awful this approach is, and how fragile the whole thing
is when you start modifying system headers. We should probaby invest
efforts in fixing this mess upstream given the continued friction seen
around libprocstat/zfs downstream.
The situation for cleandir is a bit of a mess. The top-level build runs
cleandir normally and re-runs it per-libcompat under LIBCOMPATWMAKE.
This includes all programs, not just libraries, and so we end up in
nonsense situations. As a result, bsd.compat.mk is stubbed out for
cleandir, which in turn means it won't define libcompats, yet the top
level has defined _LIBCOMPATS, and so we end up trying to index it and
error out. Thus, skip this code in that instance, since cleandir isn't
referencing any of these targets. Note that, prior to multi-libcompat
support, we'd just reference the undefined libcompat and it would expand
to the empty string (which isn't an error in the contexts it was done,
other than the one removed in 86ddc08), but now we have to be more
careful.

This should all be reworked upstream so cleandir doesn't do stupid
things and these hacks can go (including reverting the .warn that
otherwise gets hit in bsd.compat.mk back to an .error).

Fixes:	cb783c2 ("Generalise libcompat to be a list rather than a single option")
Fixes:	#1729
Fixes:	c5c2d3d ("c18n: Insert trampolines for PLT function calls")
These add a bit to the build time, but we want to check that we're not
regressing the lib32 build as has happened in the past. We may wish to
flip the default in cheribuild (or remove the option entirely) at some
point in the future once we don't need to support versions where the
build is broken.
The local parts of rc already skip .sample files; we add .pkgsave to the
list, and add logic for base.

Thanks to @RhodiumToad for getting this started.

Differential Revision: https://reviews.freebsd.org/D27962
Reviewed by: imp
Pull Request: freebsd/freebsd-src#662

(cherry picked from commit 3693d91)
It was hard to tell what the if conditions were.
When the parent is GEN1, capdirty pages will also be GEN1, but now the
child will be GEN0 so there will be a mismatch leading to faulted capdirty
pages being revoked when revocation isn't in progress.  If no revocation
is triggered before the fault happens then vm_map.vm_cheri_revoke_test may
be NULL leading to a panic.
This is a potentially expensive check we we might not want to keep it
on by default even under invariants.
When called from exec_new_vmspace() to empty an unshared vmspace prior
to exec, the vm_map will be initialized with a zeroed enqueue and
dequeue epoch in the info page. When the process had previously revoked
and we failed to reset the state and zero the kernel's notion of the
epoch we ended up in an infinite loop were the kernel would see no work
to do (passed start epoch in the past) and thus would not update the
info page resulting in no progress being made.

One could alternatively reset the state in exec_new_vmspace(), but the
extra store seems harmless in the vm_map_free() case.
Verify that the empty vmspace is not revoking and has a zero epoch.
If the epoch is open then vm_cheri_revoke_test has been set and we need
to copy it was that we can test capabilities when we come back to finish
this epoch.
If the epoch is open we may be in the midst of a revocation and thus
rev_entry may be set in old_map.  In that case, set the corresponding
entry in new_map.
Add a somewhat unprincipled set of __has_feature(capabilities) checks to
allow cheri/revoke.h to be included unconditionally by libsysdecode's
implementation.
Support the flags arguments of cheri_revoke(2) and
cheri_revoke_get_shadow(2).  Use compressed names (sysdecode_cr*()) as
otherwise the formatting of the manpage is completely trashed (mandoc
uses too much space between columns).
Add two new modes to cheribsdtest_cheri_revoke_lib_run which start or
end a revocation of the entire array (rather than looping over it to
process a random number of times (average should be)).  Use this to open
an epoch prior to forking and close it in the child process.
kevans91 and others added 28 commits July 20, 2023 17:55
Simply resume waiting for events rather than exiting if we took a signal
here.

This at least fixes running programs under daemon(8) in the face of
suspend/resume, which I suspect hits us with a spurious EINTR rather
than a signal anyways.

Reported and tested by:	manu
Fixes:	8935a39 ("daemon: use kqueue for all events")
Merge commit 484e64f7e7b2 from llvm-project (by Roland McGrath):

  [libc++] Use __is_convertible built-in when available

  llvm/llvm-project#62396 reports that
  GCC 13 barfs on parsing <type_traits> because of the declarations
  of `struct __is_convertible`.  In GCC 13, `__is_convertible` is a
  built-in, but `__is_convertible_to` is not.  Clang has both, so
  using either should be fine.

  Reviewed By: #libc, philnik

  Differential Revision: https://reviews.llvm.org/D149313

Reported by:	Mark Millard <[email protected]>
MFC after:	3 days
mfiutil(8) in theory can work on devices attached to mrsas(4) but
mrsas(4) is missing the FreeBSD mfi(4) ioctl support.  Once that
support is added to mrsas(4) then mrsasutil(8) can manage mrsas(4)
attached devices.  So this commit depends on that.  When mrsasutil(8)
is run it automatically opens /dev/mrsas0 instead of /dev/mfi0.
Add -D <device> and -t <type> flag to optionally specify mrsas or mfi to
work with the existing -u <unit>.  Device is the device node with or
without /dev/

PR:			https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=265794
Differential Revision:	https://reviews.freebsd.org/D36343
Tested by:		Dan Mahoney <[email protected]>
Reviewed by:		jhb
If the i-node number (d_fileno) for a file on the server did
not fit in 32bits, it would be truncated to the low order 32bits
for the NFSv3 Readdir and ReaddirPlus RPC replies.
This is no longer correct, given that ino_t is now 64bits.

This patch fixes this by sending the full 64bits of d_fileno
on the wire in the NFSv3 Readdir/ReaddirPlus RPC reply.

PR:	271174
Reported by:	[email protected]
Tested by:	[email protected]
MFC after:	2 weeks
Some highlights from NEWS:

 ** Added OpenSSL 3.0 compatibility.
 ** Removed OpenSSL 1.0 compatibility.
 ** Support for FIDO 2.1 "minPinLength" extension.
 ** Support for COSE_EDDSA, COSE_ES256, and COSE_RS1 attestation.
 ** Support for TPM 2.0 attestation.
 ** Support for device timeouts; see fido_dev_set_timeout().
 ** New API calls:
  - es256_pk_from_EVP_PKEY;
  - fido_cred_attstmt_len;
  - fido_cred_attstmt_ptr;
  - fido_cred_pin_minlen;
  - fido_cred_set_attstmt;
  - fido_cred_set_pin_minlen;
  - fido_dev_set_pin_minlen_rpid;
  - fido_dev_set_timeout;
  - rs256_pk_from_EVP_PKEY.
 ** Reliability and portability fixes.
 ** Better handling of HID devices without identification strings; gh#381.

Relnotes:       Yes
Sponsored by:   The FreeBSD Foundation
Some highlights from NEWS:

 ** bio: fix CTAP2 canonical CBOR encoding in fido_bio_dev_enroll_*();
    gh#480.
 ** New API calls:
  - fido_dev_info_set;
  - fido_dev_io_handle;
  - fido_dev_new_with_info;
  - fido_dev_open_with_info.
 ** Documentation and reliability fixes.
 ** Support for TPM 2.0 attestation of COSE_ES256 credentials.

Relnotes:       Yes
Sponsored by:   The FreeBSD Foundation
If ftype is VM_PROT_WRITE | VM_PROT_WRITE_CAP but the PTE has PTE_W
clear, we'll accidentally skip over the "goto done" and end up returning
1 as if we did something.  For the same ftype, even if PTE_W is set, we
won't assert PTE_D.  Fix by treating VM_PROT_WRITE as the flag it is
rather than a whole value.
Now that we're tracking capdirty, it might be informative to show
userland.  Once upon a time, we thought this might be part of the
revoker implementation, but now we think it's just diagnostics.
* If we're allocating a page to store capabilities to it, assume we're
  about to do that in the near future and let hardware mark it CAPDIRTY,
  rather than requiring another trap to mark it CAPSTORE first.

* Do the same thing if we're populating a page to store capabilities to
  it.

* If we're on the soft_fast path, we might be upgrading a page from not
  storing capabilities to storing capabilities, so mark it CAPSTORE here,
  too.

* Otherwise, if we're about to land a page, assume that we want to keep
  the existing CAPSTORE status and so zero VM_PROT_WRITE_CAP
When we're fishing around at the MI layer to see if we can advise
pmap_enter to install a superpage of some form, vm_page_ps_test runs
predicates across the contiguous collection of pages.  Add a predicate
that all pages are considered possibly cap-bearing (i.e., are all
PGA_CAPDIRTY).

When we're attempting to map a superpage in a cap-store permitting
way, use this predicate to make sure that all constituent pages are
PGA_CAPDIRTY, just as we would have if we'd mapped them individually.
The swap pager might directly set tags on physical pages, bypassing
the MMU.  Go ahead and mark the vm_page as CAPDIRTY (and CAPSTORE) in
such cases.
When first mapping a page with abstract capability store permission,
should we map it CAPSTORE?  If we think that it's likely to experience
actual capability stores soon, yes; if not, then these cap-devoid pages
will create extra work for the (eventual) revoker.  Add sysctl so we
can experiment; default to minimizing traps (vm.capstore_on_alloc=1).
This function replaces the pattern:

	if (pmap_pte_dirty(pmap, pte))
		vm_page_dirty(m);

In this commit it's a (mostly) straightfoward simplification replacing
all but one call to vm_page_dirty().  A follow on change will use it to
set the capability dirty state (capdirty) as required.

Co-authored-by: Nathaniel Filardo <[email protected]>
Co-authored-by: Brett Gutstein <[email protected]>
This function replaces the pattern:

	if ((l3e & PTE_D) != 0)
		vm_page_dirty(m);

In this commit it's a (mostly) straightfoward simplification replacing
all but one call to vm_page_dirty().  A follow on change will use it to
set the capability dirty state (capdirty) as required.

Co-authored-by: Nathaniel Filardo <[email protected]>
While here, move EXCP_STORE_AMO_CAP_PAGE_FAULT traps to the ordinary
data_abort() path.  This builds directly on VM_PROT_WRITE_CAP,
unlike in the MIPS analogue, for which the equivalent of
EXCP_STORE_AMO_CAP_PAGE_FAULT is a C2E trap, not a TLB trap.
Set ATTR_SC on kernel mappings rather than CDBM.  We don't currently
need to track capdirty in kernel so just don't worry about it rather
than attemping to emulate in a trap handler.

Fixes: capdirty morello: support for capdirty tracking
This includes the merge of capdirty and tracing support into dev as well
as updates from upstream.
@brooksdavis
Copy link
Member Author

Oops, pushed this by accident before...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.