Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource proxy mutabability, mutex Trait and closure captures (async fn) #36

Open
perlindgren opened this issue Jun 11, 2020 · 7 comments

Comments

@perlindgren
Copy link
Contributor

Resource proxy mutability

Mutable references to resource proxies

Currently in RTIC/RTFM we need a mutable reference to the resource proxies:

#[task(binds = GPIOA, resources = [shared])]
    fn gpioa(mut c: gpioa::Context) {
        // the lower priority task requires a critical section to access the data
        c.resources.shared.lock(|shared| {
            // data can only be modified within this critical section (closure)
            *shared += 1;
            c.resources.shared.lock(|shared2| { // <- here Rust would spot the error (*ERROR*)
                // both shared and shared2 would point to the same location

Rust requires that no two references in scope points to the same location. We use the borrow checker to ensure that at compile time.

Pros

  • Statically checked, error messages upfront.
  • Zero run-time cost (no ref counting).
  • ...

Cons

  • The corresponding Mutex trait requires a mutable reference to the resource proxy. This has proven difficult to adopt for general purpose use.
  • Mutable references cannot be captured by multiple closures, making it difficult for async functions to use RTIC resources.
  • ...

Alternative approach

Currently lock optimizations is done by the llvm backend using "meta data" holding the current priority. This is done by passing an additional parameter (priority), that is optimized out.

pub unsafe fn lock<T, R>(
    ptr: *mut T,
    priority: &Priority,
    ceiling: u8,
    nvic_prio_bits: u8,
    f: impl FnOnce(&mut T) -> R,
) -> R {
    let current = priority.get();

    if current < ceiling {
        if ceiling == (1 << nvic_prio_bits) {
            priority.set(u8::max_value());
            let r = interrupt::free(|_| f(&mut *ptr));
            priority.set(current);
            r
        } else {
            priority.set(ceiling);
            basepri::write(logical2hw(ceiling, nvic_prio_bits));
            let r = f(&mut *ptr);
            basepri::write(logical2hw(current, nvic_prio_bits));
            priority.set(current);
            r
        }
    } else {
        f(&mut *ptr)
    }
}

We could think of a similar approach, but applied to the resource proxies. In this case it would be reference counter (in fact a locked: bool would do). All resource proxies in the context could be initialised to false, and when locked set to true and when unlocked reset to false. When locking we could panic in case the (locked == true), indicating *ERROR* in the code above.

Pros

  • We could adhere to a relaxed requirement on the resource proxy (now &resource_proxy, instead of &mut resource_proxy), allowing the Mutex trait to be relaxed consequently.
  • Would allow sharing of resource proxies to closures, such as async functions.

Cons

  • Runt time checking instead of static analysis. (We cannot be sure without further analysis that code will not panic.)
  • Memory OH (likely a byte for each resource, or maybe even a word depending on padding). Even if llvm optimizes out the check, the memory for the locked field will still be allocated (but never touched).
  • Run-time OH in case llvm fails to optimize out the check.

Soundness

As we panic if a "double lock" occurs it should be safe.

Complexity of implementation

Fairly low.

Breaking change?

The change will likely not break current code, since mutable references will be downgraded to immutable references.

@perlindgren
Copy link
Contributor Author

Looking closer at the implementation. The "resource_proxy" could be created for each task invocation, meaning that the OH of an extra field would be only a stack allocation, never stored on the (static heap), so its not really an issue (at least not as much as a static allocation).

@perlindgren
Copy link
Contributor Author

perlindgren commented Jun 12, 2020

Another positive implication is that this approach can be seen as a limited soundness run-time verification. I.e., if the borrow checker somehow would be broken, at least the we would catch this by a panic. This is not a very strong guarantee though, since it only checks the case of nested re-locking, not other soundness issues that would be caused by a broken borrow checker. However, if we store the locked with the actual resource (stored statically), then we would actually have a run-time verification of soundness cross tasks. This however brings OH as the checks would be done for each resource lock (but that might be a price worth to pay for safety/security critical applications).

Notably, the big difference in between FreeRTOS and SafeRTOS, is that the latter is certified to standards, automotive etc., by run-time checking/verification (and crippling API functionality/surface). In our case the OH would be very small and predictable (and we do not need to cripple API functionality/surface), since the API is sound by design. The only potentially unsound operations in RTIC reside in the small and well defined blocks of unsafe code.

@perlindgren
Copy link
Contributor Author

perlindgren commented Jun 13, 2020

There is now a testing branch for the aforementioned approach.
cortex-m-rtic/immutable_resource_proxies, and corresponding
rtic-core/immutable_resource_proxies.

The Mutex trait no longer takes an &mut self, but the more relaxed &self.
Previously, Exclusive implemented the Mutex trait, I removed this for the time being, unclear how and if we want this at all. Exlusive is already an &mut, we would temporarily need to downgrade mutability to fit the Mutex trait, and restore it for the closure application. Not sure if a transmute could be used to restore the mutability, or if we could do it in a cleaner way. (RefCell hands out &mut T, so maybe looking there could give some hint, or maybe I'm missing the obvious...).

In any case, the Boolean locked is handled by a Cell, initialized for each sequential context (task invocation) to false. The auto generated code implementing lock, now looks like this.

impl<'a> rtic::Mutex for resources::shared<'a> {
        type T = u32;
        #[inline(always)]
        fn lock<R>(&self, f: impl FnOnce(&mut u32) -> R) -> R {
            #[doc = r" Priority ceiling"]
            const CEILING: u8 = 2u8;
            if unsafe { self.get_locked() } {
                ::core::panicking::panic("Resource locked in sequential context");
            };
            unsafe {
                self.set_locked(true);
            }
            let r = unsafe {
                rtic::export::lock(
                    &mut shared,
                    self.priority(),
                    CEILING,
                    lm3s6965::NVIC_PRIO_BITS,
                    f,
                )
            };
            unsafe {
                self.set_locked(false);
            }
            r
        }
    }

self here is the resource proxy. We check if the lock is taken, if so we panic. Else we claim the lock (set_locked(true)) do the real hw lock (here we execute f), and restore the locked state to false after returning.

The lock3 example ( cargo run --example lock3 --target thumbv7m-none-eabi), all is well. Behaves line the original lock example, as there are no re-locking of resources.

The lock4 example ( cargo run --example lock4 --target thumbv7m-none-eabi), blows up at run-time as expected.

A
B - shared = 1
panicked at 'Resource locked in sequential context', examples/lock4.rs:12:1

Source:

 c.resources.shared.lock(|shared| {
            // data can only be modified within this critical section (closure)
            *shared += 1;

            // GPIOB will *not* run right now due to the critical section
            rtic::pend(Interrupt::GPIOB);

            hprintln!("B - shared = {}", *shared).unwrap();

            c.resources.shared.lock(|_shared| {}); // here we try to re-lock

I tested this both in dev and release mode.

It will be interesting to see how well LLVM copes with the added complexity regarding optimization, and if not what a realistic overhead would be.

The gpioa code of lock2.rs compiles to:

000000f0 GPIOA:
      f0:       push    {r7, lr}
      f2:       mov     r7, sp
      f4:       movs    r0, #192
      f6:       bl      #336
      fa:       movw    r0, #0
      fe:       movt    r0, #8192
     102:       ldr     r1, [r0]
     104:       adds    r1, #1
     106:       str     r1, [r0]
     108:       movw    r0, #57856
     10c:       movs    r1, #2
     10e:       movt    r0, #57344
     112:       str     r1, [r0]
     114:       movs    r0, #224
     116:       bl      #304
     11a:       movs    r1, #38
     11c:       movs    r0, #24
     11e:       movt    r1, #2
     122:       bl      #270
     126:       movs    r0, #0
     128:       pop.w   {r7, lr}
     12c:       b.w     #282 <__basepri_w>

This code includes pending a task and exiting the simulator. A more realistic example would be lock2_miniamal.rs.

Here we get:

000000f0 GPIOA:
      f0:       push    {r7, lr}
      f2:       mov     r7, sp
      f4:       movs    r0, #192
      f6:       bl      #324
      fa:       movw    r0, #0
      fe:       movt    r0, #8192
     102:       ldr     r1, [r0]
     104:       adds    r1, #1
     106:       str     r1, [r0]
     108:       movs    r0, #224
     10a:       bl      #304
     10e:       movs    r0, #0
     110:       pop.w   {r7, lr}
     114:       b.w     #294 <__basepri_w>

As a comparison. The same example under vanilla RTIC, gives:

000000f0 GPIOA:
      f0:       push    {r7, lr}
      f2:       mov     r7, sp
      f4:       movs    r0, #192
      f6:       bl      #324
      fa:       movw    r0, #0
      fe:       movt    r0, #8192
     102:       ldr     r1, [r0]
     104:       adds    r1, #1
     106:       str     r1, [r0]
     108:       movs    r0, #224
     10a:       bl      #304
     10e:       movs    r0, #0
     110:       pop.w   {r7, lr}
     114:       b.w     #294 <__basepri_w>

At least for simple examples, there is Zero OH inferred by the immutable_resource_proxies, LLVM was able to remove all reference counting, setting, clearing the locked Cell.

As its passed 2AM, I will rest at this.

Feel free to experiment with the relaxed resource proxies, I look forward to further discussions, evaluations etc. I hope it will play nicely with other developments, like goodby exclusive, lock-optimisation, etc.

@perlindgren
Copy link
Contributor Author

Closed by mistake ...

@perlindgren
Copy link
Contributor Author

perlindgren commented Jun 13, 2020

By the way, the bl, is calling the read/write of BASEPRI, the cargo obdump lost the named symbols to external code somehow in the dissasembly. (Should have run with the inline asm instead on nightly, then you see exactly what's going on.) We should have a closer look at the fences, not entirely sure they are correct/sufficient.

@perlindgren
Copy link
Contributor Author

perlindgren commented Jun 15, 2020

Update.

The immutable_resource_proxies branches of rtic-core and cortex-m-rtic now includes Exclusive following the same patterns as Mutex (i.e., panic on re-lock).

Currently, RTIC hands out &mut T, for resources exclusively owned. These can now be wrapped into an Exclusive<&'a mut T> data structure, supporting both Deref, DerefMut and Mutex lock. Thus possible to pass on to external code (e.g., HAL implementations) that internally lock the resources where needed.

In this way, we can support a symmetric API. In the future we might choose to hand out Exclusive instead of `&mut T. This should also work well with #30, allowing do explicitly state the set Exclusive resources (and hence direct Deref/DerefMut, will never break).

The examples, lock5..lock9 demonstrates various safety features.

No OH measurements have been done, but as being less complex than general Mutex resource proxies, LLVM will have an easier task to optimize away the underlying Cell and its counterparts.

The panics could be be implemented as a linked library (non present) thus any such calls could be spotted at compile time.

A re-locking, would cause an linking error.

❯ cargo run --example lock8 --target thumbv7m-none-eabi --release
   Compiling rtic-core v0.3.0 (/home/pln/rust/grepit/rtic-core)
   Compiling cortex-m-rtic v0.5.3 (/home/pln/rust/grepit/cortex-m-rtic)
error: linking with `rust-lld` failed: exit code: 1
  |
  = ...88J/libcortex_m-19f1dd21ecd80693.rlib" "--end-group" "/home/pln/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/thumbv7m-none-eabi/lib/libcompiler_builtins-fa0816b847f38cf8.rlib" "-Tlink.x" "-Bdynamic"
  = note: rust-lld: error: undefined symbol: re_lock
          >>> referenced by lock8.cmw4tywi-cgu.0
          >>>    

Notice, this is a poor man's solution, but gives a static guarantee that the program will never reference the external re-lock functions (here LLVM has proven the program free of re-locks). Of course this works only in realease builds, and can also show false positives (i.e., LLVM is note able to prove re-lock freedom, but the program would never run into the re-locks at run-time. (So a correct program could be rejected). So it can be seen as an opt-in (or maybe opt-out) functionality, and provided behind a feature. This is currently not implemented though.

@perlindgren
Copy link
Contributor Author

Update.

As a proof of concept, there is now a feature link_fail.

  Compiling cortex-m-rtic v0.5.3 (/home/pln/rust/grepit/cortex-m-rtic)
error: linking with `rust-lld` failed: exit code: 1
  |
  = note: "rust-lld" "-flavor" "gnu" "-L" "/home/pln/.rustup/toolchains...fa0816b847f38cf8.rlib" "-Tlink.x" "-Bdynamic"
  = note: rust-lld: error: undefined symbol: re_lock
          >>> referenced by lock8.9vno1x9n-cgu.0
          >>>               /home/pln/rust/grepit/cortex-m-rtic/target/thumbv7m-none-eabi/release/examples/lock8-6bf696e4b09f1abf.lock8.9vno1x9n-cgu.0.rcgu.o:(GPIOC)

While

cargo run --example lock8 --target thumbv7m-none-eabi --release 
    Finished release [optimized] target(s) in 0.02s
     Running `qemu-system-arm -cpu cortex-m3 -machine lm3s6965evb -nographic -semihosting-config enable=on,target=native -kernel target/thumbv7m-none-eabi/release/examples/lock8`
A
B - shared = 1
panicked at 'Re-lock of Exclusive resource', /home/pln/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/macros/mod.rs:34:9

I have not implemented link_fail for the Mutex panics, but it could be done in a similar fashion. Not sure the name link_fail is the best, perhaps link_fail_on_re-lock would be more descriptive, but I got lazy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant