written on November 23, 2022
As a Rust programmer you are probably quite familiar with how references
work in Rust. If you have a value of type T
you can generally get
various references to it by using the ampersand (&
) operator on it. In
the most trivial case &T
gives you just that: a reference to T
. There
are however cases where you can get something else. For instance String
implements Deref<Target=&str>
which lets you also get a &str
from
it and that system also can be extended to work with mutable references as
well.
This dereferencing system also lets one work through another type. For instance mutexes in Rust are pretty convenient as a result:
let value: Mutex<u32> = Mutex::new(0);
// acquire the mutex into a guard object
let guard = value.lock()?;
// this "derefs" the guard into &mut u32
*guard += 42;
There are however cases where this neat system does not work: in particular you probably ran into this limitation with thread locals. You would expect a thread local to work this way:
thread_local! {
static value: RefCell<u32> = RefCell::new(0);
}
// borrow the cell and write into it.
*value.borrow_mut() += 42;
However unfortunately a thread local (called a LocalKey
) does not
implement Deref
. Instead you have to do this:
thread_local! {
static value: RefCell<u32> = RefCell::new(0);
}
// borrow the cell and write into it.
value.with(|value| {
*value.borrow_mut() += 42;
});
And it annoys me a lot. It's annoying not only with thread locals but also many other situations where you really would like to be able to deref but it's not possible. But why is that? And is there a better way?
I maintain a crate called fragile. The purpose of this crate is
allow you to do something that Rust doesn't want you to do: to send a
non Send
-able type safely to other threads. That sounds like a terrible
idea, but there are legitimate reasons for doing this and there are
benefits to it.
There are lots of interfaces that through abstractions require that your
types are Send
and Sync
which means that it needs to be send-able to
another thread and self synchronized. In that case you are required to
provide a type that fulfills this purpose. But what if the type does not
actually cross a thread boundary or not in all cases?
A common use for this are errors. Most error interfaces require that
errors are Send
and Sync
. Yet sometimes auxiliary information that
you want to provide just doesn't want to be this. My crates lets you put
a reference to that into your error anyways and you can at runtime safely
access the value for as long as you are on the same thread.
It accomplishes this in two ways with two different types:
Fragile
puts the value into type itself and lets you send a value into
another thread and back. Crucially you need to send it back if your
value has a destructor because if the value gets dropped on the wrong
thread fragile
will abort your process.
Sticky
is similar, but it puts the value into a thread local instead.
For as long as you are on the same thread you can access your value just
fine, on another thread it will error. Crucially though if the type
gets dropped on the wrong thread it will temporarily leak until the
originating thread shuts down and clears up the value. Not great, but
quite useful for some cases.
For Fragile
you can do this:
let val = Fragile::new(true);
assert_eq!(*val.get(), true);
This works, because the value is implicitly constrained by the lifetime of
the encapsulating object. However for Sticky
an issue arises and it has
to do with intentional leakage. Rust permits any object to live for as
long as the process does by explicit leakage with the Box::leak
API.
In that case you get a 'static
lifetime. Because Sticky
does not
directly own the data it points to, this means that through that API you
can make the lifetime of the Sticky
outlast the backing data which is in
the thread. This means that if Sticky
had the same API as Fragile
you
could create a crash in no time:
// establish a channel to send data from the thread back
let (tx, rx) = std::sync::mpsc::channel();
std::thread::spawn(move || {
// this creates a sticky
let sticky = Box::new(Sticky::new(Box::new(true)));
// leaks it
let static_sticky = Box::leak(sticky);
// and sets the now &'static lifetime to the contained value back
tx.send(static_sticky.get()).unwrap();
})
.join()
.unwrap();
// debug printing will crash, because the thread shut down and the
// reference points to invalid memory in the former thread's TLS
dbg!(rx.recv().unwrap());
This obviously is a problem and embarassingly that was missed entirely when the API was first created.
This is the same reason why thread locals won't let you deref something.
Because you could put something in there which gets leaked to 'static
lifetime and then the thread comes in and cleans up.
The reason with()
gets around this is that it can guarantee that a
reference that it passes to the closure, cannot escape it. This works,
but it's incredibly inconvenient. Here an example from MiniJinja
about how annoying this API really can be:
pub(crate) fn with<R, F: FnOnce() -> R>(f: F) -> R {
STRING_KEY_CACHE.with(|cache| {
STRING_KEY_CACHE_DEPTH.with(|depth| {
// do something here
f()
})
})
}
This is quite a lot of rightward drift. I need two nested functions to access two thread locals. Incidently I also create a similar API frustration to my caller because internally I need to do work that needs cleaning up.
Surely there must be a better way? And I believe there is. We should be
able to let the user "prove" that their lifetime is not 'static
. For
that we just need to create a utility vehicle that can never be 'static
and then that non static reference can be passed to all functions to
entangle the lifetimes accordingly.
The solution in fragile
uses zero sized token objects on the stack to
accomplish this. A StackToken
is a value that cannot be safely
constructed, it can only be created through a macro on the stack which
immediately takes a reference:
pub struct StackToken {
_marker: std::marker::PhantomData<*const ()>,
}
impl StackToken {
#[doc(hidden)]
pub unsafe fn __private_new() -> StackToken {
StackToken {
_marker: std::marker::PhantomData,
}
}
}
#[macro_export]
macro_rules! stack_token {
($name:ident) => {
#[allow(unsafe_code)]
let $name = &unsafe { $crate::StackToken::__private_new() };
};
}
The stack token itself is zero sized so it occupies no space. It also
is !Send
and !Sync
. That it's !Sync
is important. There are
two things that matter: one is that this type cannot be safely constructed.
The only way to get one is the stack_token!
macro:
stack_token!(scope);
This will create basically a let &scope = StackToken { ... }
on the
stack safely. From that point onwards any function that receives a
&StackToken
can be assured that this has a lifetime that is never static
and constrained to a stack frame. The token expresses basically that the
thread lifes for at least as long as the lifetime of that borrow. Since threads
won't randomly shut down and clean up the stack while code still references it,
this lets us create safe borrowing APIs like this:
pub fn get<'stack>(&'stack self, _proof: &'stack StackToken) -> &'stack T;
With this trick the lifetime is constrained and we are allowed to give out
references to the thread local which is exactly what Sticky
does. So
you can use it like this:
stack_token!(scope);
let val = Sticky::new(true);
assert_eq!(*val.get(scope), true);
And a hypothetical thread local API supporting stack tokens would change the example from above to this:
pub(crate) fn with<R, F: FnOnce() -> R>(f: F) -> R {
stack_token!(scope);
let cache = STRING_KEY_CACHE.get(scope);
let depth = STRING_KEY_CACHE_DEPTH.get(scope);
// do something here
f()
}
In some ways it would be really nice to be able to have first class
support for this. In the same way as 'static
is a special lifetime, one
could imagine there was a 'caller
or 'stack
lifetime that does this
automatically for us:
pub fn get(&'caller self) -> &'caller T;
In that case we wouldn't need to create this token at all. However there are some questions with that, in particular to which scope this should point when nested scopes are involved.
However even without syntax support maybe it would be conceivable to have
a standardized way to restrict lifetimes without having to use closures by
having something like an explicit StackToken
as part of the standard
library. Then also the build-in thread locals could provide access
through such an API. Here is what this could look like.
So here is an important question: is this sound? The answer is “unclear” as it makes a statement about relationships of stacks to threads that's not entirely explored. To quote Ralf Jung on a reddit thread about this topic:
So this is yet another case where Rust will have to decide -- either Stack Tokens are sound, or
mk_static
is sound, but not both.
What is mk_static
? mk_static
is a hypothetical function that lets you
make any reference static for as long as you're guaranteed not to return:
pub fn mk_static<T: 'static>(t: &T, f: impl FnOnce(&'static T)) {
struct DropBomb;
impl Drop for DropBomb {
fn drop(&mut self) {
std::process::abort();
}
}
let _bomb = DropBomb;
f(unsafe { std::mem::transmute(t)});
}
If such an API was sound then it would render the guarantees that stack tokens want invalid. So today neither of those things are clear, but one of them would have to be declared invalid for the other to work.
On a personal level I find the possibilities that stack tokens provide to be
more valuable than mk_static
but there are probably reasons to decide either
way.