#Borrow checker - mutable hashmap recursion

1 messages · Page 1 of 1 (latest)

amber geyser
#
use std::collections::HashMap;

#[derive(Clone, Copy, PartialEq, Eq, Hash)]
struct PigId(u32);

struct Pig {
    parent: Option<PigId>,
    is_disease_free: bool,
    /// This is `None` if the pig hasn't been checked for safety yet.
    is_safe_to_eat: Option<bool>,
}

fn get_or_calculate_safety(pigs: &mut HashMap<PigId, Pig>, id: PigId) -> bool {
    let pig = pigs.get_mut(&id).unwrap();
    if let Some(is_safe) = pig.is_safe_to_eat {
        is_safe
    } else {
        let parent_is_disease_free_or_nonexistent = match &pig.parent {
            None => true,
            Some(parent) => get_or_calculate_safety(pigs, *parent),
        };
        let is_safe = pig.is_disease_free && parent_is_disease_free_or_nonexistent;
        pig.is_safe_to_eat = Some(is_safe);
        is_safe
    }
}```
This fails the borrow checker, and I understand why. Is there a way to achieve what I'm trying to do here (mutable recursion up the tree) without extra unnecessary hashmap lookups?
#

(i.e. a pig is safe to eat if it and all of its ancestors are disease free)

buoyant cedar
#

Can make is_safe_to_eat a RefCell<Option<bool>. Can't think of any way of doing this with the current structure though.

amber geyser
#

so there's no safe zero-cost solution here?

plush dagger
#

you'll just lose thread-safety

#

(Sync)

amber geyser
#

huh
TIL

#

so RefCell enables interior mutability with runtime checks
and Cell does it just by dropping Sync?

#

that's cool

buoyant cedar
#

They both drop Sync

olive folio
#

Cell's primary characteristic is that you can never get a reference to its contents

#

(hence why RefCell is called that; it is a Cell-with-references)

amber geyser
#

yeah that makes sense
thanks

olive folio
#

because you can never get a reference, Cell is mostly useful for types that are Copy, but Option<bool> is such a type

#

also, as a general principle: the thing you are doing here is essentially caching (remembering a computed value for the future), and when you are doing caching, interior mutability is often part of the solution

amber geyser
#

will keep that in mind

#

...so it turns out I need this hashmap to be thread-safe in the wider context of the program
Is there a way to e.g. turn a &mut HashMap<Key, bool> into a &HashMap<Key, Cell<bool>>? In a scoped way

#

I'm guessing not

buoyant cedar
#

It should theoretically be possible to make HashMap with Cells Sync as long as you only allow modification to the Cells from an &mut HashMap access. Not sure what the best way to do this is, though.

amber geyser
#

oh I can just mutex it

#

right?

#

and then not use locks

#

just get_mut

buoyant cedar
#

Yeah, but that might be slower than doing two lookups

#

You'd use get

amber geyser
#

oh wait yea

#

hmm

buoyant cedar
#

Unless you mean Mutex::get_mut, you can't use that for the same reason as the original code

amber geyser
#

yeah

plush dagger
# amber geyser so `RefCell` enables interior mutability with runtime checks and `Cell` does it ...

Yeah, Cell enables interior mutability by dropping Sync and restricting the API a lot (only by-value operations; which for bool or Option<bool> are still quite usable). If you need references, then at that point things would get potentially dangerous even with Sync dropped. So a kind of Cell<bool> runtime guard is used to protect the "critical sections" around acquiring such references: that's RefCell<T>

#

If you need thread-safety, you could use AtomicBool or AtomicU8 as thread-safe versions of Cell<bool> and Cell<u8> respectively (or ::crossbeam::atomic::AtomicCell<Option<bool>>); and for what is worth, an AtomicU8 is often the mechanism enabling the thread-safe version of RefCell, that is, Rwlock<T> (or Mutex<T> with AtomicBool)

amber geyser
#

right

buoyant cedar
amber geyser
#

here I think SyncWrapper is enough actually
since I only need to do the internal-mutability thing if I already have a &mut reference to the thread-safe wrapper

plush dagger
#

In your case, I'm afraid you'll have to pay one of these three things:

  • double lookup;
  • atomic mutations;
  • non-Syncness
buoyant cedar
#
  • unsafe
trail pilotBOT
#

The only things that are different in Unsafe Rust are that you can
• Dereference raw pointers.
• Call unsafe functions (including C functions, compiler intrinsics, and the raw allocator).
• Implement unsafe traits.
• Mutate statics.
• Access fields of unions.

The reason these operations are relegated to Unsafe is that misusing any of these things will cause the ever dreaded Undefined Behavior. Invoking Undefined Behavior gives the compiler full rights to do arbitrarily bad things to your program. You definitely should not invoke Undefined Behavior.

https://doc.rust-lang.org/nomicon/what-unsafe-does.html

buoyant cedar
#

oh

#

my bad lol

amber geyser
#

lol

plush dagger
amber geyser
#

ooh

#

similar yeah

#

but not the same I think

plush dagger
#

And I doubt it would help in your use case, since in both cases the origin of the reference is &mut, so you'd be back to no interior mutability

buoyant cedar
#

I think we still need SyncWrapper since we have &mut HashMap but use that to get two &Cells

plush dagger
#

I guess SyncWrapper could help you if your "recursion" were &-based, but the entrypoint to it, &mut based

amber geyser
#

yeah that's basically the situation here

#

the recursion doesn't need to be thread safe, but in the wider context of the program, the hashmap needs to be put in a thread-safe type

#

and SyncWrapper can do that zerocost

plush dagger
#

?eval

use std::cell::Cell as Mut;
use std::collections::HashMap;

#[derive(Clone, Copy, PartialEq, Eq, Hash)]
struct PigId(u32);

struct Pig {
    parent: Option<PigId>,
    is_disease_free: bool,
    /// This is `None` if the pig hasn't been checked for safety yet.
    is_safe_to_eat: Mut<Option<bool>>,
}

use sync_wrapper::SyncWrapper;
mod sync_wrapper {
    pub
    struct SyncWrapper<T>(T);

    impl<T> From<T> for SyncWrapper<T> { fn from(v: T) -> Self { Self(v) }}

    impl<T> SyncWrapper<T> { pub fn get_mut(&mut self) -> &mut T { &mut self.0 }}

    unsafe // Safety: no public non-unsafe `&`-based API
    impl<T> Sync for SyncWrapper<T> {}
}

type ThreadSafeMap<K, V> = SyncWrapper<HashMap<K, V>>;

fn get_or_calculate_safety(pigs: &mut ThreadSafeMap<PigId, Pig>, id: PigId) -> bool {
    get_or_calculate_safety_(pigs.get_mut(), id)
}

fn get_or_calculate_safety_(pigs: &HashMap<PigId, Pig>, id: PigId) -> bool {
    let pig = pigs.get(&id).unwrap();
    if let Some(is_safe) = pig.is_safe_to_eat.get() {
        is_safe
    } else {
        let parent_is_disease_free_or_nonexistent = match pig.parent {
            None => true,
            Some(parent) => get_or_calculate_safety_(pigs, parent),
        };
        let is_safe = pig.is_disease_free && parent_is_disease_free_or_nonexistent;
        pig.is_safe_to_eat.set(Some(is_safe));
        is_safe
    }
}
ashen turtleBOT
#
()```
amber geyser
#

the inner get_or_calculate_safety_ fn can just take a regular map, doesn't need to be thread-safe wrapped

plush dagger
amber geyser
#

got it

#

yeah that looks about right

#

thank you very much

#

now to implement this for my real code, which is sadly pigless and much less simple

#

I feel like std should have something like SyncWrapper?

plush dagger
amber geyser
#

a cornucopia of synchronisation primitives

#

guess it makes sense that not all of them make it into std

plush dagger
#

reëntrant mutex is very interesting, because it's API is kind of &ReentrantMutex<T> -> &T, which seems silly. But it's precisely for this reason: it syncrhonizes against parallel accesses, but does let same-thread concurrent accesses occur (which is why it can't yield a &mut T); and because it does guard against parallel accesses, it is Sync even if T is not

plush dagger
#

I find ReentrantMutex to have a very interesting signature, even though I think I've never used it in practice 😅

#

(it's T : Send => ReentrantMutex<T> : Sync; although theoretically, it could be T : Send or Sync => ReentrantMutex<T> : Sync)

amber geyser
#

for my use-case requiring &mut is fine
(the use-case is bevy)

amber geyser
#

advanced problem time:

pub struct Foo {
    pub bar: u8,
    pub baz: Cell<u16>,
}

pub struct Foos {
    map: SyncWrapper<HashMap<u32, Foo>>,
}

impl Foos {
    pub fn get_foo(&mut self, key: u32) -> Option<&Foo> {
        self.map.as_mut().get(u32)
    }

    pub fn get_bar(&self, key: u32) -> Option<&u8> {
        todo!()
    }
}```
#

this would need unsafe I'm assuming

#

ack, SyncWrapper doesn't let you get a raw pointer for unsafe shenanigans

#

time to impl Sync manually then

#
pub struct Foo {
    pub bar: u8,
    pub baz: Cell<u16>,
}

pub struct Foos {
    map: HashMap<u32, Foo>,
}

impl Foos {
    pub fn get_foo(&mut self, key: u32) -> Option<&Foo> {
        self.map.get(u32)
    }

    pub fn get_bar(&self, key: u32) -> Option<&u8> {
        self.map.get(u32).map(|foo| &foo.bar)
    }
}

// SAFETY: `baz` can only be accessed through `&mut Foos`
unsafe impl Sync for Foos {}```