#Fixing UB in random crates
1 messages · Page 1 of 1 (latest)
does it need to avoid sync ever, or just where you want to print?
could you write to a thread local String?
or a static mut [u8; N]
yeah I'm trying to debug a data race
need to know more details and println! is changing things
Please complain here that debugging data races sucks https://github.com/rust-lang/miri/issues/2205
It will encourage me to work on it
can you disable the data race detector
no I'm using it lol
okay this works:
const EMPTY: String = String::new();
static mut PRINT_I: AtomicUsize = AtomicUsize::new(0);
static mut PRINTS: [String; 100] = [EMPTY; 100];
pub fn println(x: String) {
unsafe {
let i = PRINT_I.fetch_add(1, Ordering::Relaxed);
PRINTS[i] = x;
}
}
pub fn flush() {
unsafe {
for x in PRINTS.iter() {
println!("{}", x);
}
}
}
yes
I mean, it's sound
actually, I need to call flush right before the data race happens 
there's no way to tell miri to keep running after a race is there
wait, can I open stdout as a File 
pub fn println(x: String) {
unsafe {
let mut f = std::fs::File::from_raw_fd(1);
f.write_all(x.as_bytes()).unwrap();
std::mem::forget(f);
}
}

it works
data racing to detect the data race
https://twitter.com/sunshowers6/status/1550545015667011584 that's pretty cool, running each test in isolation
Super excited to announce that nextest now works with the Miri interpreter for @rustlang! Simply install the latest version of cargo-nextest, and Miri from master (nightly coming soon). Then, run:
cargo miri nextest run
thankfully they clarified that they weren't about to spin up THREAD_COUNT miri instances at once :P

which would oom so fast you wouldn't believe
cargo-nextest is something my instinctual reaction of “just contribute to cargo” has forbade me from trying so far
but maybe I should
There are many big very hard challenges around interesting this into cargo, that's why it is its own thing
Out of curiosity, what are they? Is there anything stopping them just having cargo test-next with an entirely different codebase than cargo test?
I don't remember specifics
Cargo nextest is really nice, use it. However it doesn't run doctests
alright, I’ll stomach it :P
And I think it is intended to move fast for now rather than be stuck by cargo
Eg cargo has unstable apis for ages now, and I can't use them in CI properly. Nextest just worked to get my nice junit output
also conrad you may be interested in the discussion starting about here <#rust-discussions-3 message>
fair point
it seems like we need better processes for development of tools like Cargo so experimentation can usefully happen in-tree
I think nextest also doesn't guarantee that your tests will work. If you have some sync in your tests it might fail because nextest is more parallel by default. Our test suite at work required fixing to use nextest
In fact, I had to debug annoying parallel test failures today. -j 1 all tests passed. In parallel a whole module of tests failed :(
The solution ended up being making a separate database table for each test 
For complicated reasons that I can't really explain but it was my whole day debugging why sometimes the test passed and sometimes it failed
sounds fun
Essentially we run background workers that process jobs in batches, we made a change that intentionally failed the batch job (which gets picked up later), but since its synced in the DB, some other tests were processing it in a different test context, so the actual test didn't detect the work being done
So we isolated the DBs so only that one test could process the jobs it scheduled
The only difference is that nextest runs different test targets in parallel
If your tests can't tolerate running in parallel that's mad sketch
test runner that runs tests concurrently with themselves to try and find tests that can't handle that
as in, running the same test twice
in parallel :P
cargo test already runs tests within a target in parallel, in the same process, which is super surprising to some people
i suppose the solution nowadays is a static mutex of () for each Thing that can't handle being run in parallel
or better yet fix your tests so you operate on isolated things
I think nextest breaks the static mutex trick
You can't synchronize across test targets with it
Shmem locks
nextest runs each test in a separate process
And it's glorious
Henlo
@haughty mica https://miri.saethlin.dev/ub?crate=ferris-says&version=0.2.1 is old
ferris_says already updated their smallvec version 9 days ago
sighhh is miri not available on latest nightly again
Also I don't use that, just FYI
wdym?
It's also possible I just need to rerun it
smallvec is also not tested on latest release, miri also can't seem to find UB on latest release of it
Right but what's exactly the point of still having an older release on the UB page 
i think i might be able to send a pr to update smallvec in rhai to prevent UB
The code that renders that page is broken, I broke it a few days ago
groan tests in miri are so slow
they are a lot faster than they once were
Not that many
@haughty mica a comment on the StackCache reads:
/// This maps items to locations in the borrow stack. Any use of this still needs to do a
/// probably-cold random access into the borrow stack to figure out whatPermissionan
///SbTaggrants. We could avoid this by also storing thePermissionin the cache, but
/// most lookups into the cache are immediately followed by access of the full borrow stack anyway.
isn't this incorrect, now that Item is bit packed and therefore the whole item is stored in the stack?
This is my second delegate 

At work we're trying to make a short list of things to tell people not to do here, as sort of like ground rules. The most-agreed-upon one is
Don't rubber-stamp a PR just because Ben approved it
Btw I am trying to review this but English is hard
The nextest maintainer is a machine
also
do you know anything about the -Zmiri-retag-fields flag
Yes
do they also fork themselves to run in parallel?
should i be going and investigating the cases where the stdlib fails with it
or is that a known issue
Does it?
yes
it... retags fields on non-box types 
Please report them
I think I already ran the stdlib tests with it
Oh fuck yeah
alloc is where i expect the more spicy stuff to be in any case
Yeah
i'll let core tests run
see if anything fun is found there
either a failure or an OOM
take your pick
where do miri failures go, rust-lang/rust ?
I can't figure out how to prevent my vec-box branch from running into this so I though it was just my fault because I could have sworn the whole thing passed cleanly before
Yes probably
The flag works as intended I think
ralf Has Awakened (i got 2 notifs from him replying to things i'm in, trying to push them forward)
I can write this up and credit you if that's what you're asking
nah i'll do it
Now this is podracing
With my unix-exec branch, nextest now works perfectly with Miri
okay reported
https://github.com/rust-lang/rust/issues/99701
Using https://github.com/rust-lang/miri-test-libstd, running MIRIFLAGS="-Zmiri-retag-fields" ./run-test.sh alloc --all-targets collections::vec_deque::tests::test_drain, I get run...
i didn't ping ralf because i didn't want to annoy him even though i don't think he'd mind because i just know he'll find it anyways
Holy shit is this a retag of the base tag as SRO???
That would be 
Ah yes, you have passed this VecDeque to a function, IT IS NOW IMMUTABLE
Oh wait
No this is so confusing
We're trying to use the base tag for writes and we can't because we got a SRO retag from a function argument
aren't we passing ownership of the Drain
oh
hm
drain internally is using an Iter (immutable reference)
probably
field retagging is rough because the span is just like "this line" and it could have recursed through who knows how much type
l m a o
does
does using iter invalidate deque
There's no invalidation here
right
You just can't write through deque while iter exists, because normally you could invalidate iter but since it is a function argument, you cannot
I think this might be fixed by making the iterator wrap a raw slice
can you move the iteration logic all to a RawIter and then have the others just be thin wrappers over that
that turn a *mut T into whatever is valid for them
also iter::adapters::flatten::test_trusted_len_flatten still running
no sign of stopping
so i guess TODO: work out why that takes actually fucking hours to run under miri
let iter = [(), (), ()].iter().flat_map(|_| [(); 1000]);
assert_trusted_len(&iter);
assert_eq!(iter.size_hint(), (3000, Some(3000)));
ah
i guess that's probably it
Test passes now
Yeah that's it
I commented with my explanation
Someone will have fun patching this
Ohhh I think I get it. it's basically
pub fn foo(x: &u32, y: *mut u32) {
y.write(42);
}
where x and y alias?
and that's not okay because if you pass something as an argument it must hold until the return of the function?
Yes
because we want funny llvm attributes 
we do a little optimization
I was looking a bit at sled for entertainment because I've heard it's very cursed and found this gem of an issue https://github.com/spacejam/sled/issues/1407
that is pretty funny ngl
the mem::transmute paw curls a finger as it passes the size check
hey fewwis can you eval ```rust
unsafe {
let mut vec = Vec::from_raw_parts(1 as *mut u8, 0, 10);
vec.push(1);
}
timeout: the monitored command dumped core
/playground/tools/entrypoint.sh: line 11: 7 Segmentation fault timeout --signal=KILL ${timeout} "$@"
holy shit that's a segfault in Vec::push, better open an issue
a size check on transmute simply isn't enough
we also need a vibe check
// in std::mem
enum Vibe {
Inty,
Pointery,
Uninity
}
error[E0512]: cannot transmute between types of different vibes
--> src/main.rs:3:5
|
3 | std::mem::transmute::<u64, &[u8; 8]>(0)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: source type: `u64` (inty vibes)
= note: target type: `&[u8; 8]` (pointery vibes)
For more information about this error, try `rustc --explain E0512`.
this would be a great addition to our memory model, since it's very vibey already
wait could we panic on any transmute from an integer type to a non-ZST reference
transmutes don't bless the reference with provenance
(it's fine for a ZST tho)
and if we're saying it's dereferencable....
they should not but has this been decided?
this is about ptrtoint but it does say provenance is removed
and i'm fairly sure if we don't emit a inttoptr on a transmute, this is currently LLVM UB
how do we compile from_exposed_addr?
given recent events, the answer is probably "wrongly"
but how should we compile it?
llvm has inttoptr and ptrtoint instructions, doesn't it?
?llvmir
#![feature(strict_provenance)]
pub fn owo(x: usize) -> *const u8 {
std::ptr::from_exposed_addr(x)
}
pub fn uwu(x: usize) -> *const u8 {
unsafe { std::mem::transmute(x) }
}
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define i8* @_ZN7example3owo17h6e6f82da7fd68509E(i64 %x) unnamed_addr #0 {
%0 = inttoptr i64 %x to i8*
ret i8* %0
}
attributes #0 = { mustprogress nofree norecurse nosync nounwind nonlazybind readnone uwtable willreturn "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
!llvm.module.flags = !{!0, !1}
!0 = !{i32 7, !"PIC Level", i32 2}
!1 = !{i32 2, !"RtLibUseGOT", i32 1}
hmmmm
does a transmute do an inttoptr too?
hey fewwis can you llvmir ```rust
pub fn uwu(x: usize) -> *const u8 {
unsafe { std::mem::transmute(x) }
}
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define i8* @_ZN7example3uwu17h0d2aef5dcb2270caE(i64 %x) unnamed_addr #0 {
%0 = inttoptr i64 %x to i8*
ret i8* %0
}
attributes #0 = { mustprogress nofree norecurse nosync nounwind nonlazybind readnone uwtable willreturn "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
!llvm.module.flags = !{!0, !1}
!0 = !{i32 7, !"PIC Level", i32 2}
!1 = !{i32 2, !"RtLibUseGOT", i32 1}
yeah
wait where's the uwu
?llvmir
pub fn uwu(x: [usize;2]) -> [*const u8; 2] {
unsafe { std::mem::transmute(x) }
}
pub fn owo(x: [usize;2]) -> [*const u8; 2] {
x.map(|x| x as *const u8)
}
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
%"unwind::libunwind::_Unwind_Exception" = type { i64, void (i32, %"unwind::libunwind::_Unwind_Exception"*)*, [6 x i64] }
%"unwind::libunwind::_Unwind_Context" = type { [0 x i8] }
define void @_ZN7example3uwu17h1bbbc5917c46399fE([2 x i8*]* noalias nocapture noundef writeonly sret([2 x i8*]) dereferenceable(16) %0, [2 x i64]* noalias nocapture noundef readonly dereferenceable(16) %x) unnamed_addr #0 {
%1 = bitcast [2 x i64]* %x to i8*
%2 = bitcast [2 x i8*]* %0 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(16) %2, i8* noundef nonnull align 8 dereferenceable(16) %1, i64 16, i1 false)
ret void
}
define void @_ZN7example3owo17h3a66012a35f73094E([2 x i8*]* noalias nocapture noundef writeonly sret([2 x i8*]) dereferenceable(16) %0, [2 x i64]* noalias nocapture noundef readonly dereferenceable(16) %x) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
%1 = bitcast [2 x i64]* %x to <2 x i64>*
%2 = load <2 x i64>, <2 x i64>* %1, align 8
%3 = inttoptr <2 x i64> %2 to <2 x i8*>
%4 = bitcast [2 x i8*]* %0 to <2 x i8*>*
store <2 x i8*> %3, <2 x i8*>* %4, align 8, !alias.scope !2, !noalias !7
ret void
}
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #1
declare noundef i32 @rust_eh_personality(i32, i32 noundef, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*) unnamed_addr #2
attributes #0 = { mustprogress nofree nosync nounwind nonlazybind uwtable willreturn "probe-stack"="__rust_probestack" "target-cpu"="x86-64" }
attributes #1 = { argmemonly mustprogress nofree nounwind willreturn }
attributes
```Output too large. Godbolt link: <https://godbolt.org/z/1sPb53fWM>
removed probably
for being the exact same function as owo
nice
they'll point to the same function
deduplicated, yeah
now
so transmute already knows the types' vibes internally 
is this okay
does this inttoptr
or is it literally
if you are a single pointer
then we'll inttoptr you
otherwise
you're on your own
use as
what does bitcast do 
presumably a pointer cast
HACK(eddyb) LLVM doesn't like
bitcasts between pointers and non-pointers.
lol
yeah it looks like UB
to bitcast between ints and pointers
very cool hack thank you eddyb
so basically, uwu takes two arguments, both pointers
a pointer to the start array as a [2 x i64]*, and a pointer to the return place as a [2 x i8*]*
we cast both of those pointers to i8* so we can pass them to memcpy, then we do the memcpy
then we return
so yeah
provenance is not kept
I think this is related to the "LLVM IR can't implement memcpy" issue
to which the proposed solution are byte types
you could then emit bytecast for all transmutes and LLVM would do the right thing
well, not quite
if you want to turn a usize into a *const u8 in a way that recovers exposed provenance, you need to actively do an operation on that usize
you can't just cast it
ptr <-> int transmutes aren't UB
but integers don't hold provenance
you could bytecast it
does a bytecast do an implicit ptrtoint
the proposed semantics are that bytecast from int to ptr does ptrtoint
it's basically a device to select the right cast for you
safety: U is not larger than T
unsafe fn transmute<T>(item: T) -> U {
let mu = ManuallyDrop::new(item);
let p = &mu as *const ManuallyDrop<T> as *const T as *const U;
p.read_unaligned()
}
does this do a ptrtoint?
depends on how you lower it
probably yeah
i mean
does every read_unaligned have to magically
you can transform this "addrof ptrcast read" into "a_real_transmute"
at which point there is only one place for it to happen
but yeah
so literally all memory access goes through bytes
yes
unlike C, where only memory accesses at character type go through bytes 
Rust has no strict aliasing after all
this doesn't seem sound in the general case
but i can't think of a counterexample
where it's literally impossible to confuse llvm into not inserting an inttoptr
even through, say, union reads
sure, but bytes have runtime types
not in the "typed memory" sense
but a uninit byte is different from a raw byte is different from a pointer byte
what are you talking about exacty 522?
AM semantics?
LLVM semantics?
rust to llvmir lowering?
the proposal is to make LLVM always detect any form of ptr to int transmutes
as i understand it
yes
at the AM level we have
enum Byte { Int(u8), Ptr(u8, Prov), Uninit }
the memory read_unaligned is reading from is of type Int
but we want a pointer, so we need to convert it into Ptr
converting - inttoptr
that's exactly what the byte types are for
to reflect the possibility of having integer, pointer and/or padding bytes in the same type
or poison etc.
the upshoot is that pointer types need not be able to carry integers and vice-versa
no, we're introducing a third type in the middle
no, in the abstract machine
exactly so integers don't need to have provenance
if i do a ptr to int transmute, honestly, i'd expect that to be a no-op
what is a "no-op"?
ptr2int transmute is ```rust
match byte {
Ptr(addr, _prov) => Int(addr),
_ => unreachable!(),
}
no actual physical bytes are being changed
so in that sense it is a no-op
a bunch of magic voodoo metastate is being changed
which we have decided we're fine with doing all over the place
I view the byte types as basically
#[non_exhaustive]
union Byte {
Int,
Pointer,
Uninit,
Poison,
Undef,
Padding,
Null,
Nil,
NullButFromAnotherAddressSpace,
MoreLLVMWaysToSpellUninit,
}
it's type erasure for typed memory
https://github.com/rust-lang/rust/issues/93881#issuecomment-1193987540 wtf is going on here
😵💫
the playground doesn't display any message
it has the exact same rustc version than i have (rustc 1.62.1 (e092d0b6b 2022-07-16))
(it doesn't display any message on nightly either)
As an intrinsic 
That's still the bad error, no?
Not sure, but there was recently a PR merged to add track_caller for Miri
Might be relevant?
hmm, that's weird, because the playground is broken on stable as well
The good news is that nextest seems to be working
The bad news is that everything takes forever now
Some crates now take more than an hour
Maybe I should cap runtime per-crate as well as per-test

Can you integrate JUnit output and github actions so that test failures look nice?
Um what
Why are you asking about JUnit
Nextest has stuff for it
I assumed that's why you were using it for your miri thingy (does it have a better name?) so that you'd have machine-readable output
Lol no
I'm using nextest because it has per-test timeouts and process isolation for tests
I do realize that I didn't give any context for that lmao
I had an entire convo in my head and you got exactly none of it
All I care about is the exit code and parsing Miri diagnostics
Also I want the website to look just like your terminal would if you ran the tests yourself
you wanted a good name for it saethlin, here's one 
Everything was fine until the Miri nation attacked
that sounds fun
Milar (Mylar) because it:
- Strengthens the ecosystem
- Stretches programs/the machines its running on
- Makes crates more reliable
Hmmmmmmmmmmmmmmmmm
Oooooooooooo
Is https://github.com/petgraph/petgraph/pull/505 a good fix for https://miri.saethlin.dev/ub?crate=petgraph&version=0.6.2? I tried to make it verify the safety more statically. I'm not sure if I could safely make the slice splits _unchecked, although running the benchmarks seems to have a negligible speed difference between the original code and mine
the swap could probably be https://doc.rust-lang.org/std/primitive.slice.html#method.swap_with_slice ?
A dynamically-sized view into a contiguous sequence, [T]. Contiguous here means that elements are laid out so that every element is the same distance from its neighbors.
see if that optimises to be good enough
swap_with_slice is just an assert + swap_nonoverlapping, and the slice splitting already guarentees they don't overlap and are of equal length
a smaller fix would be to just do some pointer arithmetic on the first pointer to get the second pointer, but this works as well if it works
I did try pointer.add() but Miri still gave me the same error
#[cfg_attr(miri, ignore = "Can't open files with isolation")]
MIRIFLAGS="-Zmiri-disable-isolation"
at least in the bit of the code anyways
or include_bytes!()
depends how big the file is
I guess it just seems a bit redundant, although iirc there's no performance impact to never-taken branches 🤔
> wc -c tests/res/*
1999999 tests/res/graph_1000n_1000e.txt
1999999 tests/res/graph_1000n_1000e_iso.txt
19999 tests/res/graph_100n_100e.txt
19999 tests/res/graph_100n_100e_iso.txt
4039996 total
I think I got that from https://blog.cloudflare.com/branch-predictor/, might have misinterpreted what they meant
it depends a lot, but often it is "none"
Takes too long to run in Miri
could you maybe make the graph a little smaller in miri? would that be fast enough?
(if cfg!(miri) { small } else { big })
or using #[cfg(miri)]
I wouldn't really know enough about the crate to change the test so that it still passes :/
i am clueless about gaphs as well
I disabled that one because it was 16gb of RAM and 10 minutes in on my PC which I thought seemed a bit much
that is a lot indeed
but yeah try to do it safely as 522 mentioned
btw
secret rust trick
- node_adjacencies[pos..pos + old_node_capacity]
+ node_adjacencies[pos..][..old_node_capacity]
Ooh
I looked away for a minute and miri ate 30gb of ram 

yea sounds right
Building in release it seems like the assertion gets eliminated completely any way 
yeah, that's also often the case
I feel like there should be a clippy lint for this?
maybe
?clippy
fn main() {
let list = [1, 2, 3, 4];
dbg!(&list[1..1 + 2]);
dbg!(&list[1..][..2]);
}
it would be a subjective lint
i don't think it should be on by default
since i think it is somewhat more confusing
it's a little more confusing if you learn about it the first time
but it makes code a lot nicer once you know it
does it behave worse for bounds checks
well
bounds checks it's still just a min and a max
but does LLVM get the same range info
does it understand that that's start..len
hey fewwis can you godbolt ```rust
pub fn uwu(a: &[u8], x: usize, y: usize) -> &[u8] {
&a[x..x + y]
}
pub fn owo(a: &[u8], x: usize, y: usize) -> &[u8] {
&a[x..][..y]
}
example::uwu:
push rax
mov rax, rcx
add rax, rdx
jb .LBB0_3
cmp rax, rsi
ja .LBB0_4
add rdi, rdx
mov rax, rdi
mov rdx, rcx
pop rcx
ret
.LBB0_3:
lea rcx, [rip + .L__unnamed_1]
mov rdi, rdx
mov rsi, rax
mov rdx, rcx
call qword ptr [rip + core::slice::index::slice_index_order_fail@GOTPCREL]
ud2
.LBB0_4:
lea rdx, [rip + .L__unnamed_1]
mov rdi, rax
call qword ptr [rip + core::slice::index::slice_end_index_len_fail@GOTPCREL]
ud2
example::owo:
push rax
mov rax, rsi
sub rsi, rdx
jb .LBB1_3
cmp rsi, rcx
jb .LBB1_4
add rdi, rdx
mov rax, rdi
mov rdx, rcx
pop rcx
ret
.LBB1_3:
lea rcx, [rip + .L__unnamed_2]
mov rdi, rdx
mov rsi, rax
mov rdx, rcx
call qword ptr [rip + core::slice::index::slice_start_index_len_fail@GOTPCREL]
ud2
```Output too large. Godbolt link: <https://godbolt.org/z/Ts8hf847c>
smh we should be able to have two uwus
basically the same
make an rfc
"RFC 1234 More Uwu"
this does look interesting
uwu has a check for the range being reversed and that the end is not OOB
yeah
oh it only needs to check the end if overflow doesn't happen I guess?
hm
what is it doing
clippy/perf we got it
yeah
but like
uwu only does a range order (overflow) check and a end check
this seems actually non-optimal
owo does a start check for the first slice and an end check for the second
these are semantically different
that's really interesting though that it removes the start check entirely, because the end check also fails iff the range is in order
bstr
Pre release at least
I noticed it on the site
The one on the site is just UCG 133 but there's another use outside provenance

Yup!
since we also have to lint for let _ = std::mem::uninitialized so you can't just look for calls :(
ah well
now we just need to get sign-off on "uninit u8 is UB" that would be good i think

hmmmm
or i could make deprecated take a "future_warning" parameter
which makes it a FCW
that's weird
i'll just make a new lint
No you don't, just add the attribute and it will be fine
Then justify the rule with the attribute
This is how we've done the rest of Rust semantics why stop now
eh, why even need attributes or anything
just do it
like box
"safe code can't exercise this behaviour therefore it can't happen"
justify
bold assumption any Rust semantics are actually justified
If anyone's looking for something to review, the not-so-vaporlang Roc opened its repo and it's got a decent amount of unsafe
nothing obviously bad from skimming through it a little
Yeah, though they also use unsafe_op_in_unsafe. Sketchiest things to me are some weird wasm deref of int macros, but I don't know wasm well enough to say since it's a weird arch. Also this weird MU swap
https://github.com/roc-lang/roc/blob/641daab02c90b446d8dfad302a8c57c6dfab9f0d/crates/roc_std/src/lib.rs#L155
I recreated this in Miri and it didn't complain with the default settings. But also this function is sort of useless for non-safety reasons and I may make a PR removing it.
couldn't they just make a ManuallyDrop self and then move out of the field 
but also, mem::forget is an anti pattern
I'm sure they don't know that
most people think it's an anti-pattern because it causes leaks but the cases where you don't leak are often even worse
it's an anti-pattern because whenever you need to forget something it's usually because you don't want its drop code to run
and that's often because it has been put into some kind of invalid state, or it needs to not invalidate other things
if it's in an invalid state, moving it around is sketchy. and if it needs to now invalidate other things, moving it around is sketchy.
stable vec_into_raw_parts when 
this is such an obvious API and the available alternatives are all terrible footguns
bikeshedding over stupid details

don't die 
@haughty mica have you looked into hashbrown being broken under field retagging
oh lol it's actually very trivial
mem::forget 
#[forbid(clippy::mem_forget)] when 
at least in unsafe code 
and now ci fails because of a rust windows bug 
Fixing sb errors: 15 minutes of staring at the error to understand SB, 1 minute of actual changes.
yes, mem::forget->ManuallyDrop<T> 
thanks to the nice diagnostics i immediately knew the fix. there was "pointer was invalidated by a retag" pointing at mem::forget 
i've opened a pr already but pc-windows-gnu is currently broken prevent CI from passing
(the toolchain itself)
(chrisd opened up a fix pr on rust-lang/rust)
I did spin::SpinMutex yesterday, one of the issues was that and was easy, the other was related to locks holding mut-refs instead of pointers
Which took me a bit to understand when exactly you got overlap, but once it clicked I now feel like I understand the SB model better than I did

Ah very cool
(Aside: I should run field-retags on my crates, I'm sus about at least one of them)
If they're published you can just look at them on the site
More reason to publish your stuff 😉
how many crates do you run currently?
All of them
How would I look for a specific crate? ctrl-f doesn't seem to do it :P
oh lmao
that explains why my browser OOMs sometimes on the front page 
neat
You probably need to resort to the main listing on https://miri.saethlin.dev
It is large
you probably want to curl and grep it
the browser will be very sad
why do so many of those error 
worse error rate than crater
Is there an AnyMap alternative which is more sound that we could recommend?
I fixed the hasher impl on mine BTW 
good
I temporarily enabled my admin powers to deploy it without asking anyone

It's fine. People trust me almost entirely with rust related things
I'm our Rust BDFL
conrad learned a new word 

Having a look at https://miri.saethlin.dev/ub?crate=minimal-lexical&version=0.2.1 - a lot of the errors come from calling .as_(mut_)ptr() through Deref:
impl ops::Deref for StackVec {
type Target = [bigint::Limb];
#[inline]
fn deref(&self) -> &[bigint::Limb] {
// SAFETY: safe since `self.data[..self.len()]` must be initialized
// and `self.len() <= self.capacity()`.
unsafe {
let ptr = self.data.as_ptr() as *const bigint::Limb;
slice::from_raw_parts(ptr, self.len())
}
}
}
Almost all of the errors go away if I just add .as_ptr() and .as_mut_ptr() on the struct itself, but I'm not sure if that's a good fix, or if there's an issue with the deref implementation
self.data being a [MaybeUninit<u64>; 62]
No that sounds like the right thing to do to me
This is slice deref right?
That retags the whole slice which people often don't want
Yeah
@haughty mica why do so many normal crates error out like that?


I should add a kind of dashboard page so that it's obvious when I screw up like this
Good news is the ETA is only 17 hours
Oh wait 19 hours
22 hours
Back to 15
I really played myself here
what was the problem?

At a guess, I was dorking with the commit that I build Miri from and got too creative
Btw if you are a sicko you can view the logs of a crate with a URL like this, because this is just cloudfront+s3:
https://miri.saethlin.dev/logs/stuff/0.2.0.html
No need to load the big list of all crates, but also no ability to navigate

"How has this crate been building for hours?"
╰ ➤ du -h target/debug/build/arpabet_cmudict-b9b99059bb1900c4/out/codegen.rs
38M target/debug/build/arpabet_cmudict-b9b99059bb1900c4/out/codegen.rs

Just got a bunch of ICEs because rustdoc can't create a temp dir 😩
Okay I think the above situation is finally fixed
@haughty mica how often are the tests on the ub miri website updated?
When I damn well feel like it
If you have a request for something just let me know and I can probably do it quickly
or when the run feels really bad and errors everything forcing you to rerun 
Yeah that happens
Usually I don't let things get older than a few days, but that's in terms of published crates not Miri versions
Running newly published crates is a lot easier than re-running the universe
arc-swap has an update which hopefully fixes the data-race
Cool, but the maintainer hasn't published it
looks like they have - v 1.51
1.5.1 has been up on the site for a while
The site is rarely more than a few days behind
This again
oh, https://miri.saethlin.dev/no-sb/ub is different
no hashbrown release yet 
Oh yeah no-sb/ub is very different
that one is old
the no-sb page is very out of date, I have like never updated it lol
I'll update it sooon
it would be cool if there was some kind of "state" at least for the top crates like "fixed on master", "pr open" or whatever
but that would be effort to maintain 
It would be a crazy amount of CPU work to do
I spend hardly any time maintaining the code for this thing, but like if you saw what this does to my power bill
could that be done client-side?
Do what on client side?
Seeing if it's fixed would involve running Miri
not seeing if it's fixed, more just if it's out of date (there's a new version released)
for example, what's the status of the top ub crates
inb4 compile miri to wasm and run it client side
help: <5091> was created by a SharedReadOnly retag at offsets [0x18..0x30]
--> src/lib.rs:156:42
|
51 | ptr::copy_nonoverlapping(&hole_guard.v[index + 1], &mut hole_guard.v[index], 1);
| ^^^^^^^^^^^^^^^^^^^^^^^^
help: <5091> was later invalidated at offsets [0x0..0x60] by a Unique retag
--> src/lib.rs:156:73
|
51 | ptr::copy_nonoverlapping(&hole_guard.v[index + 1], &mut hole_guard.v[index], 1);
lmao classic (scopeguard)
This is probably technically possible
The biggest obstacle is that Miri dynamically links to the rustc libraries
Exploit the users browser and upload + run miri natively
@grim copper updated now
cheers
sweet, so csv looks like the next top contender
That one is already spoken for, sadly
updated on master
ah
just like hashbrown 
I can't find it, can you send a link?
ah, the no-sb
There's a
#[allow(clippy::many_single_char_names, clippy::cast_ptr_alignment)]
Above the function
oh they also do unsound tuple transmutes
of course
this crate is pretty sus
overall

I'm more worried about the alloc-[no-]stdlib crates
looks like they're used in dropbox and they're effectively just casting uninit memory
and brotli 
at least plotting is not usually connected to the internet
I think I fixed the alloc-no-stdlib crates? And they just didn't publish?
Or maybe not
opened jan 30
looks like brotli is in its reverse dependencies
Ah I see
Well from what I recall this is UB in the tests, not the actual code
Same as scopeguard
yeah, but the test looks like its just using one of the allocators provided by the crate
allocate array -> write to some of it -> read from it -> UB
seems like it?
looks like it can leak data from the stack
allocate array -> don't write to it -> read from it -> data leak
no sure why the other tests didn't fail though, so maybe there's some condition to it
I haven't looked at the details
oh wait the failing case is on the heap not the stack
even worse
eg. if you can read uninit heap memory, you could eg. read private keys out of the process that have been deallocated
If you feel strongly about this, talk to RustSec
The group is bigger and more active recently, I'm sure if you open an issue on rustsec/advisory-db at least pinkforest will engage with you
I'll have to investigate it more thorougly, which I may be able to do a bit tomorrow
I'd like to at least have a somewhat working exploit POC before I go to rustsec
Well if it's an unsound library, exploits are not really a thing?
???
what u mean?
also dropbox has a bug bounty, so if I can make something juicy I may be able to payout
A very unsound library like this may be used in a way which is completely correct
And in fact that tends to be the common case
right yeah. An exploit would just involve using the library in a way that resembles a real-world scenario and leaking some simulated info
I don't think that would alter anyone's assessment of this issue
Everyone in rustsec will understand that this is use of uninitialized memory, which can be used to leak secrets
And it is unlikely that Dropbox will care, unless you can actually make one of their services leak a secret
File a rust sec and get 10.0 critical because "we don't know if it's misused like this and have to assume the worst"

Okay no-sb/ub is up to date again with the top 10k crates
Wild I feel like there are at least 3 buffer overflows in here
simd again?

People do UB for strange reasons
I always run these tests with randomize-layout
And I'm looking for invalid pointer offset errors
Because any buffer overflow will probably be one of those before you get to the OOB access
lol
Wait, I found something?
Yeah randomize-layout turns up a fair bit of smelly code
It was all worth it
Did the padding thing ever land? I don't think so right
now make it randomize fat pointers (DO NOT)
It does, actually
I got stuck on some funky ci error at some point
I was actually just looking at it
No
why
...because it doesn't?
but I thought std does union transmutes of fat pointers
ig it's fine as long as you ensure that union has the same layout as fat pointers?
which uhhh
yeah i think fat pointer layout is something we just gotta specify
well
what is the current layout
?eval
let x: [usize; 2] = unsafe { std::mem::transmute(&[0] as &[i32]) };
x
Running `target/debug/playground`
[93972223787100, 1]
I mean is de facto specified we won't change it and no one will

It matters for all the old code that won't be upgraded
true
it will become less important but there's still no reason to change it
so Let's Specify It
(Someone should write an RFC but it will not be me because oh god it would be worse than offsetof)
Lol
I'm not sure it would be
At the very least, if the same person would appear on my RFC I won't be patient at all with them
a lot of people have been told that unspecified layout of repr(Rust) is great (and it is great indeed!)
i would feel that many people would immediately be opposed to specify it - though if the RFC did a good job of explaining the reasons behind it it could work
Specifying all rust layouts would be an exceedingly bad idea
Specifying slices/str, no brainer
Trait objects, ehhhhhh
We still haven't figured out trait objects fully
trait objects are the interesting thing that people actually rely on though
slice fat pointers don't really make sense, we have from_raw_parts/x as *const _, x.len()
i mean we could of course specify them to be a repr c tuple but i don't think that's worth actually spending effort on since i dont think people actually rely on that
yet we won't change the layout of dyn fat pointers because changing it would be a Bad Idea and Break A Lot Of Things
Oh look it's the same time as with C++ ABI
#[repr(C)]
pub struct TraitObject {
pub data: *mut (),
pub vtable: *mut (),
}
slice is same except vtable is len: usize
more like rust struct Ptr<M: Metadata> { data: *mut (), meta: M, }
if its a slice, that meta is a usize, if its a trait object a ptr to vtable, if it is a normal pointer it's zero sized
Oh lordy me
This is 100% ub https://github.com/GraphiteEditor/Graphite/blob/master/node-graph/borrow_stack/src/lib.rs
They alloc a pointer, throw it into a Box::from_raw() and then pop() drops a Box::from_raw() pointing into the original allocation
It also decrements the length after deallocating the element which seems uh… wrong
decrementing the length afterwards is 
where's the problem here?
There’s a Box<[T; N]> and it basically does drop(Box::from_raw(&boxed_array[n]))
so it loses write provenance?
So you can’t deallocate a portion of an allocation
You also can’t write to said deallocated portion
It’s almost poetic
Where are the tests
L M A O they have the default test still in the file
Empty safety comments too 
Wait a fucking second
What does this even do
I feel like that totally-not-a-dealloc is trying to call the drop impl
All that code that uses this is commented-out 

Yah, I’m not really even sure what this could even do
Popping doesn’t give you the value back
pop isn't even called anywhere
(having a look at this now)
nor is get...
is this all just dead code
yes
@haughty mica lmao i love the favicon
It is very funny to me how some C dependencies have built setups that do not handle cgroups correctly
Yes, I do in fact have 64 cores under there, but if you spawn 64 processes you are going to get them all scheduled onto one core and you're going to like it
their build setup may be older than cgroups and they've never bothered to fix it 
blindly spawning nproc many threads is a time-honored tradition 
wait does nproc not respect cgroups
Print the number of processing units available to the current process, which may be less than the number of online processors
unless they're using --all
I could be writing down the crates that do this but eh
yeah
there's a million ways to do this wrong and ignore cgroups
parse /proc/cpuinfo or whatever
just parse some text files 
linux moment
average unix philosopher

great, yet another lint to ignore 
if i have ```rust
struct Blah(*mut ());
impl Blah{
fn as_ptr(&self) -> *mut () {
self.0
}
}and dorust
blah.as_ptr() as *mut u8
English is not your first language right
I can't explain why but your tense choices here are odd
yes
maybe they are wrong
Since as_ptr took a &self, the pointer won't have write permissions
In English it is common to say
Since as_ptr takes a &self, the pointer won't have write permissions
yeah that does sound weird
I guess the idea is that you haven't made the call yet, so it's not right to use the past tense
If you were interrupting execution in the middle of the call with an error, your tenses would be correct
well, you have made the call when you're doing the cast
but I guess takes makes more sense
You wrote the call, but you're not executing it
yeah
Why do people have crates which hork in crater and dump gigabytes of logs
WHY
I'm uploading ASan logs and it's going very slow because
└── 100.00% [10.28 GB] ── logs
├── 51.55% [5.3 GB] ── clacks_mtproto
├── 11.69% [1.2 GB] ── pleingres
├── 4.68% [480.47 MB] ── ap_kcp
├── 0.56% [57.76 MB] ── goose
├── 0.35% [35.84 MB] ── braque
├── 0.32% [32.97 MB] ── cifar_10_loader
├── 0.29% [30.09 MB] ── noise_search_deps_librocksdb-sys
├── 0.21% [21.38 MB] ── asserhttp
├── 0.14% [14.27 MB] ── injrs
├── 0.13% [13.17 MB] ── mqf
└── 0.10% [10.7 MB] ── fil_builtin_actors_bundle

At least I only need to upload this once because the crate is dead
good.
Ok here we go https://miri.saethlin.dev/asan/ub
It needs a LOT of TLC still
For example, filtering out all the false positives due to custom stack mapping
And the "requested allocation size" crashes
And I need backtraces and diagnosis for the SIGILL crashes
Bad diagnosis
Anyway, someone should probably report all these security vulnerabilities 

If someone scrapes this with a bot and reports every crate that uses loom 
time to open like 20 issues on loom 
https://miri.saethlin.dev/asan/ub?crate=integer-encoding&version=3.0.4 hmm, why is this failing??
What do you mean by that?
Oh jorge fixed it already
... maybe
Anyway, the site is still valuable as a "do not use" flag
If they publish the fix, I'll update the site
lul
it's unmaintained has a rustsec and no one uses it so i guess its fine
Oh I should add the init checks
Lot more crates about to apppear 💀
543 crates in the list now
Only 218 of those are uninit issues
Yeah I'm changing the URL
Hope nobody posted URLs anywhere yet 
which ones?
also
maybe have the root page be a landing page
The asan ones
with some text about what the site is doing and what the links are
Yes
Yes yes yes
This has been on my mind for a while and I would welcome a PR or a writeup by someone else
I just haven't gotten to it
Is it still "lol fuck your browser" though
my browser (chrome) is feeling fucked
it's actually kinda usable though
just slow
Yeah you can thank Zurr for that
after waiting a few seconds to load
Furiosly refreshing my URL waiting for AWS infrastructure to warm up
furiously refreshing my email waiting for AWS job offer to come through
Ah it turns out I messed up in multiple ways
Between my 0% CPU usage, open window, and lack of pants, it's getting kinda cold in here now 
Turn up that CPU. Also, "lack of pants" is a lot more concerning in British English
I'm quite aware of the different meaning in British English 

ok my feet are kinda yellow, definitely need to get dressed

I'm remembering that my "simple static site" actually involves a buttload of different AWS stuff
it's web scale
I just stuck a few panics into parsing paths where some terminal escape programs are like "fuck it don't handle this" and holy shit the strange stuff that's in these logs I have
For example, why does this one build script print environment variables,
Except when there are spaces,
It replaces them with 0x1F, the Unit Separator
I'm learning about most of these by reading the alacritty source code, because it helpfully documents what these all are then fully ignores most of them
Swap foreground and background colors; inconsistent emulation
Sounds like I can ignore it then!

oh fuck oh fuck oh fuck
I decided to run my tests through miri. Doing crypto algs through miri is so slow
And it's using a lot of memory 
Very cool
If you're willing to share it I can add it to the benchmark suite if it is unique
It does 4000 iterations of pbkdf2 basically
So uhh, it makes sense that it's slow
Ehhhh
There's "Miri is slow" then there's all the issues I've been closing about how Miri is slow
I'll let it run to completion. in debug the test took 0.13s
well, it's at least 3 minutes
11 minutes and still going
miri sat at 8GB of memory 
Yes we need better self-profiling support for this
Can you drop the workload size here a whole bunch?
Oh count
sure
Let's try 64
The problem is that the Rust-level profiling support that exists doesn't work if you have to kill the interpreter
15seconds
lmao ikr
stream ciphers still can't be represented with const generics
so typenum is necessary
Do you have any static/once_cell stuff in here?
not afaik
I don't even see any big arrays
So this is very strange
At a guess you have the SRW pileup issue
It's <linear it seems. I upped the number to 256 and it went to 45s
That's to be expected nowadays 😉
Miri shouldn't be much worse than linear, the big nonlinearity issue was solved
I guess my 4096 case would take 12minutes, but I think memory was becoming an issue at that point
I'll look into this later. Probably need to hack up the interpreter to figure out what's going on
nice
At a guess, there is a huge borrow stack somewhere
The perf probably doesn't go to shit because the lookup cache saves you, but the GC can't fix the memory growth issue
Yeah I don't see how
for _ in 1..rounds {
let mut prfc = prf.clone();
prfc.update(&salt);
salt = prfc.finalize_fixed();
xor(chunk, &salt);
}
would cause issues 
You like most Rust programmers are probably not thinking about what your code looks like when compiled without any optimizations
In any case, this is a good benchmark because it looks like it stresses the retag_return_place code path a lot, which I know I made slower with the stack cache but I couldn't justify doing fiddly optimization of it then
And even better, no one except you is thinking about what the borrow stack sizes look like 
Well yeah
Partly because it used to be that the whole story was "borrow stacks are always huge"
I would go with RustCore of the 3 just to keep the name close to Rust. But I'm infamously bad with names already so :P
@haughty mica "miri-tools" crying on the corner
Exactly
It's now comfortably more than just Miri too
I need to properly integrate the ASan option
I just had an idea for a way to print clean borrow stacks in Miri (at least with isolation)
Have a flag that sets the "print-alloc-id" and then a function like fn miri_print_stacks() which prints the stacks for the configured alloc id
This would not have tags around for getting the alloc id
You'd have to get the alloc id from a previous run, either through the usual function or just by doing a little ub
You can already do this btw
miri_get_alloc_id just returns a u64, you can print it yourself
I dislike this general strategy because it doesn't work if you have nondeterministic behavior, but I agree it works most of the time
Also, thoughts on how the GC should interact with this?
Should we have another magic function to turn it off temporarily? Warn if it's on and you print stacks?
Or run the GC right before each print?
root is now a landing page with a search bar. I can't get it to behave on mobile yet, but I'm sure it's just me being bad at webdev.
The page is still flipping huge. Might need to address that.
I'm now pulling --no-default-features and --features from [package.metadata.docs.rs]
I see there's also a thing for rustdoc-args which I am ignoring for now. Seems interesting though.
I also have an ANSI-to-HTML renderer which is good enough to handle cargo nextest run so that'll be up soon
The downside is that this has exploded runtime because I'm only enforcing a 1-minute per-test timeout
Any thoughts on what to do if a crate has both metadata.docs.rs and metadata.playground?
Do like a... union?
Maybe give playground precedence
L M A O idna has its own test harness
RIP
Your tests will just never run under Miri
Startup time is somewhere beyond 3 hours and 8 GB of memory
You know, it kind if is tools now
Maybe that's just the copium kicking in
we're at 7.5GB and counting rn
That code for whatever reason is hitting some code path that defeats the stack cache
I'm not certain but it may be a GC interaction
Maybe that's just my paranoia
idk, but I'm gonna run it for a few hours to see.
I'm only 1 hour in
but my desktop is reasonably beefy

Latest ETA update: 36 days
I should probably be more strategic than "rerun every version of every crate with nextest"

Neat
Yay, another lint to ignore everywhere 
How would you go about fixing https://miri.saethlin.dev/ub?crate=lru&version=0.8.1? It seems to take a pointer into a Box, then insert the pointer and the box itself into a map, moving the box and invalidating the pointer, which is later dereferenced


