#Fixing UB in random crates
1 messages · Page 3 of 1
maybe don't make it generic
hey ferris can you godbolt ```rust
use std::mem::MaybeUninit;
struct MaybeUninitConsts<T>(T);
impl<T> MaybeUninitConsts<T> {
const UNINIT: MaybeUninit<T> = MaybeUninit::uninit();
}
pub fn foo() -> [MaybeUninit<u8>; 128] {
[<MaybeUninitConsts<u8>>::UNINIT; 128]
}
example::foo:
mov rax, rdi
ret
hm, nice
hey ferris can you godbolt ```rust
use std::mem::MaybeUninit;
pub fn foo() -> [MaybeUninit<u8>; 128] {
[MaybeUninit::uninit(); 128]
}
example::foo:
mov rax, rdi
ret
hm, nice
not sure whether i trust it 100% though 
#[unstable(feature = "maybe_uninit_uninit_array", issue = "none")]
the fuck core, ???
ah, a tracking issue was added recently
oh god i hate how basically every call to copy or copy_nonoverlapping to the same allocation has SB issues because of course it fucking does, people are reasonable to write code that has these issues
self.buffer.get_unchecked_mut(len)
can we get a clippy lint please 
wait, what is it doing
wtf
this is straight up reading uninit memory
what
ah, it needs to read it as MU, then it's fine
rerunning this beauty #932319394724479037 message
writing extremely cursed (fish) shell scripts again to sort the ub crates by downloads
curl https://miri.saethlin.dev/ub > miri.html
curl -L https://static.crates.io/db-dump.tar.gz -o dump.tar.gz
tar -xzf dump.tar.gz
cat miri.html | sd '.*crate">(.*?) (.*?)</div><div class="status">UB: (.*?)</div.*' '"$1","$3"' > ub.csv
for c in (cat ./ub.csv) ;
set NAME (echo $c | xsv select 1)
set COUNT (xsv search "^$NAME\$" s_crates.csv | xsv select -n 2 | sed -n '2p')
echo "$COUNT $NAME $UB"
end > downloads.txt
cat downloads.txt | sort -nr > result.txt
They're already sorted by recent downloads
lol
well, but it doesn't show the download count so my script is still not useless 
🔵 💙 📘

simdjson needs unsoundness fixed
simdjson is a bit annoying because it can't really be mirid
How do you know it has a soundness issue @west phoenix ?
i found it handing out non-utf8 Strings
which isn't that bad but it's still unsoundness
still gotta report that
Yeah that's not great
Well, first I saw it allowed clippy::uninit_vec
and that means there is definitely ub somewhere else
There’s a lotta sketchy stuff in that crate tbh
The nice thing is that it’s got extensive tests & fuzzing so changes are easy
There's UB right there, if rustc starts adding noundef
the uninit vec stuff was fixed iirc
Yes, they fixed it shortly after I opened an issue, so they are definitely open for soundness fixes
yeah, we currently do emit noundef for the same types that mem::uninitialized panics for
anything with an invalid value
i don't think noundef does much right now but it would be interesting to see if any benchmarks improve if we emit it everywhere we can
But where can we
Oh fucking hell
pub(crate) fn parse_str_<'invoke>(
input: &'de [u8],
data: &'invoke [u8],
buffer: &'invoke mut [u8],
mut idx: usize,
) -> Result<&'de str> {
use ErrorType::{InvalidEscape, InvalidUnicodeCodepoint};
let input: &mut [u8] = unsafe { std::mem::transmute(input) };
lmaooooo
#[allow(
clippy::if_not_else,
mutable_transmutes,
clippy::transmute_ptr_to_ptr,
clippy::too_many_lines,
clippy::cast_ptr_alignment,
clippy::cast_possible_wrap,
clippy::if_not_else,
clippy::too_many_lines
)]
mutable_transmutes is just another annoying lint just like the rest right
mutable_transmutes is literally just a UB detecting lint
btw this crate is used as an optional dep of serenity (the discord bot library), so like, is probably being used in production rn
serenity != production
probably
and by probably, the only reason I'm not running it on my 36k server bot is because it had some deserialization issues (before I found out about this)
Also in general I'm not worried about these sorts of issues "in production", I'm worried about them going forward, or how they reflect on Rust overall
What do you mean by "issues"?
Oh, snowflakes
I don't remember exactly but those do tickle some uncommon scenario for deserializers
Oh god it actually copies into input https://github.com/simd-lite/simd-json/blob/ae48dc4ee34c8cc698c73ac599bcfdd648a53cd4/src/avx2/deser.rs#L144-L146
And it uses clone_from_slice for &[u8] which is not awful but just why
So they're aware but don't care?
yes
well
they might not know it's UB
https://docs.rs/simd-json/latest/simd_json/serde/fn.from_str.html
parses a str using a serde deserializer. note that the slice will be rewritten in the process and might not remain a valid utf8 string in its entirety.
parses a str using a serde deserializer. note that the slice will be rewritten in the process and might not remain a valid utf8 string in its entirety.
agreed
If input is a static string literal wont writing into it crash instantly
I suspect that scenario is avoided somehow


this is what happens when you port a c++ library 
It’s sad that it’s so cursed, lemire’s libraries deserve better than a crappy port
https://github.com/ParkMyCar/compact_str/pull/100#pullrequestreview-1002178051
I spent most of the time fixing the arc module
i know the author personally, and i wouldn't be surprised if this is true
once_cell now does pointer stuffing with as-casts 😩
Is that good or bad
Bad, because it's such a core library that this update will cause a lot of things to die under -Zmiri-tag-raw-pointers
Gotcha
(I didn't know if this was a "yay, it no longer uses transmute!" or something)
Ah. I've yet to run into a crate that does pointer transmutes which I can't replace with wrapping operations. I'm sure they exist, but haven't hit one yet.
Miri thinks one of the once_cell tests deadlocks
That is probably not great
Oh Truuuuuu
better than makefiles at least
eh they're pretty bad
extremely messy when they get big, and it's really easy to write them in a platform dependent way
Don't get me wrong, Cargo is so much better. I've just seen too many "Makefiles but with my personal nits fixed" things
these are the sort of -Zmiri-measureme that i love 
that's a lot of time spent offsetting pointers 
these zero cost abstractions become very much not zero cost under miri
What crate is this
compact_str
there's probably a huge string in the test suite, I didn't take a closer look
that's wild 
I dont know shit about provenance and stuff, so noob question here:
ptr-int transmute
int-to-ptr cast
Does that mean transmute::<*const T, usize> is UB, but usize as *const T is not?
What about the other other directions, transmute from int to ptr and cast from ptr to int?
cast from ptr to int is definitely OK, it will probably be deprecated though
transmuting pointers to integers has been made ok recently
int to ptr transmute I think is okay, it probably just creates a pointer with zero provenance
oh that’s nice. so it has the semantics of .addr()?
casting integers to pointers is complicated
yep, if you look at the implementation of .addr, it's now a transmute 
nice, this makes a lot of sense!
what makes transmute there better than as usize?
It doesn’t expose the address of the pointer, theoretically allowing more aggressive compiler optimizations
oversimplified explanation
provenance is an extra made up part of pointers that control what permissions they have to access
if you cast an integer to a pointer, what provenance does it get?
it looks whether a provenance has been "exposed" for this address, if yes, it gets it, if no, the pointer will have no provenance and can therefore not be dereferenced
to expose a provenance, cast a pointer to an integer using as
the transmute doesn't expose this
this is all used for compiler optimizations, so the transmute could (in the future) get slightly better compiler optimizations
transmute is claiming “this pointer is just an integer” where as (or other methods) tell the compiler you want to actually do things on pointers
transmute is "just give me the address btw"
as is "give me the address and make sure that I can cast this back to a pointer later"
ah i think get it, also just noticed there's an explanation of provenance in the docs now
No I think it was made not okay at all
A pointer is an address plus some abstract machine state that matters to the compiler and may or may not exist at runtime (on platforms like CHERI). So while the address part can always fit in a usize, you have to do Other Things to operate on that other state.
Oh that's interesting
as you can see @tawny coyote, these kinds of rules are still in progress 
yep but very interesting, I'll read through some of the stuff
before this comment, transmuting a pointer to an integer was considered ub 
if you want a better explanation of all this, read through ralfs blog posts
The heart of the problem is that observing the address of a pointer has implications for what optimizations you're allowed to do
But to be honest all this stuff is in the weeds, because stacked borrows with untagged still doesn't support noalias, and I'm waiting for stacked borrows with wildcard to land before I ask Ralf if that supports noalias
C lets you do whatever you want with pointers and that inhibits so many optimizations
well no, not really
ima just read https://plv.mpi-sws.org/rustbelt/stacked-borrows/paper.pdf for maximum coolness
Well
good luck, though I doubt that you'll understand much (i don't either
)
The problem with the C rules is that it's not clear how restrict is valid
Well every pointer is basically exposed and you can do things like serialize them, isn’t it
nah
and nobody uses restrict even if that helps some
you have to expose them manually still
under PNVI-ae-udi
which is what c will probably get
I’ve only ever seen restrict used in libc
funnily c provenance is still not really settled as well 
It's used here and there in libraries
And I’m sure that nobody calls it right
and it's used in all good cuda code, so I've heard from our local cuda wizard
c still has provenance, you're not allowed to go out of bounds with pointers, still have to expose them for int2ptr casts, so c pointers are limited
but not as limited as rust pointers
Probably
It's still possible for the the compiler/lang teams to just say "oh dear we can't break anyone's code!" and simply remove all noalias from the compiler and let everyone live with the regression.

Wait a second, Mr. Weak Memory Effects pasted 2 backtraces into std::sync::mpsc
This is bizarre
now, i wouldn't be surprised if std::sync::mpsc was deadlocking 
tf, running just the deadlocking test does not make it deadlock
Anyway
but it's definitely this test, ignoring it makes it not deadlock
So if transmuting ints is chill can bytemuck add them back?
i don't think the transmute in the other direction is cool
i wonder why the stampede_once is disabled under miri
ah, i guess the old miri scheduler didn't support it
it works now
it doesn't deadlock with all miri seeds, but it happens to deadlock you run the entire test suite with seed 0
the miri seed 🅱️ reproduces the issue with just the one test
the test is doing some funky things with channels, threads and oncecells
it does look good to my brain, though i don't think my brain is very good at this
hmmmmmmmmmm, this is interesting
if i add a SeqCst fence to the initializer it looks great ("looks great" meaning that it passes 0x11 seeds and then fails because, uhm, deallocating while item is protected: [SharedReadWrite for <222825> (call 65038)] (somewhere in mpsc
)
the fun thing is that there's a nice comment on initialize
Safety: synchronizes with store to value via SeqCst read from state,
yet the inner method only usesAcquires
Ralf thinks this might also be a futex issue
Comment in the zulip
I have not seen that protector error, but there is an issue with dangling Arc
I coaxed a SIGSEGV out of another Facebook codebase ayyyyyyy
And a Solana crate lmaooo
Time to crash the blockchain
Oh boy oh boy -Zrandomize-layout + -Zbuild-std is starting to turn up crates that SIGILL
That's almost certainly misuse of an unsafe API in the standard library
I already see one crate we use at work which is just so exciting
In case anyone wants to take a crack at some of these in the meantime:
abomonation/0.7.3
plotters-bitmap/0.3.1
encode_unicode/0.3.6
safe-transmute/0.11.2
fallible_collections/0.4.4
plotters/0.3.1
heapless/0.7.13
wasmer/2.3.0
typed-index-collections/3.0.3
slice-deque/0.3.0
swc_ecma_transforms_compat/0.102.0
swc_ecma_transforms_optimization/0.128.0
parquet/15.0.0
I already know about the issues in abomonation, heapless, and safe-transmute. The others don't jump out at me as familiar
parquet is apache so hopefully they should be able to fix issues well enough
also it underpins arrow / datafusion / polars
speedy web segfault
Web3isgoinggreat
swc isn't web3 crap, it's the web compiler/bundler for web2 things
web(-3.0)
so, I'd actually care about swc 
also it's a typescript compiler
with no typechecking 
hah didn't know that. I guess it kinda makes sense is compile speed is a priority though
I guess your language server can do the type checking instead..
The parquet crate has a lot of soundness problems historically
I mean like users opening tickets about segfaults
In the case of parquet it seems kind of like they didn't care at first
They do now and one developer is trying to reimplement the whole thing
Creating a mutable reference or doing write through a raw pointer removes all tags for the memory in question that post-date the source of the reborrow or the pointer for the access
This is not a good behavior in SB
what does SB stand for?
oh stacked borrows
nvm
I wonder if we could get a mod to pin https://miri.saethlin.dev/ in this channel - it's a super cool tool!
It's in the opening comment I think
yes
can you get to the opening comment without scrolling way up?
Just ask mod to pin it
@tender nimbus are you able to pin here?
<@&631915156854538260>
can you pin the original post
this is also something for forum feedback
thanks 
cheers m8
I wonder how many of these errors still apply if you ignore provenance
looks like even stuff like once_cell is failing due to provenance checks
yeah, most of these are provenance related
There are also tons of nullptr derer because bindgen tests
Not very many
738
I should upload the version of this with SB disabled
yeah that would be very useful as well
not seeing so many uninit memory (http) is soo nice 
is SB == provenance stuff? I always kinda thought that provenance is some subset of SB?
sb is a model for handling provenance basically
the concept of provenance itself is not rust specific (also happens in every other compiled language with optimizing compilers)
rerunning my script on the httpless data reveals the new top ub causer
yeah, I understand that much, I just think I wasn't understanding the relationship, but that makes sense
bytes already has fix they just need a new version release 
🚀p🚀e🚀r🚀f🚀o🚀r🚀m🚀a🚀n🚀c🚀e🚀
perf (but we didn't profile)
wait... is the rust-crypto crate different from the rust crypto org?
it is isn't it...
https://github.com/DaGenix/rust-crypto/ not in the org
ffs
no, zero init
oh right it's a u32
last updated 2016
don’t think there’s merging that
oh
Security advisory database for Rust crates published through https://crates.io
oof

i think you're better off PRing these people https://crates.io/crates/rust-crypto/reverse_dependencies
most downloaded rev dependency last updates 2 years ago
of course it's a merkle tree crate and is definitely in every rust blockchain project
💯
merkletree had its last update a month ago
oh I guess only last release 2 years ago
what is the number on the left?
how many crates failed miri because they had the bad dep on the right
ah I see
IMO the most useful metric would be sorting unsound crates by recent downloads
they are already sorted by downloads on the website
idk whether it's total or recent though
oh, didn't know that
you're not the only one 
checking out the swcs
relax i'll handle the prs if you don't want to
can you do rollups for edits of my forum post
nice, i got the sigill as well
what forum post 😛
every time I see a rollup I think of these
the op of this thread
linkez-moi
tis pinned
fun, it sigills somwhere in slice indexing
uuuhm, why does it hit a ud2 instruction after a ret
ah because a branch nvm
ah wait you mean saethlin's repo
i mean this 
ye that

the goods news
a bunch of tests pass now
the bad news
it still sigills at some point
it's really nice that cargo tells you the name of the test binary
ah, the same pattern 20 lines above
so lmao, it's not even the randomize-layout that's going on here but the debug asserts from build-std
and it passes!
ah yes, love it when ci does that
as you can see, im a very busy person on github with so many notifications
ah github notifications the thing that nobody checks
I obsessively check my notifications
It's the only way to stay on top of like 10 PRs
same
i check it via my email
To each their own. My email is far too messy for that
ya fair
i wouldn't mind using github notifications, i thought it would solve my problems but ugh the UI just makes me not remember it exists
good, good
Good that it faults abomination, nice to have confirmation it’s working properly
I still have no idea how to fix the padding thing
No one’s given me any help I know how to act on
My extension of randomize that adds random padding before fields
There’s a bug with option on guaranteed niche types rn
But I don’t know how to express "if is option with guaranteed niche T"
Well that’s the thing, it’s just option specifically
I think you're making this too hard for yourself
And I’m not at all familiar with the compiler so I don’t know anything
The biggest issue is that I just don’t know what’s going on
You could just skip everything that looks like Option.
I for one just stomp around the docs until I find something that looks useful
For example, if I search for layout there are a few helpful looking functions, depending on what you have https://doc.rust-lang.org/stable/nightly-rustc/rustc_middle/?search=layout
The “main crate” of the Rust compiler. This crate contains common type definitions that are used by the other crates in the rustc “family”. Some prominent examples (note that each of these modules has their own README with further details).
Perhaps
Hum, I don’t think so
Option is just repr rust
I’m also not entirely sure what my function "knows" about
I don’t know if it knows it’s within an option or anything since it’s just calculating an aggregate type layout
some of those actually just build-std debug assertions instead of randomize layout (swc was just a debug assertion)
I would just express "if it's option, don't"
Even if there is no niche
Eh, that's sub-optimal
I suspect most of them are just from turning on the stdlib debug assertions
Though it's possible that adding randomize-layout in there produced some problems which are only detected by the debug assertions
Also potentially true, you could check with build-std without randomize
It's worth noting that the only other way I have to detect layout problems is looking for a SIGSEGV from a randomize-layout run, which basically only happens when you confuse pointers
Looking for test failures is too hard
I plan on diagnosing all of these crashes individually, so while it might be interesting to do build-std without randomize I think I'll power through
This is the function I'm working in btw if you have any ideas https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/layout/struct.LayoutCx.html#method.univariant_uninterned
I'm just confused tbh
This is a cry for help
Ask more detailed follow-up questions on the zulip
People there are trying to be helpful but they probably are assuming you know more than you do
You're also not in compiler/help
Ah whoops
They probably don't mind much
You could ask someone to move it for you
But also if someone hasn't suggested it already they probably don't care that much
It sure is cool that the authors of cryptography crates test their code with sanitizers and/or Miri
Sarcasm, I assume?
It's a brown M&M angle
I'm not familiar with that reference
"If I came backstage, having been one of the architects of this lighting and staging design, and I saw brown M&Ms on the catering table, then I guarantee the promoter had not read the contract rider, and we would have to do a serious line check" of the entire stage setup, Roth said.
Ahhh, smart approach
Yah, I guess that's fairly indicative of them caring about safety and whatnot
Yes this is a harmless little UB in your test, but the fact that it's in here tells me that nobody is using ASan on this library
why would one even use ASan or Miri
my code is perfect, i don't need such commonfolk tools
How do I put up an advisory that just says "holy shit do not use this crate please"
Abomonation be like
Whiplash from
- Use after free
- Protector error due to inserting into a hashmap while holding a reference across the insert
Starters I guess?
I donno the term
I want to run asan on github ci, do you have any examples of that
Normal cargo test but with
env:
RUSTFLAGS: -Zsanitizer=address
Wait really?
Yes that's all you have to do
You don't need to run it under anything special?
Nope, just a nightly toolchain and a flag
ASan supports MacOS and on Windows it's supported by clang
You can mix ASan and UBsan, which is not supported on Rust because there's no reason to
But other than that no you cannot mix them, you need to do them one at a time because their shadow memory runtimes collide
Gotcha
arc_swap has great CI 
make a 9.8 critical cve
, just like the one from failure
Holy shit it got a 9.8 for being EoL
Oh it's type confusion. That's still probably not a 9.8 but it's not just for being unmaintained
it got a 9.8 for "if the programmer willingly exploits a weakness in a library they are using, they can cause ub in safe code!"
which is very yikes
To be fair, rustsec is part of the problem here
can we give every C library ever a 9.8 CVE 
The fact that rustsec keeps saying "attack vector: network" is probably a big part of the problem here
attack vector: funny developer
i will suggest an improvement to the failure advistory
i mean
i have no idea what to even fill out here
report it to wg-sec they should handle it 😛
like, there are no attack vectors
Attack vector: Network + Confidentiality: High means you can use this vuln to dump the whole contents of a web server
eenie meenie minie moe
and if a developer willingly writes bad code
then everything would possibly be the vector
every single CVSS score they put out is like this
Like, look at this: https://rustsec.org/advisories/RUSTSEC-2019-0035.html
Security advisory database for Rust crates published through https://crates.io
congrats nils, you are now a member of wg-sec
Unaligned access means you can... totally own a server over the network apparently?
inb4 updating the log4shell advisory to do
yeah see, if the memory is not aligned as you want, it could rebel! and if memory rises up against you, everything is bad
FYI there are 3 log4shell CVEs, people lost their minds over all 3, and 2 of them are rated lower than most of the RustSec CVSS scores
aye
see, the log4shell attack are only hypothetical
who in their right might would put up a java webserver????
yet when I grep through my code for __private_get_type_id__ , I get hundreds of results
this is critical to web integrity
i'm giving it low here because idk, when a developer fucks this up maybe they could get hacked
why is there no attack vector "access to source code"
Oh right, because then the CVE would make no sense!
as_mut_ptr considered harmful: Now without Stacked Borrows
oh no what happened
Two crates doing a let buffer = Thing::new().as_mut_ptr();
Ah found another crate that does the same
is this a clippy lint
not like I expect people that don't run Miri to run clippy
But one can hope
We should really have a lint against this
No idea how 'tokio has a race condition' gets 8.1 while unaligned memory access gets >9.x
With no evidence that this even leads to practical miscompilation given anywhere. (there's double frees with less score)
Incorrect hash in sha2 gets 9.8 by means of 'Availability: High', 'Confidentiatlity: High'.
I have yet to discover how to reveal information or crash anything with this when it swapped two blocks of data during hashing.
CVE for libraries is entirely political, change my mind
I agree, libraries should probably not get CVEs
I think I could make an argument that all my unsound advisories can be prompted to a 9.8 CSS CVE by their logic but I don't have the patience for that
Do you know why miri wouldn't recognize -Z miri-strict-provenance?
- name: Run miri
uses: actions-rs/cargo@v1
env:
OS: ${{ matrix.os }}
PROPTEST_CASES: "10"
MIRIFLAGS: "-Z miri-strict-provenance -Z miri-check-number-validity"
with:
command: miri
args: test --all-features
error: unknown debugging option: `miri-strict-provenance`
Error: unknown debugging option: `miri-strict-provenance`
error: test failed, to rerun pass '--lib'
Error: The process '/home/runner/.cargo/bin/cargo' failed with exit code 1
i think you need to write it as -Zmiri-thing
miriflags parsing 
i think it just does a split space
Lovely
It's awful
What's the isolation one, Zmiri-disable-isolation?
Yes
yes
They're all in the readme
Lol, there's literal XSS vuln in the database with lower scores on confidentiality than the above race conditions. smh.
you're making fun of the race condition
but it gets even worse
a soundness issue (realistically non-issue) got a 9.8
So any as_ptr() as *mut T will get the same, then?
i mean, that's an actual stacked borrows issue (but doesn't deserve a cve of course)
what I'm talking about is
"a macro assumed that you wouldn't create a function with a very special name, and if you did create that function (you have to do this on purpose) then you can cause ub in safe rust"
Nvm that it's an actual sb issue. We already found it can't escape the local analysis in llvm<14, so practically it can't cause miscompilation.
Casting pointers around is always valid. Are you referring to doing a write through that pointer?
referring to returning this pointer via a library interface as &mut _
And in |x: &mut [u8]| &mut *(x[..4].as_ptr() as *mut [u8; 4])
which is UB according to SB but the llvm compilation will never know that the pointer can't be writtent through
That's what I'm trying to say, I have no idea how to handle those advisories for that reason.
only things that can actually break should be an advisory
complexity: Low, impact: None?
isn't complexity about attack complexity?
the difficult thing is that future Rust compiler can make it break, and would then have to bump up all the issues
even compiling the same thing with gcc might do it, I don't know
There's no way to express that dependency in cve, because it's not built for libraries
I'm not sure they even deserve complexity: low
It's more like complexity: unknown
The exactly miscompilation that may arise isn't known, and even if one does occur I have no idea what the odds are that it would even get deployed
And what's the user interaction?
these attack vectors really really aren't made for these kinds of things
like, not at all
how the fuck should i know whether my library ub is exploitable over the network or not
Correct, they're designed for applications not libraries
Source of the Rust file src/lib.rs.

Another one that I cannot patch
Is this self referential too
Yes, but the stack use after return is much worse
The self referential stuff could be patched over with some AliasiableBox or whatnot
Also the GitHub repo doesn't exit anymore so this one too will not be patched
Yikes, is that meant to be some form of sentry?
What do you mean by that?
Is that a single linked list, and it tries to use out as a form of termination?
Yes I think it's supposed to be a linked list... of some sort
Oh hell
Source of the Rust file src/de.rs.
Apparently mpsc has a double free?
That's not a double free
Oh yah, a "protected deallocation"
Protectors are a direct expression of the dereferenceable LLVM attribute that goes on references
So why would this be happening?
Does this go away if you pass -Zmiri-disable-weak-memory-emulation
Related, I thought miri supported windows threads?
It's possible that it doesn't. All I know is the Windows support is very flaky
Darn
The problem with Windows is that there are precious few contributors for it
not yet
but @golden summit is implementing them right now
No it does not
This is happening on mac btw
That's very exciting
We'll see if it happens on linux, I assume not
lmao
warning: associated function is never used: `name_cstr`
--> /Users/runner/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/std/src/sys/unix/fs.rs:763:8
|
763| fn name_cstr(&self) -> &CStr {
| ^^^^^^^^^
|
= note: `#[warn(dead_code)]` on by default
warning: `std` (lib) generated 1 warning
wat
Oh yah, is -Zmiri-symbolic-alignment-check good though?
Define good
Should I use it
It has false positives, so I wouldn't suggest it normally. You need to manually diagnose each thing it finds, or run Miri a bunch of times with different -Zmiri-seed values
Alright cool
It tends to not find false positives because most people do not manually align things, but when they do 
What flags would you recommend for CI then I guess is a better question
-Zmiri-tag-raw-pointers -Zmiri-disable-isolation
how does symbolic-alignment-check work, what does it do?
It checks alignment symbolically 
That is unironically how it works though
You can deduce a lot of things from symbolic execution
Basically, even though you don't know the exact pointer values, you know some attributes it has and the invariants of the operations performed on it
Like, if I have <ptr align(16)> and I perform <ptr align(16)> + 16 I know that it's still aligned since forall { P: Ptr }, P % 16 == 0, (P + 16) % 16 == 0
oh god, mockall is super cursed
What it says is "a pointer to a u8 can never be used for a read which required an alignment greater than 1"
Oh
Frankly it's amazing that this doesn't just hork on everything
If you inspect the address of a pointer to walk it up to a correct alignment, it has no idea and you get a false positive
Although I guess technically it could actually use concolic execution here
Yes it could and that would be awesome
Misaligned pointers are very common in the Rust ecosystem and they're a pain to debug because Miri only catches them sometimes, and Ralf has a theoretical concern about a better alignment strategy for what people usually do
The official solution is "run Miri a bunch with different seeds"
yeah alignment is hard because you can hardly detect randomly getting correct alignment
@neon tiger implement concolic pointer alignment checks in miri
Honestly I think just maximally misaligning allocations would be way better
It's a very easy implementation and anecdotally it doesn't suffer the concern Ralf has
that does sound like the best solution
Robust and accurate checks and perturbations give the best results
I need to bug libs about doing that in the standard library
today i wrote another very cursed shell script 
whether a list of repos contains ub
Oh no it do be broken on linux
That's with -Zmiri-tag-raw-pointers -Zmiri-disable-isolation
unsafe fn tm_array<T, U, const N: usize>(array: [T; N]) -> [U; N] {
let array = ManuallyDrop::new(array);
unsafe { array.as_ptr().cast::<[U; N]>().read() }
}
let uninit = MaybeUninit::<[T; N]>::uninit();
let uninit = unsafe { tm_array::<T, MaybeUninit<T>, N>(MaybeUninit::assume_init(uninit)) };
wtf is this doing
oof
I was looking at the weak memory stuff and thinking "wow it's so cool that we can do this now but kinda sad that it's random"
That's a transmute
Without the size check
does transmute not like const generics?
Transmute does not like generics
It's faster 😎
yea
I don't grasp the code at the bottom
it's trying really hard to do ```rust
let uninit: [MU<T>; N] = unsafe { MaybeUninit::uninit().assume_init() };
OH
(but accidentally makes an uninit t)
Nice
fun
I think the top cause of use after free is people trying to test their "clears the memory on drop" things
Not it it's a type that owns the data
I fixed the one in ed25519-dalek because it's non-owning
ah
The right way to test them is probably with a custom allocator actually
ah yea
let ptr: *const u8 = mem::transmute(&self.to_be());
You do not love to see it
what the
:)
but why

my unsafe code typechecks so it must be right
pro tip: if code doesn't type check, insert transmute or transmute_copy to fix it
Tell std to stabilise the MU APIs already for arrays
I'm tired of always rewriting them
same
same
uhm
This person tried to hack around a lifetime error and wrote a use after free instead 
I'm having a miri moment

It's been running for 40 minutes
Doesn't that kinda undermine the checking
Yes but also if this never finishes or OOMs the machine it's on that also undermines the checking
Fair
otherwise, if you have a few tests that take very long but most tests don't, cfg(miri)ing them out is also fine
Do what you can on this test case and head on to other code is how I think of it
Ah yep, 2048 iterations with 32 threads
i understand why miri doesn't like it
get that down to a lot less iterations and threads 
Don't shrink the threads
Also what code are you running I want to profile it
If you use less threads it's harder to hit some race conditions
hmm I wonder if MIRI would get noticibly faster if you threw it into souper
No
I mean yes, but not a lot
I have a PR that takes the edge off of SB but it's not a fix
I guess it's mainly from the algorithm yeah
No it's mainly because the implementation leaks tags
It seems to be hit-or-miss though, probably seed based
I just want to watch it run
The problem with SB is that its design mandates a linear search, and because the runtime doesn't know when a pointer goes away, those linear searches grow in size over the course of program execution
For small programs you don't notice SB overhead hardly at all, then for large programs it eventually eats all your memory
Can't you at some points stop searching?
Like, once you hit your required access you can stop for something like a read, right?
Yes, that's what it does
You search top down and you add to the top, so if you're constantly searching for the same pointer but also creating new pointers, because for example you're reborrowing... yeah.
To be clear, you don't search for acccess in SB you search for tags
I was meaning stuff like SharedRead or whatever
Store the tags in a vec and simd search them
That does not fix the algorithmic problem
This is true, but isn't the linear nature fundamental?
I have a PR that does some clever caching and it neutralizes the SB overhead for some programs
The linear nature is fundamental, but the leak isn't. Oli and I think there is some way to implement a garbage collector for tags
Ahhh, gotcha
@haughty mica will you publish the disable-stacked-borrows run on your website?

I still vote for a volunteer-based thingy
I'd totally run a container for you, I'm sure others would as well
Yeah on one core though?
Really what I should do is physically move my computer farther from my bed so I can run this while I'm asleep
You know if people want to help, figuring out some kind of caching scheme for build artifacts that doesn't also let crates corrupt it when I run their tests would be a big help to the runtime speed
Containers can have more than one core
sccache?
I tried it and I couldn't observe a significant improvement. Probably ~64 processes banging away on one directory that's mounted into 64 different Docker containers isn't fast
Gotcha
iirc it has s3 support, you could hook it up to mimio (fake s3, we use it in docs.rs for local tests)
You know what, the other thing I should do is actually profile this
Or maybe do cross-crate scheduling by yourself
Gather all (or some) of the crates that you need tested and build each of their dependency graphs
Merge all the graphs together and then you have a build plan
Yeah see that seems complicated
This is true
I would rather have something that's slow and not buggy than something that's really complicated but slightly faster
Fair
I mean it's theoretically the best you can do
Minimal work done, maximal allowed concurrency
Would be nice if there was a way to have the most common dependencies precompiled
Wait isn't that called a workspace
Lol like what the playground does
Cargo abomination incoming 
Clone them all into one repo and make them one big 'ole workspace
cargo miri test --all

Ah that's true
true, this might be a good idea
I think it's better, cargo may combine their build plans into one big boi
Also some crates stomp around on the filesystem so I'm cautious of just merging them
On the other hand, are shared target dirs safe for multiple compilations in parallel
No
Yes
Really?
You will simply not get any parallelism
Doesn't cargo take out a lockfile
there's a lock
Touche'
that's not great
how often are you really compiling 2 different projects at once
Im not, but miri-tools is

I can't see SB craziness in my profile of this, and it eventually dies on some classic as_mut_ptr invalidation
Might just be general interpreter slow
Miri is still dying with the changes if you want to look at it again ig https://github.com/vmware/database-stream-processor/actions/runs/2481434847
Yeah uh all I see is a gray screen waiting for output
I'm curious to know what it's running right now but I cannot tell
Only has the top 10,000 but go nuts https://miri.saethlin.dev/no-sb/ub
#3 0x00005567c2e03e92 in core::slice::raw::from_raw_parts<u8> (data=0x0, len=0) at src/slice/raw.rs:93
#4 0x00005567c2c1cb4d in font_kit::loaders::freetype::Font::rasterize_glyph (self=0x7ffd363d3c48, canvas=0x7ffd363cfcf8,
glyph_id=3, point_size=9.67741966, transform=..., hinting_options=...,
// Safety:
// we just allocated enough capacity and data_len is correct.
unsafe { escape_field(bytes, self.quote_char, &mut self.data[data_len..]) }
I love when the safety comments are simply wrong
Really makes me wonder about the merits of requiring them on every unsafe if people just write wrong safety comments
@tender nimbus if you or anyone else fixes a SIGILL please shoot me a message or link me the PR or something, I want to track what bugs these are finding
At least if the unsafe code itself looks OK at a glance but the reasoning in the safety comment is totally off, that'll make me look extra hard
🤷 Check out the above code from polars_io
It's a very common mistake but it's just a wrong safety comment
Yeah cool but if I view that web page I need you to know that I cannot figure out what is being run
If you could name the test that was still running maybe I could help, but this level of slowdown is typical
Const eval is ~1000x slowdown, and SB is ~infinite in the general case

arc_swap 
You have found the SIGILL in the two swc crates yourself but here's the fix: <1https://github.com/swc-project/swc/pull/4943>
wtf, invalid pointer derefs inside vec in alacritty 
running just the data race test of arc-swap in miri makes it uaf instead
fun
TSan finds the data race as well
yikes
they do run tsan in ci though 
ah there is an issue
so they say that this is a false positive
this goes way above my head 
https://github.com/vorner/arc-swap/issues/76 i opened an issue i guess
I thought arc-swap fixed the SB issue but just didn't release it
no
Oh I swear I did a PR, darn
i'm sure saethwin dreams about SB issues
I wish my dreams were that calm
who doesn't
thank you discord
please stop using relative message counts for the replies
Yah it’s kinda annoying that gh doesn’t show in-progress logs
But it’s the exchange test that it’s stalled on
Oh my bad I totally didn’t realize you had to log in to view logs
Because normally right these tests are pretty quick so it doesn't matter but doctests have that nice warning when they run for a while
I’ll send the log archive, gh doesn’t leak secrets in logs right?
Yah, nothing else has any trouble
Just paste me the name of the test that was stuck
Even all four of the sanitizers are super quick
Yeah they're only like 2x slowdown
sanitizers slow a program down
unlike miri, which basically brings it to a halt
operator::communication::exchange::tests::test_exchange
Yah I was just confirming that this is very much a miri issue, nothing else cares
Yeah totally, that's why I think it would be best to do these sorts of hacks in cargo-miri
Oh this is SB thrashing
Just a perf top --pid $(pgrep miri | tail -n1)

Those top 3 functions are linear in the runtime of the borrow stack
Total memory usage holding steady though that's nice
perf says 97% of runtime in SB, but because SB also trashes your cache it's closer to 100%
is it better on your branch?
It's better
I think I never implemented a fix for the linear behavior of find_first_write_incompatible
This still destroys the cache, though I could fix that, though I think Ralf would be unhappy about it
Curious that this doesn't stress iter_mut
why, what would you do?
Well first of all this needs SoA
lmao
This loop is just a scan of Permission which is a 4-variant enum, but it's actually scanning a Vec<Item> where Item is 24 bytes

why would this make ralf unhappy? because it would make the code harder to read?
It makes the code harder to hack on
Dammit, why can't I remember the word?
It's for when there's a bad thing and you've found a situation that exacerbates it to horrible levels
The changes Ralf wants to make to SB are very much in the guts of how all this lookups into the borrow stacks work
If you make the api nice enough it could be a moot point?
Specifically in software, it's not an edge case but close?
Dammit
there was a soa derive crate that made this kind of not completely horrible i think?
Whatever, of course we made another piece of code that thrashes rust stuff lmao
SB is a big tangled mess of state so yes you could do that but it's not easy and I'm trying to do it
why
Sounds like a good benchmark
I thought so too but no one took me up on it
was the bug fixed
wg-compiler-perf didn't want it?
Donno
or what caused this abomination
Lemme find the issue
Maybe it was this? https://github.com/rust-lang/rust/issues/78925
Totally, it's still a really big issue though
Other teams were disliking us because of our compile times
It's an issue with the architecture of your code
There's really not much the compiler could do to help you here
the first timely test already times out in miri 
timely uses abomonation
Timely also has its own ub apart from abomonation
The consolidation code is ub off the top of my head
"potential" unaligned memory access
no response for over 6 months
That's typical
Frank doesn't care
materialize isn't getting owned so...
Hum?
It's not a security issue
Ahh gotcha
I mean, abomonation is a massive issue if he ever wants to actually utilize timely's distribution mechanics
It will shit the bed, I've tried before
I honestly do not understand the use case for all this
For timely?
I've worked in code that needs to pump data to a serialization format and looking at the system holistically I would never use abomonation because that doesn't help with my hot path
Oh yah, same
times out?
btw I'm running it and this is in the interpreter not SB lol
It's vaguely hot since it's used within exchange operators, but they're miniscule in comparison to building/maintaining indexes and anything using indexes like joins or aggregation
For me (and maybe this is just because of our cost structure) arranging the data to be compressible and compressing it always dominates feeding it to a serializer
Also shilling dbsp, we're faster and use less memory than timely, and we actually have theory & math behind our stuff that's understandable
Yep, most definitely
We're also working on disk-backed persistence and out-of-core indexes, meaning we can work over more data than main memory can hold which timely can't do since it's 100% in-memory
but how can you be faster than timely, you don't even do horrible ub??? 🚀

Could my miri thing be because of thread::yield_now()?
Maybe that's screwing with miri
What is your miri thing
It taking multiple hours?
No, that's just Miri being slow and SB being ~infinitely slow
Ah gotcha
You need to shrink the working set of the test or cfg it out
usually, there is literally nothing behind miri being slow
just.. miri being slow
I'm decreasing the number of rounds it does
reducing the work miri will have to do will make the test faster
The dbsp SRW blocks are thousands of tags long 
You're welcome for your bench case 
Pog
🎉
https://github.com/kolloch/crate2nix may get you this
caveat: it's Nix, so you'll have to dick around with it forever to get it to work
lol
but it will get you obviously-sound per-crate caching
obviously because the cache is keyed by all inputs, by construction
That could be good, yah
And I guess nix would help with the volunteer problem, people could just spin it up
I'm now obsessing about a good way to share build artifacts, thanks yall
because it's cool
flurry has UB and the only reason it wasn't detected is because they cfg_attr(miri, ignore) some tests
:)
i'll look into this
it's at least a fresh form of UB (deallocation with wrong layout)
wait
it's a seize issue?
hmmm
yeah, sieze's tests fail miri
which is weird since they have miri in CI
well
maybe
i'm still looking into this
but yeah weird
oh
wtf
this is allocating a vec and then deallocating it with Box::from_raw ??
no
is what not allowed
wait
why is this making a vec
fn allocate_bucket<T>(size: usize) -> *mut Entry<T> {
Box::into_raw(
(0..size)
.map(|_| Entry::<T> {
present: AtomicBool::new(false),
value: UnsafeCell::new(MaybeUninit::uninit()),
})
.collect(),
) as *mut _
}
can you collect into a box?
wtf is this
yes you can
impl<I> FromIterator<I> for Box<[I]> {
fn from_iter<T: IntoIterator<Item = I>>(iter: T) -> Self {
iter.into_iter().collect::<Vec<_>>().into_boxed_slice()
}
}
lmao
okay so
is Entry dynamically sized
how does it know the length
it is not
oh
is this
okay
it's allocating a *mut [Entry<T>] (or similar)
where the size is not 1
and then it's deallocating it as a *mut Entry<T>
ah
tf
and that doesn't blow up in practice
because malloc/free does not care
about layout
that's pretty cursed
but miri does
okay so
we do know how big the bucket we just made is
where is allocate_bucket used
i might just change that to return *mut [Entry<T>]
and then cast that to a *mut Entry<T> as needed
hmmm
nah i'll just deallocate using thread.bucket_size
how do you deallocate a boxed slice from a pointer
don't i need ptr metadata for that
ptr::slice_from_raw_parts

currently looking at some cursed alacritty shit
they read pointer bytes as normal bytes
not good
i might need to pass -Zmiri-allow-ptr-int-transmute for this to pass
since we might be putting pointer bytes in atomics
(integer atomics)
yep
a lot better than the issue before
or not, actually?
oh nice
😵💫
strange
well
do the rest of the tests pass allowing ptr-int transmute
they do not
data race
deallocate / read
race
ah, this uses a custom qword memcpy
but using usize instead of MU<usize>
MIRIFLAGS="-Zmiri-disable-isolation -Zmiri-allow-ptr-int-transmute -Zmiri-disable-weak-memory-emulation -Zmiri-ignore-leaks"
tests pass with this
Could be the weak memory emulation bug again
is that a stdlib bug or a miri bug
Miri
if we implement memcpy using MaybeUninit, is it mumcpy?
it's MaybeCopy

hi
miri's complaining about the Box::from_raw
i can get (current) miri to pass with no errors but only if i enable like 3 ignores
that part of the code is pretty much just vendoring the thread-local crate
ah so thread-local is broken too?



