#Fixing UB in random crates

1 messages · Page 4 of 1

tough leaf
#

i'll check that out

manic tangle
#

in Drop?

tough leaf
#

i'll make a PR

#

easier to show you there

#

you're allocating a Box<[Entry]> and then making a *mut [Entry] and then throwing away the length by casting it to *mut Entry

#

so when you Box::from_raw the *mut Entry, the layout is wrong unless the thread.bucket_size is 1

manic tangle
#

but in drop it does ptr::from_raw_part_mut

#

oh wait

tough leaf
#

this isn't in drop

manic tangle
#

is this in the race to allocate

tough leaf
#

yes

manic tangle
#

that... would be my fault

tough leaf
#

in get_or

#

does the pinned miri version not catch this?

#

hmmm

manic tangle
#

okay it's not released yet in thread-local I don't think

#

why doesn't ci catch this though ferrisThonk

tough leaf
#

fuck if i know

#

does old miri just not check this?

manic tangle
#

miri from 6 months ago ferrisThonk

haughty mica
#

thread-local is sketch

manic tangle
#

how

haughty mica
#

Oh that's just old version never mind

manic tangle
#

the cve?

tough leaf
#

yeah old miri just doesn't notice this

#

for some reason

haughty mica
#

Old Miri does not have weak memory emulation or preemption

#

There's probably a futex bug in new Miri

manic tangle
#

oh wait

haughty mica
#

Old Miri definitely checks for that, if it hits that code path

#

But just saying, the weak memory stuff literally has more code coverage

manic tangle
#

that'a true the cas probably never fails without weak memory

#

or preemption

#

preemption not weak memory

tough leaf
#

ohhh right

#

yes

#

that makes sense

haughty mica
#

-Zmiri-preemption-rate=0

#

It's in the readme

manic tangle
#

your pr fails cargo fmt lol

haughty mica
#

The usual

#

Sorry to see you're not a sigma format on save enjoyer

tough leaf
#

fuck

#

oh god what is that formatting

#
-   let to_free = std::ptr::slice_from_raw_parts_mut(new_bucket, thread.bucket_size);
+   let to_free =
+       std::ptr::slice_from_raw_parts_mut(new_bucket, thread.bucket_size);
haughty mica
#

That's normal when you toggle the setting for it

tough leaf
#
let _ = Box::from_raw(std::ptr::slice_from_raw_parts_mut(
    new_bucket,
    thread.bucket_size,
));
#

okay that's reasonable

manic tangle
#

also remove std:: ferrisBorrowCheck

tough leaf
#

oh is ptr imported

#

yes

#

there

manic tangle
#

thanks

#

should I enable miri strict-provenance?

haughty mica
#

If you want

#

I think it's mostly tag-raw-pointers with worse errors

#

That's partly my fault

tough leaf
#

hmmmm
strict provenance fails
because it's saying something isn't a valid pointer

it's a union

#

is this
doing a ptr to int transmute?

manic tangle
#

🤫

#

it might be doing a ptr to int transmute but only for comparisons

#

the invalid pointer issue is a different bug that I am fixing

west violet
#

I've been thinking, maybe you should let contributors PR in miri results to a git repo that the website runs off of

#

Then you could automate the running part and have it produce a file/archive of the relevant results that they can then pr

manic tangle
#

okay now miri is saying some weird things and I'm not sure if it's a weak memory emulation bug or not

haughty mica
haughty mica
#

Yay I found the UB in heapless that 0xide punted on fixing

#

Woooo

#

Go me

#

Yay

ruby jacinth
#

u detectowed the ub ferrisHeartEyes

haughty mica
#

I love how you can debug some of these just from the backtrace in gdb

#
#0  core::slice::raw::from_raw_parts::{closure#0}<u32> ()
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs:2009
#1  0x00005568b59e8e69 in core::ops::function::FnOnce::call_once<core::slice::raw::from_raw_parts::{closure_env#0}<u32>, ()> ()
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:248
#2  0x00005568b5a1baa7 in core::intrinsics::const_eval_select<(), fn(), core::slice::raw::from_raw_parts::{closure_env#0}<u32>, ()> (arg=(), _called_in_const=0x0, called_at_rt=...)
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/intrinsics.rs:2375
#3  0x00005568b591cf42 in core::slice::raw::from_raw_parts<u32> (data=0x7f25b0006593, len=0)
    at /home/ben/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/raw.rs:93
#4  0x00005568b42c1e46 in parquet::util::hash_util::crc32_hash (bytes=..., seed=0) at src/util/hash_util.rs:115
#

This one is a 9.8 CVE ferrisClueless

#

Once again, easy from the backtrace

#0  core::slice::index::{impl#2}::get_unchecked::{closure#0}<u8> () at src/intrinsics.rs:2009
#1  0x00005623ca330885 in core::ops::function::FnOnce::call_once<core::slice::index::{impl#2}::get_unchecked::{closure_env#0}<u8>, ()> () at src/ops/function.rs:248
#2  0x00005623ca348e87 in core::intrinsics::const_eval_select<(), fn(), core::slice::index::{impl#2}::get_unchecked::{closure_env#0}<u8>, ()> (arg=(), _called_in_const=0x7fa8975fb620, called_at_rt=...) at src/intrinsics.rs:2375
#3  0x00005623ca357f37 in core::slice::index::{impl#2}::get_unchecked<u8> (self=0, slice=...) at src/slice/index.rs:225
#4  0x00005623ca358327 in core::slice::{impl#0}::get_unchecked<u8, usize> (self=..., index=0) at src/slice/mod.rs:405
#5  0x00005623ca1b882c in encode_unicode::utf8_char::Utf8Char::from_slice_start_unchecked (src=...) at src/utf8_char.rs:450
#

Wait a second actually maybe no

#

Should have trusted myself, I was right, this test is just some incredible facepalm material

#
#3  0x000055ef3792c877 in core::intrinsics::copy_nonoverlapping<u64> (src=0x7fbc74000d39, dst=0x7fbc7a9fd588, count=1)

facepalm

golden summit
#

hey saethlin are you gonna test on windows after threading is merged, or are you gonna wait for file io too?

haughty mica
#

Test on windows as a host?

#

Or as a target?

golden summit
#

target

#

it would be nice to have something like the shim wishlist issue

haughty mica
#

I'll do anything

golden summit
#

but for windows

haughty mica
#

Oh I see the angle

#

For sure

#

I'll run some stuff tomorrow

#
#3  0x0000557dc42fae22 in core::slice::raw::from_raw_parts_mut<u8> (data=0x0, len=0)

You absolutely do not love to see it

#
#3  0x00005572598b101c in core::intrinsics::copy<wasmer_vm::vmcontext::VMFunctionImport> (src=0x8, dst=0x7f5df8001b34, count=0)
#

wat

pastel lily
#

Copying a zst “allocated” at dangling to stack?

haughty mica
#

Copy 0 of them though? thonk

#

That's not a number of bytes that's a number of instances of T

#

Hit another cryptography library sigh

#

Oh this one is just an actual SIGILL, it's not one of my debug assertions firing

#

I've found one or two crates with bindings that seem to be just broken because they build but SIGILL or SIGSEGV

#

idk what's up with that tbh

hybrid forge
#

Hello, is there any reason someone would use partially unitialised arrays over ArrayVec beside the extra dependency? Because that's what I see a lot in crates

tough leaf
#

what crates are doing this im gonna melobonk them

hybrid forge
#

It's a weak reason to introduce unsafe code tbh, especially given the fact arrayvec is already in their dependency tree in most cases. I wonder if it wouldn't be better to point at it in rust docs too?

#

For some embed-targeted crates it's understandable (to avoid the dep) but that is seldom

haughty mica
#

Where are you seeing this? An example would help

tough leaf
#

i'm tempted to push for more mem::uninit panics tbh

now that http has been fixed

thorny venture
#

Do ittt

thorny venture
tough leaf
#

generic storage
oh hi

#

it's you

#

:D

thorny venture
#

Well, I first wrote an RFC for an array vec

#

I only became a storage shill after the controversy

tough leaf
#

one thing that might be a good idea is to submit API compatible versions of old crates

#

API and ideally MSRV

#

like, crossbeam-queue 0.1.2 is fucked

#

but if you run cargo update, you don't get a newer version

#

yeah so

game plan

  1. go through the historical crater run and look at all failures
  2. for all root causes, try and get a semver compatible but fixed crate released
  3. fire off a crater run with the stricter checks and hope there's not 400 regressions
#

if a cargo update fixes it, I don't care

#

maybe straight up say "this panic seems to come from crossbeam-queue 0.1.2, you can run cargo update -p crossbeam-queue to update to a fixef version"

#

in the panic

haughty mica
#

Phew

#

Good luck with that plan

#

As an aside I'm amazed how many people took my UAF fixes

tough leaf
#
struct Inner<T> {
    bytes: [u8; 64],
                                                                       
    /// `[T; 0]` ensures alignment is at least that of `T`.
    /// `PhantomData<T>` signals that `CachePadded<T>` contains a `T`.
    _marker: ([T; 0], PhantomData<T>),
}
                                                                       
impl<T> Inner<T> {
    fn new(t: T) -> Inner<T> {
        assert!(mem::size_of::<T>() <= mem::size_of::<Self>());
        assert!(mem::align_of::<T>() <= mem::align_of::<Self>());
                                                                       
        unsafe {
            let mut inner: Self = mem::uninitialized();
            let p: *mut T = &mut *inner;
            ptr::write(p, t);
            inner
        }
    }
}
#

aaaaaaaa

#

why is this not just holding a T

#

to be fixed size?

haughty mica
#

Where is this code?

#

crossbeam?

tough leaf
#

crossbeam-utils

haughty mica
#

Oh I see the problem

#

Also wtf

tough leaf
#

they have a MSRV of 1.12

#

at this point in time

#

oh right

#

it needs to be 64 bytes long

#

in newer versions of rust you can just repr(align) it

#

well

#

i could create Inner by zero initing bytes

#

that's still unsound if you have padding bytes in T

#

but honestly it's the best i can do
hmmm

#

this is the full list of regressions filtered by unique types

that's.... a lot less than i was expecting

#

so it looks like four root causes for zero init, 3 of them being crossbeam

haughty mica
#

That does not mean your plan is easy

tough leaf
#

oh no it's not
but it's doable

west violet
haughty mica
#

Requires const fn

#

msrv too low

west violet
#

Ahhh

tough leaf
#

yeah the MSRV is far too low for that
the correct solution is just #[repr(align = 64)]

haughty mica
#

I now know of 3 crates which pass overlapping ranges to copy_nonoverlapping

west violet
#

lmao

tough leaf
#

oh also would an abort in unreachable_unchecked be of any value or in practice is that just an abort anyways

haughty mica
#

That would be useful

#

No need to take any chances

#

It's definitely a ud2 in debug mode already but why risk it

tough leaf
#

that'd just be assert_unsafe_precondition!(false); i assume

haughty mica
#

Eh

#

Or just call abort

tough leaf
#

fair, yeah

haughty mica
#

Lot of this stuff just hurts my brain

#

At least I've categorized all the SIGILL crates now

#
# use of abomonation
abomonation/0.7.3
abomonation_derive/0.5.0
differential-dataflow/0.12.0

# wild get-unchecked in a Map implementation?
heapless/0.7.13

# copy_nonoverlapping called on overlapping ranges
dmsort/1.0.1
lightws/0.6.4
symbolic-debuginfo/8.7.3 # dmsort

# null slice
rustler_sys/2.2.0
filebuffer/0.4.0
multiboot/0.8.0
spectrum-analyzer/1.2.4
rusty_v8/0.32.1
plotters/0.3.1
plotters-bitmap/0.3.1
tract-core/0.17.0
tract-hir/0.17.0
tract-linalg/0.17.0

# get_unchecked into an empty slice
compression/0.1.5
fallible_collections/0.4.4
garando_syntax/0.1.0
mp4parse/0.12.0
quickersort/3.0.1
slice-deque/0.3.0
stun_codec/0.2.0
stun_codec_blazh/0.1.13
swc_ecma_transforms_compat/0.104.1
swc_ecma_transforms_optimization/0.130.0
swc_ecma_transforms_testing/0.91.0
typed-index-collections/3.0.3
rav1e/0.5.1
secp256k1-plus/0.5.7 # arrayvec 0.3
polars/0.22.8
encode_unicode/0.3.6 # test-only and... why

# out-of-bounds get_unchecked, index = len
chess/3.2.0
grin_secp256k1zkp/0.7.11

# slice::from_raw_parts with misaligned pointer
commitlog/0.2.0
lua_bind_hash/1.0.1
malloc_buf/1.0.0
pelite/0.9.0
safe-transmute/0.11.2
parquet/16.0.0
datafusion/9.0.0

# Misaligned pointer passed to copy/copy_nonoverlapping
nanoserde/0.1.30
wasmer/2.3.0
wasmer-middlewares/2.3.0
as-ffi-bindings/0.2.4

# Requires a bunch of libraries, someone else can take this
opencv/0.65.0
west violet
haughty mica
#

debug mode usually means no optimization

west violet
haughty mica
#

I didn't hit them

#

They... Do not have much test coverage

#

That's probably why

west violet
#

This is true

haughty mica
#

It's very interesting that I only found one wild OOB slice access

west violet
#

I’d imagine that’s something pretty easy to find in most scenarios

#

That is, it actually manifests as a bug

haughty mica
#

slice access

#

The memory access would be valid if they didn't go through get_unchecked

west violet
#

Huh?

haughty mica
#

get_unchecked is UB to index out of bounds, even if there is memory out there

west violet
#

Oh I see

haughty mica
#

Some codebases do vec[vec.len()..].get_unchecked(0) to get the first uninit element for example

haughty mica
#

Oh font-kit is a servo crate. Nice

#

You know what, I should just try building servo

#

I'm sure Firefox would be much more exciting

pastel lily
#

hahah

haughty mica
#
diff --git a/rust-toolchain b/rust-toolchain
index abcacd9bd1..bf867e0ae5 100644
--- a/rust-toolchain
+++ b/rust-toolchain
@@ -1 +1 @@
-nightly-2022-04-11
+nightly
#

That toolchain predates the debug assertions

tender nimbus
haughty mica
#

🤷

tender nimbus
hybrid forge
tough leaf
#

added a list of all the bad crates there

west violet
haughty mica
#

The issue with all of these is convincing maintainers to release a patch for an ancient version

west violet
#

Or yank them

haughty mica
#

That's even less of a fix

west violet
#

Fair

knotty oar
#

only supported versions though?

haughty mica
#

What does supported mean?

knotty oar
#

depends on what the author defines - i nmost cases just the latest major version (and all its minor ones)

haughty mica
#

Yeah so that's my concern with 522's plan

tender nimbus
#

"we want to break those crates" could be a convincing argument for a maintainer to backport (even more so if we deliver branches with the backports)

OTOH, "don't" could be the response to such a request

tough leaf
#

true
i'll let llvm updates break the code instead

west violet
#

Writes pass targeting this code

tough leaf
#

honestly depending how aggressive llvm (and us) gets with noundef i wouldn't be surprised if even http's use breaks

#

(now patched, but it's been UB for like.... ever)

tender nimbus
#

only took a few pings to get it fixed ferrisballSweat

haughty mica
#

Get what fixed?

#

http?

thorny venture
#

Rust

tender nimbus
tender nimbus
#

vvv like this one

#

@thorny venture fix rust

thorny venture
#

Done

thorny venture
tender nimbus
#

https safety could still be improved by the 3 open PRs but at least it's not about soundness anymore

haughty mica
#

Oh really?

#

Doesn't it still do & -> &mut

tender nimbus
haughty mica
#

It has a single Iter type which powers both shared and mutable iteration. You can't do that soundly.

tender nimbus
#

uh

#

for the header map?

#

that one has IterMut, Iter and IntoIter

#

according to git blame, for years

haughty mica
#
#[derive(Debug)]
pub struct Iter<'a, T> {
    inner: IterMut<'a, T>,
}
#

That cannot work

tender nimbus
#

oohh

#

is there a fix pr for this ws well

haughty mica
#

I do not recall

#

The fix is pretty easy if you want to code it up, it just looks like awful programming unless you understand SB

#

It's really just copy+paste another definition of IterMut over Iter then duplicate all its methods
Possibly a macro would be nice

#

I definitely implemented this at least once just to convince myself that was the problem

#

Because you never know if you understand a bug until you've fixed it, you know

thorny venture
#

Why not IterPtr then turn into &mut or & respectively

tough leaf
#

you can't just clueless yourself

thorny venture
#

Fr though I'm saddened by how complex the iterator trait is for efficiency

#

Specialisation pls

pastel lily
tender nimbus
pastel lily
#

you can always clueless yourself

tender nimbus
haughty mica
#

It feels like somehow the Windows stuff is running slower. That doesn't make any sense

#

Maybe it's just that I'm back at the top of the list where things have more tests

golden summit
#

huh.

#

createthread aint even on the list

haughty mica
#

The list goes on

#

Oh

golden summit
#

theres a lot of file system stuff

haughty mica
#

I run things with --jobs=1 --test-threads=1

golden summit
#

GetFullPathNameW is painful to implement, the last time i tried it

haughty mica
#

And I think somehow Miri ends up running tests serially anyway

#

I should really raise the parallelism to 2 or something, to make it easier to find races

golden summit
#

if i remember right

#

CreateIoCompletionPort is probably tokio

#

i think it would be funny tho if i got async file io in miri on windows before linux

haughty mica
#

Yeah tokio just dominates everything

golden summit
#

but also that sounds incredibly painful to implement

haughty mica
#

Yes

#

Could you just return an error from the case you don't want to handle?

golden summit
#

oh yeah i remembered why GetFullPathNameW is up there so high

#

std calls it every time you need to use a path for anything

#

so all file io stuff is being covered up by it

#

except CreateIoCompletionPort

#

somehow

#

well thats probably being manually done by tokio

#

oh you dont actually need a file for CreateIoCompletionPort

west violet
#

Yah, ports are io objects, they're not files

haughty mica
#
3284 counts
(  1)     1684 (51.3%, 51.3%):  GetFullPathNameW
(  2)      633 (19.3%, 70.6%):  CreateIoCompletionPort
(  3)      304 ( 9.3%, 79.8%):  GetFileInformationByHandleEx
(  4)      195 ( 5.9%, 85.7%):  GetModuleFileNameW
(  5)       60 ( 1.8%, 87.6%):  WSAStartup
(  6)       46 ( 1.4%, 89.0%):  SleepConditionVariableSRW
(  7)       27 ( 0.8%, 89.8%):  CompareStringOrdinal
(  8)       25 ( 0.8%, 90.6%):  GetTickCount64
(  9)       19 ( 0.6%, 91.1%):  CreateFileW
( 10)       18 ( 0.5%, 91.7%):  AcquireCredentialsHandleA
( 11)       17 ( 0.5%, 92.2%):  Sleep
( 12)       15 ( 0.5%, 92.7%):  CreateNamedPipeW
( 13)       13 ( 0.4%, 93.1%):  CryptAcquireContextW
( 14)       13 ( 0.4%, 93.5%):  WakeAllConditionVariable
( 15)       13 ( 0.4%, 93.8%):  log1p
( 16)       12 ( 0.4%, 94.2%):  CreatePipe
( 17)       11 ( 0.3%, 94.5%):  CreateSemaphoreA
( 18)       10 ( 0.3%, 94.9%):  FileTimeToSystemTime
( 19)       10 ( 0.3%, 95.2%):  SHGetKnownFolderPath
( 20)       10 ( 0.3%, 95.5%):  SetThreadErrorMode
( 21)       10 ( 0.3%, 95.8%):  zx_vmo_create
( 22)        8 ( 0.2%, 96.0%):  GetCurrentProcess
( 23)        8 ( 0.2%, 96.3%):  GetModuleHandleA
( 24)        7 ( 0.2%, 96.5%):  GetProcessHeap
( 25)        7 ( 0.2%, 96.7%):  LocalAlloc
( 26)        7 ( 0.2%, 96.9%):  SystemTimeToTzSpecificLocalTime
( 27)        5 ( 0.2%, 97.0%):  AddVectoredExceptionHandler
( 28)        5 ( 0.2%, 97.2%):  GetComputerNameExW
( 29)        5 ( 0.2%, 97.4%):  llvm.x86.sse.cmp.ps
( 30)        5 ( 0.2%, 97.5%):  signal
( 31)        5 ( 0.2%, 97.7%):  zx_channel_create
( 32)        4 ( 0.1%, 97.8%):  CertCreateCertificateContext
( 33)        4 ( 0.1%, 97.9%):  CertOpenStore
( 34)        4 ( 0.1%, 98.0%):  GetAdaptersAddresses
( 35)        4 ( 0.1%, 98.1%):  GetBinaryTypeW
( 36)        4 ( 0.1%, 98.3%):  llvm.x86.sse.sqrt.ss
( 37)        4 ( 0.1%, 98.4%):  zx_cprng_draw
( 38)        4 ( 0.1%, 98.5%):  zx_time_get
( 39)        3 ( 0.1%, 98.6%):  llvm.x86.aesni.aeskeygenassist
( 40)        3 ( 0.1%, 98.7%):  zx_event_create
( 41)        3 ( 0.1%, 98.8%):  zx_port_create
( 42)        2 ( 0.1%, 98.8%):  CreateFileA
( 43)        2 ( 0.1%, 98.9%):  FormatMessageW
( 44)        2 ( 0.1%, 99.0%):  MultiByteToWideChar
( 45)        2 ( 0.1%, 99.0%):  NotifyIpInterfaceChange
( 46)        2 ( 0.1%, 99.1%):  clang_createIndex
( 47)        2 ( 0.1%, 99.1%):  objc_getClass
( 48)        2 ( 0.1%, 99.2%):  zx_ticks_get
( 49)        2 ( 0.1%, 99.3%):  zx_timer_create
( 50)        1 ( 0.0%, 99.3%):  CloseHandle
( 51)        1 ( 0.0%, 99.3%):  CryptStringToBinaryA
( 52)        1 ( 0.0%, 99.4%):  FreeConsole
( 53)        1 ( 0.0%, 99.4%):  GetCurrentProcessId
( 54)        1 ( 0.0%, 99.4%):  GetLogicalProcessorInformation
( 55)        1 ( 0.0%, 99.5%):  GetModuleHandleExW
( 56)        1 ( 0.0%, 99.5%):  GetNativeSystemInfo
( 57)        1 ( 0.0%, 99.5%):  GetUserDefaultLocaleName
( 58)        1 ( 0.0%, 99.5%):  LoadLibraryW
( 59)        1 ( 0.0%, 99.6%):  OpenClipboard
( 60)        1 ( 0.0%, 99.6%):  PFXImportCertStore
( 61)        1 ( 0.0%, 99.6%):  PQinitSSL
( 62)        1 ( 0.0%, 99.7%):  RegOpenKeyExW
( 63)        1 ( 0.0%, 99.7%):  _getmaxstdio
( 64)        1 ( 0.0%, 99.7%):  llvm.x86.sse42.pcmpestri128
( 65)        1 ( 0.0%, 99.8%):  objc_allocateClassPair
( 66)        1 ( 0.0%, 99.8%):  objc_allocateProtocol
( 67)        1 ( 0.0%, 99.8%):  objc_getClassList
( 68)        1 ( 0.0%, 99.8%):  sel_registerName
( 69)        1 ( 0.0%, 99.9%):  strerror
( 70)        1 ( 0.0%, 99.9%):  zx_cprng_add_entropy
( 71)        1 ( 0.0%, 99.9%):  zx_eventpair_create
( 72)        1 ( 0.0%,100.0%):  zx_fifo_create
( 73)        1 ( 0.0%,100.0%):  zx_socket_create
#

@golden summit see above, got some SetThreadErrorMode but no CreateThread

#

That list is 2,000 crates

#

I'm going to rerun with --jobs=2 --test-threads=2 to see if things change

golden summit
#

i think SetThreadErrorMode can be a noop

haughty mica
#

I just wonder if that's blocking CreateThread?

#

Only 10 calls though

#

Oh

#

It's from the crates libloading and dlopen

#

Never mind

golden summit
#

id expect most threads to go through std

#

how the fuck is it calling GetFileInformationByHandleEx

#

theres no handles to call it on

haughty mica
#

Here's the crates that do this:

arboard/2.1.1
arrayref/0.3.6
atty/0.2.14
authenticator/0.3.1
base32/0.4.0
bollard/0.12.0
bv/0.11.1
colored/2.0.0
env_logger/0.9.0
indicatif/0.17.0-rc.11
jsonpath_lib/0.3.0
jsonrpc-client-transports/18.0.0
maxminddb/0.23.0
mdbook/0.4.18
multipart/0.18.0
pretty/0.11.3
pretty-lint/0.1.1
radix_fmt/1.0.0
serde-xml-rs/0.5.1
solana-frozen-abi/1.10.25
structopt/0.3.26
trust-dns-resolver/0.21.2
yamux/0.10.1
#

env_logger seems like a good place to look

#

The call is in atty

#
     Running tests/init-twice-retains-filter.rs (target/miri/x86_64-pc-windows-msvc/debug/deps/init_twice_retains_filter-511f5a5722b421d8.exe)
thread 'main' panicked at 'unsupported Miri functionality: can't call foreign function: GetModuleFileNameW', /root/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys/windows/os.rs:239:28
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
     Running tests/log-in-log.rs (target/miri/x86_64-pc-windows-msvc/debug/deps/log_in_log-aaa47fd10b166f41.exe)
thread 'main' panicked at 'unsupported Miri functionality: can't call foreign function: GetFileInformationByHandleEx', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/atty-0.2.14/src/lib.rs:132:15
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
     Running tests/log_tls_dtors.rs (target/miri/x86_64-pc-windows-msvc/debug/deps/log_tls_dtors-a50580c8219b890c.exe)
thread 'main' panicked at 'unsupported Miri functionality: can't call foreign function: GetFileInformationByHandleEx', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/atty-0.2.14/src/lib.rs:132:15
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
     Running tests/regexp_filter.rs (target/miri/x86_64-pc-windows-msvc/debug/deps/regexp_filter-165a09d07488d5d8.exe)
thread 'main' panicked at 'unsupported Miri functionality: can't call foreign function: GetModuleFileNameW', /root/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys/windows/os.rs:239:28
golden summit
#

oh hmm

#

its probably calling it on stderr/stdout

haughty mica
#

That sounds about right

#

Rerunning everything, emergency grocery run, I'll be back later

haughty mica
#

Still no CreateThread, 381 crates in

#

OH

#

Wrong search

#
╰ ➤ rg unsupported | cut -d' ' -f2- | rev | cut -d' ' -f2- | rev | counts
632 counts
(  1)      626 (99.1%, 99.1%): 'main' panicked at 'unsupported Miri functionality: can't create threads on Windows',
(  2)        3 ( 0.5%, 99.5%): 'main' panicked at 'unsupported Miri functionality: can't call foreign function: GetModuleFileNameW',
(  3)        2 ( 0.3%, 99.8%): 'main' panicked at 'unsupported Miri functionality: can't call foreign function: GetFileInformationByHandleEx',
(  4)        1 ( 0.2%,100.0%): 'main' panicked at 'unsupported Miri functionality: can't call foreign function: GetFullPathNameW',```
#

There's your answer

golden summit
#

oh yeah thats right

#

theres a special error for it

haughty mica
#

"cache everything, in num_cpus build directories" turns out to be exceedingly bad

#

Ran myself out of disk space in like 15 minutes

west violet
#

Because I think we should explore it for finding ub in the ecosystem, it's especially nice since it can potentially catch the end consumers of things that aren't on github, the actual binaries being used by people

tough leaf
west violet
#

That's doing a pretty big disservice

tough leaf
#

and by nicer i mean propritary

west violet
#

CodeQL is a program database, basically

#

And you can write queries against that program database to find pretty much anything

#

It's also not really proprietary, it's oss

tough leaf
west violet
#

It's MIT?

tough leaf
#

is it?

west violet
#

Yah

tough leaf
#

so i can run it on closed source code?

#

despite their ToS saying you can't

west violet
#

I mean, it's MIT

#

I don't see any tos on the repo

tender nimbus
#

i guess you can't use their hosted database thing for everything

#

but if you do it all yourself, it does look like its fine

tough leaf
#

You can create CodeQL databases yourself for any project that's under an OSI-approved open source license.

#

hmmm

tender nimbus
tough leaf
#

the thick plottens...

west violet
#

That's besides the point though

tough leaf
#

The CodeQL CLI (including the CodeQL engine) is hosted in a different repository and is licensed separately

#

yep
open source frontend talking to a closed source core

#

well, not open source, idk if it's closed

west violet
#

Weird

#

REGARDLESS

#

It could be really useful in the search for ub

tender nimbus
#

if it supported rust ferrisClueless

west violet
#

Yah, that's the sucky part

tender nimbus
#

well, they actually use some rust ferrisBut
but don't support querying it

west violet
#

But like, it's got dataflow analysis and stuff, that could be amazingly useful

haughty mica
tough leaf
#

in any case
it would be good if clippy had the power to do stuff like this

west violet
#

"Woah there cowboy, this branch is 100% reachable but you put an unreachable_unchecked there!"

tender nimbus
haughty mica
#

Yeah but... so could clippy

tender nimbus
#

yes

haughty mica
#

A lot of the static patterns I'm aware of are incredibly blatant, everything else is subtle

#

Put it this way: I'll be convinced this is worthwhile when I stop finding things with Miri

#

Dynamic analysis is great because you can do it without false positives. Static analysis has the opposite problem, it can both miss things and have false positives. The singular advantage is that you don't need test cases.

tender nimbus
#

miri is obviously better
but codeql wouldn't need extensive test coverage and doesn't care about miri being slow
but yeah, as long as codeql is not a thing with rust, it's not getting anywhere anyways

haughty mica
#

There's some research group doing static analysis looking for security issues in the ecosystem

#

I mostly am aware of them because they submitted a broken clippy lint, but I have seen them do correct things elsewhere

tender nimbus
#

ferrisBut what did the lint do

haughty mica
#

It was something with transmute

#

It flagged a lot of valid code as UB

tough leaf
#

wait what

#

this is literally UB isn't it

#

well

#

depending on unspecified layout

west violet
#

Indeed

tender nimbus
#

what

#

why

#

did they "fix" it

tough leaf
west violet
#

Someone should try to break it with -Z randomize-layout

tough leaf
#

and the abi of MaybeUninit<u32> is Union

#

and the abi of u32 is "integer"

#

ABI's different, therefore lint

#

the current code checks the align of the abi only

#

uhhhh

#

maybe

#

yes

#

the old code also would have linted on a bool to u8 transmute

#

because those have different ABIs

#

specifically, the valid_range

tender nimbus
#

oh, yeah that is stupid

tough leaf
#

regardless

#

the layout of vec is unspecified

#

soooo

tough leaf
#

hmm

#

can you make a field's size depend on T

#

well

#

typeid is not const is it

#

wait shit it is

#

?play

#![feature(generic_const_exprs, core_intrinsics)]
struct FuckYouVec<T> where [(); (std::intrinsics::type_id::<T>() % 8) as usize]: {
  a: T,
  lol: [u8; (std::intrinsics::type_id::<T>() % 8) as usize],
}
small apexBOT
#
warning: the feature `generic_const_exprs` is incomplete and may not be safe to use and/or cause compiler crashes
 --> src/main.rs:1:12
  |
1 | #![feature(generic_const_exprs, core_intrinsics)]
  |            ^^^^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(incomplete_features)]` on by default
  = note: see issue #76560 <https://github.com/rust-lang/rust/issues/76560> for more information

error: `std::intrinsics::type_id` is not yet stable as a const fn
 --> src/main.rs:3:34
  |
3 | struct FuckYouVec<T> where [(); (std::intrinsics::type_id::<T>() % 8) as usize]: {
  |                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = help: add `#![feature(const_type_id)]` to the crate attributes to enable

error[E0310]: the parameter type `T` may not live long enough
 --> src/main.rs:3:34
  |
3 | struct FuckYouVec<T> where [(); (std::intrinsics::type_id::<T>() % 8) as usize]: {
  |                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...so that the type `T` will meet its required lifetime bounds
  |
help: consider adding an explicit lifetime bound...
  |
3 | struct FuckYouVec<T: 'static> where [(); (std::intrinsics::type_id::<T>() % 8) as usize]: {
  |                    +++++++++

error: `std::intrinsics::type_id` is not yet stable as a const fn
 --> src/main.rs:5:14
  |
5 |   lol: [u8; (std::intrinsics::type_id::<T>() % 8) as usize],
  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = help: add `#![feature(const_type_id)]` to the crate attributes to enable

error[E0310]: the parameter type `T` may not live long enough
 --> src/main.rs:5:14
  |
5 |   lol: [u8; (std::intrinsics::type_id::<T>() % 8) as usize],
  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...so that the type `T` will meet its required lifetime bounds
  |
help: consider adding an explicit lifetime bound...
  |```Output too large. Playground link: <https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=5316b77b0f364001e128d17266e07eb8>
tough leaf
#

gah

west violet
#

Actually MaybeUninit is transparent so like, this may be ok?

tough leaf
#

uhhhhh

tender nimbus
#

according to the current rules, it's not ok
but, honestly, it should be

#

you should be allowed to transmute vecs

west violet
#

Vec is repr(Rust)

tough leaf
#

tbh i Kinda agree but also kinda not

#

vec being special like this would be kinda weird

tender nimbus
#

vec tells you that its layout is two usizes and a pointer, and if T is repr(transparent) then like... it will literally never ever break

#

on the other hand making weird exceptions is weird

haughty mica
#
test pool::tests::grow ... error: Undefined Behavior: dereferencing pointer failed: alloc53 has size 31, so pointer to 136 bytes starting at offset 330947 is out-of-bounds
   --> /home/ben/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/non_null.rs:381:18
    |
381 |         unsafe { &*self.as_ptr() }
    |                  ^^^^^^^^^^^^^^^ dereferencing pointer failed: alloc53 has size 31, so pointer to 136 bytes starting at offset 330947 is out-of-bounds
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
#

Wow that uh sounds bad

tender nimbus
#

uhm, crate?

haughty mica
#

heapless

tough leaf
#

again

haughty mica
#

No I'm just actually running Miri on it

#

I punted on it because of problems like this before

tender nimbus
#
Undefined Behavior: type validation failed at .value: encountered uninitialized bytes, but expected initialized plain (non-pointer) bytes
  --> target/miri/x86_64-unknown-linux-gnu/debug/build/if_rust_version-679cd1ad5881e595/out/generated.rs:114:33
   |
10 |     let mut foo: u32 = unsafe { ::std::mem::MaybeUninit::uninit().assume_init() };
   |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ type validation failed at .value: encountered uninitialized bytes, but expected initialized plain (non-pointer) bytes

wow, they used MaybeUninit, progress!

tough leaf
#

lmaooooo

haughty mica
#

Hey now

tough leaf
#

honestly if you're gonna do that i'd rather people just use mem::uninitialized

#

at least then i can panic

haughty mica
#

Isn't that code supposed to be like that though?

tough leaf
#

what, UB?

haughty mica
#

Yes

#

I think this is like a generated test to see if things compile/run

tender nimbus
#

it's if_rust_version

tough leaf
#
fn test_uninit() {
    unsafe fn init_bool(x: *mut bool) {
        std::ptr::write(x, true);
    }

    let b: bool = {
        if_rust_version! { >= 1.36 {
            let mut x = std::mem::MaybeUninit::<bool>::uninit();
            unsafe { init_bool(x.as_mut_ptr()); }
            unsafe { x.assume_init() }
        } else {
            let mut x : bool = unsafe { std::mem::uninitialized() };
            unsafe { init_bool(&mut x as *mut bool) }
            x
        }}
    };
    assert_eq!(b, true);
}
#

they have this test

#

which is correct (if your rust version is higher than 1.36)

#

and i think literally panics if it's not

#

?eval unsafe { std::mem::uninitialized::<bool>() }

small apexBOT
#
thread 'main' panicked at 'attempted to leave type `bool` uninitialized, which is invalid', src/main.rs:2:10
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
tough leaf
#

wait

#

if you're lower than 1.36

#

you wouldn't have that panic

#

lmao

tender nimbus
#

it's just a doc example

#
# use if_rust_version::if_rust_version;
if_rust_version!{ >= 1.36 { use std::mem::MaybeUninit; }}
// ...
if_rust_version!{ < 1.36 {
    let mut foo: u32 = unsafe { ::std::mem::uninitialized() };
} else {
    let mut foo: u32 = unsafe { ::std::mem::MaybeUninit::uninit().assume_init() };
}}

but an, uhm, bad one

tough leaf
#

brb having UB in your tests make them pass even though your code doesn't work

tender nimbus
#

this is a terrible doc example honestly

tough leaf
#

yes

#

honestly

#

if you use mem::uninitialized your code is sus

tender nimbus
#

but i cant be bothered to pr it

proper belfry
#

like, “if your Rust version is 1.36, cause UB”

#

“but if your Rust version is above 1.36, we can take advantage of those new features to…also cause UB!”

thorny venture
#

There should be a crate that offers maybeuninit APIs that can toggle it's underlying data

#

either it's backed on std::mem::maybeuninit, or it just uses UB

#

Also, if it has the slice/array APIs I'd use it

tough leaf
#

the ability to be UB on older versions

thorny venture
#

Yeah specifically the msrv support based on a build script to auto enable features

tender nimbus
#

ok but
who cares about pre-MU versions at this point

tough leaf
#

rt

thorny venture
#

There's always someone

tender nimbus
#

that shit's 3 years old and almost every other crates has higher MSRV

thorny venture
#

Generic array still exists

tough leaf
#

debian oldoldstable is apparently 1.41

#

that's beyond MaybeUninit

tender nimbus
tough leaf
#

if fuckin debian stable has it

#

so does everyone else

thorny venture
#

I still haven't actually managed to port rust crypto to const generics rip

tough leaf
tender nimbus
#

I've heard they're getting ready for rust 1.0 soon, I'm excited! though it's a bit annoying, their latest upgrade forced me to use Box instead of ~, that was a huge change

thorny venture
#

Wow rust really is verbose

tender nimbus
#

to be honest the ~ worked perfectly fine, i don't get why they've even upgraded

#

like, i was programming just fine

thorny venture
#

Should just be heap noalias T

tender nimbus
#

but then - boom- rust upgrade and everything was broken

tender nimbus
#
|r| &**r as *const _ as *mut Context

oh come on

#

please at least read the nomicon
even the nomicon says that this is bad

#
thread_local!(
    /// each thread has it's own generator context stack
    static ROOT_CONTEXT: Box<Context> = {
        let mut root = Box::new(Context::new());
        let p = &mut *root as *mut _;
        root.parent = p; // init top to current
        root
    }
);

ugh

#

yeah that's just cursed, not gonna touch that

#

(generator)

#
Undefined Behavior: type validation failed at .value[0]: encountered uninitialized bytes
   --> /root/.cargo/registry/src/github.com-1ecc6299db9ec823/solana-sdk-1.10.24/src/packet.rs:101:28
    |
101 |             data: unsafe { std::mem::MaybeUninit::uninit().assume_init() },
    |                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ type validation failed at .value[0]: encountered uninitialized bytes

lmao solana

#
 unsafe { mem::transmute::<&SpannedExpr<Id>, &SpannedExpr<Id>>(&*self.expr) }

the fuck???

#

uhhhhhh, the borrow checker is not happy removing some of those transmutes

#

it's a fucking abstract syntax tree what are they doing ferrisSob

haughty mica
#

Where?

tender nimbus
#

gluon

haughty mica
#

Nice

thorny venture
tender nimbus
#

although the type says something about Owned* so it's probably fine perhaps maybe idk didn't investigate closer

pastel lily
#

Somewhere before 0.11

haughty mica
#

dmsort released with all my fixes

haughty mica
#

Yoooooo tract passes a null pointer to memcpy with size 0

#

That's an old C bug

tender nimbus
pastel lily
#

actually what does C say about that

tough leaf
#

uhhhhh

#

even if the size is 0 the pointers still need to be valid according to Someone Online

pastel lily
#

lol

tender nimbus
#

Yeah that's ub ferrisballSweat

west violet
#

Yah, zero is always an invalid pointer

tough leaf
#

except for free and realloc

#

:)

#

free(null) is allowed

west violet
#

Thanks C™️

haughty mica
#

Try to fix UB in a crate

#

Add a SIGSEGV

#

Try to run tests with ASan

#

Find different UB

tender nimbus
west violet
#

Wow Saethlin, dbsp absolutely thrashes miri

haughty mica
#

Which part though

haughty mica
#

To be clear, "this test causes Miri to timeout/consume all the memory" is not interesting in itself.
If it's doing this for a new reason, that's interesting

west violet
#

Ah gotcha

haughty mica
#

Ok new problem

#
Analysing: /home/ben/miri-tools

└── 100.00% [164.13 GB] ── miri-tools
   ├── 85.00% [139.51 GB] ── shared-target
   ├── 7.47% [12.26 GB] ── logs
   ├── 6.74% [11.07 GB] ── cache
   └── 0.76% [1.25 GB] ── target
#

Tried just doing everything in a shared target dir, you know, for caching and speed

#

While it is definitely more efficient, this disk usage is intolerable

haughty mica
#

Oh dear the codebase also dies under MSan, before it dies under ASan

hybrid forge
thorny venture
#

I don't know if there is a storage rfc just yet

thorny venture
#

Some people extend it further to generic dyn storage so you can have unsized stuff on the stack but it has align issues

hybrid forge
#

I see. I wonder if that kind of trait would be generic enough so that bitvec for example could be rewritten with it

thorny venture
#

Unlikely. Bitvec is very specific

west violet
#

I don't see why bitvec would care tbh

#

It has memory, it's happy

thorny venture
#

I think they mean a bit storage trait that auto impls bitvec with regular vec

#

You definitely can use storage with your own bitvec wrapper, but you need to implement the bit twiddling in the vec and not the storage

west violet
#

You mean like std::vector<bool>?

thorny venture
#

Ja

west violet
#

Please no

thorny venture
#

Well, that's why you have bitvec and not Vec<bool>

west violet
#

Oh yah, totally agree

#

I mean, you can probably implement vector<bool> given specialization, but the allocator is fairly unrelated to that

tough leaf
#

vector<bool> feels like a good idea in a language that is higher level than C++ or rust is

west violet
#

It doesn't matter how you get your memory, just how you interact with it

tough leaf
#

like, a pure FP language can do it just fine if you have no concept of references

pastel lily
#

yeah

haughty mica
#

So I'm running ASan just plain old ASan and overriding with -Copt-level=0 and my results are ~50% coroutine libraries that ASan doesn't understand and ~50% merkle tree implementations that recurse and blow the stack

tender nimbus
#

the fuck 😵‍💫

#

this is like UB on every level

pastel lily
#

what

#

this ever worked?!?!

#

like yea it's UB but I would have assumed this also just... wouldn't produce the expected output

stable arrow
#

if you never mutate the vec the capacity is almost never read … ?

pastel lily
#

ah

#

I guess

tender nimbus
#

Don't forget about the fact that this is extremely UB (hopefully with a crash?) when you init this with more than 0 elements ferrisForgor

#

also was Vec::new not const when this was written?

haughty mica
#

There's another crate like this which was written because the author learned about transmute before once_cell

haughty mica
#

I just found an off-by-one error that absolutely takes the cake

#

Fortunately the crate isn't too big, only about a million downloads

west phoenix
#

Ooooh

haughty mica
#

https://github.com/gtk-rs/gtk-rs-core/pull/720/files

diff --git a/glib/src/gstring.rs b/glib/src/gstring.rs
index 9890a05fe50..241f0283f38 100644
--- a/glib/src/gstring.rs
+++ b/glib/src/gstring.rs
@@ -700,7 +700,7 @@ impl From<&str> for GString {
         unsafe {
             // No check for valid UTF-8 here
             let copy = ffi::g_malloc(s.len() + 1) as *mut c_char;
-            ptr::copy_nonoverlapping(s.as_ptr() as *const c_char, copy, s.len() + 1);
+            ptr::copy_nonoverlapping(s.as_ptr() as *const c_char, copy, s.len());
             ptr::write(copy.add(s.len()), 0);
 
             GString(Inner::Foreign {
#

I have 29 reports, and I think 17 of them are that bug

#

facepalm Mozilla why

pastel lily
#

what

pastel lily
#

how

#

what

#

I know we all do off by 1

#

but this one is... something

stable arrow
#

that reads to me as someone thinking "and copy the existing \0" (except that there is no such thing, for &str)

#

I notice they're also not checking for nulls inside the string, but that's a semantic problem rather than memory unsafety

pastel lily
#

ohhh

#

that could make sense for someone coming from C

#

does strlen include the \0 normally (I don't think so?)

tough leaf
#

no it does not

haughty mica
#

If the + 1 copies the null, why does the code then write one

stable arrow
#

I don't mean to claim it's self-consistent code, just that I can imagine how someone might write a + 1 out of habit

haughty mica
#

Haha okay

tender nimbus
#

what null terminators do to sane programmers

knotty oar
#

lol

tender nimbus
#

ah yes, "c programmers\0"

ruby jacinth
haughty mica
#
53:==4409==ERROR: AddressSanitizer: requested allocation size 0xffffffffffffffff (0x800 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T61 (tests::read_to_))

lmao

#

Of course it's actually from fallible-collections

#

That crate is so sus

pastel lily
tender nimbus
#
        let ptr = alloc(layout)?.as_ptr() as *mut T;
        unsafe {
            core::ptr::write(ptr, t);
#

oh nvm that's legal

#

it's a NonNull

#

i was already scared

haughty mica
#

Yeah it's technically fine because the allocator just fails, but it's extra amusing because the size ASan reports wraps

tender nimbus
#

test vec::tests::try_clone_oom ... error: Undefined Behavior: invalid metadata in wide pointer: slice is bigger than largest supported object lmao

pastel lily
#

oh no

haughty mica
#
// FIXME: EXPLAIN SAFETY
tender nimbus
#

(they from_raw_parts usize::MAX)

haughty mica
#

This is what I mean when I say safety comments are dubious

tender nimbus
#

oh

haughty mica
#

I'm just not sure anyone wrote some unsafe code, it was wrong, and they went to type out the Safety comment are were like "oh wait yeah I can't do this actually"

tender nimbus
#

ignoring that issue, at least the tests pass disable-sb ferrisClueless

#

default miri is ok as well

haughty mica
#

Btw some fraction of the invalid pointer offset Miri errors are buffer overflows

#

Miri won't report a buffer overflow because people do invalid offsets first

tender nimbus
#

i think everyone can imagine what raw pointer tagging miri thinks about this

haughty mica
#

Also the invalid offsets are definite UB in any case

#

Irrespective of aliasing model, because of GEPI

tough leaf
haughty mica
#
// TODO: document why this is ok

Narrator: it is not ok

tender nimbus
#

now i see why this is still a todo

#

it's simply impossible

haughty mica
#

Omg this is just a bad memmem implementation

#

Aaaaaaaa

#

Just use the good crate

tough leaf
haughty mica
#

I've actually heard C coders insist that SIMD buffer overflows don't count

tough leaf
#

well, you literally can't SIMDify strlen if you don't say "welll..... if we read it but don't use it, it's fine"

#

that being said i think it's usually written in asm

#

where stuff like that is fine

#

ish

haughty mica
#

Uhhhhh

#

Yeah

#

So about that

#

The one in glibc is not

tender nimbus
#

i mean, it's glibc, they are allowed to do these cursed things if they're very careful ferrisClueless

haughty mica
#

wtf is chewing up all my CPU
Oh I'm running an FFT library with ASan

#

At least it's not a solana crate, I'll just renice this instead of kiling it

thorny venture
#

Who's fft library?

#

I can't remember if I even published mine so it's ok 😅

haughty mica
#

ark-poly

#

Not to worry, you'll get a PR if one of your crates has a problem

tough leaf
#

this is what my github homepage is

#

you forking repos

haughty mica
#

I've exited the popular crates and there's a lot more UB down here

#

Same story every time

tough leaf
#

i'm having another pop at sentry's crates

#

for fun

#

i don't think they have too many problems directly it's just their deps

#

and those have been largely fixed to my knowledge

#

pop as in fuzzing

#

not UB

haughty mica
#

Yeah sentry is all about processing untrusted data isn't it

tough leaf
#

yes

#

yes it is

#

untrusted data from processes who just crashed

haughty mica
#

Do they run sentry on sentry?

tough leaf
#

presumably

haughty mica
#

Don't count on it

#

Customers specifically try to run our network sensors out-of-band, and we've occasionally told them that this is a bad idea because the sensor won't monitor itself

#

Btw --cpu-shares slaps

#

Computer is now very usable with all the cores pinned by various containers

tough leaf
#

oh look
they have a subtract with overflow in a parser

#

that doesn't count

#

i'll run in release to get more real issues

pastel lily
#

though that should be changed to use an explicit method or type

#

because it might not be the intent that it overflows

tough leaf
#

oh true

#

but "hey i found a debug assertion" is not as interesting as "hey i found a segfault"

pastel lily
#

yeah

tough leaf
#

which i did, once

in C++ code

pastel lily
#

but now you have a value of several billion that’s presumably not meant to be several billion

tough leaf
#

breakpad really living up to its name

#

sure

pastel lily
#

oh yea and you got called awesome in a blog post ferrisOwO

tough leaf
#

yes

haughty mica
#

I'm now asking questions like "How did this ever work"

#

Computers are wild

west violet
#

Too bad miri isn't more performant, fuzzing under miri could be amazing

#

Plus if you got really creative you could probably take advantage of miri knowing what's going on to give yourself SAGE-style path guided fuzzing

west violet
#

Still sad sage isn't public

tough leaf
#

you won't get fast

#

so you need to optimise for making the most of every execution you get

tender nimbus
haughty mica
#

I was misreading it, the buffer overflow is not nearly as large as I thought at first

#

But it does overflow by like 24 bytes

west violet
#

Wowza

tender nimbus
#

only a little overflow
we can take that

haughty mica
#

It's a SIMD loop and the loop bound is wrong

#

But I looked at the tests and this library really needs fuzzing

tough leaf
#

HI

#

IM HERE

#

WHAT'D YOU NEED

tender nimbus
#

one fuzzing, please

haughty mica
tough leaf
#

okay :3

haughty mica
#

Image conversion

tough leaf
#

oooh

#

they don't

#

cool

haughty mica
#

I have yet to find one

west violet
#

AWS ACE time ferrisVibe

tough leaf
#

i've not once found ACE through a fuzzer

#

in rust

west violet
#

That's fair, it's a little harder in rust

haughty mica
#

I'm mostly suspicious of how this handles odd sizes/strides. The color transformation is just basic bit twiddling

#

Oh wait they do have a security policy lmao

tough leaf
#

i don't care

#

:)

haughty mica
#

Yeah uh so this link takes me to a big page and I can't figure out what to do from here 😂

#

Okay going for the generic AWS one

tough leaf
#

You can not use any other library function (also in other threads) while the initialization is in progress. Failure to do so result in undefined behaviour

#

> on a safe function

#

static mut GLOBAL_STATE: GlobalState = GlobalState

#

there you go

#

that'll do it

west phoenix
#

I was literally about to post that lmao

haughty mica
#

Yeah I saw that

#

I was trying to figure out how to do that in a fuzzer and was like naw

tough leaf
#

once_cell in a static

#

well

#

once

#
static INIT_ONCE: std::sync::Once = std::sync::Once::new();
...

INIT_ONCE.call_once(|| {
    dcp::initialize();
});

#

the fuzzer itself isn't multi-threaded so this seems fine

haughty mica
#

Ah

tough leaf
#

also i think maybe libfuzzer has a init callback

#

since i'm assuming C++ code also wants to do setup code

haughty mica
#

I sent AWS a message. If they don't get back to me in 24 hours, PR goes up.

haughty mica
#
                        vertices: [[180, 200], [260, 200], [260, 150], [180, 150]].as_mut_ptr(),
#

I'm deep enough that I'm now seeing jetscii's users

#

😩

tender nimbus
pastel lily
#

who wouldn't be careful

tender nimbus
#

i would be very careful

haughty mica
#

Somewhere in this file is a write which is a stack buffer overflow

#

I will not be debugging this

#

Holy shit this other crate is mostly use after frees

#

Every other line is CString::new().as_ptr()

#

Ok I made it sound worse than it is, they only did the pattern 37 times

tender nimbus
#

why do people write unsafe code

tender nimbus
#

for example, 38 times

golden summit
#

or god forbid 39 times

knotty oar
#

oh damn can't imagine 42 times, that would be the way of life

tender nimbus
#

yoooo #960647106056581150 message @haughty mica ferrisFlushed

#

(on the eyre community discord)

tough leaf
tender nimbus
#

good

knotty oar
#

lol

thorny venture
#

when the behaviour is uwu
ferrisUwU

tender nimbus
#

uwundefined behaviour

haughty mica
#

Very cool

#

Also smallvec released a new version so they can be off the site soon

tender nimbus
thorny venture
#

I wonder if I can get small vec + storage to work ferrisBigBrain

#

(safely owo)

haughty mica
#

You mean tinyvec?

tame jewel
#

And then started to write an alternative anyways ferrisHmm

tough leaf
#

Lazy in the stdlib is nightly only 😭

#

i'll tell them to use once_cell or lazy_static then I guess

tame jewel
#

Indeed. Did it pass miri fuzzing yet? The whole unsafe business with pointers seems very nonlocal … and hardly motivated.
No bounds anywhere, all slices immediately cast to raw pointers on function entry. The simd modules do a lot of bounds-check logic themselves, instead of being shared and passing some sort of wrapper. Tbh, one too many complexities to seriously take a look at soundness.

tough leaf
#

i don't think it would run under miri because SIMD?

tame jewel
#

There's even a custom bswap(i32) in there

#

Jeez.

tough leaf
#

oh, it also uses cpuid

tame jewel
#

And it's unsafe.

tough leaf
#

and miri does not support cpuid

tame jewel
#

Yep, not quite yet, wasn't there work-in-progress? I don't quite know where it's at

tough leaf
#

anyways it's corporate code i don't work for free :)

#

except when i do (see: sentry)

tame jewel
#

Ah, portable simd is implemented but none of the arch specific versions

tame jewel
tough leaf
#

oh should i fix them too

#

:D

tame jewel
#

A little update on this: the issue of supporting SIMD in MIR-based Rust verification tools came up at the Rust Verification Workshop at ETAPS, and there is a lot of interest in finding a shared solution. One avenue we are considering is to have a pure Rust implementation of those intrinsics; tools consuming MIR or LLVM IR should be able to compile that implementation to the required IR and replace intrinsic calls with that IR.

_Originally posted by @RalfJung in https://github.com/rust-lang/miri/issues/662#issuecomment-817122692_

GitHub

constant evaluation error: tried to call a function with ABI RustIntrinsic using caller ABI PlatformIntrinsic --> …/nightly-2019-03-15-x86_64-apple-darwin/lib/rustlib/src/rust/src/libcore/.....

tough leaf
#

honestly i'm not that happy with the state of preallocation OOMs

#

and like

#

i don't know a sound way to fix it

tame jewel
tough leaf
#

haha

#

yeah idk

#

i wonder if just taking a generic allocator that you use for all allocations then implementing limits as an allocator with a cap would be good

#

you'd need falliable allocation then though

#

on all data structures you want to use

tame jewel
#

Also I was more referring to new stuff, color buffers and color management. Decoders and Encoders are a totally different deal.
Well, slowly. What recently came up: we really need a unified way to have core-io that's SemVer compatible with std-io..
So that crates can offer std-io as a feature with no SemVer incompatibility from adding the feature.
And we'd also really like having optional-Seek-readers.

tame jewel
#

There's many details with specifics of how the allocator is used, depending on data structure and you don't want that in the interface

#

Or all the interfaces to have all the details

tough leaf
#

wait hm?

#

not sure what you mean

tame jewel
#

Well, sometimes you allocate as an optimization

tough leaf
#

do you mean the Allocator trait is bad

tame jewel
#

It's suboptimal to implement that policy via an allocator, what I'm saying.

tough leaf
#

oh hm

#

ig

tame jewel
#

There's contextual information you'll want to have, or share with other parts, and those shouldn't all be owned by the allocator handle type, and writing an unsafe wrapper around the allocator (because the allocator trait is unsafe) is a rather unfortunate too

#

What about concurrent allocation? Do you then use atomics to track actual use or is a rough policy (up to a constant factor, or a constant overhead) good enough?
Performance implications of that? How to you even express 'allocate n bytes, but really if you can't then m bytes are enough, just please never block any code, and also I'd like to store the result there because some work-stealing task elsewhere is going to take that and finish the rest you couldn't allocate with worse performance afterwards instead`'.
Nah, it's omplicated enough to work that out in a dedicated unit of code without be constrainted by the allocator interface.

#

The trait is actually in a fine place. If anything, I'm question if we should not decouple stabilizing the new allocator trait/concept from stabilizing the type parameters on standard containers more. Because providing a mechanism of alignment-changing-realloc is so much more relevant than being able to customize all std-containers with arbitrary policies. If you don't like the policy choices that std can offer then you'll write your own containers and this is fine. Tbf, the fact that C++ containers are the most relevant comparison make me worried that this ends up copying too much without having the same reasons.
In the end, Linux and other projects will code their own containers and allocator handles anyways. (And be it just to interoperate with vtable-based C libraries which, btw, mostly work by taking three simple method pointers alloc,realloc,dealloc.) The most relevant question is then: which shared abstractions can we offer so that programming these is as safe and sound as possible. Anything else doesn't have the same priority in my eyes.

tender nimbus
#

@haughty mica do you plan on using permissive provenance in the next run once it becomes stable?

haughty mica
#

I could use it now

#

Answer is... maybe?

#

It's worth a try

tender nimbus
#

I think having such a run would be useful

haughty mica
#

Actually this would be a great way to find stuff

tender nimbus
#

Not displaying the trivially fine ptr2int "UB" while still catching interesting cases around pointer misuse

haughty mica
#

Well the thing is that previously, ptr2int was actually UB

#

SB with Untagged does not permit -Zmutable-noalias, which was on by default

tender nimbus
#

I meant int2ptr ferrisballSweat

haughty mica
#

ptr2int2ptr actually

#

Yeah

tender nimbus
#

The kind of thing no provenance model would dare to break

#

Well, no "real model" in the tower of weakenings

#

strict provenance very much breaks it ferrisOwO

tough leaf
#

the two examples are XOR linked lists and FFI

tender nimbus
#

I'm not talking about theoretical but actual code

#

Which could totally be rewritten, but doesn't have to

tough leaf
#

okay, what's actual code that doesn't follow it and can't

#

well

#

AtomicPtr needs manipulation ops

#

which it's getting

haughty mica
#

XOR linked lists are dumb and people should stop bringing them up

#

You can always invent some diabolical use case that breaks any reasonable model. The reality is that they are incredibly niche, and if you can't implement XOR linked lists in Rust that is a fine price to pay for a reasonable memory model.

tough leaf
#

agreed

haughty mica
#

FFI is a lot more interesting

tough leaf
#

you'd need a "this for FFI purposes is a (int, long, whatever) but for opt purposes is a pointer" ig

haughty mica
#

FFI is the boundary of the Rust AM so I really don't know if you can just fib and use your platform knowledge to keep everyone happy, though that's more a question for #932319394724479037

#

Ok got 4840 crates in the queue

haughty mica
#

int2ptr is still unsound because angelic nondet isn't checkable

#

I'm going to use the default mode, and probably come up with some other kind of categorization

haughty mica
#

lmao found some legit issue in crossbeam-epoch

#

I like this

haughty mica
#

I think a large fraction of crates are graduating to timeouts ugh

#

Maybe I should version the bucket

haughty mica
#

Ralf is forking concurrency stuff again

#

10/16 wildcard errors so far are string_cache

tender nimbus
west violet
#

It's a cool project so I'd really like to see it sound-ified tbh

haughty mica
#

Not a lot of tests

haughty mica
#

1,000 crates left to run in the previously-ub list

#

I'm rerunning all the previously UB crates with new Miri

tame jewel
#

rayon 1.5.3
UB: deallocating while item is protected: [SharedReadOnly for <412516> (call 112693)] (rayon-core-1.9.3), deallocating while item is protected: [SharedReadOnly for <4513936> (call 1177075)] (rayon-core-1.9.3), deallocating while item is protected: [SharedReadOnly for <716428> (call 189531)] (rayon-core-1.9.3)
Okay, what's that about ferrisSweat

tough leaf
#

is this the atomic bomb

#

in Arc

haughty mica
#

Could be

#

Oh wait no

#

Rayon does the same pattern as Arc, which I think was made valid recently

#

Updates like that are tricky to do

#

I have built in a way to rerun just crates with new versions but it's so hard to know if the validity of a crate has changed due to changes in the model

#

Miri's weak memory emulation uncovered a lot of data races that were previously undetected, and also I'm not running things with more that one thread so that will probably uncover more too

#

Just a few minutes left now ablobhungry

#

I think I'm misattributing a lot of UB to once_cell

#
histogram over kind of UB
uninitialized memory
  773 ##################################################
SB-use-outside-provenance
  719 ##############################################
null pointer dereference
  695 ############################################
SB-invalidation
  608 #######################################
type validation failed: encountered uninitialized raw pointer
  254 ################
&->&mut
  152 #########
misaligned pointer dereference
  144 #########
dereferencing pointer failed: [hex] is not a valid pointer
  132 ########
SB-uncategorized
  100 ######
invalid pointer offset
   74 ####
#

That's a lot of use outside provenance now

#

I almost wonder if that's incorrect because the wildcard diagnostics are incomplete

#

Of 68 crates which report wildcard errors, 32 are due to string_cache

haughty mica
#

Oh exposed tags need to still be valid

#

Oh dear me

#

R I P string_cache I will work on something else

haughty mica
#

L M A O

I just realized that because I never delete stuff from the bucket, build logs for old crates stick around

haughty mica
tame jewel
#

Huh. I can't reproduce it locally. Maybe it requires a very recent miri …

#

Or it's the --test-threads=2 argument

haughty mica
#

I can repro with just isolation off, on the latest nightly

tame jewel
#

It doesn't always happen at the exact same test for me.. Which one is it for single threaded tests?

#

That is, is it deterministic then? I really hope so

haughty mica
#

I would not count on it being deterministic

#

You need to disable isolation for num_cpus to work

#

Why did you set -Zmiri-backtrace=full?

tame jewel
haughty mica
#

Ah. It's supposed to be the analogue of RUST_BACKTRACE=full

#

It's meant to prune away frames which are just noise unless you're debugging the Rust runtime. If it's pruning away anything you care about that would be very important to know

haughty mica
tame jewel
#

Hm, yes, might be a quick and subtle ux win

#

Is there some way to find the line number corresponding to the SharedReadOnly tags?

#

I.e. get miri to print how those came to be on the stack in the first place

haughty mica
#

What report are you looking at?

tame jewel
#

The one from rayon. I'm wondering if we can find out which reference is still alive when the stack frame is deallocated

haughty mica
#

OH

#

Yeah

#

:/

#

I punted on implementing something for protector errors, but if you're rerunning locally you can do a MIRIFLAGS=-Zmiri-track-pointer-tag=9230543 but you'll end up with an FnEntry retag so things might look a bit odd

tame jewel
#
   --> rayon-core/src/latch.rs:372:5
    |
372 | /     fn set(&self) {
373 | |         L::set(self);
374 | |     }
    | |_____^ created tag 2782933
    |
    = note: inside `<&latch::LockLatch as latch::Latch>::set` at rayon-core/src/latch.rs:372:5
note: inside `<job::StackJob<&latch::LockLatch, [closure@rayon-core/src/registry.rs:465:17: 469:18], ((), ())> as job::Job>::execute` at rayon-core/src/job.rs:114:9
   --> rayon-core/src/job.rs:114:9
    |
114 |         this.latch.set();
    |         ^^^^^^^^^^^^^^^^

Is there maybe a worker thread alive, still waiting before on a mutex? (in rayon-core/src/latch.rs:LockLatch::lock it tries to lock a mutex).

haughty mica
#

I'm hesitant to answer that confidently

#

I suspect this is the same problem as Arc had

#

Strange... I feel like this whole things should end up being retagged SRW because it's all atomics

west violet
#

@haughty mica since there’s no perf degradation in my miri pr is it good?

haughty mica
#

No (I'd rather discuss this elsewhere, because this is not about fixing UB)

tame jewel
haughty mica
#

So this needs either a fix in rayon-core or the full fix Ralf suggested

tame jewel
#

Hm okay, finally understood the relation. Makes sense. What if Drop of the lock guard would be guaranteed to occur like a tail-call, through the whole call chain?
Then the order would be first removing the stack protectors or parents, then actually making the atomic subtract visible. Well, seems hard to solve in a decently safe manner, without turning every parameter into *const _..

haughty mica
#

Yes that is exactly it

tame jewel
#

But it's still confusing why it would only show up for that test, not for all other tests.

haughty mica
#

If you comment that test out, do the others after it all pass?

tame jewel
#

Many other's do, yes. join_context_second leaks memory but works

#

There's a large chunk of tests that show as already passed in your site too.
What's the condition for another thread having taken a task, and why is it so consistent in that case

haughty mica
#

I will offer up that in the Arc example linked above, one needed hundreds of iterations to get the timing that actually hits the bug.

I wonder if setting -Zmiri-preemption-rate=1 or some other very high value makes these more likely to hit

#

Or I wonder if there's some other way to make this more likely

tough leaf
#

simd-json's tests get SIGILLed when you build-std

#

hmmmm

#
unsafe {
    input_buffer
        .as_mut_slice()
        .get_unchecked_mut(..len)
        .clone_from_slice(input);
    *(input_buffer.get_unchecked_mut(len)) = 0;
    input_buffer.set_len(len);
};

hmmmm

#

yes this is a get_unchecked_mut out of bounds

knotty oar
#

lol

tough leaf
#

now, what's the least bad way to do this....

#

there's MaybeUninit::write_slice but that's unstable

#

pointer methods it is

tender nimbus
#

a little bit of ptr::copy_nonoverlapping maybe

tough leaf
#

yep

tough leaf
#

oh no this seems like the kind of program that Really depends on indexing out of bounds

#

and their CI seems fucked rn

#

fu

#

n

tender nimbus
#

simd-json is very cursed

#

but the maintainer is open to fixes

tough leaf
#

oh i know
there's just

a lot to fix

tender nimbus
#

A Whole Lot

#

And Miri can't help ferrisballSweat

tough leaf
#

might add a compiler option for verbose errors on failure of a unsafe precondition

#

idk

#

so it just panics

#

with a nice error message

tender nimbus
#

Would be good

haughty mica
#

Miri could help

#

It's just simd 4head

tough leaf
#

that and no one's going to be running a webserver in prod under miri, but you might run at least some under debug assertions

tender nimbus
#

yeah one would run a webserver in Miri because it can't even work ferrisSob

tough leaf
haughty mica
#

None

tough leaf
#

the correct answer is none, yes

#

this feels like a thing the stdlib could do better at

haughty mica
#

Please complain to libs

#

I'd be happy to improve the debug assertion output but I would want a way to pick which version is in use

#

I have no idea what that would be

tough leaf
#

well, we have #[rustc_inherit_overflow_checks] don't we

#

i've got no idea how that works

haughty mica
#

No I mean the abort version is faster and I want to be able to select that

tough leaf
#

oh

haughty mica
#

But we should have a way to emit something helpful too

tough leaf
#

i mean right now it could be a feature flag on core

haughty mica
#

The UX is brutal but yes.

tough leaf
#

well, if you're building std it's not much worse

#

but yes

haughty mica
#

Just don't ask for too much help in the errors

#

You can have a stringify of the expression that's false and a panic but that's probably it, unless we do every one individually

tough leaf
#

the way i imagine it, it would be a SIGILL with no formatting under stock settings (or maybe under a "immediate_abort" setting, idk), and a debug_assert showing as much info as we have and that would be useful
so yes that would include string formatting imo

#

like, if you go to read from an unaligned address, we should probably mention the address

haughty mica
#

Argh so you want a case by case check

tough leaf
#

wdym case by case
every function needs hand-written debug assertions specifically for that function, no?

haughty mica
#

But not case by case formatting

tough leaf
#

isn't case by case formatting what debug_assert does

#

is this for perf reasons or maintenance reasons

haughty mica
#

Me being lazy reasons

tough leaf
#

ahhh

#

so the issue is you'd need 3 assertions there

#

instead of just the one

haughty mica
#

Yup

tough leaf
#

ah right

haughty mica
#

Really you need 5

#

And many pointer APIs need null + aligned

tough leaf
#

those assertions also don't check that src/dest + sizeof(T)*count doesn't overflow
but i Highly Highly Highly doubt any crate is doing that, seeing as that is going to segfault

#

but yeah

#

i think macros for common cases would be good
like

pub const unsafe fn copy_nonoverlapping<T>(src: *const T, dst: *mut T, count: usize) {
    extern "rust-intrinsic" {
        #[rustc_const_stable(feature = "const_intrinsic_copy", since = "1.63.0")]
        pub fn copy_nonoverlapping<T>(src: *const T, dst: *mut T, count: usize);
    }

    // SAFETY: the safety contract for `copy_nonoverlapping` must be
    // upheld by the caller.
    unsafe {
        debug_assert_aligned_non_null!(src, dest);
        debug_assert_nonoverlapping!(src, dest, count);

        copy_nonoverlapping(src, dst, count)
    }
}
#

maybe don't make debug_assert_aligned_non_null variadic that seems maybe confusing

#

but this is both easier to read imo, as well as would easily allow verbose formatting on failure

#

and then one-off debug assertions can either be moved to a new macro or have the definition of debug_assert changed to have a abort() depending on the feature flag

#

since we already have quite a few one-off debug assertions already for stuff like Option::unwrap_unchecked

#

those probably aren't as hot of a path as copy_nonoverlapping though

haughty mica
#

Lmaoooo my SB optimizations make more crates appear as UB because their tests don't time out anymore

pastel lily
#

lmao

tough leaf
#

stonks

#

memory sanitizer has no way to go "hey, tell me if any of the bytes in this value are uninit" :(
it doesn't even seem to care about noundef metadata??

west violet
#

Ok, I actually thought up another use for CodeQL with rust, one that could actually be super valuable

#

It could act as a replacement for crater in some circumstances

#

It can answer questions like "how many people call X", "how much does this pattern get used", etc.

#

Even stuff like "most ignored lint" and whatnot

knotty oar
#

sounds interesting

haughty mica
#

Wait doesn't Google do this with C++ internally

tender nimbus
#

Meta would not be able to do this since they have no idea where there code is apparently ferrisBut

west violet
#

Code databases aren't exactly new tech

haughty mica
#

Yeah I'm sure Facebook does not have such a tool

#

I know that Google has a good way to do queries and rewrites of theirs, based on semantic analysis

#

You can't get hung up on generics, for example

tough leaf
#

does llvm really just refuse to let you enable msan checks on noundef loads
is that not implemented, rather

#

im upsetti

tender nimbus
tough leaf
#

i did ask in the llvm discord

#

basically "heyyyyy is there a way we can do this"

#

because like

#

it would be Very Good

tender nimbus
#

implement it ferrisClueless

tough leaf
#

no, llvm is C++

#

if i did my MIR sanitizer

#

then that could do it

#

by knowing what bits of every value are init

#

and touching them

#

(well, branching on their value)

#

-fsanitize-address-param-retval