How unsafe is this? | Rust Programming Language Community | Page 1

safe wyvern Jun 11, 2025, 6:24 PM

#

I am writing a Rust extension for Python using PyO3 and I am planning to finalise my release for v1.0 which introduces the following async API.

// function signature of `future_into_py` as reference
// pub fn future_into_py<F, T>(py: Python, fut: F) -> PyResult<Bound<PyAny>>
// where
//    F: Future<Output = PyResult<T>> + Send + 'static,
//    T: for<'py> IntoPyObject<'py>

fn hash_async<'a>(&self, py: Python<'a>, bytes: &'a [u8]) -> PyResult<Bound<'a, PyAny>> {
    let seed = self.seed;
    let hasher = self.hasher;
    let bytes_static = unsafe { std::mem::transmute::<&'a [u8], &'static [u8]>(bytes) };

    future_into_py(py, async move { gxhash(hasher, bytes_static, seed) })
}

It uses std::mem::transmute to transmute a bytes array, owned by the Python interpreter, into a static lifetime so that it can be used in future_into_py which is essentially a wrapper for tokio::spawn. I understand that doing this is extremely dangerous but my rationale here is that bytes has the same lifetime as py, which is the Python interpreter. As a Rust extension for Python, should I even care what Tokio is doing with my bytes array if the Python interpreter no longer exists?

AFAIK, the only other way to make this work is to clone bytes which I am not willing to do because it is an order of magnitude slower than just passing the bytes array directly.

My question is:

What are the possible implications for my Python users, if at all?
Is there a better way to do this without affecting performance?

umbral estuary Jun 11, 2025, 6:31 PM

#

I'm not familiar with pyo3, but this is almost certainly a use-after-free bug

#

Python<'a> doesn't mean that the entire Python session extends over 'a, it means that the Python GIL is held for 'a.
https://docs.rs/pyo3/latest/pyo3/marker/struct.Python.html

#

if you want to perform the hashing asynchronously then you also have to incrementally read the bytes from the Python world, with the lock held.

#

actually it looks like you can use https://docs.rs/pyo3/latest/pyo3/types/struct.PyBytes.html

PyBytes in pyo3::types - Rust

Represents a Python bytes object.

#

which says it is an immutable block of bytes, so if your async task holds it, it can borrow &[u8] from that

safe wyvern Jun 11, 2025, 6:35 PM

#

umbral estuary if you want to perform the hashing asynchronously then you also have to incremen...

Hmm.. According to my tests, the GIL doesn't seem to be held once future_into_py is called.

umbral estuary Jun 11, 2025, 6:35 PM

#

sorry, let me clarify: I meant you would take the GIL repeatedly to read chunks of bytes

#

however, PyBytes looks like a much better option than doing any of that

#

https://docs.rs/pyo3/latest/pyo3/struct.Py.html#method.as_bytes

#

so your code should be something like

fn hash_async<'a>(&self, py: Python<'a>, bytes: Py<PyBytes>) -> PyResult<Bound<'a, PyAny>> {
    let seed = self.seed;
    let hasher = self.hasher;

    future_into_py(py, async move {
        let slice = {
            let py = todo!("you need to get a new `Python` here somehow");
            bytes.as_bytes(py)
        };
        gxhash(hasher, slice, seed)
    })
}

I don't know how the part I marked todo! is supposed to work though

safe wyvern Jun 11, 2025, 6:43 PM

#

Yeah, I am testing it right now. Praying it doesn't clone 😄

safe wyvern Jun 11, 2025, 7:04 PM

#

as_bytes indeed does not copy.

pub(crate) fn as_bytes(self) -> &'a [u8] {
    unsafe {
        let buffer = ffi::PyBytes_AsString(self.as_ptr()) as *const u8;
        let length = ffi::PyBytes_Size(self.as_ptr()) as usize;
        debug_assert!(!buffer.is_null());
        std::slice::from_raw_parts(buffer, length)
    }
}

#

Anyways, it seems to work. Performance seem to have taken a 25% hit but it's not as bad as cloning.

safe wyvern Jun 11, 2025, 7:26 PM

#

umbral estuary so your code should be something like ```rust fn hash_async<'a>(&self, py: Pytho...

Thank you, Kevin. I've attempted to solve this problem on and off for months.

fn hash_async<'a>(&self, py: Python<'a>, bytes: Py<PyBytes>) -> PyResult<Bound<'a, PyAny>> {
    let seed = self.seed;
    let hasher = self.hasher;

    future_into_py(py, async move { gxhash(hasher, Python::with_gil(|py| bytes.as_bytes(py)), seed) })
}

unborn cipher Jun 11, 2025, 9:13 PM

#

its unclear why this function is async

#

depending on what gxhash is doing you'll probably be better off forgetting that async exists and using allow_threads instead

#

@safe wyvern take a look at https://pyo3.rs/v0.25.0/parallelism.html

Parallelism - PyO3 user guide

PyO3 user guide

safe wyvern Jun 11, 2025, 9:23 PM

#

unborn cipher its unclear why this function is async

In Python, await is a necessary keyword (+ dropping the GIL) to not block the event loop. So yeah, this is 100% intended.

#

Rayon would be more useful for something like a hash_batch function.

unborn cipher Jun 11, 2025, 9:26 PM

#

safe wyvern In Python, `await` is a necessary keyword (+ dropping the GIL) to not block the ...

...which is useful because that means when you are unable to make progress at this time, you can let the event loop do something else. But what you do is hashing, which is a cpu bound operation that is always able to make progress

#

I'm thinking it would be more useful to let the event loop run while you are hashing at the same time

safe wyvern Jun 11, 2025, 9:27 PM

#

If you are hashing in a web server, do you want to let it block other unrelated requests from coming through?

unborn cipher Jun 11, 2025, 9:27 PM

#

Assuming you're actually using multiple threads

safe wyvern Jun 11, 2025, 9:27 PM

#

unborn cipher I'm thinking it would be more useful to let the event loop run *while* you are h...

That is exactly what I am doing here.

#

Python has no way of knowing that the event loop can continue without using async.

unborn cipher Jun 11, 2025, 9:28 PM

#

But you are holding the gil the entire time.

safe wyvern Jun 11, 2025, 9:28 PM

#

future_into_py drops the GIL.

unborn cipher Jun 11, 2025, 9:28 PM

#

Things aren't happening at the same time, they're just taking turns

safe wyvern Jun 11, 2025, 9:30 PM

#

unborn cipher Things aren't happening at the same time, they're just taking turns

If you do asyncio.gather() on two hash units of work, it will compute in parallel.

unborn cipher Jun 11, 2025, 9:31 PM

#

safe wyvern If you do `asyncio.gather()` on two hash units of work, it will compute in paral...

Where did you read this? The documentation disagrees

unborn cipher Jun 11, 2025, 9:32 PM

#

safe wyvern `future_into_py` drops the GIL.

...and reacquires it when it comes time to actually executing the future

#

Any time you have a Python token in scope, either explicitly or implicitly, that means no other python code gets to run

safe wyvern Jun 11, 2025, 9:37 PM

#

unborn cipher ...and reacquires it when it comes time to actually executing the future

The GIL is there to stop another thread from interpreting the Python bytecode. gxhash is written in Rust. The interpreter does not read any Python bytecode during that operation. So other Python operations are allowed to run in parallel.

#

I have a bunch of tests that shows that it does not block the event loop while hashing a 10 GB file and I have compared it with a non-async variant as well.

#

https://github.com/winstxnhdw/gxhash/tree/main/py-gxhash

Feel free to try it out yourself.

GitHub

gxhash/py-gxhash at main · winstxnhdw/gxhash

The fastest hashing algorithm 📈. Contribute to winstxnhdw/gxhash development by creating an account on GitHub.

unborn cipher Jun 11, 2025, 9:43 PM

#

are you not using pyo3_asyncio

safe wyvern Jun 11, 2025, 9:45 PM

#

unborn cipher are you not using pyo3_asyncio

That’s long been unmaintained

unborn cipher Jun 11, 2025, 9:47 PM

#

yeah that's good, that you aren't using it

safe wyvern Jun 11, 2025, 9:52 PM

#

unborn cipher Where did you read this? The documentation disagrees

I can’t find where I read it anymore but I still have an email with someone on the CPython core team about this.

unborn cipher Jun 11, 2025, 9:52 PM

#

safe wyvern The GIL is there to stop another thread from interpreting the Python bytecode. `...

Just so you know, this is not how it works in Python. This only happens if you explicitly release the gil

#

which the pyo3 async crate might end up doing? It's not entirely clear to me, I wasn't personally involved with it

safe wyvern Jun 11, 2025, 9:53 PM

#

I think PyO3 implicitly drops the GIL if you aren’t using with_gil.

unborn cipher Jun 11, 2025, 9:53 PM

#

safe wyvern I can’t find where I read it anymore but I still have an email with someone on t...

Right, but that's not what asyncio.gather does

#

gather works concurrently, not parallel

#

that's why I linked https://pyo3.rs/v0.25.0/parallelism.html

Parallelism - PyO3 user guide

PyO3 user guide

safe wyvern Jun 11, 2025, 9:55 PM

#

unborn cipher gather works concurrently, not parallel

Yes, I am aware but in this case, it will work in parallel because hash_async spawns a Rust thread.

#

As you know, gather essentially just runs the coroutines and waits for a callback.

#

It doesn’t know whats happening in the coroutine, especially if it’s calling non-Python code

unborn cipher Jun 11, 2025, 9:56 PM

#

safe wyvern I think PyO3 implicitly drops the GIL if you aren’t using `with_gil`.

it does not, you can have nested with_gil calls. if you drop an inner call it's still possible the gil is held

safe wyvern Jun 11, 2025, 9:57 PM

#

unborn cipher it does not, you can have nested with_gil calls. if you drop an inner call it's ...

Yeah, I am not sure either. It seems to work fine on my single-worker web server. I’ll probably run PyStack on it to confirm this before releasing v1.0

unborn cipher Jun 11, 2025, 9:58 PM

#

Also the gil is always held when a pyfunction is called, you can put a python token in the signature

unborn cipher Jun 11, 2025, 10:05 PM

#

safe wyvern Yeah, I am not sure either. It seems to work fine on my single-worker web server...

are you actually using async for other things than just this hashing function?

safe wyvern Jun 11, 2025, 10:05 PM

#

unborn cipher are you actually using async for other things than just this hashing function?

Yeap

unborn cipher Jun 11, 2025, 10:12 PM

#

i think you would be better off having a regular function (in which you call py.allow_threads and do the hashing in that), and use it with asyncio.to_thread

safe wyvern Jun 11, 2025, 10:12 PM

#

unborn cipher i think you would be better off having a regular function (in which you call py....

Yeah, that was the initial API but I didn’t like the DX and most Python devs don’t understand that you can push a nogil function into a thread to get parallelism.

unborn cipher Jun 11, 2025, 10:13 PM

#

like rust fn hash<'a>(&self, py: Python<'a>, bytes: &'a [u8]) -> ... { let seed = self.seed; let hasher = self.hasher; py.allow_threads(||{ hash(bytes) }) }

#

Anyway, was fun diving into the async runtimes crate, im much less familiar with it

safe wyvern Jun 11, 2025, 10:14 PM

#

Yeap

#

I ran a benchmark on both of these implementations

#

and I had the same performance on both of them

safe wyvern Jun 11, 2025, 11:20 PM

#

unborn cipher that's why I linked https://pyo3.rs/v0.25.0/parallelism.html

I ran PyStack on the program and as you can see (thread 1477088), there is no contention for the Python thread. I also found a bug(?) in pyo3_async_runtime. It seems that tokio is spawning more threads than necessary.

Anyways, I looked through the pyo3_async_runtime codebase, and like you said, they never drop the GIL anywhere. My guess is that any threads spawned outside of Python will not contend with the GIL and Python will somehow know to let other bytecode run?

📎 message.txt

unborn cipher Jun 11, 2025, 11:27 PM

#

it looks like it ends up calling the spawn method on the runtime (like tokio), so it might get to run on a different thread

safe wyvern Jun 11, 2025, 11:29 PM

#

unborn cipher it looks like it ends up calling the spawn method on the runtime (like tokio), s...

Yeah, I always thought you still have to explicitly drop the GIL for it actually work.

#

And as expected, running all 4 hash units in parallel with asyncio.gather saturates 4 CPU cores

unborn cipher Jun 11, 2025, 11:37 PM

#

id suggest offloading this to spawn_blocking maybe

safe wyvern Jun 11, 2025, 11:48 PM

#

unborn cipher id suggest offloading this to spawn_blocking maybe

That'd be nice but I'd have to reimplement all the annoying Python async context and cancellation stuff. I tried running 24 hash units (double my core count) in parallel to see if there would be thread starvation, but it seems tokio properly queues them without any extra intervention.

unborn cipher Jun 11, 2025, 11:53 PM

#

you can do quite a bit of blocking things in tokio's task pool before things really degrade

safe wyvern Jun 12, 2025, 12:12 AM

#

I'll think about it

#

If I am going to rewrite and use spawn_blocking, I might as well use Rayon instead.

#How unsafe is this?