#Segfault under load

1 messages · Page 1 of 1 (latest)

charred root
#

We use Bun in production and have been fans, but we are getting segfaults under load and have no good way to trace them back to a root cause. We initially suspected the Bun S3 client, but after swapping that out for Amazon's Node S3 client, the issue persisted.

We currently do use the Bun SQL plus Postgres stack quite heavily, as well as a few child processes talking to each other. How should we proceed?

We tried taking samples using the inspect flag and weren't able to glean much from it.


Line Meaning
panic: Segmentation fault at address 0x0 Null pointer dereference inside native code (JIT or C/C++). This is inside Bun, not your TypeScript.
oh no: Bun has crashed. This indicates a bug in Bun, not your code. Bun’s own crash handler agrees (and provides a report URL).
Crash report URL https://bun.report/1.2.21/...
Main child exited with signal 'SIGILL' Supervisor saw SIGILL (illegal instruction). You have two signals reported: Bun says SIGSEGV, Fly supervisor saw SIGILL. This mismatch can happen if the runtime’s crash handler re-raises a different signal or if JIT emitted a bad instruction; either way it’s still a runtime bug.
could not unmount /rootfs: EINVAL Benign during teardown; not the cause.
reboot: Restarting system The Fly.io micro-VM restarted after the crash. Expected if the app is set to auto-restart.

indigo igloo
#

If you could paste the complete crash report url and the logs that included the list of features in use that would be helpful.

It’s hard to say what causes this without seeing more details

#

You could try canary but I can’t think of any specific crashes in Bun.SQL. There was a bug we fixed in canary when reading time columns with the binary postgres protocol but I don’t think could lead to a crash.

#

cc @placid pilot

indigo igloo
solemn orchid
#

hey yes that is the issue, and no we are not using fetch on requests that return a 101 - use websocket api and fetch only for regular http requests

solemn orchid
#

i can reproduce the crash with a stress test against our development server. what can i do to get more information?

indigo igloo
#

Can you pick a random PR and then run

bunx bun-pr <number|branch|sha> —asan

and then run

bun-asan-latest <your app>

instead of

bun <your app>

and then hopefully it crashes with a big red message earlier on

#

For example

bunx bun-pr 22397 --asan