#exit 4294967295

1 messages · Page 1 of 1 (latest)

tall drum
still patio
#

how reliably does this happen for you? does it happen locally?

tall drum
#

Yeah it happens every time I try locally

still patio
#

well at least we have that!

tall drum
#

Haha yeah with enough printlns the truth will be revealed

still patio
#

correction, the *runc.ExitError is constructed by the caller from the value returned by Wait, but I think it's still just this one path...

tall drum
#

I'm so mad right now... I added a println above the uint32 conversion and now I can't get it to repro anymore........

tall drum
#

there could of course be other uint32 conversions somewhere

tall drum
still patio
#
    // (If the command exits with a non-success status, or Cancel
    // returns an error that wraps os.ErrProcessDone, Wait and similar methods
    // continue to return the command's usual exit status.)
#

oh, they don't set Cancel anyway

tall drum
still patio
#

nice!

is this a SIGCHLD bubbling up in a weird way or something? is does it possible for Wait() to be interrupted without the process actually exiting? sounds strange, but we're at the layer where we do strange things

#

time for chatgpt

tall drum
still patio
# still patio time for chatgpt

In Linux, the wait() system call is used to make a parent process wait for its child processes to complete. Under normal circumstances, wait() will only return when a child process exits, either by terminating normally or by being killed by a signal.

However, there are certain situations in which wait() may return without the child process actually exiting:

  1. Signal interruption: If a signal is delivered to the parent process while it is waiting, and the parent process has a signal handler installed for that signal, the wait() system call may be interrupted and return prematurely. In this case, the return value would be -1, and errno would be set to EINTR. You can handle this case by checking for EINTR and restarting the wait() call if necessary.
  2. Using waitpid(): If you are using the waitpid() function with the WNOHANG option, it will return immediately if no child process has exited. In this case, the return value will be 0, indicating that no child process has changed state. To properly handle this situation, you can use a loop to continuously call waitpid() until a valid return value is obtained.

In summary, while it is possible for wait() to return without a child process actually exiting, this typically occurs due to signal interruptions or specific options used with waitpid(). By checking the return value and errno, you can handle these cases and ensure your parent process waits for the child process to exit as intended.

#

seems like those situations would be handled for us 🤔

still patio
#

so that fixes a case where you exec() with pdeathsig in thread A, something else runs in that thread, and the thread exits while the rest of the program runs?

still patio
#

i don't remember the rules for when threads spawn/exit in Go, I want to say it could fluctuate with I/O though?

#

to keep it nonblocking

tall drum
#

Yeah the runtime will make new threads and throw old ones away as it needs, for i/o and some internals like gc (I think)

#

Which basically means that the number of threads that the runtime has to throw away is probably much higher when CNI is in use....

still patio
#

oo good find

tall drum
#

I'll give it a few more go's since it seemed to be somewhat random locally

#

But promising...

still patio
#

nice. brb grepping for Pdeathsig in all of my code

tall drum
tall drum
still patio
#

awesome!

still patio
#

and this explains why people could see it randomly before - it always could have happened, because our go-runc dependency didn't lock the OS thread. and then the bump just made it way more likely to happen because way more threads were exiting.

ruby aspen
#

This was great to read! Will add my comment in the PR - Discord is ephemeral.

still patio
#

great job on this fix, I validated it with Bass over the weekend. 👍

(after going on a huge adventure to support pointing Bass to Buildkit forks, which involved adding support for Dockerfiles and a bunch of other stuff. I'm now in an extremely deep rabbit hole, but a lot of features are coming out, and the amount of Bash in my repo is decreasing, so whee)

tall drum