#Upgrade to v0.19.10 breaks my module

1 messages · Page 1 of 1 (latest)

eager orchid
#

I have a working module but after upgrading my engine to version v0.19.10 my module all of sudden is broken.
I dont seem to find any related errors anywhere so im hoping anyone has an idea where this is coming from:

cleanup failed: "runc delete container": /usr/local/bin/runc did not terminate successfully: exit status 1: container does not exist```
hidden hornet
eager orchid
#

It is in codebuild.

hidden hornet
#

looks to me an issue related to a stateful engine somehow

eager orchid
#

*AWS codebuild

hidden hornet
#

do you persist the engine state directory somewhere by any chance?

#

does the module work in your local machine?

eager orchid
#

no nothing is persistent

hidden hornet
#

?

eager orchid
#

I need some time to test that. I did notice running on github hosted runners it did work

#

It is running locally

#

It did notice the codebuild runners seem to use a pretty old kernel version. So ill try to get a newer one there and see what happens

hidden hornet
#

let us know how it goes

eager orchid
#

Okay this is were i land.

  • Codebuild containers regardless of the image all report a 4.14 kernel
  • Apparently Dagger since 0.19.10 needs kernels > 5.1

I believe this is because of the usage of open_tree/move_mount in PR #11545 (“fix invalid overlay mounts”) (here )

This breaks backwards compatibility, but 4.14 is quite an old kernel i believe 😅

GitHub

Been working on fixing #8265, which has been an issue since forever but seems to have become much easier to trigger recently (I'd highly suspect due to the boltdb fix making race conditions...

stiff rune
#

I will look at codebuild quick, it seems wild they would still be on 4.14

#

@eager orchid which of these runtimes are you using? https://docs.aws.amazon.com/codebuild/latest/userguide/available-runtimes.html

I think it must be on Amazon Linux 2, which is pinned to 4.14 apparently.

But if you change to any of the Amazon Linux 2023 versions or Ubuntu 7 then you will be on a kernel at 6.x and should be good

eager orchid
#

No i had a chat with support today. The underlying hosts are 4.14

#

That’s what I get if I do a uname -r

#

I was surprised too

stiff rune
# eager orchid No i had a chat with support today. The underlying hosts are 4.14

Right if you use ⁨Amazon Linux 2⁩ then it's 4.14, but it seems you can also change the image to Amazon Linux 2023 or Ubuntu 7, in which case the kernel will be newer.

I attached what Claude claims is the way to change that, but can't verify since I don't have Codebuild setup anywhere.

(Also, the reason I feel okay pushing back against AWS Support is because I used to work in AWS Support years ago 😄)

eager orchid
#

I tried that. Reason is that those are containers so the kernel is the one of the underlying host.
Which is 4.14

#

I tried al the different codebuild container images 😅

#

AWS support confirmed the container hosts are running kernel 4.14 and they don’t have a public timeline on when they upgrade.

stiff rune
eager orchid
#

Yes indeed. And indeed they are very confusing 😀

#

I can try to create a pr that checks whether the new mount api is supported and if not uses the previous implementation. Or force it via an env var?
If this is ok?

stiff rune
#

I'd be okay with that for this particular situation but generally speaking we can't promise to be compatible with 4.14. Like I said we already added ⁨open_tree⁩ usage to a different API a while back, so that still won't work if you ever need it. There were also recent performance fixes that relied on APIs in 5.x kernels.

#

Basically, happy to try to avoid this one particular problem you hit in v0.19.10, but generally speaking your experience using the engine is gonna be far from ideal in terms of functionality+performance

eager orchid
#

I am aware but I suppose I will not be the only running this on codebuild containers?
But I also think AWS should really update this

stiff rune
# eager orchid I am aware but I suppose I will not be the only running this on codebuild contai...

Well AL2 (which I'm guessing must be the underlying AMI of the host) is itself going to be EOL on June 30th, 2026 (link). And even AL2 barely supports 4.14, you're supposed to use 5.10!

So yeah I really really hope Codebuild updates soon, for their own sake 😅

That makes me feel a bit better that we won't be stuck in this situation indefinitely though, so I'm happy to put a workaround for the issue you hit.

I appreciate the offer to implement it but I'm happy to do it quick, it'll be faster since I know the code.

eager orchid
#

Would be even better of course since it would take me same time.