#Thread about `WithMountedX` and
1 messages Β· Page 1 of 1 (latest)
Yes itβs working as designed. Of course that doesnβt mean the design canβt be improved π
Yep, if you want to actually publish/export /tmp/foo you want WithDirectory, not WithMountedDirectory, so that it actually copies it into the image
gotcha, makes sense. I had the idea that Export didn't work for mounted dirs as well as volumes. I know for sure it doesn't work for cache volumes, wasn't aware it worked for mounted dirs
yup yup - it's used pretty often to make something available to the build process without leaving it in the published image. Though there are other ways of doing that too, e.g. just doing WithFile with the result of some separate build, and maybe that's clearer
Personally I consider export not working with cache volumes to be the bug. If you can get a Directory, the full Directory API should apply to it.
Is this the main difference between the two functions? This part still feels confusing in our docs and i'd love to clarify for the future.
(same for WithFile/WithMountedFile)
i think this is pretty much the difference - the other main difference is that they both can have completely (unintentionally) different caching behaviors:
https://github.com/dagger/dagger/issues/6421
I am more confused after reading that π
yep, that's nearly 100% of what I internalize between the two - one copies, the other mounts
agh ok ignore me then π tl;dr there are some weird bugs in the mounted versions that don't cache as expected in some scenarios
π
ok i have a suggestion around this π could we limit WithMountedDirectory to mount a read-only copy of the directory? (or just not allow propagation between steps, so it resets to the previous option)
- this then lines up exactly with how
--mount type=bindworks today in dockerfiles - the caching issue in https://github.com/dagger/dagger/issues/6421 entirely goes away (side note, i'm trying to follow up on what the upstream issue that blocks this is)
- there becomes a clear difference to document about why these are different -
WithMountedDirectoryessentially becomes the much more efficient option
to me, it's a bit weird that you can write to the mounted directory anyways - most of the use cases i can think of, you don't want to do this anyways
if you do need to, WithDirectory feels like the much clearer way to do it
maybe there's some context that i'm missing around why we want the mounting behavior, but at the moment, it really feels like there are more (really complicated) papercuts around the difference between mounting/non-mounting than there are benefits to having both at all
@lament badge Example footgun for you:
I had multiple assets to copy into a container:
// 'services/**/package.json',
// '.yarn',
// '.yarnrc.yml',
// 'package.json',
// 'config/**/*.json',
// 'babel',
// 'test-utils',
// 'npm-packages-offline-cache',
// 'yarn.lock',
My lazy way of doing so is something like this (code incomplete, but illustrates the point):
const dir = client.host().directory('.', {
include: [
'services/**/*.json',
'.yarn',
'.yarnrc.yml',
'package.json',
'npm-packages-offline-cache',
'config/**/*.json',
'yarn.lock',
// bla bla
],
});
await client.container().from('node').withEntryPoint(['/bin/bash', '-c']).withWorkDir('/root/foo').withMountedDirectory("/root/foo", dir).withExec('yarn install') // bla bla
- The node_modules are written into /root/foo
- Publish the image, no node modules!
- Crash an app in production.
I can fix this by preparing a pipeline that sets certain folders/files as being a mount. To be honest it's a bit tedious to have to do that, but it's better than having it fail in production.
My gut reaction is a β to match the behavior of buildkit.
I think that makes sense. I like it! Not sure about missing context here as well though.
Just double checking, is the current behavior not equivalent to --type=bind,rw=1?
no, data propagates through - if i write to a mounted directory, it'll continue to be there in the next WithExec step
this isn't the case with dockerfiles
I use WithMountedDirectory in cases where I either don't intend to publish an image and just don't want to worry about the perf penalty of a full copy with WithDirectory, and/or intentionally don't want the mounted contents in the published image[^1]. I wouldn't feel great about changing it to align with less flexible tools; the copy-on-write semantics and automatic propagation are pretty nice to work with in my experience. Read-only mounts usually just annoy me.
@cursive terrace Isn't that fixed by "just" using WithDirectory? Is the issue that if you discover WithMountedDirectory without knowing about WithDirectory you might not notice the difference? What if we renamed it to WithShallowCopiedDirectory or something? (strawman)
[^1]: caveat: arguably you should use that content in a separate pipeline and copy only the output of that pipeline into the target container, but that can sometimes require a bit of a refactor; it's nice to have options
Isn't that fixed by "just" using WithDirectory?
No because I don't want to bloat the published image with assets that I don't want to ship.
Is the issue that if you discover WithMountedDirectory without knowing about WithDirectory you might not notice the difference? What if we renamed it to WithShallowCopiedDirectory or something? (strawman)
I'm not following sorry, can you be more specific?
perf penalty of a full copy
Is there a perf penalty? I think this should be pretty efficient, right? It should use buildkit's merge op, which is already using overlayfsjust like a mount would(mounts are bind mounts, right)
Oh true we changed that a while back
did we make a docs ticket eventually? i can't manage to find one
Got it, nevermind, didn't notice the main issue was that it led you to believe that the things you installed would propagate to publishing. I'm confused as to what the actual fix here would be though, i.e. what you mean by matching the behavior of buildkit. Just failing earlier?
Yes, if it's equivalent to RUN --mount=type=bind then it should be read-only and fail. That would have allowed me to fix the problem sooner before it hits production.
When you use buildkit under the hood, it sets certain expectations for me as to how Dagger works. But when I come across footguns like this it just leaves me feeling frustrated, as it puts into question if the concepts of buildkit are actually transferable to Dagger or not.
If you do still want this feature, it should be an opt-in, not a default. Currently the setup is opaque and confusing.
FWIW, I didn't actually understand the difference between withMountedDirectory and withDirectory until recently. I've been using Dagger for 2 years...
Makes sense - what I was getting at with the "Is the issue that ..." question was that it is opt-in since it's the longer-named withMountedDirectory, so if folks are interpreting it as "the normal way to add directories to a container" that's an issue with the API naming and/or documentation. Sounds like that's the case.
I wouldn't really say we're trying to 100% accurately reflect Buildkit behavior; it's an implementation detail, our goal is to provide a more expressive API on top of it. There are pros and cons to this of course. One big pro of that is that we can implement optimizations like the one @lament badge mentioned where WithDirectory under the hood uses a MergeOp to efficiently copy instead of a llb.Copy. We were able to implement that optimization without changing the API.
What about this: can you make it so that Dagger defaults to readonly, and you optin to readwrite? Because it doesn't come across in that name withMountedDirectory that it is an opt-in to a read write mount.
Or alternatively, no defaults allowed! You must explicity decide what mount type you want e.g.
withMountedDirectory(.... , {
type: "readwrite"
})
That way you communicate what it's capable of.
What are the chances you would have just passed 'read-write' anyway though, knowing that you need to write to it π
Or do you think the act of passing that arg would have cued you to wonder whether it's the right thing to do?
The latter.
I don't have a strong opinion on how to address this, but to clarify priorities:
-
It's important for the API to be intuitive and consistent. If developers are confused that is bad and we should fix it.
-
It's not as important for familiarity with buildkit to translate perfectly to familiarity with Dagger (as @deft mural mentioned).
In your case you're caught in between the two, which can certainly be frustrating.
I just wanted to be transparent and set expectations.
It's important for the API to be intuitive and consistent. If developers are confused that is bad and we should fix it.
hence the reason for my thread here. The fact that Export and Publish works differently depending on which object you call it and what With* things you use is what brought my attention and seems like Ronan's frustrations. π¬
@cursive terrace how did we overcome this? I can't recall now π€
I changed the pipeline to explicity define certain folders/files as being mounts as opposed to the entirety of the folder.