#How to copy embed.FS into container

1 messages · Page 1 of 1 (latest)

subtle sandal
#

I need to copy an entire embedded directory into the container; can it be possible in a single step ?

hollow ridge
#

not in a single step, but it should be doable by walking over it:

var e embed.FS

dir := c.Directory()

err := fs.WalkDir(e, ".", func(path string, entry fs.DirEntry, err error) error {
    if entry.IsDir() {
        return nil
    }

    content, err := e.ReadFile(path)
    if err != nil {
        return err
    }

    dir = dir.WithNewFile(path, dagger.DirectoryWithNewFileOpts{
        Contents: string(content),
    })

    return nil
})
subtle sandal
#

Thanks, do you think steps create more layers ?

hollow ridge
#

yeah, each file will be its own layer

#

@daring mural (confirmed above) do you know of any way to collapse these into one layer with buildkit? 🤔

#

just tried to make 1000 files and it failed with ....withNewFile.withNewFile.withNewFile.withNewFile.withNewFile.entries mount options is too long

daring mural
#

We'd need to find a way to expose that in our API though

hollow ridge
#

ah that seems right, yeah

daring mural
#

Probably accepting a list of files/dirs to make rather than just one at a time

#

We need a better way to support this sort of thing in long term though; Mkfile breaks once you reach 12MB or something like that (grpc limit)

hollow ridge
#

good point

#

maybe it's better to think of it as an expensive thing so it's used sparingly

daring mural
#

I'm having thoughts about doing something horrible with memfds and then doing a localdir sync from /proc/self/fd/ but i highly doubt that's a good idea

hollow ridge
#

lol - yeah, if you have a localdir you can use Container.withCopiedDirectory instead (once we add that)

clever bone
daring mural
tough flint
#

We could also remove support for mkfile altogether, and require that you create a local directory or file first, then copy it up? let the native platform deal with the rest

#

it would probably spare us a bunch of edge cases and feature requests that are hard to get right (see this thread) but really aren’t super differentiated since every language standard library can already do it

#

the alternative is that we build a proper file streaming primitive on top of our socket/service primitive- inevitable IMO if we want to get it right, not be limited by file size etc. But is it really worth it?

hollow ridge
#

i really like having mkfile around, it's nice to avoid the filesystem for installing small mostly-static content (e.g. config files). extensions for example could probably make great use of that, whereas having to write to a real filesystem first would be kind of clunky (I hate dealing with tmpfiles)

hollow ridge
daring mural
tough flint
#

Don’t you think layers are on their way out as a meaningful lever for optimization though? With production images getting smaller, and alternative distribution transports being explored (ie. rsync-style transfer at file granularity, bittorrent, ipfs..)?

daring mural
# tough flint Don’t you think layers are on their way out as a meaningful lever for optimizati...

Yes and no. There were attempts a few years ago at an oci image spec v2 that would incorporate all that stuff, but it got shot down pretty rapidly. Too much inertia from all the big players I think. All the cool new tech like stargz and dragonfly still use layers and oci registries at the end of the day even though they aren't really very useful for them.

I suspect that there will always be some notion of "a bundle of files", it's probably more that those bundles won't be forcibly arranged in a single linear order in the future. I can see a plausible path where "images" all become single-layers and you instead compose filesystems by efficient merging those individual layers together (aka mergeop).

Either way, all these possibilities are years down the line. I don't think we need to rush into changing our API right now to expose the differences being discussed above, we can wait for compelling/urgent use cases to arise. I'm honestly fine with the current behavior; I think the cases where you could create thousands of files will mostly be solved by having include/exclude on local import

tough flint
#

Yeah I don't have a clear picture of precisely how removing layers in the backend could help - feels like it would be far away in any case. But I still am uneasy about locking layers into our API, because that closes the door premptively to any experimentation in that direction.

If we are successful in getting a large chunk of the devops ecosystem targeting our API (directly or indirectly), this creates an opportunity to shift large chunks of the underlying state of the art, using our API as a facade / pivot point. So I'm sensitive to anything that might reduce the potential for doing that in the future.

#

For example in the context of "magical caching", maybe in 6 months we realize "oh sh%&^&$t we could do 10x more magic if we could just ignore the concept of layers"

#

Sorry, not very tangible arguments. For me it's more about creating and preserving opportunity space for the engineering geniuses to do their thing in the future

daring mural
# tough flint Yeah I don't have a clear picture of precisely how removing layers in the backen...

That's totally fair and I'm very excited by all those possibilities of abstracting over newer+better backends.

I would suspect that if we add something now to our API that allows for layer level optimization (i.e. an API that accepts multiple file ops in a single call), then if in the future we switch to a backend where layer optimizations don't matter any more, that just means we have a redundant API rather than a broken one, which is not the utter end of the world.

Either way, agree we don't need to rush into this immediately. It can be safely punted (and perhaps never tackled if the need never arises).

tough flint
#

"punted to death" 😄

reef pagoda
#

One year later 😅 Is this still the best way to mount an embed.FS? It works but it seems to break the dagger TUI. All rendering just stops whenever that part of the dag gets evaluated. It does complete though.

┣─╮
│ ▽ from debian:sid
│ █ [0.51s] resolve image config for docker.io/library/debian:sid
│ ┻
┻
• Cloud URL: https://dagger.cloud/runs/901de493-0621-418f-954d-9a2745d1120f
• Engine: e35a33edeb8d (version v0.8.7)
⧗ 2m29.3s ✔ 348 ∅ 39