#Hey @Erik Sipsma

1 messages · Page 1 of 1 (latest)

rotund citrus
#

I guess I'm confused exactly how you'd use fscopy.Copy in Directory.Chown. Directory.Chown is making a new overlay layer and then running chown on all the files, which triggers an overlay copy-up and changes the owner. If you use fscopy.Copy you'd also be copying all the data to a new ref and resetting the metadata, which should if anything be a little slower (in theory) since overlay copy-up is in-kernel.

#

I believe you that it's faster for some reason but I'd really want to understand why filepath.WalkDir + os.Lchown is slower

#

might be easiest if you send out a PR w/ the fix so I can see exactly what the change is

rotund citrus
#

@night vapor you nerd sniped me here, I'm using my ebpf pr to debug further and can confirm that for some unknown reason each of the chown syscalls (i.e. the time spent in the kernel to run a single chown) is averaging about 8ms, so 8ms*100k = 800s (using your repro)

#

8ms is an absolute eternity though

#

something very very strange is going on

#

I can show you in lounge if you want and are around

night vapor
#

yes please

night vapor
rotund citrus
#
time="2025-12-12T00:57:15Z" level=info msg=filetracer dur=1.8us op=LSM_SETATTR process=dagger-engine tgid=1520166
time="2025-12-12T00:57:15Z" level=info msg=filetracer dur=7.98ms op=OVL_COPY_UP process=dagger-engine tgid=1520166
time="2025-12-12T00:57:15Z" level=info msg=filetracer dur=1.3us op=LSM_SETATTR process=dagger-engine tgid=1520166
time="2025-12-12T00:57:15Z" level=info msg=filetracer dur=7.99ms op=OVL_SETATTR process=dagger-engine tgid=1520166
time="2025-12-12T00:57:15Z" level=info msg=filetracer dur=8.01ms gid=1000 op=CHOWN path=/tmp/buildkit-mount623003268/src/file12935.txt process=dagger-engine tgid=1520166 uid=1000

drilling deeper, it seems like the copyup is extremely slow for some reason (I could understand if these were large files, but they are each 1 byte)...

#

Why is overlay fsyncing after every chown???

time="2025-12-12T01:05:20Z" level=info msg=filetracer dur=7.90ms op=VFS_FSYNC process=dagger-engine tgid=1532820

hyperthinkspin

#

I will laugh if we get another performance improvement by not syncing to disk as much

#

I guess that's just how overlay works unless you pass the volatile option... My manual test earlier with handcreated overlays wasn't correct because I was doing it under /tmp, which is tmpfs on my system, so fsync is free

night vapor
#

My PR is not totally wrong, it just doesn't work with the named uids, still digging on that. And for the number based uids, it seems to be fast-er

night vapor
#

It's driving me crazy too 🤣 Ok, timebox on my end has been entirely used, time to resume lazy git 😢