#I'll echo @jedevc 's surprise that that'
1 messages · Page 1 of 1 (latest)
It's certainly worth discussing to make sure we get it right. In retrospect I regret not double-checking that we were all on the same page on which context we were talking about.
It's certainly worth discussing to make sure we get it right. In retrospect I regret not double-checking that we were all on the same page on which context we were talking about.
Taking it from the very top, do we agree there are 3 4 possible choices on the table:
- Callee module context
- Immediate caller module context
- Top-level module context
- Client workdir?
My assumption was that we all wanted option 3: top-level module context
But now seems like a good time to revisit that assumption 🙂
(adding @weary knoll )
Let me test my interpretation of 1, since I think I think I can imagine a use case for it: imagine you're consuming an application like Bass in your module, and you just want to do something like dag.Bass().Binary() to mount its binary into a container. I would expect that to load up the Bass context dir for it to build its binary, and for it to not have anything to do with my own module's context dir
Does that track at all with this discussion, or is that a simpler case?
Right now, for a top level module we can either get:
- The context directory (.git pwd)
- The root dir (dagger.json pwd)
For dependency that use context dir, I would assume they can get the same context as their parent.
I also agree with you... meaning it's going to be different than the parent context
But how do we make a difference between them?
Depends on if we also have use cases for 2/3/4 I think - if it turns out 1 is the only practical use case for example, I guess defaultPath in each module is always just resolved relative to that module's context dir
My main concern with dependent modules gaining access to 2/3/4 is it seems like pretty tight coupling - these path specifications tend to be acutely aware of repo layout etc., and maybe there are legit use cases for that (generic tooling for common paths like .github), but it feels more brittle
I would suggest confirming whether my description of 1 is accurate, and then trying to come up with use cases for 2/3/4 to fully scope this thing out
So basically a slight variation on dag.CurrentModule().Source() right? (not scoped to the source dir, in case it's different from the module root)
right, critically not scoped to the source dir, this would be the full git repo (modulo ways to scope that down, monorepos etc, not sure what the state of the art is there right now)
Well in some cases, the callee module context is also the caller module context. So a lot of this depends on project layout
Specifically what happens if I call an external module in the context of my app repo
dagger call build-> all good, callee=callerdagger call -m github.com/sre-team/platform build-> oh shit which context do I use
ah right - the reusable toolchains case?
Not to mention the inverted control flow which we're discussing with Fastly, where the "platform" module would import the app module instead of the other way around
OK just to be clear, I am contemplating the possibility that I was wrong and that the callee context is actually fine
I'm just scared that we get it wrong and cross a horribly one-way door
Not just a door but one of these parking lot spiked barriers:
I'm still getting a scratch dir with callee context, I need to find out why then
Some strawmen:
- Module author can specify path source (callee, caller, user, local, etc) - probably confusing? But maybe only toolchain modules need to think about this?
- When you install a module you can override its context dir (use this for toolchain modules). But what if the module needs both?
Yeah I think this boils down to toolchain modules (in other words the pattern of dagger call -m FOO from the root of your app)
I can't think of anything else that "callee context" breaks (should double-check that)
So one possible option is we just kill the toolchain module use case
I do think it's a valid and interesting use case, but yeah cutting scope has helped before, and maybe there's a path to it in the future when we can focus solely on it
It doesn't seem possible to satisfy both at once without a way to distinguish
We could add a context-type metadata to make that happen
OK I don't want to hold this up. Let's assume we're giving up on the "toolchain modules", at least for now. So let's make the implementation work with "callee context" (so the opposite of what I was saying before), and see how it goes. If you can all keep the "contextual module use case" on your radar, so we watch for the other end of the tradeoff, I would appreciate it.
Concrete consequence of this tradeoff: it's no longer a good idea (probably) to have your app repos run dagger call -m github.com/acme/platform webapp build or similar. That only affects larger organizations, and there are other patterns being considered. So we'll see if it becomes a problem.
so just to confirm, at the moment this is the only blocker left, correct?
Yes!
we also need to make sure we have tests for the above as well!
if we can, i'd like to also have a test for Directory.AsModule() and see what happens with that as it loads a context directory, since this is another related codepath to the above
@dapper hound Hey, do you know if there's a way from *ModuleSource to know if it's a loaded as a dependency or from the host?
This would be super helpful for the implementation because:
- If it's local soure -> I resolve it the current way
- If it's a dependency -> I call
ContextDirectory()method, and get the directory at thesourceRootPath(callee context)
However, I don't know how to make the difference for now, I could do it by a check on resolveContextPathFromModule: if it returns /scratch, that means it's a module loaded a dependency but that doesn't look right
I kinda found a way, not great but it works
// If relHostPath is empty, it means the module source is called as a dependecy of another module.
// So we scope the context to the module root directory (callee's context).
if relHostPath == "" {
return src.AsLocalSource.Value.RootSubpath, src.AsLocalSource.Value.RootSubpath, true, nil
}
Lol
With a parameter isDep set to true of false if relHostPath is empty
@dapper hound In case it's a dep module, I can retrieve the ContextDir, meaning that I can also get up to the parent module's
In that case, should I point to this one if the user set a /? And the dep module's if it's relative with . ?
For example
test/ <- context dir (in case of dep module)
dagger.json
dep/ <- root dir (in case of dep module)
dagger.json
i don't think we can use ContextDirectory
we had to build LoadContext so as to avoid this for local sources
I tried to add a file foo.txt inside the module and it worked
because ContextDirectory has all the filters/etc from the module sdk right?
Oh yeah, that's true..
I forget about that one...
But then what can I do? Because I cannot use buildkit.LocalImport either
LocalImport check on the host right? But from a depmodule, there's no host
well i guess we need localimport for context directories to import all the way from the host
but, we need to restrict the paths to read
and get the subpath from the host
actually not the host, the top-most place where the local directory was defined? maybe?
hm i guess a ModuleSource needs to know which session it comes from 🤔
yeah this is it - then localimport pulls from there
Hmm how can I do that?
good question, it also really screws up caching 😛
So hang on, that means the session is different for each module?
imagine this case - suppose i do Directory.AsModule - there is no client
even though it's a local source, it doesn't have a local fs to pull from
yeah, i'm drawing blanks here, one moment
Oh wait, I got something interesting
This is what I got when I do LocalImport with a path to /
i think this is what i was getting at back in https://github.com/dagger/dagger/pull/7744#discussion_r1670959469
This is an implementation of #7647
API changes
This PR updates the Function and FunctionArg from our engine API to support defaultPathFromContext
Function
withArg(
""&...
I can go in /src/src and there I can get the context dir
A LocalSource should track where the context directory it's been loaded from is. This removes the complexity of this specific dance.
I think I fixed that withRelHostPath
We should create a ScratchSource type that can be created for Directory.AsModule - this type doesn't map to a directory on the host, so we should avoid all chances of confusion, which this PR steps into.
I don't get it
right, a couple things to remember:
- a local module can have local deps (which need resolving on the host)
- a git module can also have local deps (which need resolving relative to the git directory, not the host)
- any module can load a module using
AsDirectory(which needs resolving relative to that directory, not the host)
Is the content there filtered?
i think the method you're using only solves the first of these
I see
So how can I solve that?
Btw, for git module, I didn't did any tests but I made a difference in LoadContext between the 2
actuallllly, i think point 2 is maybe not an issue, when we load dependencies, we do the translation to Git sources on the fly
so that's probably fine
point 3 is still an issue, but bleh
sorry, live thinking through this right now
No problem, I know what it is 🤣
hm, thinking through this, i think there's a subtle caching issue here that deserves a moment
suppose - we have two clients - e.g. in two different modules, they have exactly the same ContextDirectory, but they have some minor file difference (in an excluded file or something)
now, today, we can safely just combine these together
they're the same, they have the same id
but now this is an issue - if we do LoadContext on both of them, they have different results
so this is also solved by having LocalDirectorySource having a ClientID in it that indicates who loaded it
but this also removes any legitimate caching cases, which feels like it would slow down our tests, and users reloading modules
but i'm also struggling to see what the alternative is if we want to do lazy loading of module context directory data at all
Is there a chance that we merge the context dir PR only working with caller context, add a big warning on a doc page to say (it's still in progress) (or like a flag or something in the CLI to enable it) and discuss that altogether in a follow up?
I feel the more we dig into context dir, the larger it gets with impact all over the engine and design decision that we're not sure about right now
yeah, was thinking somewhat similarly - could we explicitly disable it for module dependencies? and only have it for the top-most module?
i dunno if that effectively makes it useless for @limber swan
I don't know, but I seems much more complex that is was originally, and might break more things than actually solving a problem lmao
Maybe going a bit slower, merging step by step is the right decision, I don't know
@shut condor I think your opinion can help too, there's a lot of context to digest but if you got some time, that would be amazing 🚀
hm, the more i think about this, it feels like this is what we're gonna have to do, i can't think of another way that avoids the cache conflict issue we're introducing in this case (and horribly, this is actually still an issue without doing the deps, hm)
👋 the feature is not finished without dependencies I think, but whether to do an intermediary merge is an implementation issue, it depends if it helps you get to the finish line faster and more reliably
but it feels not great to have a client id embedded into a core object, we don't do this anywhere else, so i'm not sure what other implications this might have
we spend a lot of effort with blob sources to avoid this exact scenario
I don't know if we'll want to advertise the feature yet, need to think about that
I do worry a little about how impossible this feature feels. It's the first time this happens, it smells like technical debt and makes me wonder if I've been underestimating how much we have
Maybe it's a sign we need to fix some internal core issues before building more?
maybe 🙂 but i'm not sure what those internal core issues are
The PR fixes one by keeping the host path the module is loaded from, which was a big blocker
The other right now is, how to access an unfiltered context dir from a dep
Maybe another way around would be to have both filered and unfiltered context dir?
Is this something possible?
I guess not, it would not be efficient.. we need to load it on demand...
yeah, was thinking about this just now too 😄
um, yeah, the problem feels like:
- either, there is no laziness - a module source contains everything it will ever need to container
- or, we have laziness, but then everything needs to accurately track exactly where it came from, otherwise we can't find it
Is there a way to access the host from a dep? We did it with the Host API no? So it might be possible?
you can't do it from inside the module source code, but again, even if you could, i don't think that solves the issue
since not all modules are loaded from the host
For git it's a bit different, we have to clone the repo to get it, so by default we fetch everything
which is equivalent to "no laziness"
yeah, that's the easy case 😄
"easy case"
yeah okay, i'm gonna writeup something on github, with some more structure from this conversation so we can catch up other folks on it
Thanks a lot!
i feel like we can do this, but both ways of resolving the issue feel tricky, and i think we need to make sure with other people that we make the right clal
i think i should have realized this issue earlier, i think i voiced it wrong when we started
No problem! You're doing great!
ok thanks guys, keep me posted please
isn't that what we've been doing? this is me holding back on asking for big features 😁
Hahahaha true
okay, massive writeup in https://github.com/dagger/dagger/pull/7744
@knotty fjord @shut condor if you're around to take a look, that would help with unblocking 😦 maybe i'm overthinking this, for some reason, i'm struggling with understanding this feature 😢
re-reading some old comments, i think this is maybe what erik was thinking this might look like
we can keep this moving forwards, even if we're not sure where the id comes from
here's what needs doing @weary knoll
- modify
LocalModuleSourceso that it has aCallerClientIDfield - this can be populated duringModuleSource.ResolveFromCaller - during
LoadContext, get that ID from the ModuleSource, and pass it intoLocalImport- you'll have to modify the signature of it (as suggested by erik here: https://github.com/dagger/dagger/pull/7744#issuecomment-2189738829)- if there is no ID! then we can fallback to
ContextDirectory
- if there is no ID! then we can fallback to
- we need tests for this case. we don't need all languages, just go-to-go is fine. we need to make sure that calling modules like this works.
- a test for a module calling a module that uses a context directory like this
- a test for a module calling a module, that is loaded from
AsModule
that should do it
we might come back and change the first point, but. we can fix that later, it'll be easy.
this is how we resolve this issue
It's funny that you reference Directory.asModule in your writeup @dapper hound, I always felt like that should be called Directory.asContext instead, since that directory is not actually being used as a module
Directory.asContext().loadModule("./foo")
Probably immaterial to this implementation issue, but thought I'd mention it. Funny that sometimes API design smells, point to the same place where you find code smell later
yeah, i feel like i would like to refactor every part of module loading 🙂
there's just so many moving parts, it's hard to understand what goes where
lemme know if there's still some confusion here @weary knoll, i think i've finally gotten my head round it
I just want to triple-check that the writeup is about the implementation issues and not about accomodating my earlier request to change from callee context to caller context or top-level context. Like I said before I can live with the tradeoff of killing the "toolchain modules" use case for now, to remove constraints for implementation and get the thing shipped. Just double-checking that this gets through clearly
confirmed yup
yeah, this issue was technically present from the first implementation of this
the current implementation isn't doing callee or caller context, it's getting the contents of the caller's scratch directory - which definitely isn't right
What I don't understand is: the context dir only comes up as an argument. you can already pass local dirs as arguments. So why is this (possibly) local dir different? why is it an issue that that particular argument may have a local client id (codename for "llb local" I suppose?) and not the others
because when we pass a local directory explicitly in it's been unlazied
we've already loaded the entire thing into the engine already
Got back on the GH pr: https://github.com/dagger/dagger/pull/7744#issuecomment-2297277098
This is an implementation of #7647
API changes
This PR updates the Function and FunctionArg from our engine API to support defaultPathFromContext
Function
withArg(
""&...
Put together a fix, details here: https://github.com/dagger/dagger/pull/7744#issuecomment-2298374925 It's a pretty tiny diff all said and done.
As described there, there are obscure corner cases that are truly, deeply detached from anything in the real world (afaict), so I think it would be a genuine waste of our time to care right now. I spent a while internally hemming and hawing about ways of avoiding them before I realized they just don't matter to users and thus shouldn't matter to us. If I'm missing something that would mean they matter, please let me know, worth a double check from others.
@weary knoll my suggestion would be to try that diff I linked to in the comment and add on an extra fix for some path handling when it comes to deps. Let me know if I can help at all tomorrow. I'll be afk from 1pm to about 3pm but otherwise happy to chat about it.
This is an implementation of #7647
API changes
This PR updates the Function and FunctionArg from our engine API to support defaultPathFromContext
Function
withArg(
""&...
Yeah sounds good to me 🙂
We should keep an eye on the new context dir tests though, to see how/if they flake (different module names sounds good to me 🎉 as a deflake attempt)
Thank!s Will try that asap
hmmm, i can think of a semi-legitimate use case for the caching thing:
- kubernetes setup as we recommend it and use it today, with one big dagger instance shared between multiple jobs
- the job makes some changes to a context directory, and then runs the dagger job
- potential weirdness ensues
but it feels pretty tricky to stumble into, agreed we shouldn't block on this at all, it's a fiddly amount of work
Shouldn’t be an issue because it’s only a problem when multiple modules have the same source digest within the same session. Across separate sessions you can load different modules with the same source digest and diff context dirs/files and everything should work as expected.
The implication here is that practically speaking the only way to hit the corner case is when using nested execs.
And the only user that would be using nested execs to load the same module source with different context dirs/files in the same session is our integ tests
ahhhhh right, you're correct
i suppose if we allow caching calls across different sessions, then it becomes a bit more of a concern?
Yes but that would be but one small part of all hell breaking loose if/when we do that cross session dagql caching
There would be a lot of similar problems then all over
(Not to say we can’t solve those and shouldn’t do that, just that this new thing doesn’t make the situation worse)
yeah cool, we're on the same page then, yup 😄 (i had half a message saying exactly that typed out and got distracted by a test run finishing)
thanks for helping out ❤️ i hadn't quite clocked how this was similar to secrets/sockets
@shut condor Ok so I applied your changes and made couple of tests and it seems quite inconsistent, here's an example of a trace: https://dagger.cloud/Quartz/traces/46513cfd7c242f94f7799254a41bcfbb
I often see the error:
failed to set call inputs: failed to load contextual arg "modSrcDir": failed to load contextual directory "./dagger/sub": failed to resolve context path: failed to stat source root: failed to stat path: failed to get requester session: session for "bep44r1jaiq9ci83xfzsy12gw" not found
Or
failed to set call inputs: failed to load contextual arg "file": failed to load contextual directory "/": failed to import local module src: rpc error: code = Unavailable desc = error reading from server: io: read/write on closed pipe
Is this flakes or something related to me not correctly handling paths? I'm not sure if there's an actual implemention bug on context dir or on the session side.
I'll dig more indeed, but because it's inconsisent makes me unsure about what's really happening :/
Hm okay I just verified with the repro Helder posted earlier. I will pull this down again and try to repro.
Eating lunch atm and have to go to dr w/ Alice from 1-3, so might be delayed in response but will get back asap
I see this issue when running the current context dir test suite, not Helder's repro
I wanted to verify first if it was still working for simpler case haha
Right makes sense, I will use that to see what’s going wrong
Thank you so much!
Checkout to my PR and dagger -m .dagger call --source=.:default test custom --pkg="./core/integration" --run="Module/TestContextDirectory"
Tests are in module_test.go, starting line 6773
(Dr took waaaay longer than expected, so just getting back to this now)
It was a doozy but figured out what's going on with the new approach for tracking client IDs (my original theory this morning was wrong). It breaks due to caching essentially. We were getting old clients that were from nested execs loading the same module. They aren't running anymore so they are all gone and lead to those errors.
It can actually be avoided by just giving each subtest it's own dagger client instance (which separates the sessions and avoids the dagql caching that causes cached nested clients to be used). I will use that as a fallback solution since it's what users would be doing anyways.
Attempting something more robust that would avoid this issue and probably be simpler overall, but if it doesn't pan out then I'll just push that fallback option
Yeah, the integ tests pass now and the problem with deps is fixed: https://github.com/dagger/dagger/pull/7744#issuecomment-2306374250
The fix is actually so stupidly simple that I'm mad I tried anything else first, but oh well 🤷♂️
nice!
beyond a certain level of effort tracking down a bug, it becomes a certainty that you'll feel stupid once you find it. 😛
c'est la vie 
@weary knoll Let me know if the fix makes sense to you. Don't want to just push stuff without you understanding, so happy to explain more if helpful.
I'm not actually sure there is an additional bug with paths either, I may have just been misunderstanding the way the context dirs paths work (forgot that if I set the dir to / then it's the .git dir and not the module source dir).
I added a new integ test for this dep case, but just in Go so far. I think Justin also mentioned some other integ tests that were worth adding too. But hopefully this is it and no more gotchas lurking 🙂
forgot that if I set the dir to / then it's the .git dir and not the module source dir
Now that you've remembered, does it feel unintuitive?
Paths are eternally confusing, to a surprising degree. But it actually does make intuitive sense that / is the root of the context and . is the module source dir. I can't even imagine what else we'd do.
What I forgot too was that the ignorepaths needed to be relative to the defaultPath. That's the same though in that it's just confusing because paths are always confusing; I can't think of any different approach that would be more intuitive
I think we just need some good docs showing the behavior in the various permutations of defaultPath+ignore values (absolute and relative)
Then it should click relatively quick I think
What I forgot too was that the ignorepaths needed to be relative to the defaultPath
ooooh. But what happens if I have an +ignore but an explicit argument? Is the ignore applied to the argument?
This is for a follow up
The fix totally make sense! I'm adding the integration tests, making sure all work fine and we'll be good to go 🚀
@shut condor I ran tests, I assert that they pass just fine! Amazing