#core api
1 messages · Page 1 of 1 (latest)
@unique nimbus @shy cobalt @lucid elbow @sullen warren - copying the last thread, sorry for the noise!
Alex and I are gonna sync up at 1:30 PST
Can’t make it at 1:30 but I’ll be available later in the afternoon if you want to chat, around 3-ish
That's 6 PM @lucid elbow 's time (not sure if that's in work hours or not) so we just met really quick right now, but happy to chat more whenever everyone has time
@shy cobalt one thing, do you agree that ContainerID is similar to FSID in that it's "content-addressed" (i.e. product of all the params) rather than something more like a random uuid? That's what I was thinking we meant there but want to double check we're on same page
I'll be around at 6pm so happy to do another sync up then too if it's helpful, planning to work til 8pm today (had a late start). maybe most days. still figuring that out. 😄
yes 👍
@sullen warren (when you are back) was about to merge this but want to check you don't have any in-progress comments or thoughts first: https://github.com/dagger/dagger/pull/3137
while we're looking at PRs related to core API... 😇 https://github.com/dagger/dagger/pull/3120
It's #2 in my queue! 🙂
oh nice, I was wondering about that when translating the existing tests
@unique nimbus you mentioned implementing these side-by-side, which actually looks pretty easy since the current API is namespaced under core {} and all these new things will become top-level instead. is that the idea? context: I started with a cp -a core/ corev2/ and planned to rename later but it looks like that's not necessary
what I did in my little piece (git) was bundle both old and new API for that area of functionality in the same file
(and btw I think those should be subdirectories..)
Yep exactly, you shouldn't need to namespace anything under core anymore
would we run into lots of circular dependencies if we break things up into tiny packages?
I think it would require some refactoring of various bits and pieces to do that without cycles, yeah. Definitely don't feel obligated to change that as part of your current effort, whatever makes the most sense to you is good
We can rearrange things after once the whole picture is available
Yes agreed. I initially had a small attempt in my PR, but removed it for this reason
I’m free now if it’s not too late for you @lucid elbow. Can do tomorrow otherwise
i'm around! ill hop in dev-audio
Joining now! /cc @unique nimbus if you're around as well
I'm opening a PR with the tweaks we discussed earlier: dockerfile, withRoot
(on gql files only, no implementation)
Just to recap:
-
OK to ship native core API clients before a complete codegen solution (possibly at the risk of creating a gap between core clients and extension clients)... but not right now, first focus on shipping the new core API, use it with raw gql queries, then in 1-2 weeks see where we stand
-
OK to start bottom up with core API, raw gql queries etc, but we need to aim for a "minimum intuitive experience" for each language, which involves limited use of raw gql. (semi-quoting @lucid elbow here)
- Erik suggested focusing cloud experience on raw gql queries first, because you need to build a lot to get that to work end-to-end; then incrementally add more client languages, hopefully with the benefit of a more polished DX by then
👍
fyi the Go dx I want to try on the side is a “flat” model where you register pipeline steps, with inputs and outputs, and the sdk generates a graphql schema at its leisure from that.
also a testing ground for the “it’s all pipelines and steps” terminology to avoid the weirdness @lucid elbow mentioned of developing something called an “extension” on day one
Sounds good, the sketch in my head from over the weekend is something sort of close to what you get with Typescript; there's just methods attached to go structs. In the simple case you basically just define one method and you're done, never realize you are even defining a graphql schema, but you can also expand to the full possibilities of generic graphql by adding more methods, hierarchies of them, etc.
I also had some ideas around something called dagger.Session that might help make the difference between scripts and extensions shrink, but I'm not sure about it yet
We can both give it a shot and compare notes

@shy cobalt did you open an issue/discussion regarding container { from } and the other thing we were talking about yesterday? Lost track with the channels moving around
a PR
@shy cobalt re https://github.com/dagger/dagger/pull/3120#discussion_r981534165 - will this change for the better when I switch these to go:embed the *.gql files? will that force us to start with ErrNotImplemented stubs right off the bat?
(it should at least prevent drift)
i can just mess around and find out, but figured i'd connect these lines of thought
I guess it depends what you’re doing with the embedded gql files
(wasn’t aware of that part)
just returning their string content from Schema() like we're doing today. i just like having syntax highlighting for'em, but it'd also be nice to gravitate towards api/ being the source of truth. it'll mean some regressions until we get all the pieces in place, for example changing git.gql to have tree return a Filesystem with a comment saying "TODO: turn this into Directory"
Ah, I see! that's awesome
regressions is the wrong word, more like...compromises 😛
This is actually vital information for @hollow oracle who is trying to build automated API reference docs
oh cool, yeah that's probably easier with static .gql files. i also tried getting graphql LSP working but it was giving me cryptic errors. an adventure for another day
IMHO we should move them inside core while we do the merge, rather than having them in a different directory
e.g.
git.schema.go (go:embed git.graphql) // (might drop the .schema.go at that point I guess? so just git.go?)
git.graphql
either way is fine, really. Just thought it'd help maintain some consistency between them if they co-exist in the same package/directory
how about core/schema/*.gql and core/git.go which go:embeds schema/git.gql?
oh and yeah, .gql vs .graphql?
now is probably the time for bikeshedding 😛
Shouldn't matter as gql is fully introspectable and most tools work with introspection (e.g. all the features we get with graphiql are through introspection). I think it's probably best for our tooling to use standard introspection rather than requiring the core API to have raw schema files alongside (otherwise the day we decide that we want to e.g. use code-first or whatever is going to be extremely painful to migrate the tooling)
Whatever you think is best, no opinion really. @unique nimbus ?
github code search datapoint
No opinion either as long as vscode & co are happy about it 🙂 I think gqlgen (the most popular implementation Go library) uses .graphqls -- whatever the s at the end means
Yeah that sounds good to me 👍
ooo, maybe we should use that
or does that imply there's like one monolithic schema file?
no, gqlgen has both monolithic and modular using .graphqls -- originally i thought the s was for modular (plural of graphql?) but apparently it's just schema
@shy cobalt added some comments in there. Not really bikeshedding, just raising a few points, I don't have very good alternatives
here's a PR for the aforementioned reorg, also switches the new git impl to it: https://github.com/dagger/dagger/pull/3164
Re: https://github.com/dagger/dagger/pull/3151#discussion_r981757443
Any objections to having:
Container {
fs
withFS
}
(fs is lowercase in one and uppercase in the other)
If everyone is OK with it, I will switch to that instead of root / withRoot which has other issues as mentioned by @sullen warren
welp, went to GitHub's API to try and figure out what the consistent thing would be, and...
sadTrombone
they seem to mostly just capitalize initialisms like a word though (withFs)
Finally finished updating that typed resolver PR, let me know if there's any disagreements on this point @lucid elbow @sullen warren : https://github.com/dagger/dagger/pull/3155#discussion_r981788184
If not I'll merge it as is so we can get the rebase pain over
approved!
to provide an answer, no objections here, i agree with the user root confusion risk. withFS vs withFs doesn't matter to me as long as we're consistent, the github folks may have gone with Fs style because of their Rails background for all I know
Makes sense, I updated the PR accordingly
@lucid elbow let me know if your embed PR gets merged, I'll rebase mine
Also, if there are other small pieces of the core API work that can be easily farmed out, similar to git, I am interested 🙂 It's nice to have a reasonably sized contribution that is useful but also low-pressure. I suspect I'm not the only aspiring core contributor interested in those
my plan so far is to start with Directory since that seems to be the thing present everywhere (Container is Directory + image config, File is Directory + path, etc)
HTTP seems like a good option here
i'd say anything else is fair game if you can come up with stand-ins until that's implemented (Filesystem seems ok?)
merged it
btw a heads up, some of the Container fields are more work than they seem (which is the point, they are easy to use but there is an implementation cost for us). Mostly stuff related to mounts
Talked to Alex about that yesterday, said that we don't have to handle every case immediately. He also apparently dealt with somewhat similar issues in Bass though, so we are in good hands 🙂
eg. Container { directory } is probably the most work (but worth it IMO)
Just wanted to flag for later, that a lot of thought went into those details (interactions between container and mount specifically) and all of them of course are up for discussion.
So it's worth reviewing and commenting in detail, you won't be looking at random noise
hm, I wonder if that will mean Directory is also not "just" an LLB definition since it might need to include a SourcePath from its origin mount point that needs to be passed in to future places it's mounted in to
...something like that
i'll burn that bridge when i get to it
Yes it will need to be exactly that, same as File essentially
awesome, on the same page then. and i guess we'll need to find what exact mount the given path refers to based on the container's mounts
Yeah, including when mounts are on top of other mounts, symlinks, etc.
actually it could be llb, involving the execution of our shim with the mounts in place and instrumented to copy the desired content out
(whether that’s a good idea is another matter)
quick note: for directory { withNewFile(path: "a/b/c") } I'll have it create a/b for you. one alternative is to make withDirectory's DirectoryID optional and do withDirectory(path: "a") { withDirectory(path: "a/b") { withFile(path: "a/b/c") { id } } } but that feels a bit clunky.
also it might be a good idea to make the withDirectory DirectoryID optional anyway
I think automatically making the missing parent dirs is alright. It also could be an option like MakeParents; same idea as -p flag in mkdir
does llb have that option?
It does for llb.Mkdir, not Mkfile, but probably just an oversight
yeah, right now i'm just chaining a separate llb.Mkdir with llb.WithParents if it detects a / in the path
an option would be good too, we can chat more about it, just trying to get minimal tests passing for now 🙂
we might want one for file permissions too
opened an issue for that: https://github.com/dagger/dagger/issues/3167
here's a PR with Directory progress so far, this might be a good time to context-switch to start bootstrapping other types so we can start working more in parallel: https://github.com/dagger/dagger/pull/3168
☝️ now has the beginnings of File too, also added more Directory implementation since I realized parts of its API will also need the LLB definition + Path structure to DirectoryID.
i've just been rolling on ahead but feel free to critique any part of it
Awesome, will take a look in the next hour
one change I'm thinking of making, since it cost me an hour of troubleshooting: right now DirectoryID("") is treated like scratch. I think it should just be invalid.
technically what happened was I misnamed one of the fields for the schema args struct, so it'd also be nice if we errored on unknown fields, but if an empty DirectoryID were invalid I'd at least get an error instead
Could it be nil?
i'm not sure about nil, but I think the distinction between DirectoryID and DirectoryID! should still be respected, since we need an optional representation
maybe it's still "" under the hood and we just error on it instead of special-casing it to llb.Scratch
ok I see. I had this in mind with query { directory(id: DirectoryID) }
(note the optional id)
right right
I assumed "optional" under the hood just means "defaults to null"
it's a scalar, so it seems to just become the 0-value for a string, but maybe it accepts a pointer too?
the main difference is that instead of doing a no-op and returning a &Directory{ID: ""} now it'll actually build a valid ID from llb.Scratch, which could be done by just checking if the arg is ""
Ah! I see
huh, apparently I need to do llb.Scratch().File(llb.Mkdir("./", 0755)) before encoding to a DirectoryID; with just llb.Scratch() I get cannot marshal empty definition op
that feels a little weird 🤔
backing out of that, i'll just be more careful with arg fields. 😛
I think with gqlgen there's a distinction between zero value and not provided (nil pointer); not sure if we can do the same with our router
e.g. directory(id: DirectoryID) (optional id field in gql)
with (pseudo code) func directory(id DirectoryID) we get DirectoryID(""), whereas with func directory(id *DirectoryID) we get nil
(i.e. pointers allow to distinguish between "zero value was explicitly provided" and "not set")
I think since we're marshalling/unmarshalling json in router, and the json encoder has this same behavior, it should also work on our resolvers
I made a project tab to track core API implementation work: https://github.com/orgs/dagger/projects/2/views/67 - feel free to pick up anything that's unassigned!
Also if any ideas come up feel free to add it to the backlog, but it's kind of tricky since this tab is filtered. You have to add a draft ("+ Add item" or Ctrl+Space), which will warn you about it being invisible, then click "Open Item" before the alert disappears, convert it to an issue, and add the area/core and cloak labels
also, calling this out separately for feedback: https://github.com/dagger/dagger/issues/3184
Container progress: https://github.com/dagger/dagger/pull/3190
@tropic bobcat @obtuse hawk @jaunty iris example of a lightweight open-source feature discussion
not everything needs to have its own channel
@unique nimbus a depressing conclusion to the test timeout: i bumped it to 20 minutes and now it's fine?!
i wonder if the github runners don't have enough cores, and now we've got a bunch of tests
the tests seem to pause on t.Parallel() which might contribute to the overall timeout
Oh... uh yeah that's highly unfortunate. Just a random thing to check, if you run locally do you see high memory usage or anything?
Only asking cause that happened to us in the past with GHA
Also I think one time running w/out race caused our tests to speed up by an enormous amount
checking
~600MB peak usage (this is just eyeballing htop, nothing scientific), that's with it running in parallel across 16 cores, trying with -parallel=2 now
hmm roughly the same
Okay yeah that's probably totally fine then, was just a shot in the dark based on past experience. The only other thing worth trying would be disabling -race when running in GHA, just to see if it makes a difference.
We should definitely figure this out soon, but if the problem only occurs in CI then I'm personally okay merging as is (with the timeout bump) just because it doesn't make sense to block the new API on it
i could maybe switch a test from golang:1.19 to a smaller image to save some time, too
btw, are image fetches cached across all tests?
They are all hitting the same buildkit, so yes they should be in theory
i'll try without -race and without the bumped timeout first, don't really know a canonical small image that sets PWD and/or ENV
i think it'd be great if our tests could finish in under 10 minutes, but i kinda think we're limited by the runner's parallelism more than anything else at this point
I think we use golang:1.19.1-alpine to build the shim anyways, so if you use that then it's no extra pulling needed
oh cool, will do
Yeah possibly... someone should invent a way to create your own self-hosted CI runner that operates almost like a loop 🙂
opting to keep -race and just bump the timeout with more explanation in the commit message. i'm unlikely to ever run -race locally so I'd rather CI catch it, even if it slows things down (since things can really be subtly broken otherwise)
Yep sounds good to me
I just tried this:
{
container {
from(address: "alpine:3.16.2") {
rootfs {
file(path: "/etc/alpine-release") {
contents
}
}
}
}
}
but I'm getting a "failed to load cache key: no match for platform in manifest: not found",
Any idea of what it could be @lucid elbow ?
I see the tests are doing this, so it must be me
Hmm maybe I'm missing the r.platform when marshaling an llb.State somewhere
Yeah. Although you have a test that’s doing exactly that same query, weird
Oh woops sorry I forgot the time zone. Have a great weekend!
np! you too
@tranquil pivot 👆
Getting the same issue (from your tests), on my bundling buildkit PR. I'll dig on it tomorrow. If you have any hint @lucid elbow, on how to debug that, interested 😇 ?
Context: we spawn a daggerd binary compiled with GOOS=linux, and serve it from an alpine:3.16 image now. It now fails locally (Mac M1, and in the CI)
During that time, I'll take time to also upgrade the buildkit version to
v10.4@unique nimbus
Sounds good, I am about to go catch up on PRs from others, I will take a look at this after that to see if we can get this unblocked
thanks, wish I could help but I suspect it's related to M1, which I don't have. my only guess is that there are some (llb.State).Marshal calls that aren't propagating the platform but it's a shot in the dark
Could be NewContainer which marshals without platform and then it's called from everywhere (e.g. from)
I was just looking at this and filling in a few Marshal calls (just to dedupe effort)
https://github.com/dagger/dagger/pull/3226 @tranquil pivot
require.Equal(t, expected, actual) vs require.Contains(t, haystack, needle) is an odd design
for tracking this down and fixing it!
two more container API PRs ready, rebased both now that withMountedDirectory is merged so they can be reviewed independently: https://github.com/dagger/dagger/pull/3219 + https://github.com/dagger/dagger/pull/3217
oh actually I still need to add that fix for overlapping mounts to 3217, so maybe we can aim to merge 3219 first
Awesome! I think there was a merge "conflict", branch doesn't build anymoe
(conflict between that and the http PR)
Oh weird, looking at it
@unique nimbus @sullen warren @shy cobalt gentle poke that these are both ready to go now whenever you've got time (rebased + tests/lint passing + fixed mount overlap propagation issue)
In my TODO queue for later today (dedicating a few hours to code-first schema atm)
@shy cobalt question re: Secret / SecretID: is SecretID a recipe for how to fetch the secret value? so it might be a base64-encoded payload (just like the other IDs) and its content is either a FileID, or the name of an env var to fetch from the host, or whatever other ways we come up with for fetching secrets?
Yes that's the idea. I think there was an open question on whether to support a special "use this inline value" recipe. But IMO it's safer to not support that, so that we are 100% sure that no matter what, there is no secret content in an ID
(this constraint led to adding host { envVariables } to compensate for the inability to pass secret content as a query argument
cool cool, ty!
fyi: I'm going to just remove the old secrets implementation since it's pretty far from what we want and it's a lot simpler to just have one secret store configured (in this case, one that resolves the SecretID to its value using the gateway). any issue with that?
@lucid elbow @shy cobalt @unique nimbus a few notes on the API, now that I'm using it in code:
exectakes a mix of positional args and an input object. While it's non typical in GraphQL, it's very weird in codegen (using the "LLB optional args" technique, you end up having to pass a structure to it), like so:
alpine.Exec([]string{"cat", "/etc/alpine-release"}, api.WithContainerExecOpts(api.ExecOpts{
RedirectStderr: "/foo",
}))
I suggest switching everything to positional (optional and non optional) so we could have the following:
alpine.Exec([]string{"cat", "/etc/alpine-release"}, api.WithContainerExecRedirectStderr("/foo"))
- There's a bunch of optional fields which shouldn't be optional. Examples:
withEntrypoint(args: [String!]),withUser(name: String),withWorkdir(path: String),withoutVariable(name: String), maybe others
- I forgot about 3 🙂
for 2), withEntrypoint is a little funny 'cause it's actually optional for unsetting it. maybe we should have withoutEntrypoint for that instead. I wonder if that's true for withUser and withWorkdir too? withoutVariable is definitely an oversight
yeah, i see where it's coming from from a query point of view. In code it's a little bit funny though: "if you don't pass this optional argument, then it's going to REMOVE the field"
e.g. container.WithEntrypoint() means "remove the entrypoint from this container"
yeah, you end up calling withEntrypoint { } but ... it does the opposite 😆
for those three, we could either add withoutEntrypoint, withoutUser, and withoutWorkdir - or just make them all required have the user configure empty values. so withEntrypoint(args: []), withUser(name: ""), withWorkdir(path: "")
it makes no difference to the image config, since they're already non-pointers
leaning towards just making them required, so I'll make that change in my existing PR
yeah it made sense at the time but I see why it would be confusing
like windows 95 where you had to click “start” then “stop”
@lucid elbow Wasn't able to fix "keep the same order as the schema" yet, so I resorted to sorting :/ At least it gives a stable API, but the arguments don't match the API intention
Are there viable alternatives to go positional parameters to pass field arguments?
structs, which are easier on our end, but there's downsides and the DX ends up being worse
- LLB-style
dir.WithNewFile("/foo")
dir.WithNewFile("/foo", dagger.WithDirectoryWithNewFileContents("blah"))
Pros: Nicest DX
Cons: Discoverability
- Mandatory positional, optional struct with non-zeros
dir.WithNewFile("/foo", nil)
dir.WithNewFile("/foo", &dagger.DirectoryWithNewFileOpts{Contents: "blah"})
Pros: Less magic
Cons: Must pass nil around pretty often when NOT providing args
- Everything in struct (pointer for optionals)
dir.WithNewFile(dagger.DirectoryWithNewFileOpts{Name: "/foo"})
dir.WithNewFile(dagger.DirectoryWithNewFileOpts{Name: "/foo", Contents: dagger.Optional("blah")})
Pros: Least magic
Cons: Sucks to pass optional arguments (e.g. in Go you can't pass &"foo", need to create a string first then pass the pointer, or use a helper function)
I think that's fine honestly. I don't think raw graphql queries enforce that you provide args in a certain order, so it's not that crazy. We can also fix that problem later. It will involve hijinks like providing the original graphql schema files in addition to the introspection json (or perhaps having our generated schema files include directives that annotate the order, etc.). But it's fixable
Yeah. It was that or hours of fighting with graphql-go ...
Yeah don't do that, violence is not the answer
that's how fields are represented in graphql-go:
type FieldDefinitionMap map[string]*FieldDefinition
absolutely everything goes through that piece of code
so yeah, mega refactoring
can we attach any kind of metadata to the spec? like foo(path: String!, contents: String!): Foo! @order(params: ["path", "contents"])
Yeah that's what I was thinking, have the code-first schema generator include that. Provided directives show up in the introspection json then it should work I think
fyi: I'm basically done with secrets, but it has led me to Host (for host env secrets), which also has toplevel API collision. this one is harder to remove since it's also used by cloak generate - do we still need that?
more generally I think we're getting to the point where it'll be time to retire FSID soon, which I think leads to SDK updates and such
I was temporarily disabling cloak generate in my PR; @sullen warren is going to use it again I think but in a whole new form, so wouldn't worry about anything it's doing right now.
Host is still used to load local extensions and other local dirs, so we need a replacement for it
yeah, there's local dirs setup code in the engine code that uses the host API too, I figured I would just remove that until we figure out the new local dirs API
or are you saying local extensions will require local dirs beyond just workdir too?
oh no they should only need workdir, sorry I misunderstood
just rebased https://github.com/dagger/dagger/pull/3217, now to wait 20 minutes 😛
@lucid elbow @shy cobalt hey, what's the difference between container.rootfs.directory and container.directory?
and if there's a container.directory, shouldn't there be a container.file as well?
e.g. I can do container.Directory("/etc").File("alpine-release") but it's container.RootFS().File("/etc/alpine-release")
adding container.file seems reasonable to me (same mount magic)
opened https://github.com/dagger/dagger/issues/3266 for that