#Host access APIs
1 messages · Page 1 of 1 (latest)
I an using “posix” as a shorthand for “the unix APIs for accessing host resources from a program”: files and directories addressable by path in a local vfs; environment variables by name; command execution by path and arguments.
Deno implements posix APIs because it runs programs on a single host, and Deno programmers expect to have full programmatic access to those resources via the APIs they know. Therefore they cannot design their own APIs for accessing host resources, even though it could make Deno more secure, more scalable and easier to program. They would be sacrificing adoption which makes it a non-starter.
Dagger does not run your pipelines on a single host (or at least it does not promise to) and developers of CICD pipelines do not expect full access to a single host which they know does not exist. Therefore we should not expose a concept of a single host in our APIs. Specifically a single host VFS and network stack
The issue is not just about preventing host resource access. That framing implies that there was an API allowing to request host access in the first place; which in turn implies there is a “host” in the traditional posix sense. I challenge those assumptions.
In other words: there is no host. If we imply to our developers that there is one; that will only lead to misunderstandings, confusion, and ultimately a bad developer experience. And yes some security issues along the way as well.
Finally getting around to replying....
Seems like everytime I think I have Dagger figured out you throw me a curve ball 😆 . Reading this I'm thinking of "invocation vs execution" and
"client vs server". Dagger must be invoked somewhere, could be my local machine, could be a container within gitlab, but either way is the invoking compute not the "host" ? The pipeline executing in buildkit ("server") may execute on a different machine but still maintains a bidirectional stream with dagger ("client"). Is the client not a host? I'm not understanding "there is no host" . If there is no host Dagger would never read a local file system or socket like it does today.
Yes apologies for the confusion. There are 2 programs at play:
- The
daggerclient, a Go program which runs on a host machine, and accesses its host resources via the Goospackage - The plan, a Cue program loaded by the
daggerclient, and executed by the buildkit vm
The Cue API we are discussing is really a Cue binding to the underlying Buildkit VM API. Buildkit has the capability to call back to the client to access some resources on the invoking client’s host. But that API is very restrictive and includes a mandatory indirection. Buildkit has no concept of a host filesystem for example. The sandboxed executable code (in the llb format) requests access to an opaque input directory ID; then at runtime , and at runtime only, the invoking client may map each requested directory ID to an actual directory on its own host. At no time does the LLB executed inside the buildkit sandbox have any ability to access the invoking client’s filesystem. In fact it lacks even an API to make such a request.
I think we should expose the reality of how the buildkit VM works in its Cue bindings.
I realize that perhaps i should have started with this exposé 🙂 Sorry
TLDR: the Dagger runtime API is really a Cue binding to the buildkit VM. The buildkit VM has a very specific model for accessing resources, that is very different from the usual posix-style API. Our Cue API for Buildkit should embrace Buildkit’s execution model as faithfully as it should embrace Cue’s development model
got it that makes more sense. @crisp solar explained that very thing the other day. So I can't really continue until I take a deep dive into the buildkit execution model. Thanks for clarifying!
No worries we are taking the winding road together 😉
I can give you a quick tour if you’d like
sure!
ok give me 5mn I’ll join the audio room
same
that's the scariest gif Ive ever seen
Yeah it seemed cute in small form