#the gang solves cache invalidation
1 messages · Page 1 of 1 (latest)
Here's a pattern I've thought about before, feels like a similar approach but a different manifestation:
// uncached
func (m *Foo) WithLatestPackage(name string) (*Foo, error) {
version, err := resolveVersion(name)
// ...
return m.WithPackageVersion(name, version), nil
}
// cached
func (m *Foo) WithPackageVersion(name, version string) *Foo {
// ...
}
This doesn't work today because calling sibling methods just calls the real code - it doesn't go out-then-back-in, so there's no opportunity to cache the self-call. This pattern feels more intuitive to me though; the CacheMe early-exit approach seems like a bit of a brain-twister (kind of like a reexec).
This also came up while I was experimenting with v2 IDs, by the way: I wanted to make it so that container.from always returns a pure ID even when given an impure input. The plan there was to have container.from resolve the ref and then return an ID that "pretends" you passed the digest directly: instead of the ID embedding the automated "query ID" like [container, from("golang", tainted=true)] it embeds [container, from("golang@sha256:...")]. Very similar situation to this example.
interesting, how would you control cached/not cached? just arguments I guess
not enough though…
Yeah, at least in this simple example the 'cache buster' is just a regular old argument. And I didn't specify at all how you tell Dagger not to cache WithLatestPackage (or vice versa)
right that last part is what I was wondering about
We could have a special type of argument, maybe?
func (m *Foo) WithLatestPackage(name string, buster Buster) (*Foo, error) {
version, err := resolveVersion(name)
// ...
return m.WithPackageVersion(name, version), nil
}
just guessing there might be useful info to pass along, and it's kind of neat to represent it as just another parameter
Simplest thing might just be the current time; then you can just truncate it to your desired caching granularity
doesn’t work because the caller inherits the problem all the way up the call stack to the CLI (or web client etc)
This would be an automated param, not a manually provided one; Dagger would recognize the special type of the arg (Buster), use that to hint to the schema that this is a tainted/non-cached function, and automatically pass in the value
We are in agreement that the only allowed name of that argument would be keaton right?
😂 - there was a really unfortunate choice of words playing on tainted before I wised up
GG @gleaming bridge, you just advanced to level 1!
Is it just me, or is there a blog post hidden in this thread?
(This is sort of a continuation of my message here since I realized this is probably a more relevant thread: #1156315986023170129 message)
I think it's important to keep in mind that "nocache" and "cache busters" are not really a feature or desirable, they're hacks that we've grown reliant on to deal with the fact that we don't have a good way of adding a dependency on external state.
Back in the "environment" iteration of Zenith, we talked about possibly adding a "deployment" entrypoint type that implemented the pattern of "get external state, plug it in as an input to the cache key", which AFAICT is the problem we're running into with unconditionally always caching. Basically, we want to re-run code when some external state changes, but we can't implement that because we have no way of unconditionally checking that external state (right?).
If you continue thinking along those lines, I think you end up somewhere pretty close to the idea of Buster as an input type, except more generalized. A very rough sketch being:
// This is a pre-defined type, not written by user
// No affinity for the names or signature as is, just a strawman
type ExternalState interface {
GetState() error
}
// The rest below is user code
func (m *Foo) Deploy(state *MyExternalState) error {
...
}
type MyExternalState struct {
CurrentVersion string
}
func (st *MyExternalState) GetState() error {
// do something, set st.CurrentVersion
}
The idea above is that when there's a parameter to a Function that satisfies the ExternalState interface, it will be called once per session (to avoid the critical performance problems I mentioned here: #1156315986023170129 message) and automatically set as the input parameter when Deploy is called. We can decide whether it should be possible for external callers to override it or if it should only be settable internally, I could go either way (may have some relationship w/ default value support too).
If you truly wanted something to never be cached, you could just implement the same "cache buster" pattern of random id or unix nanosecond time, etc. However, I think if we had this level of power, we would no longer want to encourage that type of behavior unless there's no other option for whatever reason, since you should hopefully be able to do something more specialized and fine grained that will result in the best performance (both for you and anyone importing your module).