#How to control caching?

1 messages · Page 1 of 1 (latest)

vale hazel
#

When I was running dagger on GHA, caching was not a thing because github would restart dagger every time. Now that I'm self hosted, the dagger daemon is caching tasks. But too aggressively; it is skipping tasks that it should not.

I was looking at the docs, and it mentions "introducing a volatile time variable at a specific point in the Dagger workflow" as a hack. How can I declare the keys and values that gate cache invalidation in a more principled way? Workflow engines like Airflow, Dagster and Flyte use the task inputs to determine if the task can be cached. Does Dagger take a similar approach?

subtle adder
# vale hazel When I was running dagger on GHA, caching was not a thing because github would r...

yes, the approach is very similar. The only caveat is that Dagger currently (note: this is in the works to be improved) only caches operations that are invoked via the dag.Xxxx executions. This means that anything outside dag.Xxx that runs within your functions will always execute.

Going back to caching, yes the way that Dagger invalidates the cache is if any of the inputs of your pipeline steps change. From there, it'll cascade down the invalidation for any other DAG vertices that depend on the invalidated vertex in the DAG

merry blaze
vale hazel
#

Is Directory an appropriate cache key; to invalidate if the files have changed? Previously I was using it as a class property. Now it seems the correct approach is to pass it to every affected function. I'm going to try using the class properties as defaults for these duplicated function parameters in order to avoid repeating myself.

subtle adder
vale hazel
#

Can I avoid passing class properties to every function that depend on them? Merely defining them in the class __init__ does not suffice.

subtle adder
# vale hazel Can I avoid passing class properties to every function that depend on them? Mere...

if you use defaultPath (https://docs.dagger.io/api/default-paths/?sdk=typescript) in the module constructor that will allow you not to pass that property one very call

It is possible to assign a default path for a Directory or File argument in a Dagger Function. Dagger will automatically use this default path when no value is specified for the argument. The Directory or File loaded in this manner is not merely a string, but the actual filesystem state of the directory or file.

vale hazel
#

@subtle adder When I did not pass the Annotated DefaultPaths I got missing 2 required positional arguments. Is it supposed to be Annotated[dagger.Directory] = DefaultPath("/") rather than Annotated[dagger.Directory, DefaultPath("/")] ? That is how you would define a default in Python.

subtle adder
vale hazel
#

Let's say I have

@function
async def test_and_build_backend(self,
 backend_dir: Annotated[Directory, DefaultPath("foo")],
 env_toml_path: Annotated[File, DefaultPath("bar")],
 pr_number: str = "latest") -> Container

I can't successfully call test_and_build_backend without passing the first two parameters.

subtle adder
#

@vale hazel looking at the TS docs here seems like you're using the wrong annotation? https://docs.dagger.io/api/default-paths/

It is possible to assign a default path for a Directory or File argument in a Dagger Function. Dagger will automatically use this default path when no value is specified for the argument. The Directory or File loaded in this manner is not merely a string, but the actual filesystem state of the directory or file.

vale hazel
#

I'm on python. I feel like you've defined the type but not its default value.

fair whale
#

yeah for python that looks correct using the Annotated[Directory, DefaultPath("foo")] format

vale hazel
#

How can that work? In Python, default values are provided like def f(foo=bar). We've merely annotated foo, without providing bar. So python complains.

fair whale
#

@sinful galleon may be able to speak to the implementation but I believe it boils down to the fact that the defaults are resolved on the dagger engine side before they hit the function

subtle adder
fair whale
#

I just made this based on the boilerplate module code

@function
    async def grep_dir(self,
        directory_arg: Annotated[Directory, DefaultPath("/")],
        pattern: str = "foo"
    ) -> str:

i'm able to just run dagger call grep-dir without specifying anything for either arg

vale hazel
#

you are calling the function from the CLI so directory_arg probably gets populated; I'm calling one function from another. If the CLI is the only way defaults are supported, the documentation could be clarified.

merry blaze
vale hazel
#

How do you call one intra-module function from another through the dagger API?

subtle adder
fair whale
subtle adder
vale hazel
#

@subtle adder As stated above, the reason I'm using these annotated dagger objects is to properly declare the cache keys. Before I was just using the class properties.

I hope inserting the dagger API between internal function calls won't make IDE debugging more difficult. In the meantime, you might want to add a note in the docs.

I'll revisit this question once the functionality is added.

subtle adder
vale hazel
#

yes, but you can't use them as function defaults f(foo = self.bar) Am I following you? Or are you saying I don't need them at all, in which case why were the being skipped? I thought I had to declare the function's inputs as cache keys.

subtle adder
vale hazel
#

if I call f(None) won't the cache will ignore the call, regardless of whether the contents of default_src have changed, since None has been process before?

subtle adder
#

Having said that, Helder will surely now if there's a better way to handle this

subtle adder
#

They always execute.

#

What gets cached is the dag.* calls within your functions

subtle adder
fair whale
sinful galleon
#

I could possibly extend it with class DefaultPath(dagger.Directory) but don't quite remember now what the problem with that was, I believe it's to discourage you from trying to use it directly, but there was also a caching chicken and egg situation.

sinful galleon
vale hazel
#

Thanks for everyone's support. I look forward to this functonality.