#Why do I have to set my container to a variable for .with_exec to execute?

1 messages · Page 1 of 1 (latest)

ancient frost
#

The observation is that some scenarios in which I'm using .with_exec to write to a file using echo don't always work.

This always works, when the container is created client.container().from_() and chain the .with_exec command:

        ubuntu_container = (
                            client.container()
                            .from_("ubuntu:latest")
                            .with_exec(
                                [
                                    "sh",
                                    "-c",
                                    (
                                        "echo -n Hello World > /tmp/sandbox_test.txt"
                                    ),
                                ])
                        )
        file_contents = await ubuntu_container.file("/tmp/sandbox_test.txt").contents()
        print(f"==== file contents 1 /n| {file_contents }"

What also works is if I take my ubuntu_container object and run a .with_exec command and set the result of that to a variable:

        ubuntu_container_with_newFile = ubuntu_container.with_exec(
            [
                "sh",
                "-c",
                (
                    "echo -n Hello World 2 > /tmp/sandbox_test2.txt"
                ),
            ])
        # This works, and creates sandbox_test2.txt
        newFile_contents = await ubuntu_container_with_newFile.file("/tmp/sandbox_test2.txt").contents()
        print(f"==== New file contents 1 /n| {newFile_contents}")

What does not work, is just calling .with_exec and not setting to a var, it throws an error upon read with the file not existing

        ubuntu_container.with_exec(
            [
                "sh",
                "-c",
                (
                    "echo -n Hello World > /tmp/sandbox_test3.txt"
                ),
            ])

What am I not understanding?
I would assume you can use the same obj/var to execute multiple commands against the container.

eager rock
#

Hey @ancient frost 👋 Did you exclude the await in the last snippet? Try adding a sync() after the with_exec()

ancient frost
#

@eager rock , I feel that I have tried that but will try again real quick.

eager rock
#

What I think you're bumping into is that the DAG is lazily evaluated (This might be the most updated issue describing this: https://github.com/dagger/dagger/issues/3617). So in your original code, the awaited contents() is the call that triggers the evaluation of the dag. In the second snippet, it's effectively the same as the first since you're reassigning ubuntu_container. The with_exec does not directly change ubuntu_container, it returns a new Container with the modified attributes, which is why it needs to be assigned to a new variable (or reuse the same variable)

ancient frost
#

@eager rock

┃ ==== New file contents 1 /n| Hello World 2
┃ ==== New file contents 2/n| Hello World 2
┃ Traceback (most recent call last):
┃   File "/mnt/c/Projects/baseballhistory/CI/.venv/lib/python3.10/site-packages/dagger/cli┃ ent/_core.py", line 149, in execute
┃     result = await self.conn.session.execute(query)
┃   File "/mnt/c/Projects/baseballhistory/CI/.venv/lib/python3.10/site-packages/dagger/cli┃ ent/_session.py", line 126, in execute
┃     return await (await self.get_session()).execute(query)
┃   File "/mnt/c/Projects/baseballhistory/CI/.venv/lib/python3.10/site-packages/gql/client┃ .py", line 1231, in execute
┃     raise TransportQueryError(
┃ gql.transport.exceptions.TransportQueryError: {'message': 'lstat /tmp/buildkit-mount3511┃ 360911/tmp/sandbox_test2.txt: no such file or directory', 'locations': [{'line': 5, 'col┃ umn': 5}], 'path': ['container', 'file']}
#

@eager rock That would make sense with the Lazy loading.

eager rock
#

Yeah that makes sense, it was the last bit that was the real issue

The with_exec does not directly change ubuntu_container, it returns a new Container with the modified attributes, which is why it needs to be assigned to a new variable (or reuse the same variable)
So when you're calling the file().contents() its on the ubuntu_container before you've created the file

ancient frost
#

Ok, so after the "initial" chaining of events, if you execute some commands, you always have to set that to a new container var to force it to execute?

#

(or reuse the same variable)

Ok..I reread that

eager rock
#

Not exactly, it's more about what you're asking the dag to execute. For example

a = client.container().from_("foo")
b = a.with_exec("echo", "b")
c = b.with_exec("echo", "c")
await c.with_exec("echo", "d").sync()
e = c.sync() // <-- does not contain d
#

hopefully that is more clear, but happy to dig in further 🙂

#

and this is identical to the above

abc = client.container().from_("foo").with_exec("echo", "b").with_exec("echo", "c")
await abc.with_exec("echo", "d").sync()
e = abc.sync() // <-- does not contain d
ancient frost
#

await abc.with_exec("echo", "d").sync() does not actually fire right? becasue of the lazy loading?

eager rock
#

It does, the sync() is what triggers the evaluation. Basically anything that needs await will have an execution. But in both cases it would be "abcd" followed by "abc"

ancient frost
#

Oh...I think I get it and I think I was missing some of the other Dagger sauce fundamental.

Which is "when to use sync". I know it "executes" a pipeline but maybe I was hung up on what the definition of a pipeline is.

I considered my pipeline to be the contents of the entire file, whatever I was trying to accomplish vs a single step in that file.

In my demo I'm loading my source code from host into one container to build, then referencing the output of that in another container to do code scanning and I don't have sync anywhere, but the chaining probably forces the execution. And I considered that whole thing to be my pipeline.

eager rock
#

Ah yeah, that's the lazy DAG model. In your original pipeline, the thing triggering the execution is actually the contents() call. Walking back the dag from there, it needs the file(), which depends on the container, which has the with_execs, so all of that gets executed. But in the case where the with_exec isn't assigned to a variable, it's basically dangling and isn't part of the DAG executed for contents()

#

And then when the sync() is added, the with_exec does get executed, but still not in the same DAG as the contents(), so the file doesn't exist there

ancient frost
#

To sum it up, it would be a preferred practice then, as you stated, to reuse the same variable or create a new one if you alter the DAG.

eager rock
#

Yes, exactly 🙂 that's the most common pattern for sure. Using new variables is handy for example if you're creating a base image to reuse in multiple pipelines. Like a python base with some stuff preinstalled.

base = client.container().from_("python").with_exec(["apk", "add", "foo"]).with_directory("/src", src).with_workdir("/src")
test = await base.with_exec("pytest").sync()
build = await base.with_exec("bar").sync()
ancient frost
#

Thank you for this, it is making a lot more sense now.

eager rock
#

Happy to help! It sounds like we can make some improvements to our docs 🙂

ancient frost
#

If you have a minute to help me dive in just a tad more. Here is code that is failing and I'm not sure how to handle it.

I'm using Snyk CLI to run my OSS package vulnerability scan and it is finding out of date packages/security issues and in the same command it's supposed to be writing a file with the results, that I later want to export and digest. It is appropriately flagging the exit_code as 1 and maybe it's tossing an error as well. I'm trapping the error, but what you are saying is the file I'm trying to reference is not accessible in my error catch.

async def run_snyk_cli(snyk_cli: dagger.container, snyk_exe_path: str, snyk_api: str) -> None:
    authed_container = await snyk_cli.with_exec([snyk_exe_path, "auth", snyk_api]).sync()
    try:
        result_container = await authed_container.with_exec(
            [
                "/usr/local/bin/snyk",
                "test",
                "--file=obj/project.assets.json",
                "--sarif-file-output=oss-report.json",
                "--debug"
            ]
        ).sync()
        oss_file = result_container.file("oss-report.json")
        file_size = await oss_file.size()
        print(f"==== OSS file size 1 | {file_size} ===")
    except dagger.ExecError as e:
        oss_file = authed_container.file("oss-report.json")
        file_size = await oss_file.size()
        print(f"==== OSS file size 2 | {file_size} ===")
#

The print in the except block also throws and error, because the file doesn't exist and that all lines up with what you said above.

eager rock
#

I'm trying and failing to find a github issue I know we have that describes what you're encountering here. I'll follow up again when I find it.

Does snyk create the oss-report.json even when it fails? The exit code 1 just signals that there's some sort of policy failure?

ancient frost
#

It does appear to write the file, regardless of if it finds problems or not. I got this from their support:

The exit code is only set when the process exits, by which point it should have done all its work (unless there is some sort of race condition).
 
More likely the file is being written somewhere and then discarded, or it’s being written somewhere on the filesystem but not where you expect, or the file write is failing due to a read-only filesystem.

I was going to treat the report like I use to the junit xml in the past. Have it run the analysis, then in my mainline read the results and process them using my Python code to make a determination of the next step/pass/fail/etc.

eager rock
#

Awesome, thanks for the context.
So yeah part 1 of the problem is what you were describing earlier. The file should be in result_container but not in authed_container.

The second part, which I was looking for the github issue for, is that there's not a way to call result_container.file() because the DAG will "fail" on the snyk exit code 1 and never make it far enough to see the file. There's a few ways to currently work around that, one is to avoid the exit code, as long as you have some way to determine yourself it the pipeline should pass or fail. That would look something like with_exec(["sh", "-c", "/usr/local/bin/snyk test --file=obj/project.assets.json --sarif-file-output=oss-report.json --debug || true"])

#

We see people do this sometimes with test frameworks like pytest as well to get the test report like you mentioned

ancient frost
#

one is to avoid the exit code , Does Dagger see the setting of the exit_code as 1 as an error? Is this snyk cli actually tossing an error because it did find issues (this is asking your guess, I can test this out or ask them).

Basically, I'm not sure I can avoid it because something is tossing the error and I guess I have no way to get to the container to get to the file in this case.

eager rock
#

Yeah exit code 1 is a failure, so dagger will see that exec as failed

I'm not sure I can avoid it because something is tossing the error and I guess I have no way to get to the container to get to the file in this case.

I think if you reassign authed_container = await authed_container.with_exec(... instead of assigning to result_container you should have it in the except, right?

ancient frost
#

Just finished testing.

Experiment #1
With your suggestion:

authed_container = await authed_container.with_exec(... instead of assigning to result_container 

I was still not able to see the file in my catch block.

Experiment #2
Replacing my .with_exec with your version

with_exec(["sh", "-c", "/usr/local/bin/snyk test --file=obj/project.assets.json --sarif-file-output=oss-report.json --debug || true"])

Worked like a champ. The error was swallowed and the file was there.

I guess I should use this, but will have to keep an eye out for the existing Dagger issue to be fixed.

What are your thoughts?

eager rock
#

but will have to keep an eye out for the existing Dagger issue to be fixed.
Yes totally agree, we need a better way to handle this in dagger. There are some other ways to use the same workaround to have a better idea if the command failed like snyk ... && printf OK > /ok || printf FAIL > /ok (https://github.com/dagger/dagger/issues/3192#issuecomment-1308469426)

GitHub

We currently expose this API: type Container { exitCode: Int } ...but the only possible value returned here is null or 0. If the command exits nonzero Buildkit will error instead. We could have the...

ancient frost
#

@eager rock Thank you again for your help on this.