#File prints content in log

1 messages · Page 1 of 1 (latest)

heady wraith
#

I am performing a git diff in one container and then including that file in the other container.

But dagger prints the whole file content into the log, which is not acceptable within our CI/CD pipeline.

Setting the silent option hides all other output as well e.g. hiding the test results which are send to stdout.

There was the same question before: #1349120663884926986 message
But without any real answer on the actual question.

@staticmethod
    async def execute_git_command(root_dir: "dagger.Directory", command: list[str]) -> str:
        """Execute a generic git command using a Dagger container."""
        output_file = "/tmp/output.txt"  # noqa: S108
        output = (
            await MonorepoDagger.git_container(root_dir)
            .with_exec(command, redirect_stdout=output_file)
            .file(output_file)
            .contents()
        )
        return output.strip()

This will print the file content to the log.

This then is included in the other container:

.with_file("diff.txt", dag.file("diff.txt", contents=git_diff))

Which prints the file content to the log again.

How to avoid this? What is the recommended approach?

heady wraith
#

This happens even if I just do something like:

directory.with_file("uv.lock", root_dir.file("uv.lock"))

I get the whole content of uv.lock in the log output.

I never want file content to be printed to the log. How to disable this?

heady wraith
#

The log prints something like this:

96  : ┆ Directory.file(path: "uv.lock"): File!
96  : ┆ Directory.file DONE [0.0s]
97  : ┆ File.contents: String!
97  : ┆ File.contents DONE [0.0s]
94  : ┆ File.contents DONE [0.0s]
... file content is printed here
green carbon
heady wraith
#

Hi @green carbon, thanks for sharing. The issue also focuses on secrets, which is not a concern for me. Hence, it does not address the issue I'm having.

green carbon
#

Setting the silent option hides all other output as well e.g. hiding the test results which are send to stdout.

FWIW if you Dagger function returns your tests outputs via stdout then silent should also work

#

i.e:

package main

import (
    "context"
    "dagger/foo/internal/dagger"
)

type Foo struct{}

func (m *Foo) Foo() *dagger.Container {
    dag.Directory().WithNewFile("foo.txt", "here's some file contents").File("foo.txt").Contents(context.Background())
    return dag.Container().From("alpine").WithExec([]string{"echo", "hello"})
}

calling dagger -s foo stdout will print `"hello" but will not print the contents of the file

heady wraith
#

I have a Dagger function which 1. builds, 2. runs linter 3. runs tests and 4. runs a publishing step.

When I set silent, then only the output of the publishing step is printed. Not the output of the linting and tests.

#

But to get to the core here: Why does it print the file content to the log at all? What is the use case to do so?

#

Files usually can be quite large. Printing them to the log does not seem like a good idea in any case.

green carbon
#

from the observability POV, it's usually quite handy to have the contents of the Contents call in the trace

heady wraith
#

I get that, when I do it explicitly. But in this case I'm not even executing stdout() or contents().

What I'm missing is a way I can be specific about what I want to be printed to the console.
I tested my own "prints" in python but they are also not visible when selecting the silent option.

#

Am I getting you right that atm. when I want to use files, I have to accept that they will be printed to the log in full and there is no way to disable that or workaround it?

#

This is also an issue for our monitoring, as it records all logs and this way this is a storage issue.

green carbon
heady wraith
#

I checked again and there is a .contents() further down in my code. Might be the source. But any other way then .contents() to read a file to a variable in the python SDK?

uv_lock_dict = tomli.loads(await uv_lock.contents())
keen loom
#

Wait, not using modules?

heady wraith
#

Modules? 😅

green carbon
keen loom
#

Dagger modules. Seems like you're running with dagger run python script.py or just python script.py?

heady wraith
#

I'm running with dagger call

#

and dagger functions

#

Yeah, you can export the file and read it with Python.
Uh, so there is no way to directly assign the file content and instead I have to first export to a temp file on the disk and then read that back in again?

#

Would be great to have an alternative to "contents" that does not log the file content

#

or add an option to contents(log_content=False) which is set to True by default.

#

If you want I can open a feature request on GitHub

green carbon
# heady wraith > Yeah, you can export the file and read it with Python. Uh, so there is no way ...

yes, that's the only way for the moment. When you call Export within a Dagger module, it doesn't get exported to the host machine but to the module sandbox instead. So doing something like:

file.Export("myfile.toml")
with open("myfile.toml", "rb") as f:
    toml_dict = tomli.load(f)


```

> Would be great to have an alternative to "contents" that does not log the file content

Once this (https://github.com/dagger/dagger/issues/10376) is implemented, we'll have a better way 🙏
GitHub

What are you trying to do? We would like to run commands in containers and use the output as the value of a secret Doing it naively appears to be insecure and ends up logging password.txt. Snippet ...

heady wraith
#

For now I wrote a small helper function to do the export and read of a dagger file:

    @staticmethod
    async def read_file(file: File) -> str:
        """Export a Dagger File to a temporary file and return its content as a string."""
        with tempfile.NamedTemporaryFile(delete=False) as tmp:
            tmp_path = tmp.name

        try:
            await file.export(tmp_path)
            async with await anyio.open_file(tmp_path) as f:
                return await f.read()
        finally:
            if os.path.exists(tmp_path):
                os.remove(tmp_path)

Seems to work and my logs no longer contain file content 👍

keen loom
#

Each time you make a function call, it runs in it's own container (instance), and your workdir is an empty /scratch directory. The container is not reused if you run a second function, so there's no need for a temporary file (could be nice to get a random file name without making another request to get the actual file's name, but adds a little bit of overhead to do it this way), and no need to delete it afterwards since it's not going to be persisted. It's purpose is exactly for these types of operations.

So you can simplify it a little bit if you want, and also, I suggest you use anyio for the os parts as well so it's non-blocking. For example:

    async def read_file(file: File) -> str:
        """Export a Dagger File to a temporary file and return its content as a string."""
        name = uuid.uuid4().hex
        await file.export(name)
        return await anyio.Path(name).read_text()