I'm curious - would this sort of project be a good candidate for using Dagger to manage running a suite of LLM code model evaluation tests across multiple languages? https://github.com/bigcode-project/bigcode-evaluation-harness/blob/main/Dockerfile-multiple
#Use case for Dagger in OSS project?
1 messages · Page 1 of 1 (latest)
For sure, I recently gave a talk on using Dagger for testing and validation. If you go to the demos channel, you'll find two other posts on GPU and ML workflows in Dagger
The benefit here is that you can build up the testing environment, and then add the tests as little layers on top, each clean and ephemeral. This can happen within the loops of your testing matrix.