#Integration Test

1 messages · Page 1 of 1 (latest)

finite patrol
#

I am new to dagger and interested in writing integration tests. I've gone through the tutorials and understand how to build a container and how to run its unit tests. I struggling to figure out how to do integration testing though. I would like to start up multiple containers, send data to them and check the results. I looked at services, but that doesn't really match my processing. We have a pipeline, where data flows through several containers. In the integration test, I would like to stand up all of the pipeline steps and then run data through.

Is there a "good" or canonical way to do this in Dagger? Are there any tutorials?

jolly carbon
#

What does run the data through look like in your case? Are you making HTTP requests?

A bit of a self plug: I wrote an article a few weeks ago doing end to end tests for an API with a connection to a db. You can skip to the dagger part and see if that helps you, if not I'm happy to understand a bit better what you are trying to accomplish! https://blog.matiaspan.dev/posts/exploring-dagger-streamlining-ci-cd-pipelines-with-code/

finite patrol
#

I do signal processing where the data passes through a series of transformation stages. Each step applies a mathematical algorithm to the data. These are not the actual transformations, but to give an example, the input to the first stage of the pipeline is a string message "Hello World". Let's say that the stages do the following:

  1. Compress the input data using gzip
  2. Add a CRC32.
  3. Convert data to format for wireless transmission

All of these stages are linear and we can think of it as building up layers in an onion. We have have very high throughput data, so the stages are connected via a high speed bus/message broker. To check the data, we can invert the steps in reverse order - peeling off the onion layers. We can (and do) write unit tests for the individual stages, but I would like to have an integration test for the whole system.

#

The ideal test (in my mind) would standup containers for stages 1,2, and 3. The stages would be connected in the same way that we use them in the production system. Once they are up, it would send a series of string messages to the first stage. It would process the message and send it to the second stage. We would have a comparison container or routine, that would connect to the output of the last stage. This could compare against known good output values, or it could peel back all of the onion layers and make sure that we get back the original value.

I've tried using docker compose in the past, but it didn't go very well. The two big issues, were ensuring that the processing stages + external services are all up and ready before sending messages and doing something useful with the output results. I wanted to run an automated test where it failed if the final data was invalid or didn't receive enough data within a certain amount of time.

I am intrigued by dagger because I can orchestrate a task using a real programming language. So it feels like I can do what I want. I am also open to thinking about the problem differently if that helps.

jolly carbon
#

Okay. That sounds like an interesting use case. How does the data travel between the stages? In the example you sent, we have the initial "Hello World" message that gets sent to stage:compress-gzip, is that through an HTTP API that stage:compress-gzip exposes? Once the stage:compress-gzip is done, how is the data plugged into stage:crc32? Is it via an orchestrator that knows when stage:compress-gzip is done or are the stages connected to each other at start time?

finite patrol
#

The stages are connected at startup. We generally use either ZMQ or jetstream to connect them.

#

In thinking it through, I believe that in jetstream case (since there is a broker), I just need to have everything depend on a nats server service. It doesn't matter when the containers come up, because the broker makes sure they get the data. One the service is up, I can very easily inject messages into the broker.

#

The ZMQ case is more complicted because it is brokerless.

jolly carbon
jolly carbon
#

Overall, if the process is about connecting containers together so that they can communicate and validating results, I think it's a great use case for Dagger. In your case it will probably require a bit more work to get containers up and running than the example I sent before

#

The one thing I'm still not sure is about validation. Since this is an integration test, after the stages are done you want to make sure that the result is the expected. In your case, where is the result written to? Is it also sent via a message queue?

finite patrol
#

Yes the result would be sent to the message queue. I can have a final validation container that I insert at the end of the pipeline. That container will read the final messages and have access to whatever info is needed to validate. I'd like to check the return code of a script that it runs to decide if the test passed.

#

I think I understand enough to put something together now. Thanks for the help.

jolly carbon
#

No problem! If you don't mind sharing your experience here afterwards that would be great!party_gopher It seems to be an interesting use case where Dagger should make the experience better. If it does not, that is super valuable feedback for us

finite patrol
#

I'm running into a problem implementing the pipeline. In my test case, I have a data insertion container, a check container, and a single container in my pipeline. I create the jetstream broker as a service. I'm using the python API so I'll use their async io terminology. I await a data insertion container. Then I want to start up the container under test and the check container.

The container under test runs forever, but the check container is designed to exit after it validates the messages that it receives. What I would like to do is launch both containers at the same time. Wait for the check container to finish and then tell the container under test to shutdown.

I have tried various ways to do this without success. I can create a task group, but it looks to me like I have to wait for the whole task to finish. I can't just wait on one container to finish.

Is there an easy way to launch multiple containers in dagger and wait on just one of them to complete?

golden iris
#

👋 IIUC you have something like this, correct?

producer > test container > checker container

and messages follow through a jetstream container that all the components above have access to.

so basically the only thing you need to do is to validate that the checker container actually finishes, correct?

#

so if you make your producer, test and jetstream containers services and you bind them to the checker container that should work I think

#

cc @finite patrol forgot to reply

golden iris
finite patrol
#

Thanks for the help guys. I'm going to present this tomorrow and see what people think about it.

golden iris
finite patrol
#

Yes. I have a case of jetstream broker + produce -> test_container -> checker. I extended it to several test_containers.

My initial feedback would be:

  1. I got a lot of errors that it couldn't find a host when trying to connect to a service. I am guessing that this was because of an error in my dagger file, but I spent a chunk of time thinking that there was a networking issue.
│ ┃ lookup 0s2j4lfdrtdeu.tj2m15o8h8vd2.dagger.local on 10.87.0.1:53: no such host
  1. I'm a little confused on the order in which information is printed to the screen. I tried printing a success message at the end of the test, but it printed out in the middle of the output data.

  2. It would be nice to have an alternative to with_service_binding for when I don't care about the IP of the container. Maybe something like - with_dependency(container)

  3. I really like the intuitive syntax. Gitlab/Kubernetes yaml can be really obtuse. I have shown a few people my code and it is really easy to follow what is happening.

  4. Using a real programing language allows us to do a lot things that other products can't (such as docker-compose and argo workflows).

  5. I wonder if I could use dagger in production systems to periodically run complicated system health checks in a similar manner to what I did here.