#service healthcheck failing too fast

1 messages · Page 1 of 1 (latest)

forest turtle
#

Hi,

I'm trying out Dagger for local development and am having some issues with the healthcheck of a service.
I'm defining a service using the official Kafka image like this:

    kafkaService := dag.Container().
        From("apache/kafka:3.8.0").
        WithMountedCache("/tmp/kraft-combined-logs", dag.CacheVolume("kafka-data")).
        WithExposedPort(9092).
        WithExposedPort(19092).
        AsService()

This works fine when I run it with docker compose (with the appropriate environment variables) but with dagger the healthcheck always fails:

      ✘ 9092/tcp 19092/tcp 0.9s
      ! checking for port 9092/tcp: context canceled
      ┃ 16:44:07 WRN port not ready error="dial tcp 10.87.0.35:9092: connect: connection refused" elapsed=14.421625ms                                                                                                                                                                                                          
      ┃ 16:44:07 WRN port not ready error="dial tcp 10.87.0.35:9092: connect: connection refused" elapsed=118.574917ms                                                                                                                                                                                                         
      ┃ 16:44:07 WRN port not ready error="dial tcp 10.87.0.35:9092: connect: connection refused" elapsed=310.343542ms                                                                                                                                                                                                         
      ┃ 16:44:07 WRN port not ready error="dial tcp 10.87.0.35:9092: connect: connection refused" elapsed=497.743584ms   

Kafka needs something like 5s to propertly start up but it seems like the healthcheck fails too fast, is there a way to control the healthcheck intervals ? Or am I missing something ?

fast current
fast current
#

found the issue vincent, this works:

func (m *KTest) Test(ctx context.Context) *dagger.Container {
    kafkaService := m.KafkaCtr(ctx).
        WithExposedPort(9092).
        WithExposedPort(19092).
        AsService()

    return dag.Container().
        WithServiceBinding("kafka", kafkaService).
        From("apache/kafka:3.8.0").WithExec([]string{"ls", "-la"})

}

func (m *KTest) KafkaCtr(ctx context.Context) *dagger.Container {
    return dag.Container().
        From("apache/kafka:3.8.0").
        WithMountedCache("/tmp/kraft-combined-logs", dag.CacheVolume("kafka-data"), dagger.ContainerWithMountedCacheOpts{Owner: "1000"})

}

the fix is adding the dagger.ContainerWithMountedCacheOpts as otherwise the Kafka volumes are owned by root and the kafka container won't be able to write there

forest turtle
#

ah thank you I’ll try that ! was there an error somewhere that indicated this ? because I ran with -vvv and didn’t see anything but it’s possible I just didn’t look correctly

fast current
# forest turtle ah thank you I’ll try that ! was there an error somewhere that indicated this ? ...

I saw this with -vvv:

    ┃ ===> Using provided cluster id 5L6g3nShT-eMCtK--X86sw ...                                            
    ┃ Formatting /tmp/kraft-combined-logs with metadata.version 3.8-IV0. Error while writing meta.propertie
    ┃ s file /tmp/kraft-combined-logs: /tmp/kraft-combined-logs/bootstrap.checkpoint.tmp                   
      ✘ 9092/tcp 19092/tcp 1.1s
      ! checking for port 9092/tcp: context canceled

^ you can see the "Error while writing meta.properties" message above

#

after getting that I refactored my example to what I posed above so I could run dagger call kafka-ctr terminal and get a shell in the kafka container to undestand what was happening

#

from there that was regular debugging and realized that the /tmp/kraft-combined-logs folder was being created with incorrect permissions