#DNS Resolution Failing

1 messages ยท Page 1 of 1 (latest)

fallow tinsel
#

I'm having some errors with java tests failing to resolve DNS and so trying to understand DNS resolution in dagger a little better.

I exec'd into my dagger engine, and then into the target container namespace via runc

I see the following in DNS settings.

root@buildkitsandbox:/src# cat /etc/resolv.conf
search tg71nu3c0flau.dagger.local buildkit.svc.cluster.local svc.cluster.local cluster.local us-west-2.compute.internal
nameserver 10.87.0.1
options ndots:5

root@buildkitsandbox:/src# cat /etc/hosts
127.0.0.1    localhost buildkitsandbox
::1    localhost ip6-localhost ip6-loopback

10.87.0.18    redis-cluster

10.87.0.20    db

10.87.0.19    localstack

I can resolve redis-cluster via ping, but not nslookup

root@buildkitsandbox:/src# ping redis-cluster
PING redis-cluster (10.87.0.18) 56(84) bytes of data.
64 bytes from redis-cluster (10.87.0.18): icmp_seq=1 ttl=127 time=0.034 ms
64 bytes from redis-cluster (10.87.0.18): icmp_seq=2 ttl=127 time=0.046 ms
64 bytes from redis-cluster (10.87.0.18): icmp_seq=3 ttl=127 time=0.032 ms
root@buildkitsandbox:/src# nslookup
> redis-cluster
Server:        10.87.0.1
Address:    10.87.0.1#53

** server can't find redis-cluster: NXDOMAIN

Why would nslookup not work in this case? This behavior mirrors the problems I'm seeing in my tests, some libraries work for resolution (database), but others do not (redis), indicating they are using different resolution methods somewhere.

A search in discord here shows some previous mentions to an expermental flag, but that appears missing from the current codebase in a search.

All that said, any ideas? Perhaps a team member could help provide some detail about the dns internals?

#

after update to 10.1 (both client and server) I see the same behavior

91: [265.6s]     redis.clients.jedis.exceptions.JedisConnectionException: Failed to connect to any host resolved for DNS name.
sharp shoal
#

AFAIK the java resolution system should use /etc/hosts also to resolve names

#

what's the part in your java test that's failing particularly?

fallow tinsel
#

redis tests. frustratingly, I can connect via redis-cli

#

redis cluster instantiation looks like this FWIW

  companion object {
    private val redis by lazy {
      val hosts = "redis-cluster:30000,redis-cluster:30001,redis-cluster:30002,redis-cluster:30003,redis-cluster:30004,redis-cluster:30005"
      JedisCluster(
        hosts.toTestHosts().toHostPorts(), # just splits the string
        DefaultJedisClientConfig
          .builder()
          .build()
      )
    }
  }
fallow tinsel
#

hah! nice

#

If it helps, our current setup just runs docker-compose with a redis cluster, and that is able to resolve no issue

sharp shoal
#

@fallow tinsel which Java docker image are you using?

fallow tinsel
#

coretto 8

#

err.... its like debian:bullseye + coretto

#
root@d5995a003654:/src# java -version
openjdk version "1.8.0_402"
OpenJDK Runtime Environment Corretto-8.402.08.1 (build 1.8.0_402-b08)
OpenJDK 64-Bit Server VM Corretto-8.402.08.1 (build 25.402-b08, mixed mode)
sharp shoal
#
GitHub

Redis Java client. Contribute to redis/jedis development by creating an account on GitHub.

#

is the java docker image you're using public @fallow tinsel ?

fallow tinsel
#

no :/

sharp shoal
#

ok, I'll do a quick test with the same JDK version of a public image

#

and see what happens

fallow tinsel
#

which jdk image are you gonna use I could try the same, just rebasing ours off that

sharp shoal
#

they seem to be based on alpine though ๐Ÿ˜•

fallow tinsel
#

damn.

before it's lost to my terminal history here's a little more context.. I found this issue which seems somewhat similar, and mentions /etc/nsswitch.conf was missing.. that wasn't the case for me, though I dont know if there's some misconfiguration with the content

root@buildkitsandbox:/src# vim /etc/nsswitch.conf
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.

passwd:         files
group:          files
shadow:         files
gshadow:        files

hosts:          files dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis
sharp shoal
#

so @fallow tinsel this seems to work:

func main() {
    ctx := context.TODO()

    redis := dag.Container().From("redis").WithExposedPort(6379).AsService()

    dag.Container().From("amazoncorretto:8-al2023-jdk").
        WithWorkdir("/app").
        WithFile("DNSClient.java", dag.Host().File("DNSClient.java")).
        WithServiceBinding("redis", redis).
        WithExec([]string{"javac", "DNSClient.java"}).
        WithExec([]string{"java", "DNSClient", "redis"}).
        Stdout(ctx)
}
import java.net.InetAddress;
import java.net.UnknownHostException;

public class DNSClient {
    public static void main(String[] args) {
        InetAddress address = null;
        try {
            address = InetAddress.getByName(args[0]);
            System.out.println("host address: "+address.getHostAddress());
        } catch (UnknownHostException e) {
            throw new RuntimeException(e);
        }

    }
}
#

trying your image now

fallow tinsel
#

is zenith going to behave different here somehow? I'm not using it

sharp shoal
#

nope, it's the same codepath for service resolution

#

that code snippet seems to work with your image as well

#

interesting.. I'll try an actual JedisCluster example a bit later

fallow tinsel
fallow tinsel
#

Hate to nag but were you able to try the reproduction?

#

I simplified it even and am still completely stumped. redis cli works fine, but jedis can't handle it

fallow tinsel
#

yeah like I even just added a test to test dns resolution and pinging all ports successfully....

34: [54.6s] JedisTest > test redis-cluster IP resolution() STANDARD_OUT
34: [54.6s]     IP addresses for redis-cluster:
34: [54.6s]     10.87.0.81
34: [54.6s]     Ping result for port 30000: PONG
34: [54.6s] 
34: [54.6s]     Ping result for port 30001: PONG
34: [54.6s] 
34: [54.6s]     Ping result for port 30002: PONG
34: [54.6s] 
34: [54.6s]     Ping result for port 30003: PONG
34: [54.6s] 
34: [54.6s]     Ping result for port 30004: PONG
34: [54.6s] 
34: [54.6s]     Ping result for port 30005: PONG
.....

34: [54.6s] JedisTest > test a simple redis ping() STANDARD_OUT
34: [54.6s]     2024-03-15 17:54:50 DEBUG ConnectionFactory:62 - Error while makeObject
34: [54.6s]     redis.clients.jedis.exceptions.JedisConnectionException: Failed to connect to any host resolved for DNS name.

so must be something in jedis I guess, hard to know where to go from here.

perhaps the redis cluster is telling it to go to some node by hostname that isn't the service alias. edit: not sure this is it, using cluster mode on redis cli doesn't experience this behiavior

sharp shoal
fallow tinsel
#

no worries - kind of half rubber ducking here hah. appreciate the help though

sharp shoal
#

hey @fallow tinsel ! did you ever get to overcome this networking issue? catching up with old threads, I realized I never came back to this one ๐Ÿ™