#Set custom DNS resolvers

1 messages · Page 1 of 1 (latest)

empty mica
#

I'm currently attempting to get dagger to talk to some services on my tailscale network. I'm able to get a dagger container to join the tailnet just fine. However the tailnet uses split DNS to point some internal domains at specific nameservers. Without being able to access those name servers, the dagger code won't be be able to resolve tailnet names.

Once joined to the tailnet, the dagger code is able to manually resolve the names if I point it at the right nameserver, but this obviously doesn't work if e.g. the code was going to be used for arbitrary CI pipelines. Perhaps I'm being optimistic here, but it seems like I Just(tm) need to be able to add a resolver line to /etc/resolv.conf. However no matter what I do affect any changes in this direction. I think this is related to dagger/buildkit taking control of resolv.conf. Changes seem to be overwritten with a basic DNS config and some dagger specifc services stuff.

Any ideas here? I see that there was some discussion around split dns resolution a while back, but it didn't seem like there was any solution then.

grim current
grim current
#

if that's the case you can run umount /etc/resolv.conf before running tailscaled so the agent is able to change /etc/resolv.conf accordingly.

Take into account that by replacing the container's default DNS server, some features around Dagger services won't work given that they require the default injected engine DNS server for those to work

empty mica
#

@grim current Thanks, that's helpful. I had been trying to make this happen somehow via the dagger API, but none of the things I tried worked. If I use umount as you suggest I can modify the resolv.conf as needed. When running the tailscale container on it's own (i.e. accessing the tailnet from within that container), everything seems to work. Re-write resolv.conf to use nameserver 100.100.100.100, some external DNS like 1.1.1.1, and then append daggers 10.87.0.1 nameserver, along with the .ts.net and .dagger.local search paths that were originally there.

With that config, I can access the tailnet, resolve tailscale short names, tailscale FQDNs, and tailscale split DNS correctly. Yay!

#

Now for the more complicated part: Resolving DNS from another container. In this scenario, I'm running the same tailscale container as above, but using it as a sidecar to proxy tailnet traffic and DNS. I run tailscaled with a SOCKS5 proxy, add with_exposed_port(...) and as_service(). Then in another 'worker' container, I do with_service_binding('ts', tailscale_service) and bring it up. This also works. That is, it successfully proxies traffic to the tailscale sidecar. Yay!

Where the fun stops, is if I try to use DNS from the worker container. I set up a DNS server in the tailscale service container to proxy the same nameservers as /etc/resolv.conf before and expose it on port 53. From within the tailscale container, this DNS server successfully resolves all the tailscale DNS when explicitly queried. Unfortunately if I try to query the DNS server on 53 from the worker container, it doesn't work, and I'm not sure why.

Even fetching the dagger hostname of the tailscale service via the dagger API, resolving it manually to an IP, and inserting that as a nameserver in /etc/resolv (nameserver <resolved dagger ip> does not work. All queries just get 'connection refused'. I can't just use the 'ts' service name in resolv.conf, because clients don't know how to parse that, they want an IP.

It seems like it ought to work, and it's pretty close, but so far no luck. 😕

grim current
#

cc @tame lance . Not sure if you were able to figure it out

empty mica
#

@grim current coming back to this after a while, after fiddling more I was able to get the remote DNS resolution to work correctly. However it turns out that, unless there is a way to manage the dagger networking (I can't find any docs on this), resolving names doesn't help, since the dagger interfaces don't seem to have routes to the tailscale IPs. I was able to solve this by having the CI containers resolve over socks5h (which tailscale seems to have fixed!) and having the tailscale service container run an exit node to some dedicated, real node that can do the routing. It's a hack but it will work for now.

Using this approach, I am now able to create pipelines that talk to my tailnet by setting things like ALL_PROXY on the various container steps, and/or providing an explicit proxy endpoint provided by the dagger service. It pretty much works as expected. However now noticed that things like publishing a container to a private registry on the tailnet don't work. I'm assuming this is because the publish command is running in the runtime container (?), which doesn't have proxying set up, e.g. I can't set ALL_PROXY, so dagger doesn't know anything about the tailscale name, even though there is a managed service running which would happily proxy the traffic if it asked. So if the publish API doesn't have a built in way to set a socks proxy, and I can't set proxy directives for the container running the publish, am I dead in the water? It seems like the only way would be to run publish in a custom container with the appropriate proxying set up, but then that will be some kind of dagger-in-docker-in-dagger setup which seems very unpalatable... thoughts?

grim current
grim current
#

This is something we should improve and allow the automatic provisioning process to pick these variables automatically

empty mica
# grim current <@341797950206509059> setting the *PROXY variables when provisioning the engine...

Hmm thanks. I tried setting up a custom runner with ALL_PROXY set and pointing the CLI to that. Immediately, I run into a problem. The runner tries to do a bunch of stuff before my code runs such as fetch module source, dependencies, etc. These steps are apparently running within the runner, because they are trying to proxy to my ALL_PROXY location. However this fails, because the socks endpoint specified is not running yet, because my pipeline code needs to start that service. If I had a static tailscale proxy running somewhere the runner could access, I suppose it could be made to work, but this is definitely not the setup I was building. I pass an automatically generated authkey into the pipeline, which allows starting a proxy service joined to the tailnet for that CI workflow.

This also confuses me a bit. If those setup steps are happening within the runner, and it works normally (without ALL_PROXY), then that implies that the runner is sharing the host's networking, because the code is being pulled from a location on the tailnet. But if that's the case, then why is it that publish() fails to make network calls to the same tailnet host? I must be missing something... is there another layer of container-ception going on in between?

grim current
#

However this fails, because the socks endpoint specified is not running yet, because my pipeline code needs to start that service. If I had a static tailscale proxy running somewhere the runner could access, I suppose it could be made to work, but this is definitely not the setup I was building. I pass an automatically generated authkey into the pipeline, which allows starting a proxy service joined to the tailnet for that CI workflow.

Ok, now I know what seems to be the issue. It's basically related to this: https://github.com/dagger/dagger/issues/6411

We've tried addressing it in the past but none of the avenues that we explored end up giving a reasonably acceptable outcome. Given that I've seen this one surfacing recently so I'll bring it up to the team this week to see if someone might have a better way to solve it.

In the meantime, you should be able to get passed the publish blocker by following the suggestion here: https://github.com/dagger/dagger/issues/6411#issuecomment-2354072245. It's not ideal but it's not that bad of a stopgap for now 🙏

GitHub

Related to: #5235 #6271 There's a workaround using docker commands, but having support for Service would make it much simpler to manage.

#

cc @late comet. Since it'd be nice to chat about this one this week 🙌