#service de-duping
1 messages · Page 1 of 1 (latest)
I think that currently services will be deduplicated not only within a single session but also across them (right?). If so, that would mean if two clients are concurrently running pipelines that spin up a db as a service, the db would be shared across both clients, which probably results in unexpected behavior quite often.
I was then thinking that clients could at least just set an env var or something on the service instance that uniquely identified the client/pipeline instance somehow, but then realized that would change the service's content-based DNS name and thus always invalidate the execution of the execs depending on the service (right?)
If that's all true, I feel like maybe we should just dedupe services ourselves inside the session rather than relying on the buildkit solver deduping, which is unfortunate to not be able to take advantage of, but I can't think of another way to get consistent behavior.
Yeah this is all true
Once we have the new progress stream integrated it'll be really easy to implement per-session deduping, since we can forward gateway container stdout/stderr to a custom progress vertex, and just use an in-memory lock/map
The remaining concern would be making sure we don't end up with the exact same hostname across sessions; I think you could end up with funky load balancing behavior there. I guess we could inject a per-session value?
I guess we could inject a per-session value?
Yeah I think that would make sense, just include a random id generated once per session in the hash. If we're using gateway containers now we wouldn't need to worry about that impacting the caching of the service itself anymore.
But actually, I guess we do need to worry about the hostname still invalidating the execs depending on it... Worst case there are some tricks to inject information into an exec without it impacting caching: the http proxy settings in LLB don't break the cache (would be a total abuse, but could make it work robustly nonetheless I think), secret values don't actually invalidate the cache when it comes to the raw LLB I think, etc.
Oh yeah good point. Hmm - maybe we could resolve hostnames to IPs using the per-session map and just inject them in /etc/hosts? We already do something like that for service aliases
@abstract gulch Have you seen any cases where you use Hostname / Entrypoint without WithServiceBinding? That's something that works today, but might be hard to support depending on what option we go with here
Hostname / Entrypoint
Presuming you meantHostname/Endpoint?
If so, no I guess I haven't seen that. But I'd probably be okay with those not having meaning if WithServiceBinding isn't set. Or we could refactor the API a little too of course (maybe rename WithServiceBinding() to just Service(ctx) that returns the endpoint?) But whatever works! Let me know if I'm misunderstanding the issue
Oops yeah that's what I meant. I just remember seeing Hostname/Endpoint used in some of our bootstrapping/test code, and was wondering about folks who run dockerd too, wanted to make sure they're all binding the service. Sounds like probably