so im futzing around trying to emulate | Dagger | Page 1

radiant venture Nov 12, 2024, 11:14 PM

#

also worth mentioning that the stats reported here are kinda weird, like it's reporting disk and network stats BEFORE the install is complete? cc @light pine

uncut herald Nov 12, 2024, 11:33 PM

#

If you mean the fact that the next line already has a checkmark (✔ Container.withExec(args: ["sh") - that should be fixed by https://github.com/dagger/dagger/pull/8442 which adds a "pending" state

re: the metrics, I'm guessing those show up live, and don't have to wait for the command to finish?

radiant venture Nov 12, 2024, 11:35 PM

#

yeah that's what i mean

#

and yeah after further poking, the polling displays the first poll and can change by the end of the exec

#

so we're all good, and i do have something here that might be a failing manual test... the one thing i'm worried about is whether netem changes to shit will actually show up in CNI/netns stats.... like i think it might be too userspacy, so CNI shit doesn't consider emulated packet loss to be real packet loss?

#

either that or i have a bug where nothing displays lolsob

#

packets at least wire all the way out, but dropped seems to always be 0

#

i could honestly merge this whole PR right now without the packet loss, or just test fake packet loss #s coming out of the sampler

light pine Nov 12, 2024, 11:51 PM

#

You might need to apply netem to your whole host, or at least the engine container you are testing in. e.g. after ./hack/dev do docker exec dagger-engine.dev tc qdisk <rest of cmd>

radiant venture Nov 12, 2024, 11:52 PM

#

i can try that for sure

light pine Nov 12, 2024, 11:52 PM

#

radiant venture i could honestly merge this whole PR right now without the packet loss, or just ...

That's fine too, my suggestion above might be worth a shot but otherwise I wouldn't worry too much yet

radiant venture Nov 12, 2024, 11:53 PM

#

i kinda suspect it'll give the same results, the tc qdisk changes already persist to the whole engine afaict

#

it's annoying af because it makes apt-get take FOREVER w/ 25% packet loss lmao

light pine Nov 12, 2024, 11:55 PM

#

I think with what you have above it would apply to the eth1 device of the container's netns, which I agree might not end up counting in the metrics. But if you did it on the network device for the whole engine container then packet loss should be "transparent" to the withExec netns device (theoretically, hard to say for sure since there's a lot going on here)

radiant venture Nov 12, 2024, 11:56 PM

#

yeah, trying now

#

gotta blow the whole thing away first though

radiant venture Nov 13, 2024, 12:33 AM

#

still appears to always report 0, gonna test the wiring by hardcoding a value

radiant venture Nov 13, 2024, 5:18 PM

#

found actually good documentation for tc http://tcn.hypert.net/tcmanual.pdf

#

i think what's going on here is that netem stats are collected to /proc/netem/stats, and maybe aside from from the interface stats? at the very least, i think it does make sense that causing packet loss on the host interface doesn't show up in the container's interface stats- the veth never knew they existed so it can't count their drops. internal to the container, i think there's a similar logic going on– dropped packet counting is done by each interface's driver, and netem obviously exists outside of driver-land...

#

i'm like 95% confident in my statement about packet loss on the host interface, but only like 65% on the in-container netem assertion... gpt-4o mini disagrees lmao

light pine Nov 13, 2024, 6:23 PM

#

Yeah I wouldn't worry about it, we can just collect the metric for now. If it's literally always zero (which will become easy to see once we forward metrics to the cloud) then we can rm it

radiant venture Nov 13, 2024, 6:30 PM

#

another thing, having now read a bunch of docs and paged in my dusty af networks brain... there's a difference between packet drops and packet loss that makes all this shit make way more sense– drops are interfaced scoped, usually happen bc of out-of-order delivery or queue exhaustion. loss can occur anywhere on the RT– including other interfaces and boxes, so way I tell it "tc qdisc add dev eth1 root netem loss 25%" ... that is not 100% equivalent to asking for 25% drops, it's asking to emulate the network losing packets, not necessarily this specific interface dropping them

#

knowing that, i might be able to properly force drops with iptables

#

in other words, i asked @uncut herald the wrong question in the PR "open to ideas on how to simulate packet loss"

#

and the tui is also printing the wrong word

#

if we actually wanna give packet loss numbers tcp retransmits are a better proxy, but that's kinda wonky and only works for tcp+tx, udp i think you've got nothing

#so im futzing around trying to emulate