#Correctness of high throughput reading stdin and outputting to stdout.

33 messages · Page 1 of 1 (latest)

grand jasper
#

.write() is allowed to return after a partial write even if no error occurred. .write_all() wraps a loop which checks the return value of .write() and calls it again if it returned early without an error, else returns the error.

The weirder thing in your code is that you're using BufReader and BufWriter to read into and write from a buffer, which basically just adds more work for no real benefit.

#

Each read (assuming the read returns exactly CHUNK_SIZE bytes) first copies via syscall into reader's buffer, then gets copied to buffer, then gets copied to writer's buffer before finally being copied to stdout via syscall.

tame shell
grand jasper
#

Though if you try to write 64KiB in one go through a BufWriter when a single byte remains unflushed in a 1KiB buffer, then what happens?

tame shell
#

yeah, both stdin() and stdout() are buffered underneath

grand jasper
#

Ok(0) specifically means EOF, so this shouldn't happen unless stdin is closed (or the user presses ^D).

tame shell
#

also they lock a mutex on each access

grand jasper
#

You can avoid most of the overhead of the mutex by holding the locks outside the loop viastd{in,out}().lock().

tame shell
#

if you're reading in chunks of CHUNK_SIZE at once it makes no sense to use BufReader/BufWriter

#

the from_raw_fd handles are actually reasonable

grand jasper
#

I think the buffering on stdin and stdout is actually from libc rather than Rust, as in the same buffering that can be adjusted via the stdbuf command. There should be a way to set these streams to unbuffered mode, though I think you need an external crate.

grand jasper
tame shell
#

actually, yes

tame shell
#

that's probably good enough

#

?play ```rs
use std::io;
let buf: [u8; 5] = *b"Hello";
io::copy(&mut &buf[..4], &mut io::stdout().lock());

severe gateBOT
#
Hell```
tame shell
#

would there really be such a big difference between 8KiB and 64KiB?

#

you're literally reimplementing io::copy with a slightly larger buffer

#

uhh no

#

write_all is good

#

but the whole function can be replaced with io::copy

#

stdin and stdout you have are still buffered

#

you don't really have access to stdin's buffer

#

if you're processing it then yeah, an intermediate buffer is needed

#

somewhere

#

uhh

#

I don't follow

#

if something is already buffered, then yes, it's going to fill only the internal buffer

#

if it's the first thing you do on stdin, then it's likely that it's doing the optimization