#How to Write a ReadOnlySpan<T> to Disc

1 messages · Page 1 of 1 (latest)

iron wing
#

I have an array of unmanaged structs (T[]) that I’d like to write to disc. What is the most efficient way to do this, without creating any new allocations?

My initial thought was to somehow get a ReadOnlySpan<byte> pointing to the data, and then pass that to FileStream.Write(). However, while the data is entirely blittable, it’s not stored in a byte[], but a T[]. And converting from T[] to byte[] (or from ReadOnlySpan<T> to ReadOnlySpan<byte>) seems to require new allocations, or otherwise be potentially expensive.

Am I missing something? Is there a simple way to write a ReadOnlySpan<T> to disc, without a cast or new allocations?

Thanks for any help!

inner raft
#

You can reinterpret cast a ReadOnlySpan<T> to ReadOnlySpan<bytes> using MemoryMarshal.AsBytes. That won't allocate.

#

And of course you can always use unsafe to reinterpret cast pointers to whatever you want.

iron wing
#

Thank you! Appreciate it.

Is MemoryMarshal.AsBytes performant? Is that something I should use sparingly?

#

Also: if I’m writing using async/await, will I need to pin the array or span before writing?

icy umbra
#

If the array ref is in scope with the span it should be fine

#

I think its unlikely its relocated but if it can be pinned that would make it fully safe

inner raft
#

There's no risk or extra cost if the span is stack allocated rather than heap allocated, it's just a reinterpret cast. The span you get back will have the same lifetime and point to the same memory as the one you gave it.

icy umbra
#

what I am reading about read only span makes me think they cannot be used in async functions so the marshal approach may be best there

inner raft
#

There is no File write API that is both async and accepts a Span, because of this rule.

iron wing
#

Thank you both, btw. I’ve learned more in this thread than after a lot of google searching.

icy umbra
#

Yea that restriction does

inner raft
iron wing
#

Since async is commonly used for IO operations, what do people usually do? Do they just send a reinterpreted array to an async Write method?

#

I had imagined that this was a very common case (wanting to write an array of a custom struct to disc asynchronously, without new allocations). Part of me expected that there would be a well established API for doing that, and lots of knowledge about it. I figured I just hadn’t found the right search terms yet. But I’m starting to wonder if maybe this use case isn’t as common as I assumed.

#

Passing around naked array references seems a bit risky, maybe smelly? But if Span is off the table for async, then what else can be done?

icy umbra
#

A managed language is going to have these issues so its not surprising its this tricky to avoid heap allocation

inner raft
# iron wing I had imagined that this was a very common case (wanting to write an array of a ...

If I'm writing an array of structs onto disk, I'm probably doing it as part of a binary file that contains other information, like a header, other arrays, etc. In that case, I would be writing to a Stream, maybe with a BinaryWriter or a custom writer class. I wouldn't use the async methods, instead I would perform the writing synchronously on a background thread.

I can wrap that whole thread in an async method so I can await it, if I want.

iron wing
# inner raft If I'm writing an array of structs onto disk, I'm probably doing it as part of a...

This is my use case too. And it was also my approach…but every piece of advice I could find online has suggested to not explicitly use threads for IO-bound operations. That, instead, IO is what async was designed for. (And that threading should be reserved for work that can be parallelized).

This has been one of the most confusing things for me to understand, while trying to decide on a non-blocking approach to saving and loading game data. My initial intuition was the use background threads/threadpool. That makes a lot of intuitive sense to me. But having not used async much before, I figured I was just naïve.

…and learning that async often uses threadpool behind the scenes anyway just confused things even further.

Is there any good reason to use async over scheduling a task directly on the threadpool, when it comes to IO?

inner raft
# iron wing This is my use case too. And it was also my approach…but every piece of advice I...

The benefit of using async/await is that it's easier than managing many threads, and it can avoid wasting a whole thread just to wait for a resource. This becomes more important if you're writing something like a web server that is serving thousands of clients, each wanting to read or write to different files. It would be overkill to create a thread for each client, where each thread might spend most of its time just waiting for IO to finish.

But for writing to a single file with a sequential data stream, as fast as possible for the fastest save and load times, dedicating a thread to that is reasonable.

iron wing
#

I sincerely appreciate the straightforward answers here. Cutting through a lot of headaches I’ve had over the past few weeks.

hardy kettle
#

yeah async file IO isn't just doing the same thing as you can do by putting it on a worker thread, it should be using the platform support for async IO events or whatever the platform does and not blocking any thread while it waits

iron wing
hardy kettle
#

for one or two files it's probably fine either way, it'll be different if you access lots of files at once