#I built a distributed message streaming platform from scratch that's faster than Kafka

11 messages · Page 1 of 1 (latest)

rain sky
#

I've been working on Walrus, a message streaming system (think Kafka-like) written in Rust. The focus was on making the storage layer as fast as possible.

it has:

1.2 million writes/second on a single node, scales horizontally the more nodes you add

Beats both Kafka and RocksDB in benchmarks (see graphs in README)

How it's fast:

The storage engine is custom-built instead of using existing libraries. On Linux, it uses io_uring for batched writes. On other platforms, it falls back to regular pread/pwrite syscalls. You can also use memory-mapped files if you prefer(although not recommended)

Each topic is split into segments (~1M messages each). When a segment fills up, it automatically rolls over to a new one and distributes leadership to different nodes. This keeps the cluster balanced without manual configuration.

Distributed setup:

The cluster uses Raft for coordination, but only for metadata (which node owns which segment). The actual message data never goes through Raft, so writes stay fast. If you send a message to the wrong node, it just forwards it to the right one.

You can also use the storage engine standalone as a library (walrus-rust on crates.io) if you just need fast local logging.

I also wrote a TLA+ spec to verify the distributed parts work correctly (segment rollover, write safety, etc).

Code: https://github.com/nubskr/walrus

would love to hear your thoughts on it :))

GitHub

🦭 High Performance distributed message streaming engine - GitHub - nubskr/walrus: 🦭 High Performance distributed message streaming engine

bronze panther
#

Cool, thanks for sharing your progress on this! Are your segment files based on message count instead of file size? And if so what made you choose that? For SierraDB, I also have segmented files that rollover, but they're based on configurable file sizes (256 MB by default)

#

I also like your commands. I wonder if you've considered implementing the protocol on top of RESP3 (Redis' protocol), so every language immediately has client support for your database, and redis-cli can be used?
https://redis.io/docs/latest/develop/reference/protocol-spec/
I think that could be a big win for accessibility

rain sky
#

segments are message count based for now because walrus as a storage engine doesnt really exposes any 'size' based metrics, so adding that on top of walrus would have been like swimming upstream. some workaround/architectural rewrite would need to be done in the future

#

also, I would highly recommend writing a tla+ spec (if you havent already) for sierradb, it helps a lot in figuring our architectural bugs before you even start implementing your distributed systems

bronze panther
winter folio
rain sky
#

yes, the syntax is cursed, but it is what it is I guess

humble steeple
#

for more distributed setup