Optimization tips for my implementation of Brainf*** in Rust | Rust Programming Language Community | Page 1

tranquil nacelle May 18, 2023, 3:21 PM

#

Hi, I am an intermediate in Rust and I am trying to write a fast brainf*** interpreter in Rust.

A problem which I've noticed is that the interpreter becomes slow when running mandelbrot.bf from the example/ of my project (took 2m 07 seconds to finish executing on my machine)

Code -> https://github.com/vs-123/bf-rs

I would appreciate any tips to improve the code

GitHub

GitHub - vs-123/bf-rs

Contribute to vs-123/bf-rs development by creating an account on GitHub.

crystal flare May 18, 2023, 3:32 PM

#

are you compiling in --release mode?

tranquil nacelle May 18, 2023, 3:35 PM

#

crystal flare are you compiling in --release mode?

With --release mode, it goes fast in the beginning but gets slower in the middle

It took 1 min 39 seconds

crystal flare May 18, 2023, 3:36 PM

#

that's expected

tranquil nacelle May 18, 2023, 3:37 PM

#

Can it go faster than that?

crystal flare May 18, 2023, 3:37 PM

#

it finishes in 15 seconds on my pc

#

the code is fine

tranquil nacelle May 18, 2023, 3:37 PM

#

Oh okay

tranquil nacelle May 18, 2023, 3:38 PM

#

crystal flare it finishes in 15 seconds on my pc

So it depends on the computer specs?

crystal flare May 18, 2023, 3:38 PM

#

I guess

dreamy condor May 18, 2023, 7:28 PM

#

If you're asking about microoptimization, note that your Command struct is 16 bytes, assuming you're on 64-bit.

#

Consider if it'd make sense to split it up and store the loop offsets separately so your could use memory more efficiently

#

For example, maybe just have a HashSet<usize, usize> to look up the extra loop data for a program counter position, and thus keep Command down to 1 byte.

#

Looping over chars is also costing you in complexity. Replace it with source.as_bytes().iter().copied().enumerate(), and change the match arms to things like b'+' => output.push(Command::Increment), instead, and your parse should go much faster.

#

Instead of let mut memory = [0u8; 30_000];, I'd suggest let mut memory = [0u8; u16::MAX as usize];, and change your pointer to u16. Then you can use memory[pointer as usize] and the compiler will remove the bounds checks.

#

(Your branch predictor is probably making those bounds checks pretty cheap anyway, but you'll save code size which might help)

#

Other than that, you should make Command be Copy. Anything two usizes or fewer is definitely better copied than worrying about references (and that'll be even more true if you make it just a byte, like I mentioned above).

#

Other than that, if you want to go faster in something like this you'll need to make the code more complicated to better take advantage of common patterns in the input

#

For example, >>>>>>>>>> is common, so instead of

            Command::MoveRight => {
                pointer = pointer.saturating_add(1);
            }

you might do

            Command::MoveRight => loop {
                pointer = pointer.saturating_add(1);
                if commands[command_index + 1] != Command::MoveRight { break }
                command_index += 1;
            }

so you only go back to the big match when changing to a different kind of command

molten hedge May 18, 2023, 10:09 PM

#

tranquil nacelle Hi, I am an intermediate in Rust and I am trying to write a fast brainf*** inter...

I am curious... how fast is my bf macro with that bf program ferrisHmm

#

/*u-macro BF*/macro_rules!b{(f$($c:tt)*)=>
{{let mut x=([0u8;1<<15],0,stdout(),stdin(
));use std::io::*;$(b!{x$c};)*}};($f:tt->)
=>{b!($f-);b!($f>)};($f:tt<-)=>{b!($f<);b!
($f-)};($f:tt<<)=>{$f.1-=2};($f:tt>>)=>{$f
.1+=2};($f:tt..)=>{b!($f.);b!($f.)};($f:tt
>)=>{$f.1+=1};($f:tt<)=>{$f.1-=1};($f:tt+)
=>{$f.0[$f.1]+=1};($f:tt-)=>{$f.0[$f.1]-=1
};($f:tt.)=>{$f.2.write(&[$f.0[$f.1]]).and
($f.2.flush()).ok()};($f:tt[$($c:tt)*])=>{
while$f.0[$f.1]>0{$(b!{$f$c};)*}};($f:tt,)
=>{$f.3.read(&mut$f.0[$f.1..=$f.1]).ok()};
($f:tt$x:tt)=>{};}

pub fn main() {b!(f
    ++++++++[>++++[>++>+++>+++>+<<<<-]>+>->+>>+[<]<-]>>.>
    >---.+++++++..+++.>.<<-.>.+++.------.--------.>+.>++.
)}

Could someone run it and see ferrisPlead . I don't have a computer to test on right now

unique vale May 18, 2023, 11:00 PM

#

molten hedge ```rust /*u-macro BF*/macro_rules!b{(f$($c:tt)*)=> {{let mut x=([0u8;1<<15],0,st...

Your macro: about 2.69 microseconds/iteration
Cekofe's impl (pre-parsed): about 2.37microseconds/iteration
Cekofe's impl (parsing + interpreting): about 4.48 microseconds/iteration

I am currently playing minecraft (CPU spikes abound), so take this with a tablespoon of salt.
Here's the zipped benchmark suite if you want to test it on your machine (unzip -> cargo bench)

📎 benchfuck.zip

molten hedge May 18, 2023, 11:19 PM

#

unique vale Your macro: about 2.69 microseconds/iteration Cekofe's impl (pre-parsed): about ...

Thanks ferrisOwO

fossil steppe May 18, 2023, 11:51 PM

#

JIT compiler when hyperdaax

#

I'd be quite interested to see a cranelift implementation ^^

unique vale May 19, 2023, 12:58 AM

#

fossil steppe JIT compiler when <:hyperdaax:492801350736805897>

I do have a JIT-like thing for a simple stack-based RPN-like math language with BF-inspired loops; it probably wouldn't be too much work to make actual BF out of it. Well, the . and , would be a bit hard to JIT, but could just call out to prewritten functions for that.
https://github.com/zachs18/Simple-RPN-Math-Compiler-Rust

GitHub

GitHub - zachs18/Simple-RPN-Math-Compiler-Rust: Converts an RPN-lik...

Converts an RPN-like math language to amd64/i686 (SysV calling convention) or armv7 machine code in Rust - GitHub - zachs18/Simple-RPN-Math-Compiler-Rust: Converts an RPN-like math language to amd6...

dreamy condor May 19, 2023, 1:12 AM

#

fossil steppe I'd be quite interested to see a cranelift implementation ^^

cranelift is surprisingly nice; probably wouldn't be that hard.

#

Or go easy mode and translate to WASM, and use one of those runtimes to JIT it

fossil steppe May 19, 2023, 2:17 AM

#

That's too easy lol

tranquil nacelle May 19, 2023, 7:22 AM

#

dreamy condor For example, `>>>>>>>>>>` is common, so instead of ```rust Command:...

I was planning on implementing that but then I thought that the main loop would handle them anyways, so I don't have to check whether this command_index refers to the last one and then I should use commands[command_index + 1]

dreamy condor May 19, 2023, 7:23 AM

#

tranquil nacelle I was planning on implementing that but then I thought that the main loop would ...

Well, it depends if you want clear code or fast code :)

I agree that the extra loops like this would be uglier, but they might be faster.

tranquil nacelle May 19, 2023, 7:24 AM

#

dreamy condor Well, it depends if you want clear code or fast code :) I agree that the extra ...

Hmm I'll try it

tranquil nacelle May 19, 2023, 7:24 AM

#

dreamy condor Looping over `char`s is also costing you in complexity. Replace it with `source...

This one improved the performance a lot

#

It takes now 47 seconds to execute the mandelbrot

dreamy condor May 19, 2023, 7:25 AM

#

tranquil nacelle This one improved the performance a lot

Wow, really? I would have thought that most of the time was in execution, not parsing.

#

(You're running in --release, right?)

tranquil nacelle May 19, 2023, 7:25 AM

#

Yes with release it is 47 seconds

dreamy condor May 19, 2023, 7:25 AM

#

(Oh, you already said yes above)

tranquil nacelle May 19, 2023, 7:26 AM

#

molten hedge ```rust /*u-macro BF*/macro_rules!b{(f$($c:tt)*)=> {{let mut x=([0u8;1<<15],0,st...

That's cool

#

Let me try it real quick

molten hedge May 19, 2023, 7:30 AM

#

From previous testing I know that llvm has a hard time optimizing the Rust code the macro makes

tranquil nacelle May 19, 2023, 7:30 AM

#

It fails to run the mandelbrot brainfuck program

tranquil nacelle May 19, 2023, 7:30 AM

#

molten hedge From previous testing I know that llvm has a hard time optimizing the Rust code ...

I think it does not have addition overflow checking

#

Or saturating addition/subtraction

molten hedge May 19, 2023, 7:31 AM

#

tranquil nacelle I think it does not have addition overflow checking

Run it release

tranquil nacelle May 19, 2023, 7:31 AM

#

molten hedge Run it release

I did

molten hedge May 19, 2023, 7:31 AM

#

It may need a bigger memory strip

tranquil nacelle May 19, 2023, 7:31 AM

#

I'll try it again

molten hedge May 19, 2023, 7:31 AM

#

unique vale Your macro: about 2.69 microseconds/iteration Cekofe's impl (pre-parsed): about ...

You could try this one

#

Oh also you may have to offset the starting index into the buffer

#

Some bf programs use some space before the starting index

tranquil nacelle May 19, 2023, 7:33 AM

#

Ngl tho that's a really cool macro lol

#

It reminds me the donut C code

#

Oh now it works

#

I had overridden the release profile

molten hedge May 19, 2023, 7:33 AM

#

Its the 3rd version, the idea was to make a bf interpreter that fits in a discord message and works with the ?play command

tranquil nacelle May 19, 2023, 7:34 AM

#

Cool

molten hedge May 19, 2023, 7:34 AM

#

tranquil nacelle I had overridden the release profile

Yeah making it work with overflow on debug took more code, so I just force release ferrisBut

tranquil nacelle May 19, 2023, 7:34 AM

#

Do you know a way to improve the compilation speed?

molten hedge May 19, 2023, 7:35 AM

#

Not really

#

This macro eats rustc and llvm alive with a big enough bf program

tranquil nacelle May 19, 2023, 7:35 AM

#

molten hedge This macro eats rustc and llvm alive with a big enough bf program

Ye mandelbrot program is pretty big

molten hedge May 19, 2023, 7:36 AM

#

tranquil nacelle Ye mandelbrot program is pretty big

I have a playground with it running a copy of Conway's game of life ferrisOwO https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=485b505093c41e369af28b98ed182d0c

Rust Playground

A browser interface to the Rust compiler to experiment with the language

#

Macros are very fun

tranquil nacelle May 19, 2023, 7:38 AM

#

molten hedge I have a playground with it running a copy of Conway's game of life <:ferrisOwO:...

That's very cool

tranquil nacelle May 19, 2023, 8:37 AM

#

I updated the source code, can anyone review it please?

worthy venture May 19, 2023, 9:17 AM

#

tranquil nacelle I updated the source code, can anyone review it please?

you can change let mut loop_stack = Vec::new(); to let mut loop_stack = Vec::with_capacity(source.matches('[').count());

#

but the slowness is in interpret() and not in parse()

#

try to write/read to a buffer, once you are done. you display it in one go (instead of displaying/printing line by line)

tranquil nacelle May 19, 2023, 9:47 AM

#

Guys I optimized it alot

#

It got a huge performance boost

#

It takes 15 seconds for me to do the mandelbrot

tranquil nacelle May 19, 2023, 9:47 AM

#

worthy venture but the slowness is in interpret() and not in parse()

I changed the whole structure now

#

I added support for consecutive commands

#

which improved the performance a lot

tranquil nacelle May 19, 2023, 9:49 AM

#

unique vale Your macro: about 2.69 microseconds/iteration Cekofe's impl (pre-parsed): about ...

I will benchmark with this using the updated one

#

Also I pushed the updated source

tranquil nacelle May 19, 2023, 9:52 AM

#

worthy venture but the slowness is in interpret() and not in parse()

Yes I improved it alot now

#

#

📎 benchfuck-updated.zip

dreamy condor May 19, 2023, 11:06 PM

#

Ah, I see, doing the consecutive folding in the parsing instead of the interpreting. Makes sense.

tranquil nacelle May 20, 2023, 7:31 AM

#

dreamy condor Ah, I see, doing the consecutive folding in the parsing instead of the interpret...

That reduced the interpretation time a lot

dreamy condor May 20, 2023, 7:57 AM

#

Hmm, you know what that means? You could actually unify the two states.

 #[derive(Debug, Clone, Copy)]
 pub enum Command {
-    Increment(usize),
-    Decrement(usize),
+    Increment(u8),
     MoveLeft(usize),
     MoveRight(usize),

#

Because Decrement(1) and Increment(u8::MAX) do exactly the same thing, so there's no need for both.

tranquil nacelle May 20, 2023, 8:13 AM

#

dreamy condor Because `Decrement(1)` and `Increment(u8::MAX)` do exactly the same thing, so th...

Why not i8 for negatives?

dreamy condor May 20, 2023, 8:18 AM

#

Because you're using wrapping_add anyway, so having the same type as you're using for a memory cell will make your life easier.

#

i8::wrapping_add and u8::wrapping_add are exactly the same operation. LLVM doesn't even have different instructions for them.

#

I'm personally opposed to signed values unless absolutely forced, but do whatever.

tranquil nacelle May 20, 2023, 8:21 AM

#

Hmmm ok

#

I'll implement it rn

tranquil nacelle May 20, 2023, 9:08 AM

#

I updated the source code

#

Can anyone review it please?

#

The time dropped to 12 seconds for the mandelbrot

tranquil nacelle May 20, 2023, 9:28 AM

#

I used unsafe code to optimize it even more

#

Guys now it takes less than 10 seconds for me!

vivid carbon May 20, 2023, 9:42 AM

#

ferrisballSweat

tranquil nacelle May 20, 2023, 9:46 AM

#

vivid carbon <:ferrisballSweat:678714352450142239>

Yo

#

I merged the MoveLeft and MoveRight to a single variant

molten hedge May 20, 2023, 3:11 PM

#

tranquil nacelle Guys now it takes less than 10 seconds for me!

That's awesome ferrisOwO

tranquil nacelle May 20, 2023, 4:49 PM

#

molten hedge That's awesome <:ferrisOwO:579331467000283136>

: )

vivid carbon May 20, 2023, 9:32 PM

#

you could try output buffering like recommended before, but trying to preallocate part of the buffer, since you know how many writes at least you have to do

dusky mural May 20, 2023, 9:42 PM

#

dreamy condor If you're asking about microoptimization, note that your `Command` struct is 16 ...

just curious why it's 16 bytes and not 8 since isize and usize are both 8?
Is it storing the variant + alignment?

vivid carbon May 20, 2023, 9:42 PM

#

you could also try compressing consecutive adds and subtracts into one

dreamy condor May 20, 2023, 9:44 PM

#

dusky mural just curious why it's 16 bytes and not 8 since `isize` and `usize` are both 8? I...

If it has a usize in it, it has to be at least 8. But it also needs extra bits to store which command, so that means it needs to be bigger than 8.

dusky mural May 20, 2023, 9:45 PM

#

@tranquil nacelle would recommend checking out nom as a parser for this. I've played around it for a bf parser and the code ends up being v elegant where you can parse your input into basically a Vec<Command> and then run it after.
This could make it easier to optimise multiple operations of + - > amd < as mentioned above ^

GitHub

GitHub - rust-bakery/nom: Rust parser combinator framework

Rust parser combinator framework. Contribute to rust-bakery/nom development by creating an account on GitHub.

dreamy condor May 20, 2023, 9:45 PM

#

(Assuming on x64, so that usize is 8 bytes.)

vivid carbon May 20, 2023, 9:45 PM

#

dusky mural <@386862660483809280> would recommend checking out [nom](https://github.com/rust...

maybe also precompute other things like eliminating loops if you know that it wont be executed

dusky mural May 20, 2023, 9:53 PM

#

I might also recommend making the interpreter a struct, this could allow you for stepping through instructions and debugging the memory during execution

dusky mural May 20, 2023, 10:05 PM

#

vivid carbon you could also try compressing consecutive adds and subtracts into one

just checked and this compresses the mandelbrot code from 11,451 instructions to 4,073 so this could definitely give a performance increase

vivid carbon May 20, 2023, 10:15 PM

#

there are also some bf patterns which can be optimized

#

for example [-] which sets a cell to zero

#

https://esolangs.org/wiki/Brainfuck_algorithms
Here are more

Brainfuck algorithms

#

if you can detect them you will be able to optimize a lot

dreamy condor May 20, 2023, 10:24 PM

#

vivid carbon you could also try compressing consecutive adds and subtracts into one

Oh, I thought that had already happened as part of unifying them into one operator (#1108776450611486804 message)

vivid carbon May 20, 2023, 10:33 PM

#

dreamy condor Oh, I thought that had already happened as part of unifying them into one operat...

no i mean +++---- into - :)

dreamy condor May 20, 2023, 10:36 PM

#

Well collapsing consecutive ones had already turned that into Increment(3)+Decrement(4), so I figured that when it turned to Increment(4)+Increment(252) that had already happened to have it be Increment(255).

#

Though actually, the only reason for code to have that would be aesthetics, right?

#

That mandlebrot demo has no +- or -+ sequences at all.

vivid carbon May 20, 2023, 10:37 PM

#

dreamy condor Though actually, the only reason for code to have that would be aesthetics, righ...

yes

dusky mural May 20, 2023, 10:37 PM

#

dreamy condor That mandlebrot demo has no `+-` or `-+` sequences at all.

no proper code probably wont

vivid carbon May 20, 2023, 10:38 PM

#

dusky mural no proper code probably wont

yea but is there a cost for collapsing the operators if somebody did that?

dusky mural May 20, 2023, 10:39 PM

#

vivid carbon for example [-] which sets a cell to zero

has 120 instances of [-] so that might be next thing to try

vivid carbon May 20, 2023, 10:40 PM

#

ig we are at that point where we need to detect such complex structures, which may be nested, that we need a parser like nom

dreamy condor May 20, 2023, 10:41 PM

#

vivid carbon ig we are at that point where we need to detect such complex structures, which m...

Don't need it in the parser, though

#

This is just a classic "have an IR and optimize it" compiler problem.

#

Heck, could just turn the whole thing into a classic CFG if you wanted.

#

Or do it with https://en.wikipedia.org/wiki/Rewriting#Term_rewriting_systems

vivid carbon May 20, 2023, 10:45 PM

#

dreamy condor Heck, could just turn the whole thing into a classic CFG if you wanted.

which may be a good idea. its inherently inefficient to work in the context of a "standart bf vm". if we add new instructions to the vm like loops, ifs etc we could still work on the same array though more efficiently since we can implement some algorithms in rust instead of running it in bf

dusky mural May 20, 2023, 10:46 PM

#

during runtime convert bf code to rust then compile and run it for optimal performance

vivid carbon May 20, 2023, 10:47 PM

#

this:

temp0[-]
temp1[-]
x[temp0+temp1+x-]temp0[x+temp0-]
temp1[
 code
temp1[-]]

could be just optimized into a simple if like that

Command::If(usize) // if cell is 0 skip usize commands

vivid carbon May 20, 2023, 10:49 PM

#

dusky mural during runtime convert bf code to rust then compile and run it for optimal perfo...

i think @tranquil nacelle wants to run a bf vm and not compile to code, else he would write it as a proc macro

vivid carbon May 20, 2023, 10:53 PM

#

vivid carbon this: ``` temp0[-] temp1[-] x[temp0+temp1+x-]temp0[x+temp0-] temp1[ code temp1[...

there are a lot more operations which are more expensive if we would execute the commands normally and not detect the pattern and for example replace the expensive computation with several loops with a simple if check in the vm

tranquil nacelle May 21, 2023, 6:28 AM

#

dusky mural <@386862660483809280> would recommend checking out [nom](https://github.com/rust...

Actually I'm trying to make this without using external crates

tranquil nacelle May 21, 2023, 6:32 AM

#

vivid carbon you could also try compressing consecutive adds and subtracts into one

Hmmm I'll try that

tranquil nacelle May 21, 2023, 7:10 AM

#

vivid carbon for example [-] which sets a cell to zero

I implemented that just now