#Software Rasterizer on an FPGA implementation of a CPU

235 messages · Page 1 of 1 (latest)

distant pasture
#

I am implementing a CPU as SpinalHDL FPGA code that I intend to run my software rasterizer on. I wrote a nearly complete GCC backend targeting the CPU; it has a custom instruction set.

dense bridge
#

if you go into the settings of your repo, there is something called "social preview" there you could upload a screenshot/picture of your thing in action

#

that will then show up as the embed, rather than this boring github default

#

if you want

distant pasture
#

that's a good idea

#

but I'll need to actually render something cool first

dense bridge
#

just modify your first message again, then it should also show up as the thumbnail of your post in #1019722539116802068

#

hehe

#

ye just saying

distant pasture
#

I'll keep that in mind

#

I also want to make an FPGA-based GPU too

dense bridge
#

you should boop @ nemez about that part

distant pasture
#

I wrote my software rasterizer to learn how the 3D rendering pipeline works

dense bridge
#

they reside in #hardware

distant pasture
#

oh

dense bridge
#

and were supposed to release their GPU 6 years ago already

distant pasture
#

hm

#

I'm considering making my GPU a partially MIMD machine

#

using the Flare32 CPU as the base for the programmable part of it

severe lark
distant pasture
#

oh

dense bridge
#

: >

distant pasture
#

I am thinking of adding SIMD within a register instructions to my CPU

#

and using GCC's autovectorization for my software rasterizer

dense bridge
#

if nemez cant deliver, make sure to pull all the information you can from them

#

we might get a GPU then hehe

distant pasture
#

heh

dense bridge
#

there is also another dude on the server who has a working fgpa

#

with hdmi thingies

distant pasture
#

fpga-based GPU?

dense bridge
#

Aleok i think

#

i sink so

distant pasture
#

ah

dense bridge
#

try the searchbox

distant pasture
#

so I'm actually wanting to make a whole FPGA-based "PC"

#

using my AMD/Xilinx ZCU104 dev board

#

I'm probably going to modify the CPU I've written to be 64-bit

#

and have virtual memory

#

...and also I think out of order execution but maybe more cores and a smaller CPU core would be better

#

I was wanting to port an existing Unix OS of some sort, maybe NetBSD

dense bridge
#

you have to start somewhere

distant pasture
#

yeah

dense bridge
#

dont try to rewrite and reinvent everything at the same time : >

distant pasture
#

well

#

I have already written a GCC backend for my CPU

dense bridge
#

you copied it

#

and modified bits of it to work for your thing

distant pasture
#

sure

severe lark
#

certainly start with a "big enough" CPU first before starting the GPU part KEKW

distant pasture
#

heh

dense bridge
#

: )

severe lark
dense bridge
#

yeah

distant pasture
#

pretty much

#

so I do plan on starting smaller

dense bridge
#

FGPA business is the UE6 Clone and MMORPGs of every graphics programmer

distant pasture
#

I just plan on open sourcing everything :)

#

but this project is more for me than other people

dense bridge
#

all good

distant pasture
#

not thinking of starting a business... my job pays well enough :)

dense bridge
#

thats what we do here anyway

distant pasture
#

ah

#

yeah

dense bridge
#

make sure to boop the people for your fgpa thing 🙂

severe lark
#

so the current thing so far is in-order, I see a cache too? not bad 😛

distant pasture
#

yeah the caches are not fully implemented yet

#

but the rest of the CPU is mostly implemented

#

I've implemented an instruction cache before

#

for a previous CPU project

#

also yeah, in-order

#

for now

#

I have a whole SpinalHDL library that I wrote

#

I wrote up a 2D GPU in SpinalHDL

#

check out libcheesevoyage on my github, in the hw/spinal/libcheesevoyage/gfx folder

#

the 2D GPU is mostly finished, but needs a little more work on it

#

it does function, doesn't seem to have any bugs left

#

it rendered this

severe lark
#

very nice

distant pasture
#

it is composed of three primary kinds of pipelines, and always has at least one of each type

#

it's good enough that I could use it to implement an FPGA Game Boy

#

mostly

dense bridge
#

inb4, 10000loc of SetPixel(1,1AlmostWhite); SetPixel(2,1, AlmostWhite) ... SetPixel(120, 54, Blue); ... 😄

#

jk

distant pasture
#

haha

#

it's tilemap and sprite based

#

It actually doesn't render into a framebuffer, but actually scanline buffers. Though there is a mode to treat the block RAM of tiles used by the tilemaps as a framebuffer

#

that mode is less usable when you have a lot of tilemaps

#

like say 4 or 8

#

the number of tilemaps has to be a power of two

#

the configuration of it besides the actual graphics themselves is generally setting the sizes of the block RAM arrays inside of it, the RGB format, and more

#

it also includes stuff like whether or not the fancier features are enabled

#

such as the color math and affine sprites

#

still need to go back and update my code that implements affine sprites to make it use my new code

rich mesa
#

neat project

#

I want to do some FPGA stuff someday since I did a lot of HW coursework but it's just low on the project list right now

distant pasture
#

the first version of the CPU is almost done

#

I also want to rename it just to "Flare"

#

or "Flare64"

#

and make the registers and math 64-bit

#

you can already do a 64-bit add or subtract in two instructions

#

due to the flags register

#

...that's actually how I informed GCC how to do 64-bit add/sub

dense bridge
#

all this fpga stuff will be useful for all the hardware supporting srs

rich mesa
#

Yes in the sense that if you don't have a custom PCIe FPGA card to run the SRS firmware, you can't play the game

distant pasture
#

what is srs?

distant pasture
#

I'm gonna make a full system

#

using my Flare CPU

dense bridge
#

srs is a secret government project

rich mesa
#

But one day I would like to attempt making an FPGA-based PCIe peripheral to do like custom hardware acceleration for game systems just for fun

distant pasture
#

ah

distant pasture
#

so what I'm thinking of doing for my programmable GPU is actually just taking my existing CPU with an in order pipeline and have a ton of cores arranged in a MIMD fashion

distant pasture
#

It looks to me like MIMD with no OS or OoOX and small cores could work well

#

I was going to hook up all the cores via the Banana Memory Bus

distant pasture
#

had to update the GNU Binutils port

#

I'm switching over to little endian

distant pasture
#

good news, a conforming Vulkan implementation can implement floats such that denormals are flushed to zero

#

This means I can have my FPUs be really small

#

and thus fit more cores

distant pasture
#

well, I've changed my mind and will support denormals

#

I don't think they'll add that much hardware

#

not with me reusing my existing clz

#

and barrel shifter

#

for them

distant pasture
#

It is small and fast

#

it doesn't take a whole 2D array in hardware

#

it's a lot smaller than the standard barrel shifter design while still reasonably being single cycle

distant pasture
#

so, got my old systemverilog code converted over to SpinalHDL

#

I'm gonna make a small (LUT size) floating point unit

#

...state machine based

distant pasture
distant pasture
#

I added a "forwarding" mode to my PipeMemRmw SpinalHDL module that makes it include the stuff needed for hooking up a register file into an in-order CPU pipeline. This mode doesn't stall upon a hazard, but rather forwards data that's not made it into the SRAM to earlier in the pipeline.

distant pasture
#

so it's not 100% working yet. Will have to continue it tomorrow.

distant pasture
#

got it working

distant pasture
#

or not

#

the assertion based formal verification tools found bugs

distant pasture
#

and... I changed my design, and I'm getting closer to it working!

distant pasture
#

seems to be working entirely now

distant pasture
#

I used sby and abc pdr to formally verify it

#

that's how I know it completely works now!

distant pasture
#

and now I'm doing some testing of my PipeMemRmw module with a fake "instruction set" intended to simulate the way stalling will happen in my CPU

rich mesa
#

Can you spit out some block diagrams of your design or something neat we can look at

distant pasture
#

I'm going with something akin to the Classic RISC Pipeline

rich mesa
#

Or rasterizer output or something

distant pasture
#

hm

#

well

#

I can show you pics of the rasterizer as it ran on my laptop

#

the soft CPU isn't done yet

#

you're right

#

I haven't posted my software rasterizer yet

rich mesa
#

Neat

distant pasture
#

technically I posted it in #software-rasterization

#

but not in this thread

#

also I drew that texture

#

I still need to learn Blender

#

I started learning it about a month ago when I was on vacation

rich mesa
#

Yea takes time

#

I was doing hard surface modeling in blender just for fun long before I was ever programming

distant pasture
#

ah

#

I've been programming for half my life

#

...well a little more than half my life

distant pasture
#

and so I'm getting close to ready to work on the CPU again!

#

there's an important FPGA code modul I wrote that I plan on using for the CPU in more than one location

distant pasture
#

I am currently working on the CPU

distant pasture
#
GitHub

A 32-bit CPU being developed in SpinalHDL. Contribute to fl4shk/flare_cpu development by creating an account on GitHub.

GitHub

A 32-bit CPU being developed in SpinalHDL. Contribute to fl4shk/flare_cpu development by creating an account on GitHub.

distant pasture
#

at this time I've made a ton of progress

#

I've switched CPU instruction sets though.

#

I wrote another GCC port and Binutils port.

#

would be good to try rendering this with my software renderer:

#

made this model yesterday

#

with magicavoxel

#

I think turning my sprites into 3D models this way would work out well

#

decided to squish his head

distant pasture
#

more work is needed for building libstdc++

#

Since apparently I didn't get access to std::vector and other containers

low marten
low marten
distant pasture
distant pasture
low marten
#

Ur cpu us in scala right ?
Is there a scala to verilog transpiler or something
I wanna try simulating it

#

I don't have access to an FPGA any more :(

distant pasture
#

SpinalHDL is a Scala library that is an FPGA language/HDL

#

so it doesn't translate arbitrary Scala code into directly into Verilog

low marten
#

ahh unlucky I guess

distant pasture
#

well

#

that translation is very hard to implement for a language like Scala

#

you may like PipelineC though

#

which doens't translate... arbitrary C code into Verilog

#

but it does translate a certain kind of C code into VHDL I believe?

#

hey I just got libstdc++ to build for my CPU, with support for std::vector!

distant pasture
low marten
#

so is ur cpu 8bit retro style ? or a mordern 64bit cpu ?

distant pasture
low marten
#

Did you copy some existing architecture or make it from scratch?

distant pasture
#

It's only got integer instructions though

low marten
#

still intresting none the less

distant pasture
#

GCC is a lot easier to port than you might think

#

Binutils was the second hardest because dealing with ELF was somewhat difficult

distant pasture
#

should I do (1) make a GPU or (2) add vector instructions and do software rendering?

low marten
#

GPU

rich mesa
#

What's the difference really

#

I guess by GPU you mean hw rasterization

distant pasture
#

yes I do mean HW rasterization

#

based on my SW rasterizer

#

...I'm thinking of doing both, starting with the vector ops

distant pasture
#

well

distant pasture
#

or PipelineC

distant pasture
#

started converting my software rasterizer into PipelineC

distant pasture
#

well

#

I determined I don't need square roots in the GPU itself

#

I can instead leave that to the application!

#

though

#

I'm thinking of including an inverse square root LUT as a non-cached block RAM

#

which would speed up things!

distant pasture
#

I'm converting this project into making a full hardware rasterizer!

#

with my Nim-To-PipelineC converter

#

the source code is such that it'll count as a software rasterizer as well!