#Software Rasterizer on an FPGA implementation of a CPU
235 messages · Page 1 of 1 (latest)
if you go into the settings of your repo, there is something called "social preview" there you could upload a screenshot/picture of your thing in action
that will then show up as the embed, rather than this boring github default
if you want
just modify your first message again, then it should also show up as the thumbnail of your post in #1019722539116802068
hehe
ye just saying
you should boop @ nemez about that part
I wrote my software rasterizer to learn how the 3D rendering pipeline works
they reside in #hardware
oh
and were supposed to release their GPU 6 years ago already
hm
I'm considering making my GPU a partially MIMD machine
using the Flare32 CPU as the base for the programmable part of it
I'm sure its been longer than that
oh
: >
I am thinking of adding SIMD within a register instructions to my CPU
and using GCC's autovectorization for my software rasterizer
if nemez cant deliver, make sure to pull all the information you can from them
we might get a GPU then hehe
heh
fpga-based GPU?
ah
try the searchbox
so I'm actually wanting to make a whole FPGA-based "PC"
using my AMD/Xilinx ZCU104 dev board
I'm probably going to modify the CPU I've written to be 64-bit
and have virtual memory
...and also I think out of order execution but maybe more cores and a smaller CPU core would be better
I was wanting to port an existing Unix OS of some sort, maybe NetBSD
you have to start somewhere
yeah
dont try to rewrite and reinvent everything at the same time : >
sure
certainly start with a "big enough" CPU first before starting the GPU part 
heh
: )
every compiler ever
yeah
FGPA business is the UE6 Clone and MMORPGs of every graphics programmer
I just plan on open sourcing everything :)
but this project is more for me than other people
all good
not thinking of starting a business... my job pays well enough :)
thats what we do here anyway
make sure to boop the people for your fgpa thing 🙂
so the current thing so far is in-order, I see a cache too? not bad 😛
yeah the caches are not fully implemented yet
but the rest of the CPU is mostly implemented
I've implemented an instruction cache before
for a previous CPU project
also yeah, in-order
for now
I have a whole SpinalHDL library that I wrote
I wrote up a 2D GPU in SpinalHDL
check out libcheesevoyage on my github, in the hw/spinal/libcheesevoyage/gfx folder
the 2D GPU is mostly finished, but needs a little more work on it
it does function, doesn't seem to have any bugs left
it rendered this
very nice
it is composed of three primary kinds of pipelines, and always has at least one of each type
it's good enough that I could use it to implement an FPGA Game Boy
mostly
inb4, 10000loc of SetPixel(1,1AlmostWhite); SetPixel(2,1, AlmostWhite) ... SetPixel(120, 54, Blue); ... 😄
jk
haha
it's tilemap and sprite based
It actually doesn't render into a framebuffer, but actually scanline buffers. Though there is a mode to treat the block RAM of tiles used by the tilemaps as a framebuffer
that mode is less usable when you have a lot of tilemaps
like say 4 or 8
the number of tilemaps has to be a power of two
the configuration of it besides the actual graphics themselves is generally setting the sizes of the block RAM arrays inside of it, the RGB format, and more
it also includes stuff like whether or not the fancier features are enabled
such as the color math and affine sprites
still need to go back and update my code that implements affine sprites to make it use my new code
neat project
I want to do some FPGA stuff someday since I did a lot of HW coursework but it's just low on the project list right now
oh thanks
the first version of the CPU is almost done
I also want to rename it just to "Flare"
or "Flare64"
and make the registers and math 64-bit
you can already do a 64-bit add or subtract in two instructions
due to the flags register
...that's actually how I informed GCC how to do 64-bit add/sub
all this fpga stuff will be useful for all the hardware supporting srs
Yes in the sense that if you don't have a custom PCIe FPGA card to run the SRS firmware, you can't play the game
what is srs?
so what I'm gonna do is not the PCIE FPGA card stuff yet
I'm gonna make a full system
using my Flare CPU
srs is a secret government project
It's my FPS game project that has become a symbol of scope creep lol
But one day I would like to attempt making an FPGA-based PCIe peripheral to do like custom hardware acceleration for game systems just for fun
ah
so what I'm thinking of doing for my programmable GPU is actually just taking my existing CPU with an in order pipeline and have a ton of cores arranged in a MIMD fashion
It looks to me like MIMD with no OS or OoOX and small cores could work well
I was going to hook up all the cores via the Banana Memory Bus
good news, a conforming Vulkan implementation can implement floats such that denormals are flushed to zero
This means I can have my FPUs be really small
and thus fit more cores
well, I've changed my mind and will support denormals
I don't think they'll add that much hardware
not with me reusing my existing clz
and barrel shifter
for them
https://github.com/fl4shk/snow64_cpu/blob/master/hdl%2Fsrc%2Fsnow64_extra_arith_log_modules.src.sv#L94 here's my old barrel shifter code
It is small and fast
it doesn't take a whole 2D array in hardware
it's a lot smaller than the standard barrel shifter design while still reasonably being single cycle
so, got my old systemverilog code converted over to SpinalHDL
I'm gonna make a small (LUT size) floating point unit
...state machine based
I added a "forwarding" mode to my PipeMemRmw SpinalHDL module that makes it include the stuff needed for hooking up a register file into an in-order CPU pipeline. This mode doesn't stall upon a hazard, but rather forwards data that's not made it into the SRAM to earlier in the pipeline.
so it's not 100% working yet. Will have to continue it tomorrow.
got it working
and... I changed my design, and I'm getting closer to it working!
seems to be working entirely now
I used sby and abc pdr to formally verify it
that's how I know it completely works now!
and now I'm doing some testing of my PipeMemRmw module with a fake "instruction set" intended to simulate the way stalling will happen in my CPU
Can you spit out some block diagrams of your design or something neat we can look at
I'm going with something akin to the Classic RISC Pipeline
Or rasterizer output or something
hm
well
I can show you pics of the rasterizer as it ran on my laptop
the soft CPU isn't done yet
you're right
I haven't posted my software rasterizer yet
Neat
technically I posted it in #software-rasterization
but not in this thread
also I drew that texture
I still need to learn Blender
I started learning it about a month ago when I was on vacation
Yea takes time
I was doing hard surface modeling in blender just for fun long before I was ever programming
ah
I've been programming for half my life
...well a little more than half my life
and so I'm getting close to ready to work on the CPU again!
there's an important FPGA code modul I wrote that I plan on using for the CPU in more than one location
I am currently working on the CPU
https://github.com/fl4shk/flare_cpu/blob/main/docs/flare_cpu/flare_pipeline_structure.dot
https://github.com/fl4shk/flare_cpu/blob/main/docs/flare_cpu/flare_pipeline_structure.pdf
at this time I've made a ton of progress
I've switched CPU instruction sets though.
I wrote another GCC port and Binutils port.
would be good to try rendering this with my software renderer:
made this model yesterday
with magicavoxel
I think turning my sprites into 3D models this way would work out well
decided to squish his head
more work is needed for building libstdc++
Since apparently I didn't get access to std::vector and other containers
you made gcc generate instructions for ur fpga cpu ?
yes
Wow
That's cool
hey, thanks!
tbh, it was harder to develop the CPU itself
Ur cpu us in scala right ?
Is there a scala to verilog transpiler or something
I wanna try simulating it
I don't have access to an FPGA any more :(
in SpinalHDL
SpinalHDL is a Scala library that is an FPGA language/HDL
so it doesn't translate arbitrary Scala code into directly into Verilog
ahh unlucky I guess
well
that translation is very hard to implement for a language like Scala
you may like PipelineC though
which doens't translate... arbitrary C code into Verilog
but it does translate a certain kind of C code into VHDL I believe?
hey I just got libstdc++ to build for my CPU, with support for std::vector!
I find this really impressive 😅
so is ur cpu 8bit retro style ? or a mordern 64bit cpu ?
it's actually 32-bit
Did you copy some existing architecture or make it from scratch?
There's bits and pieces comparable to other CPUs, mainly RISC ones
It's only got integer instructions though
still intresting none the less
the CPU was the hardest part to develop, I think
GCC is a lot easier to port than you might think
Binutils was the second hardest because dealing with ELF was somewhat difficult
should I do (1) make a GPU or (2) add vector instructions and do software rendering?
GPU
yes I do mean HW rasterization
based on my SW rasterizer
...I'm thinking of doing both, starting with the vector ops
well
or PipelineC
started converting my software rasterizer into PipelineC