#Sega Dreamcast

1 messages · Page 8 of 1

rain obsidian
#

But better than having all of the random giant triangles everywhere.

#

Core running at 30 MHz again. I tried it at 40 MHz, but there were quite a few pixel glitches, and vert lines.

#

A lot of the slow-down is actually the emulator atm, as the above video is with the emu not waiting for the FPGA to finish each frame

#

skmp did a tweak of the emu, to enable the use of "Fastmem".

#

But I haven't tried that yet.

#

ARM core(s) are at the default 800 MHz atm.

#

I did run the Overclocking scripts before, but I don't know if they continue working after a power cycle. I guess not?

#

I just tried the 1.2 GHz overclock script, and it instantly crashed. lol

#

Trying 1 GHz.

#
/root# cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq
1000000
1000000
#

Nope, those overclock settings don't stick, after loading the core.

#

Just had to run the script from the shell.

#

Trying C Taxi.

#

Running about ~28% fullspeed atm.

#

(at the BIOS logo)

#

Some weirdness going on. I think frame sink is still enabled.

#

Did a make clean.

#

That's more like it. Emu runs at 100% speed, at the BIOS logo.

#

In fact, maybe a bit too fast. lol

#

40% at the Sega logo.

#

Around 8-16% in-game.

#

So not great... yet.

#

The ARM cores aren't as fast as I was hoping.

#

I'll try the Fastmem version now...

#

oic, it's on a separate branch.

#
export CC='/opt/gcc-arm-10.2-2020.11-x86_64-arm-none-linux-gnueabihf/bin/arm-none-linux-gnueabihf-gcc'
export PATH=$PATH:/opt/gcc-arm-10.2-2020.11-x86_64-arm-none-linux-gnueabihf/bin
cmake -DCMAKE_TOOLCHAIN_FILE=tc.cmake ..
#

oww

#

Might need to add -lrt in CMakeLists

#
set(CMAKE_CXX_FLAGS "-lrt")
#

It should be clear by now - I have no idea what I'm doing, when it comes to compilers. lol

#

Last build didn't work.

#

librt still missing.

#

And it would have to be the ARM version, I would think?

#

Same as the old version, so maybe the whole vmem_posix thing isn't meant to be enabled, for the ARM build?

#

Doh! lol

#

Compiling.

#

booo!

#
${swrl}/hw/mem/_vmem.cpp
#

Needs to use the erm, embedded version of _vmem, rather than the x86/x64 thingy. Dunno.

#

Nope

#

That _vmem file was already included, elsewhere in the CMakeLists file.

#
int main(int argc, char* argv[])
{
    #if HOST_OS == OS_LINUX
    void common_linux_setup();
    common_linux_setup();
    #endif
    
    set_user_config_dir(".");
    set_user_data_dir(".");
    add_system_config_dir(".");
    add_system_data_dir(".");

    ParseCommandLine(argc, argv);
    cfgOpen();

    libswirl_init();
    libswirl_loop(argc == 1 ? "": argv[1]);
    return 0;
}
#

Old version doesn't have that common_linux_setup stuff.

#

So I'll give up for now, and give skmp chance to take a look.

#

Actually, I'll try using the new _vmem files in the old build.

#

Maybe not. Way too many changes...

#

Gonna leave it for tonight, sorry.

#

Was just hoping to see how fast the emu ran with Fastmem.

#

(with or without the FPGA rendering running)

#

DC logo, with the ARM set at the default 800 MHz...

#
VREG = 03 ARMRST 00
ARM7: Invalidating cache
SetWindowText: reicast git/n - 35.46 - 28.19 - V: 16.90 (1.41, NTSC480i59.94) R: 11.93+0.00 VTX: 0.00 , MIPS: 0.00

SetWindowText: reicast git/n - 11.38 - 87.80 - V: 52.63 (1.01, NTSC480i59.94) R: 51.64+0.00 VTX: 0.00 , MIPS: 0.00

SetWindowText: reicast git/n - 11.29 - 88.56 - V: 53.09 (1.03, NTSC480i59.94) R: 51.10+0.00 VTX: 0.00 , MIPS: 0.00
#

ARM set to 1 GHz...

#
VREG = 03 ARMRST 01
VREG = 03 ARMRST 01
VREG = 03 ARMRST 00
ARM7: Invalidating cache
SetWindowText: reicast git/n - 30.19 - 33.12 - V: 19.85 (1.37, NTSC480i59.94) R: 14.39+0.00 VTX: 0.00 , MIPS: 0.00

SetWindowText: reicast git/n - 9.18 - 108.83 - V: 65.24 (1.01, NTSC480i59.94) R: 64.24+0.00 VTX: 0.00 , MIPS: 0.00

SetWindowText: reicast git/n - 9.11 - 109.76 - V: 65.79 (1.03, NTSC480i59.94) R: 63.30+0.00 VTX: 0.00 , MIPS: 0.00
#

I think the second number is the percentage speed, vs a real DC ?

#

So that's a decent speed-up. About what is expected for the 25% increase in ARM clock freq.

#

Need to do some profiling of the emu, when running on the ARM.

#

Don't get me wrong, it's amazing to see it running at all.

#

But I'm wondering which part of the code takes the longest to execute.

#

Seeing that there's no actual rendering being done by the emu, and no sound stuff enabled.

#

Or maybe the ARM is still there, as it has to give feedback to the games, to keep them happy?

#

I think the basic summary is - the ARM cores on the Cyc V suck. lol

#

We'll just have to call it "Lazy Taxi".

#

Oh dear...

#
/media/fat/reicast# ../Scripts/mister_mem_oc.sh
Current frequency is 800000 KHz
Frequency successfully set to 1000000 KHz.
/media/fat/reicast# ../Scripts/mister_mem_timings.sh
***** BEFORE *****
  tCL=7, tRP=6, tRCD=6, tRAS=14
  tRFC=120, tFAW=15, tRRD=3, tAL=0, tCW=7
  tWTR=4, tWR=6, tREFI=3120
  tCCD=4, tMRD=4, tRC=20, tRTP=3
  Min Power Save Cycles=0, tXPDLL=3, tXS=512
#

mister_mem_oc seemed to work.

#

mister_mem_timings, not so much.

#

The whole screen went wonky, then it rebooted. lol

#

A nice extra speed boost.

#

(DC logo)

#

So that's with both the ARM and DDR3 set to 1 GHz.

#

But not the mem timings, as that probably only works for slower mem freq?

#

I spoke too soon…

#

Makes me wonder about whether a heatsink for the RAM would help.

rain obsidian
#

ARM core(s) overclocked to 1 GHz. Memory left at 800 MHz, else it explodes.

#

FPGA renders not in-sync with the emu, for the above vid. ^

#

Scenes with mostly flat-shaded polys render quite fast now.

#

Emu in-sync (waits for) FPGA renders...

#

So the FPGA is still a fair bit slower than the emu is running.

#

Maybe ~10 FPS, at 30 MHz.

#

It should be in-sync, but it now looks like it's displaying the backbuffer.

#

Hence the gaps in the hud, etc.

#

emu is running at 30% speed, on that BIOS menu. With "wait for FPGA render" turned off.

#

5.12 millisecond (per frame) being wasted atm, waiting for the Tile writeback.

#

When it could be double-buffered, so it can be rendering the next Tile, whilst writing the previous one.

rain obsidian
#

Trying direct FB writes, bypassing the Tile ARGB buffer.

#

Doesn't work too well on the sim. lol

rain obsidian
#

Fixed.

#

I was hopping over too many states in isp_state, trying to get stuff to run faster.

#

But forgetting that it skipped most of the FB writes, because I'm bypassing the Tile ARGB buffer.

#

Had to restart the Quartus compile.

#

This is just a test, to see how much faster it can render, when it's not using the Tile buffer.

#

So it won't be able to do Alpha blending etc. but will write the pixel colour directly to the FB.

rain obsidian
#

With the in-sync thing enabled, the BIOS menu is doing around 12% fullspeed.

#

Which is pretty good, considering it's at 30 MHz.

#

The end goal being 100 MHz, which will be tough, but we'll see.

rain obsidian
#

Takes forever to get into Sanic.

#

Because the FMV plays so slowly.

#

The FMV won't actually display anything yet, because I haven't implemented the YUV thing.

#

You can see how much slower it is, with textures enabled.

#

So the texture cache is the next thing on the list.

#

lol

#

I really dislike unskippable intros.

#

With textures disabled, it's doing roughly 5-12 FPS.

#

With textures on, it's only about 3-4 FPS.

#

Sleeping soon. zzzzz

rain obsidian
steady grail
#

I know you keep saying the dreamcast goal is 100mhz but considering the original CPU was 200mhz do you think its possible to run at full speed?

dusk bolt
#

FPGA is mostly filled with gpu so both gpu and cpu are out of the question for mister

rich kindle
#

Thus the only chance will be a hybrid approach with the FPGA just being the GPU and perhaps the sound chip. And the CPU and the rest being emulated by the ARM cores (a mini version of reicast currently running on them)

#

If this turns out to be too slow I suggested a "half-rez" SD mode, but ElectronAsh isn't convinced yet of its necessity.

#

(which is a good sign)

#

If a possible SD mode is too slow then DC is simply not feasible at all on MiSTer. But the PowerVR2 research could be used for a next gen MiSTEr.

surreal sluice
#

Yeah I assumed this was more of a research / experimental thing that could maybe pay off on a bigger board but I haven’t been following that long

polar goblet
#

There should be a feature where the Dreamcast core makes obnoxious disc drive noises.

I want my games to be grinded into dust for the authentic Dreamcast experience.

rain obsidian
#

As there might be ways of getting an SH4 core to execute more instructions in parallel.

#

Where it could then run at a lower freq, and still be fast enough for most games.

#

But right now, the SH4 is never going to fit alongside the GPU, sound, and all of the other logic.

#

Unless somebody knows how to massively reduce the logic used for the inTri and interp blocks.

#

I'm convinced it's possible, I just suck at maths.

#

Not even ChatGPT knows quite how to do it. lol

rain obsidian
#

But at that point, you almost might as well run a Dreamcast or NAOMI. lol

#

The same FPGA platform would be able to run tons of the existing cores, though.

#

It just needs enough IO, vs the DE10.

#

The SH4 "hat" board I built, uses up both 40-pin GPIO headers, and even a few of the Arduino header signals.

#

So it can't even be used alongside an SDRAM module.

#

(although the hat board does have it's own SDRAM for the SH4 to use.)

#

I really do think the GPU logic could be shrunk down a lot, I just don't know enough about the maths etc., to say for sure.

#

On the face of it, there really isn't THAT much other active logic for the GPU.

#

I did have that starting to run a bit of BIOS code, but it would need tons of other registers and logic, to get it to run much further.

#

ie. to the point of it starting to write display lists into VRAM.

#

I looked on the MAME debugger a few weeks back, to see what it accessed first.

#

I can't remember exactly what now. lol

stone plaza
#

!

tired glacier
#

isn't the memory bandwidth needed also above what misterfpga can provide?

rain obsidian
#

You can only really mitigate the latency by adding caches for the most frequently-accessed stuff, like textures.

#

So you Burst-read say 32 Words of data at a time, into the cache.

#

You still have that initial handful of clock cycles of latency, but after that, the data is read pretty much contiguously from that small chunk of DDR3.

#

Once the data is in the cache, it's basically zero latency.

#

ie. The address changes, then the data is available on the very next clock cycle.

#

DDR3 on the DE10 usually runs at 400 MHz, and is 32-bit wide.

#

400 MHz * 4 Bytes at a time = 1,600 MB/sec.

#

But it's DDR, so transfers data on both the rising and falling edges of the clock.

#

So the peak transfer rate is 3,200 MB/s (3.2 GBytes/sec !!)

#

(or 3.051 Gibibytes, or whatever the stupid name is for it now. lol)

#

The SDRAM used as VRAM on the Dreamcast runs at 100 MHz.

final oar
#

16MB main memory, 8MB graphics memory and 2MB sound on dreamcast right?

rain obsidian
#

Only half is read/written at a time, for parameters (Vertex and other data), so 400 MB/s.

#

For textures, it's read as the full 64-bit wide data, so 800 MB/s, peak.

#

Yep.

#

NAOMI is 32MB main, 16MB VRAM, 8MB sound.

#

I guess they needed quite a lot for sound samples on NAOMI, as it can't use CD/GD audio?

#

So, plenty enough bandwidth for DC.

#

The latency of DDR seems to be a bit sucky, though.

#

With the core running at 30 MHz atm, it's taking roughly 5-7 clock cycles of latency to read any random word from DDR3.

#

For every single word (64-bit).

#

If the core was running at 100 MHz eventually (which could be very hard), that 7 (max) clock cycles will turn into about 21.

#

(from the POV of the core, vs DDR3 running at the same 400 MHz)

#

I don't know why it sucks quite as much as that. lol

#

Or I could be way off, and the DDR controller will handle it, and the latency isn't that bad.

#

The old-people (SDR) SDRAM module needs 8 clock cycles to read/write any random word.

#

But each module is only 16-bit wide.

final oar
#

yeah its weird, i lack knowledge on how mister works but DDR3 at 400mhz should theoretically be about 10ns, must be something to do with how it's shared on mister

rain obsidian
#

So maybe the DDR3 is always roughly 7-8 clocks max, even at a faster core freq. It's a good question.

#

What we do know is, even with ARM Linux sharing DDR3, the BUSY signal almost never gets asserted.

#

I can Burst write tiles into the DDR3 Framebuffer, for 512 clock cycles, and only see BUSY asserted for one or two clock cycles.

#

Still, with the core at only 30 MHz, though.

#

I really need to do some more calcs on how to speed it up, though.

#

I didn't think it would be quite as hard as this, to get closer to the same speed as the sim.

#

I need to add a texture cache next, even if it's quite a simple one.

#

The Codebook cache is pretty much already in place, and it Burst-reads 256 Words at a time (64-bit).

#

Every time the texture Base address changes, as that's where the VQ Codebook sits.

#

The Parameter Cache isn't really a cache at all right now.

#

It just blindly stores the vertex params for each triangle. The Tag just increments on each incoming triangle within the current tile.

#

(so the Tag can be used as the address for the param buffer.)

#

Both the ISP and TSP need to know some of the param word stuff, and the verts for each triangle.

#

The ISP needs to know them, to do the inTri and Z interp calcs.

#

The TSP needs to know the texture base address, the X,Y, and UV of each vert, and reads the resulting Z values from the Tag/Z buffer.

#

Another pair of interp blocks for UV.

#

And then even more, for Gouraud, which I haven't even attempted to fit in the FPGA yet. lol

#

I need to bite the bullet soon, and just add the texture cache. Probably using the Codebook cache as the basis. It's practically the same thing.

#

The Sega PDF suggests how many words it pre-reads for each type of data.

#

And it's not very much. Maybe 32 Words at a time, usually.

#

VQ compression can fit 32 texels per 64-bit word, so it does a lot to reduce the amount of data you need to read for each texture.

#

I guess only TWO 64-bit Words to be read, for an 8x8 texture, using VQ.

#

Largest texture size on PVR2 is 1024x1024, apparently.

#

Which is riduculous, even with VQ. lol

#

That would need to read 32,768 Words (64-bit), even with VQ enabled.

#

256KBytes.

#

An example of the DDRAM_BUSY signal, during a Tile -> FB writeback.

#

OK, so it was three clock cycles, out of 512. lol

#

Actually a small bug there, because I should be holding in the same state, until BUSY goes Low again.

#

The DDR controller supposedly takes care of Burst Writes now.

#

By bunching the writes together (if they are on consecutive addresses).

#

So you just keep writing as fast as possible, and the controller mostly takes care of it.

#

For Burst reads, you have to decide how much contiguous data you want to request at a time.

#

Which is easy for some stuff like the Codebook cache, as that's always reading 256 Words.

#

But vertex params and textures are quite variable-length.

rain obsidian
#

That's better.

#

It now pauses in the same isp_state, if the BUSY signal is High.

#

I can't tell if that has made much difference on-screen, as I keep messing with the ASCAL thing.

#

And now it has vertical scanlines again.

#

But the BUSY signal gets asserted far less often than I thought.

#

Sometimes it barely happens at all, through many frames of renders.

#

Although the Framebuffer writes are happening on alternate clock cycles atm, so it could go faster.

#

Anywho. Texture cache.

#

Gonna try to hook something up.

rain obsidian
#

FPGA around 90% full... and it couldn't compile.

#

And that wasn't even with the texture cache block enabled, because it just gave a black screen when I tried it last night.

#

So not sure what else would have changed so much to make it fail to compile.

#

I removed some stuff from SignalTap, as that usually helps.

#

Also had to add some waits for "z_clear_wait", as it was often trying to write to the Tag/Z buffer during a clear.

#

It has to clear all of the Z values, at the start of each new tile.

#

So Z starts at zero, then it gets compared (for each pixel) to each incoming (visible pixel) triangle Z value.

#

It does the "32 pixels at once" thing for the Tag/Z buffer, so only takes 32 clock cycles to do the clear.

#

I really do need to emulate Burst transfers on the sim soon.

#

So I can keep the sim and Quartus version 100% in-sync.

#

So then, I should be able to check everything in the sim, fix most of the worst graphical glitches, etc.

#

I don't know if I can finish this, tbh.

#

But I also don't think it could be re-written very easily. There aren't too many ways to do it.

#

Just have to keep on chipping away at it.

#

But you can almost guarantee, if I get to a point where it runs quite fast on the FPGA, and the renders look half-decent, an uber dev will just release a full Dreamcast core. lol

#

'cos stuff like that has happened many times in the past.

#

(Arkanoid is one example, but I was happy to help get that done.)

nocturne lodge
#

thank you for your time in this 🙂

rain obsidian
#

I will carry on, and see what I can do.

#

It's just frustrating atm, as I know what is possible now, but it's hard to get everything implemented, especially with it running so low on FPGA resources, and the very long compile times.

#

Right now, Quartus is havin trouble again, and the compile has been going for nearly 90 minutes.

#

That usually means it won't finish, so I might as well cancel it, and figure out some stuff I can remove temporarily.

#

This is why I really want a MiSTer setup with a larger FPGA.

#

I kind of sort of have one already, but with only 77K LEs.

#

Using a QMtech Cyc V module.

#

I also have a Cyc IV module with 150K LEs, but haven't tried compiling cores for that yet.

#

The problem with both modules is - there is no DDR mem.

#

Only on the Orange Pi, which isn't directly shared with the FPGA.

#

So none of the usual stuff like ASCAL can work.

#

Hell, I'd even replace the Cyc V on a DE10 with a larger one.

#

Which might not be quite as bad as it sounds.

steady grail
#

I appreciate your diligence and hard work you put into this community! I just hope the mister fpga platform over time evolves where mister fpga chips do become bigger and MAYBE (idk how cores exactly work) could possibly one day have the ease capability of running dreamcast, PS2 and Xbox.

edgy pilot
#

Likewise, it's been amazing seeing what you can do and push the limits of the mister.

rain obsidian
#

Thanks. 😉

#

I still believe a LOT of logic could be saved, in the interp and inTri blocks.

#

Just needs the right dev, and somebody who actually knows maths, rather than me. lol

#

Also on the todo list, is to make a note of all of the min and max float values for X,Y,Z, for each of the example renders.

#

So I can figure out how many fractional and integer bits it needs, and where the could be an overflow etc.

#

I knew it was going to be quite hard to work with fixed-point numbers, as floats can represent a VERY wide range.

#

But so far, the ranges don't seem to be too extreme.

#

Currently working on testing an HP Z800 Dual-Xeon workstation.

#

I've not owned any Workstation-class stuff before.

#

And no ECC memory, not since I had the Octane and O2, around 2008.

#

It's not really worth trying to repair the original Z800 PSU.

#

Almost all of the rails are 12V anyway, then a low-current -12V, and 5V Standby.

#

I might not have a 12V PSU with enough current to even get the machine to POST atm, but it's worth a try.

#

I'm just curious to see of an older dual-socket Xeon can still be useful for Quartus or whatever. Probably not. lol

#

Poor Dreamcast sat on the bench. I haven't even tested the AtomisWave LAN adapter yet.

#

And the DVD drive from the Marantz DVD player, which I was intending to try reading GD-ROMs with.

#

Too many projects.

edgy pilot
#

I also think your 'hat' thing is an interesting idea

#

It gives me Sega CD/32x vibes.

#

Imagine if one could just make a hat and connect 1 or 2 raspberry pi zeros for extra performance.

#

(I have no clue how any of this works, and I'm probably chatting absolute rubbish)

rain obsidian
#

This 12V PSU is only rated at 16.5 Amps.

tired glacier
#

interesting stuff, but yeah just realized I don't really know what the other cores do with their video out but I guess it has to end up in the DDR3 🙂

rain obsidian
#

Very unlikely to be quite enough to get the Dual Xeon to post, but I'll try without the CPUs in first.

tired glacier
#

yeah it's important to try to limit all the calculations to as few bits as possible. You can even see this happening on modern GPUs

rain obsidian
#

The main issue there is, having enough IO pins to do everything.

#

Each 40-pin header on the DE10 has 36 IOs.

#

(then two Grounds, 3V3, and 5V)

#

Which might be just about enough for a 32-bit bus, Clock, and a few control signals.

#

So that could be one way of getting more ARM power to run the emu.

#

It's just... it's never going to feel quite the same as having the vast majority of it on the FPGA.

#

Which isn't necessarily a bad thing, but you know what I mean. lol

rain obsidian
#

ASCAL has two or three Framebuffers in DDR3, to do the upscaling (or downscaling) etc.

#

But ASCAL also has the option of displaying an existing Framebuffer in DDR3, which is what I'm using atm.

#

Normally, that feature would often be used for displaying the Linux framebuffer from the ARM side.

#

The ARM just has to plonk the Framebuffer at a certain address, then ASCAL set up to read from that address.

#

ASCAL supports 16-bit, 24-bit, 32-bit?, and Paletted colour, when reading the Framebuffer.

#

I'm using 16BPP for PVR2, but kind of read from each 32-bit word. It's still not quite right, hence I only see half the horizontal resolution atm.

rain obsidian
#

After quite a lot of hassle - the Xeon "POSTed" earlier.

#

But with the measley 16 Amp PSU, it gets quite hot within a few minutes, so I didn't want to leave it on.

#

I ordered an HP DPS-800 PSU instead, which is known to fit in the old case.

#

Lots of RAM errors showed up, but I don't know if it just needs "training" for that.

#

Could probably run the whole thing off a car battery for quite a while. lol

#

It's almost all 12V rails, plus an Amp or two for 5V Standby.

#

And a -12V rail, which is probably only used for the old-people serial port, and certain sound cards with opamps on, etc.

#

I doubt the speed of the Dual Xeons will be super impressive, but I don't know.

#

Just interesting to own a "Workstation" class machine, even if it's old.

brisk jay
#

Could someone please explain to me exactly what's being discussed here? I gather it's partially using the ARM and may run slow? Would it be too slow for use like a normal core? Could this become a real Dreamcast core on a DE-10 successor?

dusk bolt
rain obsidian
#

Bump

rich kindle
#

He bumped!!! 😊

pseudo tinsel
#

hehe

#

so I am back home

#

I should in theory be able to load up the core in the tools

rich kindle
#

Great to see you back. καλωσόρισμα

#

(hope that makes sense 🤔😆)

pseudo tinsel
#

it kinda does and kinda doesn't 😛

rich kindle
#

The waitress in my favourite heavy metal pub in my home town was from Greece. She educated us thus all my Greece is due to her.

pseudo tinsel
#

haha

rich kindle
#

Ash wanted to experiment with that branch but couldn't get it working

rain obsidian
#

Not for ARM, at least.

pseudo tinsel
#

yeah it is a bit tricky

#

i'd need ssh access to a mister to figure out

floral vale
#

I like Dreamcast

rain obsidian
#

I think I probably need to tidy up this craphole of a room, and the other room, before I look into DC again. lol

#

It hasn't been this untidy in quite a while. I thought I was making progress, until I got burnout a few months back.

edgy pilot
#

GameCube
Zx Spectrum

What's not to love

#

You have a very nice collection

rain obsidian
#

tbh, I haven't powered up most of that stuff in the past 2-3 years.

#

There's a Vectrex at the back, but it has a fault on the logic board.

edgy pilot
rain obsidian
#

It was fully working originally, but I helped out a friend many years ago (easily 8-9 years ago now), by sending him my working board, and he sent his faulty one back. I never got around to repairing it.

#

Exactly.

#

And, tbh, I rarely play games either now. I find projects more rewarding overall, that's during the times when I don't have burnout.

#

Good job the detail isn't great in the photo, as it's like an anti-doxxing filter. lol

edgy pilot
#

Although, there is one game that's keeping one of my older consoles connected up to my TV. Resident Evil Code Veronica on the GameCube. (Plus I like how easy it is to mod a GameCube. I have the GCloader + a gcvideo device for a beautiful digital out picture)

Ironically, it also has a Dreamcast release. 😉

#

But yeah, it's always nice to see the more uncommon/non-mainstream consoles/retro stuff in people's collection. That Vectrex does look nice.

rain obsidian
#

Last time this bench was this "tidy" was about four months ago...

#

So that's all in the same room, with the bench, and bed.

#

Each of the two rooms are so small, I can lay flat, and touch both walls. lol

#

Actually, I need to blur an address on that last one.

edgy pilot
#

Haha, don't worry. All I saw was your very blue co op card.

#

It's why I don't really send pictures of my collections or anything. Knowing my luck il leave some personal info lying around. (Which has a high chance of happening)

rain obsidian
#

co op card, was my squeegee, for the solder paste.

#

I have so many PCB projects to build, and almost zero motivation.

#

New GD Emu, boards for the DIY AV Receiver, etc.

edgy pilot
#

Id be gutted if I ended up breaking my co op card though. No discounts.

rain obsidian
#

lol

#

I only got it for Download fest last year.

#

Used it probably twice, and never since.

#

In fact, I think I forgot to even use the card.

#

These could actually be quite nice rooms, if I bothered to keep them tidy. sigh

#

Could do with new curtains, too.

edgy pilot
#

I think this year, will be the year il learn how to do some soldering.

Collecting retro games/consoles and soldering goes hand in hand. I've got probably 5 spare ps1's I could practice modchipping on after I feel confident enough.

Then I could actually diagnose and attempt to fix my old Dreamcast. Unfortunately it's no longer dreaming anymore. (Powers fine, no video out)

#

Used to work though

edgy pilot
#

Least it works though!

rain obsidian
#

I do find soldering quite theraputic at times.

#

Also, getting into using BGA chips, and using solder stencils, isn't half as bad as you might think. 😉

edgy pilot
#

My first attempt at using a soldering iron was to desolder caps and tsop (bridge 2 solder points) in my OG xbox

rain obsidian
#

The only thing is, when using a stencil, is you really want to make 100% sure you have ALL of the components ready.

edgy pilot
#

Needless to say, it didn't go well at all.

rain obsidian
#

You can obv manually solder quite a lot after the fact, but not really with the stencil, and you'll already have paste / solder on some pads, unless you mask off the stencil.

#

I first did that TSOP thing probably around 2005, I think?

#

I've never owned an Xbox modchip - I only ever did the TSOP, and then the Font exploit.

#

And it was always fine, for what I used.

#

Easy to load ISOs on the HDD, and use XBMC etc.

#

The OG Xbox was my main media player, until about 2012. lol

#

Soldering really is mostly about prep, using (extra) flux where needed, and a half-decent soldering station.

#

I pretty much only need to use a small chisel tip for most stuff, on the Metcal.

#

It's like a 1.78mm wide chisel tip, or something.

#

And 0.5mm diam solder. Nothing ever larger than that.

#

And a lot of people think they can use a high-power "60 Watt" iron, and it'll be fine, but it's often not.

#

Especially the cheapo mains irons I started with, which got wayyyy too hot, to the point where the tip was almost glowing red.

#

And they would charr up instantly. That's no good.

#

And had the big screw, holding the tip, in, which would melt connectors etc. lol

edgy pilot
#

Yeah. I think I had some rubbish flux, a bargain bin £10 eBay special soldering iron and just no skill.

My plan is to learn on boards you can buy from AliExpress, then after I feel confident I would buy a few cheap sports cartridge based games that use a battery for saving. (As I'd mostly be replacing batteries). Then start replacing them. Il probably pickup the Pinecil V2 soldering iron. Looks good

rain obsidian
#

Definitely best learning on junk boards you don't care too much about.

#

For a lot of SMD component removal, I use the hot air station.

#

Just have to be patient with it, and don't set it much higher than say 350C.

#

(I have to set mine fairly high, as I don't think the temp display is very accurate.)

#

And to not try to prise off chips etc.. Unless they have a small spot of glue underneath (Saturn SDRAM, for example), they should lift off with almost no effort, once the solder on all pins is molten.

edgy pilot
#

Yeah, sweet. Thank you for your tips. Much appreciated

rain obsidian
#

The KSGER soldering stations look fine, for the money. I almost bought one the other month, but there is a new version now, and I wasn't sure which type (brand) of tips people tend to go for now.

#

I only ever switch to a larger tip, for really heavy-duty wire soldering, or large ground planes / heatsinks.

#

(on the Metcal, I mean)

#

But the Metcal is getting very old now. The PSU on my previous one already failed years ago, and is still in pieces. lol

#

I'm sure the KSGER T12 (or newer) would do fine.

#

But, ironically, the KSGER is already a kind of clone of other stuff, but there are clones-of-clones.

#

ie. versions which look similar, but with not-so-good build quality, especially in the PSU.

#

Some don't even hook up Mains earth to the tip.

#

(which isn't great for ESD, but it does help prevent shorting stuff, if you forget to turn the power off to the board. lol)

#

(should obv never do that anyway)

rain obsidian
#

If it's been powered off / unplugged from the Mains for quite some time, it should be safe enough to take apart.

#

Take the few screws out of the PSU, don't lose the plastic insulator sheet underneath it.

#

Clean the pins that poke up from the motherboard, using IPA, or contact cleaner.

#

tbh, I generally even scrape the pins on all sides a bit, then clean with IPA.

#

But you need to be careful not to allow any metal shavings onto any of the PCBs.

edgy pilot
#

Sweet thank you for the suggestion. Hopefully I can fix it as it's been region free modded to play my Japanese Resident Evil imports.

rain obsidian
#

Then, when you plug the PSU back on (don't forget the plastic sheet underneath. lol) - kind of shove the PSU on and off the pins a few times, then put the screws back.

#

Oh, and if you need to remove the motherboard for any reason, be very careful that some of the screws are longer than the others.

#

Specifically the ones that hold the metal part of the GD drive to the top metal shield of the mobo

#

Else, this can happen. See if you can spot it. lol

#

Hint: Just below the RGB/AV port, and just to the left of the SH4 CPU.

#

I feel quite motivated tonight, to tidy the rooms.

#

But I can't, because it's nearly midnight. sigh

#

#sleepingpatterns

#

I bought a CPAP machine years ago, which I'm sure would help, but I never got on with it.

#

It's probably how a dog feels, when it hangs it's head out of a car window, with the car doing 70 MPH.

#

Air being forced down you when you try to breathe out. lol

#

And yes, I am a bit bored tonight.

#

Need to quietly move a few bits around, like the two or three DVD players I bought recently.

#

With a view to getting rid / reselling them.

rich kindle
rain obsidian
#

It's a long story. lol

rain obsidian
#

I do think it's possible to read the High Density track on a GD disk.

#

Using the Marantz DVD player mech I bought a few months back.

#

But as usual, I didn't quite get to that point, of hooking up the serial connections, and FPGA.

#

If I can just do a proof-of-concept for that, I'll finish the PCB design for the "new" CD/GD/DVD drive for the DC.

#

Unfortunately, it has to use some fairly old chips, but probably still quite a lot of new-old stock available.

#

Could eventually lead to a Dreamcast that can play DVDs, too.

#

(and read DVDs, for larger homebrew, like DCA III.)

late cargo
dense shard
rain obsidian
#

Or I think to select it from the Settings menu?

#

Japanese Cake BIOS, I think?

nocturne cave
#

I keep updating ScummVM, unfortunately I don't think there is a way to get rid of screen tearing without modifying the framebuffer to support double-buffering.

rain obsidian
#

I did wonder about that again recently.

#

The PVR "core" thing can actually sync the emu to Vsync, sort of...

#

When the emu writes to the framebuffer in DDR3, I check the last few bytes on the FPGA side.

#

That triggers the FPGA to start rendering a frame, then I write some bytes back to DDR3, which tells the emu that the frame is done.

#

It's possible something like that could work for ScummVM and other stuff.

#

But you'd have to be fairly sure the emu can always run fast enough to render frames > 60 FPS maybe?

#
    // Write the magic number to the evil PVR register. ;)
    //
    // parameter TEST_SELECT_addr = 16'h0018; // RW  Test - writing this register is prohibited.
    //
    pvr_regs[ 0x18 ] = 0xCA;
    pvr_regs[ 0x19 ] = 0xFE;
    pvr_regs[ 0x1a ] = 0xBA;
    pvr_regs[ 0x1b ] = 0xBE;
    
    // Copy the PVR regs directly ABOVE the 8MB VRAM (in DDR3).
    memcpy(vram+offs_8meg, pvr_regs, pvr_RegSize);
    
    // Trigger the PVR reg copy on the core, then render the frame...
    // (the core should then clear these to 0x00, after the frame is rendered.)
    vram[ (offs_8meg-8)+0 ] = 0xCA;
    vram[ (offs_8meg-8)+1 ] = 0xFE;
    vram[ (offs_8meg-8)+2 ] = 0xBA;
    vram[ (offs_8meg-8)+3 ] = 0xBE;
    vram[ (offs_8meg-8)+4 ] = 0xCA;
    vram[ (offs_8meg-8)+5 ] = 0xFE;
    vram[ (offs_8meg-8)+6 ] = 0xBA;
    vram[ (offs_8meg-8)+7 ] = 0xBE;
#
void rend_end_render() {
    // wait for render to end
    // interrupts get fired automatically
    
    while ( emu_vram[ (offs_8meg-8)+0 ] != 0x00) {}
    
    FrameCount++;
    //printf("rend_end_render\n");
}
#

It's quite rough, but it works. Obviously the FPGA renders WAYYYYY slower than the emu can run atm, though.

lapis bison
#

At first i was quite exited with scummVM on mister because i thought the upscaler will be used.

Is it possible to have a software running on the arm side which use the fpga scaler?

rain obsidian
#

ScummVM technically kind of already did/does use the ASCAL scaler.

#

ASCAL can be set to display a framebuffer in DDR3.

#

And the DDR3 is shared between the ARM Linux and FPGA side.

#

So Scumm running on the Linux side just renders frames to an area of DDR3, and ASCAL displays that.

#

So ASCAL can do the upscaling / downscaling stuff.

#

It's just they are not in-sync, so you end up with screen tearing, unfortunately.

#

I do think it's fixable, but I don't think I'll personally have time to take a look.

#

I'd also have to find the code in ScummVM which does the final rendering, and see if it could be made to wait for the FPGA/ASCAL to finish reading the previous frame.

nocturne cave
lapis bison
#

So it is possible to have the same filters than the cores?

rich kindle
#

The channel is way too empty wihout Ash 👾

ripe stump
#

I hope ash is doing ok. ❤️

golden cradle
#

Is he just taking a break or did something happen?

ripe stump
#

Dunno

lime mango
#

probably fine. he works in bursts of extreme inspiration

golden cradle
#

That seems like his MO. 😎

rich kindle
#

@rain obsidian , grabulosaure showed git activity recently (IntV branch) thus maybe he is back from AWOL

rain obsidian
#

Not doing too great, tbh.

#

I had another bout of what seems to be the same kind of low folate thing.

#

Leg weakness, shortness of breath, brain fog, blurry vision, dizziness.

#

Not half as bad as a few years back, but still not fun.

#

I had to start on Folic Acid again, Iron, and multivits, etc.

#

Gradually getting better, I think. But it's hard to think.

#

This all started happening after either the second vax shot, or after covid itself.

#

AFAIK, I might have only had covid once, which was at Download fest in 2021, about six months after the vax.

#

Trying not to make this a "political" thing, but I'm not sure what to think any more about any of it.

#

I lost my eldest brother to Covid in December 2021, which has obviously also had a very negative impact on all of us.

#

And no, he didn't have the vax. He refused it.

#

And I feel like I still can't freely talk about this online, else certain people will say "told you so" about him not having the vax, or "told you so" about me (and the three other people in this household, and most of the rest of the family) having it.

#

And I still have the trip to Thailand in just over a week, which I'm just about well enough to go on.

#

So I've just been trying my best to not let the depression drag me down, trying to drink more water, and eat a bit better.

#

Still not exercising, which is a huge problem. (EDIT: Quite hard to do, when you have problems with going outside, and when your legs have been buckling under you for weeks.)

#

I've been like this, pretty much since about four months ago, when I had a sudden burst of energy, and got about five PCB projects sent off at once.

#

I haven't built more than one of them since.

#

Could even be due to this, as I had Hypoglycemia as a kid, and have Asperger's etc.

#

Just interesting to me that I never had anything quite like this, until after the vax / covid era.

#

A lot of the symptoms feel VERY similar to what I had a few weeks after the second vax. Coincidence? We'll probably never know.

#

Might even be triggered as part of long covid.

#

I'm not back from Thailand until the middle of April, if I survive. lol

#

So will be even quieter until then

final oar
#

Hang in there Ash, Covid was f'd up. I lost my grandad to it. Not everyone gets political about it, screw people who do IMO 🙂

twin crest
#

I really hope things get better for you Ash, I lost family to covid and a close friend now has long covid. Don’t need to get political to say that covid times were pretty awful.

wispy vault
#

Get well soon Ash, we appreciate what you do here but your health matters more!

golden cradle
#

Take care and do what you gotta for you. 🙏 Things will get better and of course, Dreamcast some day too 😎😅

ripe stump
#

take care of yourself, ash. ❤️

dense shard
#

@rain obsidian I joke around but I love your work ethic and posts. Looking forward to seeing them again when you feel better. ❤️

signal scaffold
#

Hope you feel better man

edgy pilot
#

Hope you get better soon mate 🙂

late cargo
#

Sending good vibes to ElectronAsh.

ivory gate
#

I hope you have a good trip and start feeling better soon, Ash!

halcyon creek
#

Aye, take it easy mate, rest as much as you can, get well soon!

ripe stump
#

welcome back, buddy

vocal vine
#

missed you my guy, hope you’re feeling better

late cargo
#

You're back! We missed you! Hope you got the rest you needed. 🫂

steady grail
#

Welcome back @rain obsidian! Hope your doing well and health is good! ✌️

austere shuttle
#

How was Thailand?

rain obsidian
#

Thanks, all.

rain obsidian
#

But, also, after the first full day of rest, I was helping my brother on the allotment plots again, dismantling and moving a greenhouse.

#

Songkran was... nuts.

#

But in a good way.

#

Very crowded at some points, but we really got into it on the second day in Bangkok.

#

The first day of Songkran, we were actually in Lamai, on Samui.

#

It did really help with my mental health, to get me out of a rut.

#

But I need to work on my diet and exercise now, especially to get rid of the belly.

#

I think that was the street parallel to Khaosan Road?

#

Hard to remember exactly where we were, as we stayed in four different hotels.

#

Bangkok for a few days, then Korat, visited Isaan, then back to Korat for 6 days.

#

Then down to Pattaya via coach.

#

Then back to Bangkok, to get a flight to Samui

#

We missed the earthquake by about four days.

#

We were in Pizza Hut in "The Mall" in Korat when it happened.

#

My sis-in-law said she thought the building was moving, then we immediately noticed it swaying slightly, and everyone was dizzy, including the staff.

#

Checked our phones, and saw the quake in Myanmar and Bangkok. Couldn't believe it.

#

When we stayed in Bangkok again for the last few days, the lintel in my room had a giant crack along it.

#

Very glad we missed the main quake, tbh.

#

Somehow I didn't have a repeat of the leg weakness etc. when I was away. Maybe just due to moving around every day, and taking Vit B complex.

#

And I was drinking at least 7 bottles of Leo beer almost every day for a month. lol

#

Only added a few cocktails to that on certain nights, not too often. Only really got properly drunk about three times.

#

We went via Air China for the main flights, via Beijing.

#

Flights and staff were fine, except the first Beijing to Bangkok leg, where the air con wasn't working at ALL, in the middle part of the plane.

#

It was genuinely one of THE worst experiences of my entire life. lol

#

Couldn't breathe for hours, I was passing out, lips tingling etc.

#

Bloody plane just would NOT go below the clouds for hours. I had my first real panic attack in the airport.

#

Aside from that, it was just the usual travelling thing of not getting enough sleep before flights, then not being able to sleep on the plane.

#

(crappy 737 for that short leg, but they thankfully fixed the AC for the flight back to Beijing. Then it was a much newer A350 back to London.)

#

Anyway, Dreamcast...

#

Not sure when I'll get back to working on it, but I do intend to at some point.

#

To make any real progress, I'd ideally need to collab with somebody.

strange trench
#

Is Dreamcast core even a possibility? Or do you mean for a potential successor to the de 10 nano?

rich kindle
#

I guess that this experiment would rather feature a hybrid approach. High level emulation on the arm side and a low level emulation of the powerVR on the FPGA side.

#

But it is not even granted that the MiSTer can even simulate the powerVR alone.

#

But it is an experiment that nobody dared before. The outcome is unknown.

signal scaffold
#

I think it's more just laying the groundwork for a next gen Mister

rich kindle
#

After all it is open source. If something can get working on the MiSTer, then good. If next gen consoles profit from the work, then even better.

valid idol
low widget
frail raft
#

It would need to be bridged with a raspberry pi or something over the GPIO to do a lot of the file stuff we offload to the ARM side. I don't know if all the extra logic would be as helpful as components to help with 3d processing.

lime mango
#

the memory situation is pretty big downgrade from current mister

frail raft
lime mango
#

not relevant for the atum a3 nano

frail raft
#

It's not specifically listed in the compatibility list but that might be becuase it's too new. It was listed under compatible attachments on the Atum 3 page

lime mango
#

the atum3 nano doesn't have an fmc connector

dense shard
#

Omg everyone is posting that everywhere

frail raft
#

Terasic spam works lmao

dense shard
#

lol true

maiden granite
#

The agilex 3 is kinda the cheaper budget line compared to the agilex 5 which will be the cyclone v successor more or less.

lime mango
#

I predict both the price and feature set of the "de25-nano" are going to be disappointing

maiden granite
#

Same. Especially for us yanks due to tariffs.

lime mango
#

there's also the even more disappointing possibility that this is what they consider the de10-nano equivalent

maiden granite
#

Nah

#

Doesn't even have hps

#

The de10-nano having hps and ddr3 positioned it very strategically.

lime mango
#

interestingly the agilex 7 dev board has hdmi 2.1 output

frail raft
#

Perhaps the solution is to come up with the specs we need and the next time and every time Terasic sends out spam we have everyone reply with that spec list.

lime mango
#

they already know

frail raft
#

They need a reminder. Every time they contact us about a board we can't use, we contact them about what we want. As much as you don't like having this flood of responses on the discord, they will not abide a flood of emails.

dense shard
#

Please no mail campaigns. They’re aware of us of course and are happy to sell de10-nanos to more people, but we are not their target market in the slightest. They have project goals and priorities that don’t necessarily line up with what the MiSTer and that’s ok.

lime mango
#

and they've directly spoken to developers

dense shard
#

Alexey (Sorg), the MiSTer project lead, is already involved and has written a technical review of the de25-standard to help influence its development.

https://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=&No=1365#contents

#

lol everyone in that lineup are department heads and professors of highly regarded universities. And then it’s Sorg.

#

Who’s the best dressed out of all of them lol

rain obsidian
#

That is by far the sharpest suit.

rain obsidian
#

LOL

#

I really want to work on DC some more, but a bit stuck again.

#

I also really want to see how the uber devs organize a project.

#

eg. Do they still keep paper notes and diagrams, for example?

#

I also need to hook up a DC mobo to the new o'scope, too see the basic VRAM timings.

#

I think that could give some good info on the internal state machines of PVR2.

#

ie. how much it pre-fetches (on average) for textures and vertex params.

#

How often it fetches the framebuffer for displaying via the DAC, and what the bursts lengths are.

#

That could give me more pointers on how to structure the core stuffs.

#

OK, so writing to the tag buffer is about as fast as it's going to get.

#

Where it processes 32 "pixels" worth at once (a whole tile row).

#

It also writes the params for each Tag value to the param buffer.

#

It's the texturing part which is slow atm, due to the DDR / SDRAM latency.

#

So I need to separate the TSP stuff next.

#

Then figure out exactly which params need to be shoved into the TSP.

#

The "Texture Address" module contains most of the TSP stuff really, but it currently resides within the ISP Parser file / module.

signal scaffold
#

You're doing God's work sir

rain obsidian
#
Param Buffer stores these values atm, per Tag...

ISP Instruction Word.
TSP Instruction Word.
TCW (Texture Control Word).
X,Y,U0,V0,Base Colour,Offset Colour, for Verts A,B,C.
vocal vine
rain obsidian
#

During the texturing phase, the outputs from the param buffer go via the Z,U,V interp modules, UV clamp, then to the Texture Address module.

#

Most of that just to generate the UV coords for the texture look-up.

#

The texture address module does the VRAM address generation, then decodes the colour info from that data.

#

(Texture Address module also contains the palette, Codebook cache, and final colour blending logic.)

#

Wondering how much of that module could be done using LUTs.

#

ie. small "ROMs" for various combinations of UV input bits.

#

This would be one of the first candidates...

#
    // NOTE: Need to add 3 to tex_u_size in all of these LUTs, because the mipmap table starts at a 1x1 texture size, but tex_u_size==0 is the 8x8 texture size.
    case (tex_u_size+3)
        0:  mipmap_byte_offs_norm <= 20'h6;        // 1 texel
        1:  mipmap_byte_offs_norm <= 20'h8;        // 2 texels
        2:  mipmap_byte_offs_norm <= 20'h10;     // 4 texels
        3:  mipmap_byte_offs_norm <= norm_offs_1024[05:0];    //    20'h30;     // 8 texels
        4:  mipmap_byte_offs_norm <= norm_offs_1024[07:0];    //    20'hb0;     // 16 texels
        5:  mipmap_byte_offs_norm <= norm_offs_1024[09:0];    //   20'h2b0;     // 32 texels
        6:  mipmap_byte_offs_norm <= norm_offs_1024[11:0];    //   20'hab0;     // 64 texels
        7:  mipmap_byte_offs_norm <= norm_offs_1024[13:0];    //  20'h2ab0;     // 128 texels
        8:  mipmap_byte_offs_norm <= norm_offs_1024[15:0];    //  20'haab0;     // 256 texels
        9:  mipmap_byte_offs_norm <= norm_offs_1024[17:0];    // 20'h2aab0;     // 512 texels
        10: mipmap_byte_offs_norm <= norm_offs_1024[19:0];    // 20'haaab0;     // 1024 texels
    endcase
    
    // mipmap table mux (or zero offset, for non-mipmap)...
    mipmap_byte_offs <= (!is_mipmap) ? 0 :
                          (vq_comp) ? (mipmap_byte_offs_norm>>3) :    // Note: The mipmap byte offset table for VQ textures is just mipmap_byte_offs_norm[]>>3.
             (is_pal4 | is_pal8) ? (mipmap_byte_offs_norm>>1) :    // Note: The mipmap byte offset table for PAL4 or PAL8 is just mipmap_byte_offs_norm[]>>1.
                                                  mipmap_byte_offs_norm;
#

It took me a very long time to simplify that. lol

#

It used to be three separate case blocks.

#
wire [19:0] norm_offs_1024 = 20'haaab0;

reg [19:0] mipmap_byte_offs_norm;
//reg [19:0] mipmap_byte_offs_vq;    // The VQ mipmap offset table is just norm[]>>3, so I ditched the table.
//reg [19:0] mipmap_byte_offs_pal;    // The palette mipmap offset table is just norm[]>>1, so I ditched the table.
#

Could replace the norm_offs_1024[05:0] stuff with the constant value.

#

I guess it would already be inferred as a small "ROM" block, tbh. Just in registers atm.

#

The type of mipmap offset "table" is chosen based on whether the texture is VQ-compressed, or uses a palette, or is uncompressed.

#

So that could for sure be combined into a small ROM / LUT.

#

(mipmapping isn't implemented at all yet, but I still need to calc the start offset for the texture data.)

#

If a texture is VQ-compresesd, is also adds an offset of 2,048, since it stores the Codebook before the actual texture data.

ripe stump
#

Oh shit, he’s back at it

vocal vine
#

and we love him for it

rain obsidian
#

lol

#

Not quite at the level I was months ago, but I'll keep chipping away at it when I can.

#

It was a huge boost, when skmp got the emu working on the ARM side, for sure.

#

Currently trying to further simplify / combine the logic here.

#

Because every time it does a <= assign, it's adding latency to this module.

#

Which will be making the renders look more crappy atm.

#

If you stare at the code for long enough, you can start to see more patterns for combining things.

#

For example, I could probably get rid of the extra shifts in some places, like mipmap_byte_offs)<<2).

#

Need to try combining these.

#
    // mipmap table mux.
    mipmap_byte_offs <= (!is_mipmap) ? 0 :                            // Non-Mipmapped. Zero offset.
                           (vq_comp) ? (mipmap_byte_offs_norm>>3) :    // Note: The mipmap byte offset table for VQ textures is just mipmap_byte_offs_norm[]>>3.
                 (is_pal4 | is_pal8) ? (mipmap_byte_offs_norm>>1) :    // Note: The mipmap byte offset table for PAL4 or PAL8 is just mipmap_byte_offs_norm[]>>1.
                                            mipmap_byte_offs_norm;    // Uncompressed?
    
    // Twiddled or Non-Twiddled).
    twop_or_not <= (vq_comp) ? ((12'd2048 + mipmap_byte_offs)<<2) + twop :
          (is_pal4 || is_pal8 || is_twid) ? (mipmap_byte_offs>>1) + twop :    // I haven't figured out why this needs the >>1 yet. Oh well.
                                         mipmap_byte_offs + non_twid_addr;
#

If the texture is non-mipmapped, that overrides the texture start offset, and just starts at 0.

#

If it's VQ-compressed (which apparently always uses mipmaps?), then add the offset from the table, with a shift.

#

Else, 4BPP or 8BPP paletted, so different shift.

#

Else, uncompressed, so no shift.

#

The actual clamped U/V coords get shoved into the "twop" calc.

#

"twop_or_not" gets shifted yet again, to generate the final texture look-up address...

#
    // Shift twop_or_not, based on the number of nibbles, bytes, or words to read from each 64-bit vram_din word.
    texel_word_offs <= (vq_comp) ? (twop_or_not)>>5 : // VQ = 32 TEXELS per 64-bit VRAM word. (1 BYTE per FOUR Texels).
                       (is_pal4) ? (twop_or_not)>>4 : // PAL4   = 16 TEXELS per 64-bit word. (4BPP).
                       (is_pal8) ? (twop_or_not)>>3 : // PAL8   = 8  TEXELS per 64-bit word. (8BPP).
                                   (twop_or_not)>>2;  // Uncomp = 4  TEXELS per 64-bit word (16BPP).
#

I feel like a lot of this could be combined into one LUT. lol

#

I try to combine the first block of code/logic first. I'll have to stare at it for a few hours.

#

Not quite right.

#

Oh, apparently you can have non mip-mapped VQ textures.

#

Or something.

#

Old code...

rain obsidian
#

Ahh, it was due to 16-bit uncompressed textures for the Daytona logo etc.

#

But it can use a twiddled or non-twiddled texture address.

rain obsidian
#

Took me all that time, just to ditch the "twop_or_not" logic.

#

But I still managed to break some of the textures more.

#

Wall texture is very rough now.

late cargo
#

We are so back! 🤘

rain obsidian
#

Not quite at full steam. lol

#

I worked on it for quite a few hours last night, and got hardly anyway.

#

I couldn't fix the issue with the wall textures (VQ + mipmap) without reverting to the older code.

#

I'm once again reaching the limits of my knowledge. I'm happy to get this far, as I didn't think I'd even manage to render anything at first.

#

It could be doable to get the speed of PVR fast enough to be playable, if the emulator part on ARM was improved as well.

#

But it might never hit "full speed" on the DE10, even with the hybrid emu approach.

forest echo
#

Something simpler that you may be interested in as I know you like Sega and midi, is Saturn had a midi adaptor and a couple of titles supported plugging in a midi keyboard to use that could potentially be supported in the core with usb midi keyboard (Main already supports midi keyboards etc.)

https://github.com/MiSTer-devel/Saturn_MiSTer/issues/385

GitHub

Raising for visibility, I am not sure if there is room left to support this and is even more niche than USB QWERTY Keyboard which isn't supported yet. The Saturn had a Midi interface box/adapto...

#

I am also looking for someone who can read Mame code and comments to figure out what chips are actually inside the Casio Loopy that are used for playing games (and not the printer parts) as the main CPU is an SH1 and I have a suspicion the other parts may not be as complex as previously thought. Interestingly the console is described as basically being a Casio keyboard under the hood, which if true is quite funny. Would be good to figure out what is actually in this thing. 🙂

https://discord.com/channels/647909397477195803/1363955274020290621

valid idol
valid idol
#

Wait, I wonder what will happen to the VMU here? Like, will it be able to be seen with DS emulator-like settings on positioning it, or will it just be not visible at all?

#

Also would SNAC be able to handle the data load of VMU communications for a Dreamcast controller?

#

Prolly none of these questions matter, given that you aren’t even hoping to get more then what sounds like the GPU on the FPGA, and aren’t hoping for the core to run at beyond half speed.

rain obsidian
#

twop, I believe, just means "Twiddle Operation".

#

Twiddling is the swapping of certain bits. In this case, it's the texture address.

#

They interleave the bits of the U and V coord values, which changes how the texture is accessed from VRAM.

#

It makes the texture data more localized, even when a texture is rotated on-screen.

#

ie. It makes the accesses more contiguous, so it doesn't have to skip so many lines of texture in VRAM.

#

This helps keep the data within the same VRAM "page", which in turn helps with burst transfers.

#

(if you go beyond the current VRAM page, it incurs extra clock cycles, to activate the next page / row.)

#

I've been wanting for ages, to hook up a real DC to the o'scope, to check the basic VRAM timings for stuff.

#

But I haven't got my brain back into "fine soldering" mode yet. lol

#

tbh, things like the VMU are wayyyyy secondary to everything else atm.

#

But the VMU could just be added as an on-screen overlay later.

#

Or, real DC joypads would be hooked up via a simple SNAC adapter. It's only a few pins.

#

I'm still trying to figure out an efficient-ish way to do the Tag sorting, before the info gets sent to the TSP for texturing.

#

But it looks like any way I chose is going to use a lot of FPGA logic.

#

It's quite a tricky problem, and I'm not used to that kind of thing with coding.

#

The Tag buffer just represents which primitives (polygons) exist within the current tile.

#

And obviously which pixels of the tile each polygon covers.

#

You can't really send any of that info to the texturing unit until ALL of the polygons within the tile have been processed.

#

Since that it how it does the HSR (Hidden Surface Removal).

#

It does process 32 "pixels" in parallel, though, so gets a nice speed boost from that.

#

The problem is, to make VRAM access efficient, it needs to group together all of the same Tag values for texturing, so it doesn't have to keep re-reading the VQ Codebook or palette, nor vertex info from VRAM.

#

Quite a hard thing to do, if you think about it. Especially trying to do it without too much logic.

#

I already generate a bitmask for each new polygon written to the Tag buffer.

#

That bitmask denotes which pixels of each tile row (part of) the polygon is visible in.

#

For each new incoming polygon, I would have to write a new bitmask, but also negate the bitmasks of any previous polygon Tags that get overwritten.

#

That would mean updating (up to 32) 32-bit bitmasks at once.

#

Which would really need to be done in parallel as well.

#

I guess that wouldn't be too bad, if it was only working on one row at a time.

#

Even then, once the whole tile is processed (in the Tag buffer), the TSP would still need to increment through each Tag + Bitmask, to do the texturing.

signal scaffold
rain obsidian
#

No easy way to skip to active bits in the bitmask either, unless it uses an RLE scheme.

#

Yes, that's true. You could overlay stuff outside of the 4:3 image.

#

That would only really work on the HDMI output though.

#

Unless you output to the CRT as anamorphic or something. lol

signal scaffold
#

Yeah CRT would be tricky.

You'd probably just have to overlay it in a corner of the screen

#

If you allow for the option of positioning it wherever you'd like, on a case by case basis for CRT gamers they could position it in the best spot possible

rain obsidian
#

I think the only real way to solve this Tag sorting problem, is to use a kind of linked list.

#

Since it won't always have 32 different Tags per row.

#

But can have up to 32.

#

It would need a way to denote how many of the 32 entries are active.

#

If it's only say 5 unique tags in the current row, it can say "OK, we're done for this row", then move to the next.

#

Or, screw all of that, and just hope a proper texture cache can keep the speed high enough. lol

#

Random wireless light gun photo, FTW.

#

Using the "AliExpress Special" light gun board.

pseudo tinsel
#

oh hi @rain obsidian

#

I don't implement tag sorting myself

#

instead i have a larger fpu cache

rain obsidian
#

Yeah, it's because I can only read one 64-bit word of texture per clock tick.

#

Even at 100 MHz, if it spends too long re-reading the Codebook or params, the overheads are too great.

#

Fastest it can render a 640x480 image at 100 MHz, texturing only one pixel at a time...

#

640 x 480 = 307,200 pixels

#

307,200 x 100 MHz (10ns) = 3,072,000 = 3.072 milliseconds.

#

I guess that's 325 FPS. lol

#

But yeah, it's tricky to get decent frame rates without Tag sorting.

#

I've tried to think of all kinds of ways to approach it.

#

But most ideas involve using too much FPGA logic.

#

Latest ChatGPT suggestion...

#

Which is pretty much what I tried before, but couldn't get working.

#

I don't think it would help much, keeping a bitmask for which pixels each Tag relates to in a tile row.

#

Because the TSP would still need to check each bitmask bit to know whether to write the pixel or not.

#

The pixels in a tile row aren't necesarily contiguous (per Tag / polygon) either, of course.

#

Since parts of previous poly spans can be overwritten by new ones, leaving sporadic pixels from the previous tag(s).

#

However...

#

With the older Verilog, I was getting very good (theoretical) frame rates in the sim.

#

Since it had no latency for accessing random Words from VRAM etc.

#

But mainly, when I added the leading zeros / trailing zeros thing.

#

That allowed it to jump to the starting pixel of a span, but then it was also using a full Z-buffer, which is cheating.

#

The only other way, would be to store a list of all of the X/Y coords of all pixels relating to each Tag.

#

Which could be up to 1,024 values (32x32 pixel tile)... per Tag.

#

And I currently have to store the vertex params for about 256 (or more) Tags in the param cache.

#

So that's a LOT of data. lol

#

OK, so...

#

Ignorring transparent stuff for now, the inTri thing is boolean, right?...

#

ie. either the Tag gets written to the Tag buffer (per-pixel), or it doesn't, depending on the Depth Compare.

#

Trying to figure out if I can just send a Tag buffer row directly to the TSP.

#

Then let it overwrite pixels in the framebuffer, based on the inTri result.

#

No, probably not.

#

That's not really "deferred" rendering at all.

rain obsidian
#

Do we know if the real HW can actually read texels from four different addresses at once?

#

Since there are four address busses.

#

(would only be a 16-bit wide value from each VRAM chip, ofc.)

#

Another example of the Tag buffer.

#

Which represents the lower-right tile being drawn on Sanic.

#

You can kind of spot the pattern of Sanic's spike, and right hand.

#

A "Tag" is simply a value which increments once for each incoming polygon (or partial polygon) within the current tile.

#

With Tag values up to 0xB6 (182) in the example above, that's quite a lot of crap to store in the param cache.

#

(the param cache stores the ISP and TSP params, Vertex X,Y,Z,U,V, Base Colour, and Offset Colour, for every tag within a tile. 😮 )

#

I basically need an algo which could determine which pixels in the Tag buffer relate to a given Tag value, so it can efficiently group them together.

#

And then ofc do the same for all of the other Tag values in the buffer.

#

They did apparently use an RLE approach on newer PVR stuff. Not sure if PVR2 did.

#

I keep coming back to the linked-list thing.

#

It already calculates in inTri bits for all 32 pixels in a tile row, within one clock cycle.

#

But when a new poly span comes along that overwrites part of an old one, it needs to "evict" those inTri bits from the list

#

(and if all of the inTri bits of an existing span get overwritten, it needs to mark that as a free entry.)

rain obsidian
#

Only got as far as boosting the theoretical frame rate last night, from 15.6 to around 23 FPS.

#

And that was just skipping some states in the ISP parser.

#

So there isn't much point looking into Tag sorting and other stuff, until I can figure out how to do a pipeline.

#

Due to the latency of processing each thing, it's taking too many states to output a new pixel.

#

I need to figure out exactly what needs to pass between each stage, then do a pipeline thing.

edgy pilot
#

Well I'm pretty sure Resident Evil Code Veronica (PAL version) runs at 25fps. Theoretically we only need 2 more frames (/JK)

It's always great to read your write ups and seeing pictures. All that seems so technical to me.

rain obsidian
#

So it will take a hit of a few clock cycles at the start of processing a tile, but then it will be a contiguous stream of pixels written to the ARGB tile buffer (per Tag, ignorring any wait times for texture cache misses / DDR latency).

#

(and then double-buffered, so it can flush the finish tile to DDR via burst transfer, whilst allowing the opposite tile buffer to get written to for the next tile.)

#

Oh, and cleaned up the render a bit, but the wall textures are still messed up, after I tweaked the texture address module.

#

I think I'm gonna struggle with any of it from this point, tbh.

#

It was amazing seing skmp's emulator running on MiSTer, though.

#

Even with the very low frame rates, mainly due to my crusty "GPU".

#

I might have to pause working on this again, and get some PCB projects built.

#

Including the new GD Emu + HDMI board.

#

Whilst helping with the allotments, so I didn’t get much done in the rooms. lol

#

I only got as far as clearing some of the stuff from the floor, into a box. sigh

edgy pilot
#

Love the random Psu haha.

rain obsidian
#

Really hard not to dox yourself these days, with all the mail-order stuff. Oh well. lol

#

Oh yeah, I bought a US Dreamcast PSU a while ago.

#

To do some testing on noise + heat, when the 12V rail isn't being used.

#

I need to be sure I don't get it mixed up with the 240V ones. lol

edgy pilot
#

Seeing all that electronic stuff always reminds me that I need to learn how to solder.

One day I will. Got a lot of stuff planned first.

rain obsidian
#

Bits of poor Game Gears everywhere, too.

#

DIY 3-chip DLP projector probably won't ever happen at this rate.

#

I never did manage to get ANY pixels shifted from an FPGA onto the DMD.

#

But at least my own DMD board does work, only driven by a commercial projector directly (Optoma).

#

I think we've done quite well in this little house.

#

And I helped move most of it. lol

#

Don't really EVER want to build a single-pane glass greenhouse again.

rain obsidian
#

It was terrifying. My brother already dropped some panes of glass on his ankle ('cos he didn't wait for me one morning.)

edgy pilot
#

That picture

#

No way

#

The black and white

#

In your pub

#

My old Barbers used to have a wall filled with pictures like that hung up. He had the same one iirc.

#

What a blast from the past.

#

Nice to see the Guinness on tap too haha.

rain obsidian
#

Helped build the conservatory (and pick it all up), the summerhouse for the bar, the bar itself, the other summerhouse, all of the sheds, the planters, everything in the bar, moved the wood and slabs, bought the corner sofa for the decking two weeks ago.

#

Taps have never really worked. lol

#

Brother bought a beer cooler a few years ago, but it had a coolant leak right away. sigh

edgy pilot
#

I reckon you just need some classic Wetherspoons plates and your all good to go.

rain obsidian
#

So just use the fridge and optics.

#

lol

edgy pilot
#

Looks amazing

rain obsidian
#

He rarely drinks spirits, and this is the first time in about 18 months that we've even bought a few bottles, for his birthday last weekend.

#

So basically, I'm the main person who has drank the spirits in the bar... about eight times over. lol

#

Just over a long time.

#

The only thing that makes my eye twitch, is the fact he decided to offset the shelves at the back of the bar.

#

I don't get it. I never did. lol

#

He wanted to put a different amp horizontal, on the left-hand side.

#

But there are SO many other ways to control the amp if it was stashed under the bar etc., or just buy a smaller one.

#

I like symmetry. lol

#

I used to really hate Guinness in my teens, but I love it now.

#

(planters and stuff not finished yet. We only just moved them from another plot yesterday. Rolled my ankle... again. lol)

#

The summerhouse on the plot has a faux-leather sofa, TV, and stereo, with Mission floorstanding speakers. lol

#

Just don't tell them.

#

We also somehow managed to get the allotment plot directly behind our garden.

#

And they even let us put a few steps up, and a gate. 😮

rain obsidian
#

Then my brother managed to get the plot next to that, so it's almost like we have a 60ft square garden. lol

#

So yeah, that's another reason why I struggle to get retro / electronics projects done.

#

Don't really want to do much more heavy lifting atm.

edgy pilot
#

That does make sense

rain obsidian
#

I didn't decorate the bar, though. I didn't have much say in it tbh. lol

edgy pilot
#

The decorations look amazing TBF. That black and white sign really stands out.

rain obsidian
#

That corner sofa, btw...

#

The day before, we saw one at a garden centre... They were asking £1,500 for it, almost the same.

#

My brother had a favourite search set up on Faceplant Marketplace.

#

We got it for £100. lol

edgy pilot
#

Now that's a deal and a half right there.

rain obsidian
#

So both good and bad memories, but it for sure helped us get through it.

#

Right, I might actually have a Guinness in a sec. lol

#

Catch you all soon.

edgy pilot
#

Take care mate.

austere shuttle
rain obsidian
#

Thanks.

#

It was a lifesaver during lockdowns.

pseudo tinsel
#

all textures are addressed in 64 bit units

#

I assume it always reads from the texture cache in 64-bit units, and that the texture cache can provide 4x64 bit units for bilinear

#

if miss, then it refills

#

same for the VQ cache

stone plaza
#

wow, still at it

#

dreamcast by christmas

rain obsidian
#

Yes, Christmas 2038.

valid idol
rain obsidian
#

I think I might finally have figured out a method for Tag sorting.

#

I couldn't find a better image atm, but you get the idea.

#

When the HSR process is done, you end up with a full Tag buffer.

#

Each Tag value just relates to a specific polygon (or partial).

#

But during the HSR processing, the Tags won't always be in contiguous spans/rows, because previous Tag values get overwritten if the next polygon span comes along (and if certain pixels from the new poly span pass the Depth-compare test).

#

For every incoming polygon, I currently store the ISP/TSP/TCW, Vertex, and Colour params in a buffer/cache.

#

Which is obviously very wasteful, but I'll figure that out later.

#

Even storing only 256 sets of params gives half-decent renders.

#

Anywho, for doing the Tag sorting, I just have to start rendering pixels from the FRONT Tag, then going front-to-back.

#

For each row in the Tag buffer, I'll also store a bitmask (inTri word) for which pixels in the row the Tag is active in.

#

So then I can just work my way from the "front” Tag layer backwards.

#

Each time a pixel is filled in (in the Tile ARGB buffer), it does a logical NOT with the bitmask in the next lowest Tag.

#

Once all of the bitmask bits in a row are zero, it will just skip that row, until ALL rows are zero, then it knows it's rendered all pixels within the tile.

#

Kind of hard to explain, but at least now I have something to work with.

#

Need to go out now, to help my brother fix the brakes on his van.

#

Oh, could also store a 32-bit bitmask for which rows of the Tag buffer contains active pixels of the current Tag. That way; the TSP should be able to skip whole rows.

#

I’m still in the van. lol

#

For transparent polys, I think that can work, too. Just calculating the pixel colour as it works through each layer.

#

Failing that, it would have to render from the ‘back’ Tag layer, but that’s harder.

#

And now, as always when I get a eureka moment, I’m stuck bleeding the brakes on a Ford POS.

#

I think I might be able to reuse the inTri logic during the TSP stage. At the same time, checking each inTri bitmask to generate the ‘per row’ bitmask.

rain obsidian
#

@halcyon creek It's actually a Ford Transit, but the "joke" is from Men In Black (1997)...

valid idol
forest echo
#

Not sure if the CPU has been, but that isn't required to make a core of it. There is one half done someone did as a uni project. "Just" needs someone with the skill level to handle a CPU to pick it up.

valid idol
#

Or was the old "Genesis" core just based off of a REALLY bad guess on the hardware?

forest echo
#

The old Genesis core was excellent

#

The new core is a really inefficient way to code a core, doing a wire but wire recreation, but it did fix a few timing bugs. There is a reason after that Sorg didn't want to port over the SMS done this way

ripe stump
#

is the current SMS core based on the genesis core?

forest echo
#

No, it is it's own thing

ripe stump
#

oh it's SMS and game gear that are the same core, yeah?

forest echo
#

Yeah

#

A Nuked SMS core would lose a lot of things that the current SMS core does, would be two steps forward, these steps back

ripe stump
#

Yeah, that's a weird balance. I imagine the wire-by-wire recreation is the reason that the core is so full, and doesn't have room for more stuff

forest echo
#

Yeah, the Megadrive core is really inefficient because it is wire by wire, and as a result it much bigger than it needs to be, so takes up way more space than the old one. Also being coded that way makes it much harder to add extra features to before even considering the lack of space left

polar goblet
#

People are really forgetting that accuracy is always the priority when it comes to the MiSTer project.

#

If you really want to pump games with extra options, just boot up a software emulator instead.

valid idol
#

The VMU cannot be more complicated than the Sega Genesis, so ritual sacrifice (decapping) of one for an hyper-accurate core doesn’t seem too far fetched to me.

forest echo
#

The reason there isn't a core is because nobody with the skillset to make the core is interested, or hasn't decided they want to spend their time on it.

#

If this Sanyo CPU is from the late 90s, then it may not be possible to decap and get good pictures of, as once you get to the mid 90s the number of levels on chips increased so it isn't feasible to decap them and get pictures of all the layers as they are stacked on top of each other. Unlike chips from the 80s that were one layer.

#

Even if someone did a decap, and it worked, and good photos were taken, someone with the skills to trace the whole thing out would need to do that which takes a long time. Then someone would need to turn that into FPGA code.

#

The end result, if even possible, would be less efficient and harder to read than building a core using the datasheet, other resources like Mame and probing the hardware.

#

For the SMS core, it would be better if someone were to fix the two remaining test fails, possibly using the Nuked core as a reference, than throw it out and port over the Nuked one, losing all the additional mappers, GG support, peripherals, as well as extra features like cheats which would be convoluted to try add into the Nuked core.

rain obsidian
#

I'm still stuck on this stupid problem.

#

I managed to speed up the render times on the sim yesterday, from around 15 FPS up to 26 FPS.

#

But most of that was "cheating", by removing unneeded states, etc.

#

I don't think there is any simple way to generate the RLE output from the polygon spans on-the-fly (as the spans get written to the Tag buffer).

#

Without it using a LOT of logic.

#

But maybe it's not too bad to just store the inTri bits for all 32 rows in some BRAM.

#

That would require 128 Bytes worth, per Tag.

#

A render like the Daytona one has a max Tag value per-tile of around 600.

#

So call it 768 Tags in BRAM, times 128 Bytes = 96 KBytes.

#

It would still need to render the "Tags" in front-to-back order.

#

Because if it was back-to-front (ie. processing from the first incoming polygon to the last), there's no simple way of updating ALL of the previous inTri bitmasks when spans get overwritten by newer polys.

#

Converting the finished Tag buffer (example above) to RLE could be doable as well, but I think it would take too many clock cycles.

#

If I temporarily disable the final rendering output in the sim, the Daytona render gets a frame time which would hit 70 FPS, but that's only if the core could hit 100 MHz.

#

Basically, the whole reason to group together the Tag values, is so that you only need to read the texture into cache ONCE.

#

(well, you have to read another chunk of the same texture if need-be, but you get the idea.)

#

With the current scheme of just rendering each Tag value as it appears in the Tag buffer, it would obviously be even slower on the FPGA, due to the DDR / SDRAM latency etc. That RAM is only really efficient when doing longer Burst transfers.

#

I tried looking through multiple patents to see if I could find how they optimized things.

#

But didn't find too much detail on the Tag sorting part.

#

Maybe doing the RLE thing on the Tag buffer isn't so bad.

#

As it takes a lot of clock cycles just to read in a chunk of texture or VQ Codebook anyway.

#

That could be enough time to offset the time taken for the RLE thing.

#

(RLE is important, not only for sending less data to the TSP, but so the TSP can quickly skip over pixels and rows which are not covered by the current Tag / polygon.)

#

Doing the RLE conversion on a 32-bit inTri bitmask looks fine. ChatGippity can help with that.

#

But I don't think there are many ways around storing the 32 inTri words for every incoming poly / tag.

#

It's sort of currently doing this.

#

But with a single Tag value assigned to each triangle (ie. the same value across all active pixels of the triangle).

#

Obviously any incoming triangles that pass the depth test with overwrite the previous triangle's pixels.

#

(I do also store the Z values for each pixel. The Tag buffer is a combined Tag / Z buffer really.)

#

I can't use the Z values to help with the RLE thing, because that would be a LOT of data.

#

What we do know, is the Tag value of the last poly spans written to the Tag buffer.

#

I can't think of a better way, than to render from the "front" polygon, then working front-to-back.

#

(marking off how many pixels in each row have been shaded, until all pixels are done.)

rain obsidian
#

This is probably obvious to most of you, but code from ChatGpt rarely works as stated. lol

#

I get that it's the users that "train" the language model, but it still has a way to go yet.

signal scaffold
#

It'll be wild when it gets there though. It's only a matter of time.

rain obsidian
#

Tried some ChatGippity code earlier. Didn't work.

#

Well, didn't quite work.

#

It outputs RLE length = 0, which should never happen.

#

Eventually gets down to a row where the "02" tags start.

#

But even those don't combine RLE values into one group. lol

#

I think this might be a doable method, as it could start the process whilst the TSP is still fetching a chunk of texture for the first Tag (polygon) found.

#

I need to test a smaller part of code from ChatGippity first.

#

Just need it to output valid RLE values for the stuff currently in the (completed) Prim-Tag buffer.

#

Might be enough logic spare later, to have it do that on more than one value at a time.

#

I realize this might never get close to "full speed" on MiSTer, btw, even with skmp's port of the emu part on the ARM.

#

But if I could even get it running well enough to hit 15-20 FPS in most games, at say 50 MHz, that would be a great proof-of-concept.

#

And hopefully inspire others to collab, and aim at Dreamcast for a future FPGA board maybe.

#

Failing that, Robert might just release a full Dreamcast core out-of-the-blue. lol

valid idol
forest echo
#

What you could do is some research into what the games are all on the VMU, make a. List and highlight the best ones, and also what Homebrew was made for it, and do a write up in the VMU thread. If a good case is made there is interesting stuff on there then there is more chance of a dev deciding to pick it up.

#

I could be wrong but I don't think the CPU in it is used elsewhere, but would be worth double checking that.

rain obsidian
#

Still working on this.

#

I have a slightly different strategy.

#

I asked ChatGippity to write some Verilog that simply generates RLE output from the 1,024 Tags in the Tag buffer.

#

For some tiles, all 1,024 pixels will have the same Tag value (like some of the "sky" texture tiles in Daytona, for example).

#

That would take a minimum of 1,024 clock cycles to determine that, so that's the worst-case latency for when the first RLE value gets output.

#

But, you can offset that latency...

#

By just having two Tag buffers, and doing the double-buffering thing.

#

So you would incur that 1,024 clock cycle delay at the start of processing, but then the ISP would be able to continue processing the NEXT tile whilst the TSP is doing the texturing for the current tile.

#

The only problem with all of this, is how to handle the translucent polys. lol

#

Once I have the ISP outputting RLE reliably, I can then separate the texture_address unit, and shove all of that into its own TSP module.

#

Oh, and the RLE code from ChatGypsy does apparently group the Tags together, which is exactly what I need.

#

(that way, the TSP only needs to read the chunks of texture for ONE tag value at a time, which will prevent it wasting time re-reading stuff from VRAM.)

#

Depending on how the texture cache is written, it would be possible later, to render two, four, or more pixels at once.

#

Then dump the finished Tile ARGB back to DDR.

#

I know this project has been going on for a very long time already, but I don't just want to let it die (yet).

#

I'll continue to work on it when I can.