#How much memory in the 3060 8 or 12. I

1 messages ยท Page 1 of 1 (latest)

oblique pulsar
#

I've got qwen3:8b working with 32k context on a RTX 3070 using K and V Cache Quant at Q4_0 - I've also been making changes to make the context smaller so it can handle multi turn longer locally, will see if the maintainers want lazy loaded tools that I've got working. It is fine at reminders and got it checking emails and such with about 15 tok/sec

shell pasture