Hello All, just setting my hardware - Lenovo P350 tiny (i5 10th gen), and planning to use either a Yeston 3050 LP 6GB or Modded RTX A2000 LP 6 GB GPU. The issue is I want to use Qwen2.5 3B LLM for conversation assistant at Q6_K_L that alone requires 2.5. GB.
So if intend to use a Wyoming-whisper medium-int8, I have read it consumes atleast model 4-5GB vram. So scratching my head here as I really cannot afford more budget for a better GPU.
Could anyone running Wyoming Whisper on GPU can please let me know how much vram are the small and mediums models consuming on your GPU ? Also any idea on how much piper is consuming if you are running too on GPU ?