#Newbie need help with AI Stuff any help is welcome

3 messages · Page 1 of 1 (latest)

lost ivy
#

Hello! I hope I’m in the right place where someone will (hopefully) have the patience 😄 to help me with this whole AI thing. I’m very interested in this stuff and I’ve researched a lot on my own. However, for the last 5–6 days I’ve been hitting a wall and I can’t see a way out.

What’s going on: I’m into programming—learning and experimenting—and, as with everything else, AI has gotten involved because it can help a lot. I looked into it and found a combo where I use the Cline extension for Visual Studio Code, and LM Studio for all the AI parts. Now, here’s the problem: I downloaded several models and tested them, but it’s just not working as it should. You might ask, what exactly? For example, with some models I’ve had to wait up to 30 minutes for a response, and when it finally comes it’s 4–5 letters at a time every 10 seconds. That’s when I realized I’m definitely doing something wrong.

So I’d really appreciate it if someone willing and eager to help could jump in—I'd be very grateful. From what I understand, it all comes down to the settings in LM Studio, but I just can’t figure out where and what I’m doing wrong. Below I’ll share screenshots of the models I’ve downloaded so far, as well as my PC specs. If you need any additional information, just let me know. Thanks in advance!

Below are my PC specifications:

CPU: AMD Ryzen 7 9800X3D
RAM: 32 GB 6400 MHz
GPU: RTX 3060 12 GB

ornate shuttleBOT
#

Your hardware (RTX 3060 12GB, Ryzen 7 9800X3D, 32GB RAM) is capable of running many AI models, but slow responses (e.g., 4–5 letters every 10 seconds) usually mean the model is too large for your GPU, or settings in LM Studio are not optimized. Try using smaller, 4-bit quantized models (e.g., 7B or 13B parameter models, not 30B+), and ensure LM Studio is set to use your GPU, not CPU. Also, check that the context length and batch size are not set too high for your VRAM, and avoid running other GPU-intensive apps at the same time. For best speed, use models in GGUF format with quantization like Q4_K_M or Q5_K_M, and set GPU layers as high as your VRAM allows without causing out-of-memory errors.

If you want step-by-step help with LM Studio settings, or want to know which specific models to try, let me know! For more details on model selection and VRAM requirements, see Unsloth's model selection guide.

Sources:

lament reef
#

Are you able to verify that it's running on the GPU and not CPU?