#Finding the largest model file my computer can run

28 messages · Page 1 of 1 (latest)

tepid hollow
#

What is the largest model i can run with 6gb cpu ram plus 2gb integrated gpu vram? In general, can you only use model files with the size less than your total ram? I saw a term called "lazy loading", and if i understand it correctly, will it enable you to use a model whose size is larger than your ram?

obsidian mortarBOT
#

Hi @tepid hollow. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

The size of the model you can run depends on both your CPU and GPU capabilities. The system requirements for running GPT4All models include a CPU that supports AVX or AVX2 instructions and enough RAM to load a model into memory (source (https://docs.gpt4all.io/gpt4all_help/faq.html#what-are-the-system-requirements)).

However, the model size is not the only factor that determines whether a model can run on your system. Other factors such as the model's architecture and quantization type also play a role. For example, one user reported that some models like dolphin 13b won't work, but falcon 13b works fine, despite both being 13 billion parameter models. This is due to differences in model size and GPU type (source (https://github.com/nomic-ai/gpt4all/issues/2345)).

As for "lazy loading", it is a method that loads only the necessary parts of a program into memory, which can help to reduce the amount of RAM used. However, the GPT4All documentation does not mention this feature, so it's unclear whether GPT4All supports it.

In general, it's recommended to try different models and see which ones work best with your system. You can find a list of supported models on the GPT4All website (https://gpt4all.io) or in the models directory (https://raw.githubusercontent.com/nomic-ai/gpt4all/main/gpt4all-chat/metadata/models3.json) (source (https://docs.gpt4all.io/old/index.html#models)).

Remember, even if a model can technically run on your system, it may not perform optimally if your system resources are stretched to their limits. It's always a good idea to leave some overhead for other system processes.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@tepid hollow kapa.ai is still learning and improving, please let us know how it did by reacting below

wide pier
#

You can try Qwen2-1.5b-instruct, which should fit into your RAM.

#

maybe phi-3-mini-instruct, if you are lucky.

tepid hollow
#

Yeah, that should fit perfectly into my 2gb of gpu vram. i also managed to run llama 7b gguf (q4_0), which has a file size of 4gb, possibly because i have another 6gb of cpu ram

#

But i cant run anything beyond 7b q4_0. Im wondering if thats the theoretical maximum for my specs? Or maybe there are other tweaks to run even larger models? 🤔

wide pier
#

No, that's probably the max or close to the max. You also need some RAM for your operating system and other apps that are running in the background.

#

And conversing fills up your RAM too. The longer the conversation goes, the more you will need.

#

The "Large" in Large Language Models is real 😄

tepid hollow
#

Thanks! I also saw this post on github

#

I read thru it several times but cant figure out if it means anything significant?

#

If i understood the post correctly, i guess much of the model is loaded into page files

#

Which i guess will affect the speed a lot, but im not sure cuz i didnt see them mentioning the speed 😂

wide pier
#

Anything on your HDD or SSD is much slower than RAM or VRAM.

#

Is there a reason, why your hardware is THAT limited?

tepid hollow
#

Yeah, thats just the computer i have, so im trying to find the largest model that it can possibly run 😭

wide pier
#

what's the name of the motherboard?

#

what's your budget to buy hardware?

tepid hollow
#

Its

ASUSTeK COMPUTER INC.
X512DA
1.0

and ive $0 tbh... I dont have allowances and its still years until i can make my own money

#

I also read a bit about possibly connecting my computer to another even older computer we have at home, which has 2gb of ram and no vram

#

But based on what i read, if theres too much difference in the handware, its only going to make the performance worse

wide pier
tepid hollow
#

Yup, vivobook 15

#

I think thats honestly the best way right now 😂