#im trying to load some models to do some light testing but i keep getting a warning message:
31 messages · Page 1 of 1 (latest)
This warning message doesn't really mean anything useful...
But I find it a bit strange that you load something in 5 shards. You're probably loading some uncompressed model. And it seems the app dies.
Try using models with -GGUF suffix.
What's your hardware setup?
unsure, been a while since i got this thing
Hm, 11GB GTX card, you definitely need to use -GGUF models.
Something within 12B range, I think. Perhaps more if you ok with sacrificing a lot of speed.
ye, thing is, i used to mess around with llms and got it all to work, just can't remember how for the life of me
ight
ill download one and get back to you
alright
so
still not working
different problem though
ill post the cmd message i got and then head to bed for now, it's late
- Use llama.cpp loader
- Ensure you limit context to some reasonable value, for example 8192.
Not sure how it's field called now, maybe still n-ctx. The field where value is set to ~1 million.
update
did that
no binaries
how do i fix this
also
sorry if this is annoying to deal with for you man
cmd_windows.bat -> pip install -r requirements/full/requirements.txt
hm, do you have several copies of the app installed?
if you do, ensure you do this command for the same copy you're running
and restart the app
Try then cmd_windows.bat -> pip install https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.14.0/llama_cpp_binaries-0.14.0+cu124-py3-none-win_amd64.whl