help-forum
Loading larger models
Hardware
Linux
Setup
Traceback
Windows
Long Term Memory Extension Error, Please help me or santa will kill you in your sleep.
All models suddenly failing to load "unpack requires a buffer of 4 bytes"
Gradio
Windows
Webui is too slow, but not consuming too much resources
Windows
need to integrate the webui to a python code
Windows
Yi-34B 200k based models double spaces in notebook mode
ElevenLabs - Still supported?
Hello to everyone , I run ggml-model-q4_0.gguf . It is talking in very poor english (-it seems to me
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 451: character maps to
Can't install: No module named 'conda'
Text output error □□□□□□□□□□□□
"load_model_wrapper" & "IndexError: list index out of range" errors
Setup
Issue with rocBLAS on amd rx 580
dolphin-mixtral8x7b_4km not loading --please help!
Linux
Setup
AMD
Continue generation via API
Trying to run TheBloke_dolphin-2.5-mixtral-8x7b-GPTQ getting an error with every model loader
slow generation on GTX4080
PC resources maxing out, Dolphin-mixtral extremely slow
Running out of VRAM even with --gpu-memory set
Linux
Slow text generation in latest versions with llama.cpp
Prompt from Python script
Prompts & Characters
Windows
ERROR: byte not found in vocab: ''Segmentation fault (core dumped)
Taskweaver WebUI Question
Superbooga loading txt files added permanent?
Why doesn't it stream the tokens?
Windows
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=80): Max retries exce
Windows
Getting no answers. Stuck in "Is typing.."
Windows
I can't get multimodal to run
is it possible to Run Mixtral8x7B?
Why am I getting this error? When I try to load any preset I get this. Reinstalling the interface to
Mistral MOE loading?
ggml_new_object: not enough space in the context's memory pool
can I use the new opensource Mistral 8x7B model that is supposedly better than gpt 3.5 locally?
Hardware
Setup
Windows
Trouble downloading/loading
OobaBot not connecting
AutoAWQ?
Add mixtral compatibility
Optimal way of running a LLM
Hardware
Windows
slow answer...Hello all of you.
Does anyone have a python token streaming example code using the API?
Inference speed slows down drastically as context window fills up.
Windows
large Context Testing
Windows
Seeking a webui to permit remote queries to vector db
Problem to install llama cpp
Linux
Slow Answers
Hardware
Setup
LLaMA
Windows
Wizard-Lm-7B Generating random gibberish on 1660 ti
Windows
How to work with the api?
Prompts & Characters
Can't chat
Linux
Gradio
How to prevent ai reply from emoticon, or emotion
LLaMA
Windows