help-forum

805 threads · Page 12 of 17

2 tokens/second 17 messages
KeyError: ‘bos_token_id’ 2 messages
Windows
VRAM & "CUDA out of memory." 40 messages
Hardware Setup
Cuda error 2 5 messages
general performance-review 9 messages
Windows
unsupported tensor dtype in loras LIMARP with Exllama 7 messages
How to convert ggml to ggml v3? 4 messages
Can not run llama2 7b parameters on windows webui (updated) 56 messages
Can't install models with WebUI 26 messages
Hardware Setup Windows
Could not find the quantized model in .pt or .safetensors (Solved, wrong model loader.) 2 messages
Windows
Loading TheBloke_Llama-2-70B-chat-GPTQ… 10 messages
Windows
[RESOLVED] OSError: [WinError -1073741795] Windows Error 0xc000001d 38 messages
Setup Windows
any recommendations? 22 messages
All of a sudden my screen now goes off randomly while running ooba 9 messages
Hardware
I can't figure out what's wrong. It won't Load Models! 11 messages
In chat-instruct mode where do you add memory of important details about the user LLM is talking to? 7 messages
Can't get any models to work in Text Generation WebUI on Windows 11 laptop 3 messages
Windows
How to install GPTQ-for-Llama with venv on Linux? 3 messages
Linux Setup LLaMA
Failed building wheel for quant-cuda 6 messages
Setup Windows
.safetensors models produce nonsense output. 16 messages
524 error with public api extension - any insight? 2 messages
Windows
New to WebUI, Conda errors 2 messages
Windows
Auto Installer - Listen on Lan 4 messages
Linux Setup
Are there any existing functions to submit prompts in batches? 10 messages
Linux Prompts & Characters
Question about Perplexity Results 42 messages
max_new_tokens increased length 6 messages
Gibrish/Empty responses when clicking "continue". 8 messages
LLaMA Windows
Error when loading model (WizardLM Uncensored Falcon 40B) 54 messages
Hardware Setup Windows
Issues using OpenAi api exstension 7 messages
silero_tts repeating voice clip bug 3 messages
models won't load 51 messages
Hi I enabled public_api in Text Generation WebUI from RunPod service and I need the URL 2 messages
Hardware Setup
can't start web UI without loading a model, Bug? 4 messages
Windows
llama.cpp not using GPU despite having BLAS = 1 (Linux, GGML) 121 messages
Linux LLaMA AMD
What are Ooga's default preset values? 4 messages
What models can I use? 25 messages
Hardware Windows
What prompts does Chat mode use instead of Instruct mode? 3 messages
Linux Prompts & Characters
I fine-tuned the model and I got CUDA out of memory 4 messages
Linux
When training Lora in alpaca format - how do i ensure the conversation history is maintained ? 5 messages
KeyError: 'lm_head.weight' when attempting to load Guanaco 33B with loader other than Transformers 4 messages
Linux LLaMA
Show prompts in Console? 3 messages
Expected inference speeds with a 3090Ti / ExLlama setup 106 messages
Getting error while trying to create a LoRA based onTheBloke_WizardLM-33B-V1-0-Uncensored-SuperHOT-8 5 messages
Linux Setup Windows
Errors when running SUPERHOT models using Exllama 7 messages
Cpu bottleneck 2 messages
Hardware
How exactly does feature "start replay with" work? 21 messages
Runtime error while trying to boot alpaca 5 messages
Setup Windows
CUDA out of memory 8 messages
Windows
ERROR:Failed to load the model 3 messages
Conda environment is empty 30 messages
Setup Windows