help-forum
help with server.py --options
How do i use oogabooga with cpu
GUYS I HAVE A 1650 graphics card 4gb vram , i wanna run a 6B or 7B model
Hiding GUI elements?
Gradio
llama 2 tends to continue and modify the prompt without me asking it to
GGML Performance with slow RAM
API reacts differently for different model loader
LLaMA
Windows
Building Wheels fails
Linux
AMD
Running a 13b model on a 3060
Hardware
Windows
api call stuck in character and very deterministic
Gradio
LLaMA
Windows
0.10 Tokens a second, help would be appreciated..
Japanese-StableLM-Instruct-Alpha-7B setup and error
Linux
Setup
Responses Stop after a particular token count
Gradio
Windows
Model not regenerating previous responses with 'regenerate' button
IndexError: list index out of range
NotImplementedError: Cannot copy out of meta tensor; no data!
Linux
mirostat perf seems slow compared to regular sampling.
Error trying to install | AttributeError: module 'os' has no attribute 'statvfs'
Hardware
Windows
Drop vram cache via API?
Linux
[FIXED]Using GGML Model (llama.ccp) with 4k tokens only check the first 2k tokens and give an error.
LLaMA
Prompts & Characters
Windows
Can't train Lora. Two warnings ('--load-in-8-bit-' and wrong kind of model) then an error.
Windows
which setting will make response less formal?
Prompts & Characters
Windows
Is there a way to always show the model dropdown?
Gradio
Unable to connect on Sillytavern via phone/different local PC
Llama.cpp
LLaMA
Windows
2Gb Vram
Windows
GPTQ ExLlama Won't Split Between GPUs
Linux
Setup
silero_tts : WARNING:socket.send() raised exception
Windows
Any sort of local framework for generating chats?
Windows
How do I know if an LLM model is 4-bit or 8-bit?
A way to reduce prompt eval time? Incredibly bad generation speed when truncation begins.
How does text-generation-webui choose a GGML file from a directory of many different q values.
Data Generation Using GPT-3.5
LLaMA
Prompts & Characters
Train the llama v2 model using the web UI on Raw text
Hardware
LLaMA
Windows
how to infer large models such as 30b/70b (running into oom using transformers) thanks !
Windows Ooba issue
unable to load llama2 model
What software is used to train or fine tune llama 2?
Illegal instruction (core dumped)
need help allocationg vram
Hardware
Windows
Guides on Training/Dataset building (Text, web/document scrapes)?
Linux
Setup
LorA training issue with Alpaca chat format on ExLlama loader
Gradio
LLaMA
Training based on my books
Windows
How big is one model?
Gradio
Setup
Windows
Running Downloaded Model on Linux and GPTQ and CUDA Requirement?
Linux
Setup
Error with running AI on Ooba [ISSUE RESOLVED]
Installation error
Windows
Where do I get Loras?
Setup
Windows
One word answers in long chat
Prompts & Characters
Question about prompt formatting when using chat history/for bot purposes...
Linux
Prompts & Characters