help-forum

805 threads · Page 11 of 17

help with server.py --options 3 messages
How do i use oogabooga with cpu 2 messages
GUYS I HAVE A 1650 graphics card 4gb vram , i wanna run a 6B or 7B model 5 messages
Hiding GUI elements? 10 messages
Gradio
llama 2 tends to continue and modify the prompt without me asking it to 2 messages
GGML Performance with slow RAM 17 messages
API reacts differently for different model loader 5 messages
LLaMA Windows
Building Wheels fails 5 messages
Linux AMD
Running a 13b model on a 3060 10 messages
Hardware Windows
api call stuck in character and very deterministic 2 messages
Gradio LLaMA Windows
0.10 Tokens a second, help would be appreciated.. 5 messages
Japanese-StableLM-Instruct-Alpha-7B setup and error 4 messages
Linux Setup
Responses Stop after a particular token count 13 messages
Gradio Windows
Model not regenerating previous responses with 'regenerate' button 3 messages
IndexError: list index out of range 4 messages
NotImplementedError: Cannot copy out of meta tensor; no data! 2 messages
Linux
mirostat perf seems slow compared to regular sampling. 4 messages
Error trying to install | AttributeError: module 'os' has no attribute 'statvfs' 6 messages
Hardware Windows
Drop vram cache via API? 4 messages
Linux
[FIXED]Using GGML Model (llama.ccp) with 4k tokens only check the first 2k tokens and give an error. 3 messages
LLaMA Prompts & Characters Windows
Can't train Lora. Two warnings ('--load-in-8-bit-' and wrong kind of model) then an error. 2 messages
Windows
which setting will make response less formal? 2 messages
Prompts & Characters Windows
Is there a way to always show the model dropdown? 6 messages
Gradio
Unable to connect on Sillytavern via phone/different local PC 2 messages
Llama.cpp 7 messages
LLaMA Windows
2Gb Vram 52 messages
Windows
GPTQ ExLlama Won't Split Between GPUs 5 messages
Linux Setup
silero_tts : WARNING:socket.send() raised exception 7 messages
Windows
Any sort of local framework for generating chats? 2 messages
Windows
How do I know if an LLM model is 4-bit or 8-bit? 14 messages
A way to reduce prompt eval time? Incredibly bad generation speed when truncation begins. 66 messages
How does text-generation-webui choose a GGML file from a directory of many different q values. 4 messages
Data Generation Using GPT-3.5 6 messages
LLaMA Prompts & Characters
Train the llama v2 model using the web UI on Raw text 2 messages
Hardware LLaMA Windows
how to infer large models such as 30b/70b (running into oom using transformers) thanks ! 3 messages
Windows Ooba issue 11 messages
unable to load llama2 model 2 messages
What software is used to train or fine tune llama 2? 2 messages
Illegal instruction (core dumped) 15 messages
need help allocationg vram 5 messages
Hardware Windows
Guides on Training/Dataset building (Text, web/document scrapes)? 9 messages
Linux Setup
LorA training issue with Alpaca chat format on ExLlama loader 18 messages
Gradio LLaMA
Training based on my books 533 messages
Windows
How big is one model? 42 messages
Gradio Setup Windows
Running Downloaded Model on Linux and GPTQ and CUDA Requirement? 34 messages
Linux Setup
Error with running AI on Ooba [ISSUE RESOLVED] 52 messages
Installation error 14 messages
Windows
Where do I get Loras? 5 messages
Setup Windows
One word answers in long chat 5 messages
Prompts & Characters
Question about prompt formatting when using chat history/for bot purposes... 15 messages
Linux Prompts & Characters