llama 13b 4bit 128 amd | Text Generation WebUI | Page 1

vestal smelt Mar 29, 2023, 5:29 PM

#

have you followed the steps @ https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode to install GPTQ?

GitHub

LLaMA model

A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion. - oobabooga/text-generation-webui

#

it looks like you either haven't done that yet, or just have an outdated copy of it

vestal smelt Mar 29, 2023, 11:20 PM

#

it's not -required- it's always optional, but it's helpful, and compatible with multiple models

#

to my knowledge it's the only way to get 4bit models loaded rn

#

there are multiple version of gptq available including a newer pytorch variant

civic grail Mar 31, 2023, 10:28 PM

#

there is an issue with some new functions that were added in GPTQ-for-llama that rocm cant handle. Yellowrose has a patched fork for rocm/hip here https://github.com/YellowRoseCx/GPTQ-for-LLaMa

GitHub

GitHub - YellowRoseCx/GPTQ-for-LLaMa: 4 bits quantization of LLMs u...

4 bits quantization of LLMs using GPTQ. Contribute to YellowRoseCx/GPTQ-for-LLaMa development by creating an account on GitHub.

atomic cliff Apr 1, 2023, 1:34 AM

#

i get this error when i try to run the llama 7b with 4bits on my nvidia card

#llama 13b 4bit 128 amd