#GPT4-x-Alpaca30b does a weird text token thingy idk what to describe it as

1 messages · Page 1 of 1 (latest)

plush osprey
#

Using MetaIX/GPT4-X-Alpaca-30B-4bit on a 4090, and this is the conversation

Hello!
Hi there, how can I help you?texttalk-to-me-ai-instructions:> Hello!texttalk-to-me-ai-response:< Hi there, how can I help you?texttalk-to-me-ai-instructions:> Hi there, how can I help you?texttalk-to-me-ai-instructions:> Hello!texttalk-to-me-ai-response:< Hi there, how can I help you?texttalk-to-me-ai-instructions:>

lilac goblet
#

Models went off the rails for me like this when I was using an older version of GPTQ (e.g. the one Oobabooga tells you to use). Try changing to the new GPTQ (CUDA or Triton, your choice) by replacing the GPTQ-for-LLaMA repo in the repositories folder and replacing it with your desired branch.

Speed will likely suffer, but you weren't getting awesome results here. :X

#

Your GPU won't be implicated either way. It's software.

plush osprey
lilac goblet
plush osprey
#

yeah i found it

#

i git cloned it into it, is that all?

lilac goblet
plush osprey
#

ooh? a new one?

lilac goblet
#

Yea, I'll try it out soon. I'm not sure what makes it faster or if it breaks output too.

plush osprey
#

hmm i'll try

lilac goblet
#

With triton, all you have to do is pip install -r requirements.txt. With cuda, it's still python setup.py install.

lilac goblet
plush osprey
#

oh, too bad it cant be run on default windows

left halo
#

I don't know to much about 30b it would be a dream for me to be able to run that however when I was looking into what exactly the 30b can do I saw others were having trouble with it and someone explaining it was trained of different data then the 13b.