wont load model on latest version | Text Generation WebUI | Page 1

pale meteor Jun 19, 2023, 6:58 PM

#

All I get when trying to start the webui is nothing (as shwon in attached img)
can anyone help?

#

https://github.com/oobabooga/text-generation-webui/issues/2749

cedar dust Jun 19, 2023, 7:20 PM

#

Does it load if you use the --loader gptq-for-llama flag?

pale meteor Jun 19, 2023, 7:21 PM

#

cedar dust Does it load if you use the `--loader gptq-for-llama` flag?

where to do I pass the flag to?

cedar dust Jun 19, 2023, 7:25 PM

#

Should be a section at the top of webui.py for it. You can also manually select the gptq-for-llama loader in the webui when loading the model.

pale meteor Jun 19, 2023, 7:25 PM

#

Here?

cedar dust Jun 19, 2023, 7:25 PM

#

Yes

pale meteor Jun 19, 2023, 7:26 PM

#

cedar dust Jun 19, 2023, 7:29 PM

#

It may be that the model is simply incompatible with the GPTQ-for-LLaMa version that the webui uses. I saw someone else say that they were able to load that model, but I've yet to be able to myself.

#

I'll re-download it and see if I can figure out how they did it.

#

You might also look at this: https://huggingface.co/notstoic/pygmalion-13b-4bit-128g/discussions/2

cedar dust Jun 19, 2023, 7:37 PM

#

pale meteor

I'm able to load that model with both gptq-for-llama and AutoGPTQ. You should try the suggestion mentioned in the link from my previous message.

pale meteor Jun 19, 2023, 7:44 PM

#

not getting anything again

cedar dust Jun 19, 2023, 8:06 PM

#

pale meteor not getting anything again

What about without the --loader gptq-for-llama flag?

pale meteor Jun 19, 2023, 9:12 PM

#

cedar dust What about without the `--loader gptq-for-llama` flag?

Does that with that flag

cedar dust Jun 19, 2023, 9:12 PM

#

pale meteor Does that with that flag

What about without the flag.

#

You can also try with --loader autogptq and if that doesn't work, then I don't know what else to do besides using exllama. Setup instructions for exllama here: #general message
And here: https://github.com/oobabooga/text-generation-webui/blob/main/docs/ExLlama.md#installation

pale meteor Jun 19, 2023, 9:33 PM

#

cedar dust You can also try with `--loader autogptq` and if that doesn't work, then I don't...

how do i disable the autoload for the model?

#

because with the 7B model i get a diffrent error

cedar dust Jun 19, 2023, 9:34 PM

#

--model None
You are supposed to be able to just remove --model, but I think it is bugged.

pale meteor Jun 19, 2023, 9:37 PM

#

cedar dust `--model None` You are supposed to be able to just remove `--model`, but I think...

Thats the error i get when loading the 7B model

📎 message.txt

cedar dust Jun 19, 2023, 9:38 PM

#

pale meteor Thats the error i get when loading the 7B model

Probably need to set the correct groupsize for the model. Most are 128 or None.

pale meteor Jun 19, 2023, 9:39 PM

#

cedar dust Probably need to set the correct groupsize for the model. Most are `128` or `Non...

changed it to none and same error

cedar dust Jun 19, 2023, 9:39 PM

#

Can you link the model?

pale meteor Jun 19, 2023, 9:40 PM

#

cedar dust Can you link the model?

https://huggingface.co/TehVenom/Pygmalion-7b-4bit-GPTQ-Safetensors

TehVenom/Pygmalion-7b-4bit-GPTQ-Safetensors · Hugging Face

cedar dust Jun 19, 2023, 9:43 PM

#

Groupsize none should be the correct setting. AutoGPTQ will detect the settings on it's own most of the time. GPTQ-for-LLaMa needs wbits set to 4 and groupsize set to None.
If all else fails, exllama may be the only option.

#

I'll download the model and see what settings work for me.

pale meteor Jun 19, 2023, 9:44 PM

#

So AutoGPTQ managed to load it, but it makes gibberish

#

here is the 13B model btw
https://huggingface.co/notstoic/pygmalion-13b-4bit-128g

pale meteor Jun 19, 2023, 9:46 PM

#

pale meteor So AutoGPTQ managed to load it, but it makes gibberish

now its simply not replying

cedar dust Jun 19, 2023, 9:46 PM

#

Interesting. There is a universal tokenizer that you can download that may fix that. Go to the Model tab and enter this to download it: oobabooga/llama-tokenizer
After it is downloaded, reload the model and the webui should load the tokenizer with it.

pale meteor Jun 19, 2023, 9:49 PM

#

cedar dust Interesting. There is a universal tokenizer that you can download that may fix t...

just stuck on this

cedar dust Jun 19, 2023, 9:50 PM

#

pale meteor just stuck on this

Not sure what that's about, but you can download it manually if you have to. https://huggingface.co/oobabooga/llama-tokenizer

#

Just make sure to put the files in models\oobabooga_llama-tokenizer\

cedar dust Jun 19, 2023, 9:54 PM

#

pale meteor So AutoGPTQ managed to load it, but it makes gibberish

Just to check, make sure that 7B model only has one model file in it's folder. The one you want is Pygmalion-7B-GPTQ-4bit.act-order.safetensors

pale meteor Jun 19, 2023, 9:55 PM

#

cedar dust Not sure what that's about, but you can download it manually if you have to. htt...

cedar dust Jun 19, 2023, 9:56 PM

#

pale meteor

That should be good.

pale meteor Jun 19, 2023, 9:56 PM

#

looks like the 7B model works now

cedar dust Jun 19, 2023, 9:57 PM

#

That's good. Not sure why the 13B isn't working.

#

You might try enabling the auto-devices option when loading it with AutoGPTQ.

pale meteor Jun 19, 2023, 9:57 PM

#

Now gptq-for-llama works with the 7B model

#

let me try load the 13B

#

13B still doesnt work

#

it gives the same memory error

cedar dust Jun 19, 2023, 10:00 PM

#

How much memory do you have?

#

VRAM

pale meteor Jun 19, 2023, 10:00 PM

#

24gb vram

cedar dust Jun 19, 2023, 10:01 PM

#

It should definitely not be giving a memory error. You might have to set up exllama. It can load just about any GPTQ model and is the fastest way to run them.

pale meteor Jun 19, 2023, 10:01 PM

#

guide on how to install?

cedar dust Jun 19, 2023, 10:03 PM

#

pale meteor guide on how to install?

Get the compiler listed here: #general message
Then check if exllama is in \text-generation-webui\repositories. If it isn't, then run cmd_windows.bat and enter this command:

git clone https://github.com/turboderp/exllama .\text-generation-webui\repositories\exllama

#

Another link for the compiler: https://aka.ms/vs/17/release/vs_BuildTools.exe

#

Make sure to select the C++ option when installing.

#

Once that is done, you should be able to select the exllama option to load the model in the webui.

pale meteor Jun 19, 2023, 10:06 PM

#

cedar dust Make sure to select the C++ option when installing.

this one?

cedar dust Jun 19, 2023, 10:06 PM

#

yes

pale meteor Jun 19, 2023, 10:07 PM

#

the full 8GB worth? dont have all that much disk space

cedar dust Jun 19, 2023, 10:07 PM

#

Use the other link I gave then: https://aka.ms/vs/17/release/vs_BuildTools.exe

#

It is 1gb

pale meteor Jun 19, 2023, 10:07 PM

#

thats the one im using

cedar dust Jun 19, 2023, 10:08 PM

#

Damn that sucks. Microsoft doesn't care about disk space.

#

The exllama devs think it's cool to have to compile their software.

pale meteor Jun 19, 2023, 10:09 PM

#

cedar dust The exllama devs think it's cool to have to compile their software.

is there any precompiled?

cedar dust Jun 19, 2023, 10:09 PM

#

Nope. It is designed to compile when it is used. Kinda dumb that they made it that way.

pale meteor Jun 19, 2023, 10:09 PM

#

theres everything to be installed with the link

cedar dust Jun 19, 2023, 10:10 PM

#

MSVC and the Windows 11 SDK are really all you need, but I don't know enough about the other stuff to know what is safe to disable.

pale meteor Jun 19, 2023, 10:11 PM

#

cedar dust MSVC and the Windows 11 SDK are really all you need, but I don't know enough abo...

any other way to install?

cedar dust Jun 19, 2023, 10:12 PM

#

Nope. Microsoft doesn't provide any other way of installing it. Build Tools is the smallest install they offer.

#

Eventually I'll try to redesign the exllama code to make a pre-compiled version. But that will take a while as I've never done something like that before.

pale meteor Jun 19, 2023, 10:14 PM

#

cedar dust Eventually I'll try to redesign the exllama code to make a pre-compiled version....

that would be super useful

cedar dust Jun 19, 2023, 10:17 PM

#

Took a cursory glance at the code, and it doesn't seem too difficult to do.

pale meteor Jun 19, 2023, 10:24 PM

#

cedar dust MSVC and the Windows 11 SDK are really all you need, but I don't know enough abo...

okay have the MS stuff installed, what next?

#

seems like its working but its only using ~45% power of the gpu

cedar dust Jun 19, 2023, 10:31 PM

#

pale meteor seems like its working but its only using ~45% power of the gpu

I would assume that it is CPU bottlenecked, but I don't know much about how exllama works. It's pretty new.

pale meteor Jun 19, 2023, 10:32 PM

#

cedar dust I would assume that it is CPU bottlenecked, but I don't know much about how exll...

cpu is running at about ~30-40%

#

normal ram usage is high though

cedar dust Jun 19, 2023, 10:48 PM

#

Well... I managed to compile exllama as an independent module. Just have to get the webui to load it.

pale meteor Jun 19, 2023, 10:52 PM

#

alright

#

Does your ram get hit this hard when running exllama?

cedar dust Jun 19, 2023, 10:58 PM

#

Yeah it's pretty bad on RAM usage.

pale meteor Jun 19, 2023, 10:58 PM

#

cedar dust Yeah it's pretty bad on RAM usage.

anyway to tone it down a bit?

cedar dust Jun 19, 2023, 11:00 PM

#

No clue. It might be a bug in how the webui uses exllama. Don't know enough about it know for sure.

pale meteor Jun 19, 2023, 11:02 PM

#

well i have to head to sleep
i'll try other models to see what works

pale meteor Jun 20, 2023, 9:17 PM

#

@cedar dust Do you run TavernAI?

cedar dust Jun 20, 2023, 9:18 PM

#

pale meteor <@798586431886065684> Do you run TavernAI?

I use SillyTavern

pale meteor Jun 20, 2023, 9:18 PM

#

ah alr

pale meteor Jun 20, 2023, 9:52 PM

#

cedar dust I use SillyTavern

do you know the diffrence between TaverAI and SillyTavern?

cedar dust Jun 20, 2023, 9:53 PM

#

More features and direct support for text-generation-webui.

#

Has some cool extensions, not that I've ever used them.

pale meteor Jun 20, 2023, 9:54 PM

#

cedar dust More features and direct support for text-generation-webui.

do you know if i can export chats from tavernai to sillytavern?

cedar dust Jun 20, 2023, 9:56 PM

#

Not sure. Don't see any mention of it in the docs.

cedar dust Jun 20, 2023, 10:00 PM

#

pale meteor do you know if i can export chats from tavernai to sillytavern?

There is a way to import a chat. Don't know if it supports TavernAI's chats or not.
It wouldn't surprise me though, given that SillyTavern started as a fork of TavernAI.

pale meteor Jun 20, 2023, 10:00 PM

#

cedar dust There is a way to import a chat. Don't know if it supports TavernAI's chats or n...

how would i import chats?

cedar dust Jun 20, 2023, 10:07 PM

#

pale meteor how would i import chats?

Select a character, then press the three-bar button in the bottom left. Press View past chats and you should see a button in the top right of the pop-up to import.

pale meteor Jun 20, 2023, 10:10 PM

#

cedar dust I use SillyTavern

im not using a proxy or anything like that

cedar dust Jun 20, 2023, 10:13 PM

#

Probably just have to wait a while and try again. npmjs servers probably just having issues.

pale meteor Jun 20, 2023, 10:20 PM

#

cedar dust Probably just have to wait a while and try again. npmjs servers probably just ha...

any other ideans on how to fix?

cedar dust Jun 20, 2023, 10:21 PM

#

pale meteor any other ideans on how to fix?

VPN maybe? Not really sure what to do.

cedar dust Jun 20, 2023, 10:22 PM

#

pale meteor any other ideans on how to fix?

You can try editing the Start.bat script and change the second line to @rem call npm install
Since this is your first time running, it may just fail due to missing packages.

#

You can also try running this command in cmd: npm config delete proxy

pale meteor Jun 20, 2023, 10:36 PM

#

cedar dust You can try editing the `Start.bat` script and change the second line to `@rem c...

Getting this now

cedar dust Jun 20, 2023, 10:40 PM

#

pale meteor Getting this now

Just remove the @rem and try again later if the npm config delete proxy doesn't fix it. Not much else to do since it is refusing to connect to install the requirements.

pale meteor Jun 20, 2023, 10:41 PM

#

cedar dust Just remove the `@rem ` and try again later if the `npm config delete proxy` doe...

are you able to share the node modules with me?
like the folder

cedar dust Jun 20, 2023, 10:47 PM

#

pale meteor are you able to share the node modules with me? like the folder

📎 node_modules.7z

pale meteor Jun 20, 2023, 10:48 PM

#

cedar dust

thx

pale meteor Jun 20, 2023, 11:19 PM

#

@cedar dust is there anyway to contine the convo in sillytavern, as in letting the AI contine talking?

cedar dust Jun 20, 2023, 11:20 PM

#

pale meteor <@798586431886065684> is there anyway to contine the convo in sillytavern, as in...

It sometimes doesn't work very well, but you can press the generate button with the input box blank.

#

It also has a multi-gen mode in the settings that is intended to allow for longer responses. Live text streaming doesn't work with it though.

ashen forum Jun 21, 2023, 2:49 PM

#

Does anyone have any idea what the hell this is all about? Last week the model was working. Today I updated the webui and...
bin J:\oobabooga_windows\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll 2023-06-21 16:47:54 INFO:Loading settings from J:\oobabooga_windows\mysettings.yaml... 2023-06-21 16:47:54 INFO:Loading TheBloke_chronos-wizardlm-uc-scot-st-13B-GPTQ... Traceback (most recent call last): File "J:\oobabooga_windows\text-generation-webui\server.py", line 1007, in <module> shared.model, shared.tokenizer = load_model(shared.model_name) File "J:\oobabooga_windows\text-generation-webui\modules\models.py", line 65, in load_model output = load_func_map[loader](model_name) File "J:\oobabooga_windows\text-generation-webui\modules\models.py", line 197, in huggingface_loader model = LoaderClass.from_pretrained(checkpoint, **params) File "J:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\models\auto\auto_factory.py", line 484, in from_pretrained return model_class.from_pretrained( File "J:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\modeling_utils.py", line 2449, in from_pretrained raise EnvironmentError( OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory I:\Textmodels\TheBloke_chronos-wizardlm-uc-scot-st-13B-GPTQ. Press any key to continue . . .

#

So I cannot load the webui.

#

OK, it was a compatibility problem. Need delete or rename config-user.yaml:
https://github.com/oobabooga/text-generation-webui/issues/2795#issuecomment-1600999075

GitHub

OSError: Error no file named pytorch_model.bin, tf_model.h5, model....

Describe the bug I have updated my webui, and now I get this message for my previously working model: Traceback (most recent call last): File "J:\oobabooga_windows\text-generation-webui\server...

pale meteor Jun 21, 2023, 3:54 PM

#

@cedar dust keep getting this now

#

ig it times out or something

pale meteor Jun 21, 2023, 4:11 PM

#

calm wave Jun 21, 2023, 4:18 PM

#

pale meteor

@pale meteor

#

can you help me rq ?

pale meteor Jun 21, 2023, 4:18 PM

#

calm wave can you help me rq ?

sure

calm wave Jun 21, 2023, 4:18 PM

#

pale meteor sure

kk wait a little

#

#

basically

#

i can load non 4b-128g models

#

but when i try to load one

#

#

it get stuck on that

#

Or

#

it send me a weird error

#

#

here's what happen

#

basically it don't load the model

#

(and i tried to update peft)

#

so if you can please help me lmao

calm wave Jun 21, 2023, 4:20 PM

#

calm wave it send me a weird error

#

this is the error

#

it would send

#

before

#

but i managed to fix it i think

pale meteor Jun 21, 2023, 4:20 PM

#

cedar dust Get the compiler listed here: https://discord.com/channels/1089972953506123937/1...

start up here to install exllama
worked for me

calm wave Jun 21, 2023, 4:20 PM

#

pale meteor start up here to install exllama worked for me

I did

#

but uhh...

#

nothing changed

#

it only got worse

calm wave Jun 21, 2023, 4:21 PM

#

calm wave

this started when i installed exllama

pale meteor Jun 21, 2023, 4:21 PM

#

reinstall without exllama?

calm wave Jun 21, 2023, 4:21 PM

#

i tried

#

and it just don't load

#

the model

pale meteor Jun 21, 2023, 4:21 PM

#

can i get the full error?

calm wave Jun 21, 2023, 4:22 PM

#

pale meteor can i get the full error?

it don't happen anymore but basically

#

you have it in the screenshot

#

it's the full thing

#

so idk what i did wrong

#

but this happened after the update.

pale meteor Jun 21, 2023, 4:24 PM

#

whats the link to download the model?

calm wave Jun 21, 2023, 4:24 PM

#

https://huggingface.co/mayaeary/pygmalion-6b-4bit-128g

mayaeary/pygmalion-6b-4bit-128g · Hugging Face

#

and that's what happen now

#

Traceback (most recent call last): File “B:\webui\text-generation-webui\server.py”, line 62, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name, loader) File “B:\webui\text-generation-webui\modules\models.py”, line 65, in load_model output = load_func_maploader File “B:\webui\text-generation-webui\modules\models.py”, line 271, in AutoGPTQ_loader return modules.AutoGPTQ_loader.load_quantized(model_name) File “B:\webui\text-generation-webui\modules\AutoGPTQ_loader.py”, line 55, in load_quantized model = AutoGPTQForCausalLM.from_quantized(path_to_model, **params) File “B:\webui\installer_files\env\lib\site-packages\auto_gptq\modeling\auto.py”, line 82, in from_quantized return quant_func( File “B:\webui\installer_files\env\lib\site-packages\auto_gptq\modeling_base.py”, line 773, in from_quantized accelerate.utils.modeling.load_checkpoint_in_model( File “B:\webui\installer_files\env\lib\site-packages\accelerate\utils\modeling.py”, line 1094, in load_checkpoint_in_model checkpoint = load_state_dict(checkpoint_file, device_map=device_map) File “B:\webui\installer_files\env\lib\site-packages\accelerate\utils\modeling.py”, line 946, in load_state_dict return safe_load_file(checkpoint_file, device=list(device_map.values())[0]) File “B:\webui\installer_files\env\lib\site-packages\safetensors\torch.py”, line 261, in load_file result[k] = f.get_tensor(k) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 6.00 GiB total capacity; 394.10 MiB already allocated; 4.60 GiB free; 396.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

#

idfk why

pale meteor Jun 21, 2023, 4:24 PM

#

whats your gpu vram?

calm wave Jun 21, 2023, 4:25 PM

#

6

#

but before i could run it

#

easily

#

with it only using 4.5

#

so yeah, i can def run it

pale meteor Jun 21, 2023, 4:26 PM

#

maybe too many apps running, try closing some
and try enable auto-devices in the webui.py file

calm wave Jun 21, 2023, 4:26 PM

#

Traceback (most recent call last): File “B:\webui\text-generation-webui\server.py”, line 62, in load_model_wrapper shared.model, shared.tokenizer = load_model(shared.model_name, loader) File “B:\webui\text-generation-webui\modules\models.py”, line 65, in load_model output = load_func_maploader File “B:\webui\text-generation-webui\modules\models.py”, line 271, in AutoGPTQ_loader return modules.AutoGPTQ_loader.load_quantized(model_name) File “B:\webui\text-generation-webui\modules\AutoGPTQ_loader.py”, line 55, in load_quantized model = AutoGPTQForCausalLM.from_quantized(path_to_model, **params) File “B:\webui\installer_files\env\lib\site-packages\auto_gptq\modeling\auto.py”, line 82, in from_quantized return quant_func( File “B:\webui\installer_files\env\lib\site-packages\auto_gptq\modeling_base.py”, line 773, in from_quantized accelerate.utils.modeling.load_checkpoint_in_model( File “B:\webui\installer_files\env\lib\site-packages\accelerate\utils\modeling.py”, line 1094, in load_checkpoint_in_model checkpoint = load_state_dict(checkpoint_file, device_map=device_map) File “B:\webui\installer_files\env\lib\site-packages\accelerate\utils\modeling.py”, line 946, in load_state_dict return safe_load_file(checkpoint_file, device=list(device_map.values())[0]) File “B:\webui\installer_files\env\lib\site-packages\safetensors\torch.py”, line 261, in load_file result[k] = f.get_tensor(k) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 6.00 GiB total capacity; 394.10 MiB already allocated; 4.60 GiB free; 396.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

#

now another one

#

hehe

#

...

calm wave Jun 21, 2023, 4:27 PM

#

pale meteor maybe too many apps running, try closing some and try enable auto-devices in the...

well i tried

#

but ig i can try again

#

since i managed to literally make it load some models

#

(before it just wouldn't load anything)

#

but does this started happening to other ppl too ? after the update

#

coz it would work normally for me until that

pale meteor Jun 21, 2023, 4:28 PM

#

mine broke after the update on my 12gb gpu

calm wave Jun 21, 2023, 4:28 PM

#

pale meteor mine broke after the update on my 12gb gpu

How tf...

pale meteor Jun 21, 2023, 4:28 PM

#

I got my P40 bc of it lol

calm wave Jun 21, 2023, 4:28 PM

#

12gb could run a 13b-4bit-128g easily

#

and the model would only use like 8-9gb vram/13gb

#

i've literally seen a guy running a 30B model

#

on a 16gb gpu

#

or something like that

#

like i can run normal models

#

but 4bit-128g ones can't

#wont load model on latest version