#Llama CPP GPU offloading doesn't work for me. (Windows)
1 messages · Page 1 of 1 (latest)
output:
llama.cpp: loading model from models\birdup_pygmalion-7b-q5_1-ggml-v5\pygmalion-7b-q5_1-ggml-v5.bin llama_model_load_internal: format = ggjt v3 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 2048 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 9 (mostly Q5_1) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 0.07 MB llama_model_load_internal: mem required = 6612.59 MB (+ 1026.00 MB per state) ................................................................................................... llama_init_from_file: kv self size = 1024.00 MB AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
llama cpp python version: 0.1.65.
i don't know how to find the version of llama cpp. i used webui one click installer.
on my env llama-cpp-python 0.1.63 but I didn't update for a bit . also you might want to install VS build tools 2019 & 2022 ( should work with 2019 only but if doesn't get 2022 as well)