So I've been trying to track down this bug for the past day, For some reason when creating a new swipe or regenerating a message in Silly Tavern. It basically generates the same thing, maybe changing a word or sentence. Changing basically every parameter doesn't do anything on SillyTavern's side. I then launched Postman and sent an api request to completions and got a value, changed the seed and basically got the same text back from it. However, using the OobaBooga webui with the same character from SillyTavern, I can regenerate and basically get a new generation each time. If I copy those settings from the OB WebUI to SillyTavern, it still doesn't fix the issue from SillyTavern. Does anyone know what I'm missing?
#Using API (SillyTavern) with identical settings to OobaBooga webUI creates identical gens in API
1 messages · Page 1 of 1 (latest)
What loader are you using? Could be related:
https://github.com/oobabooga/text-generation-webui/issues/5451
Yeah I am using llamacpp
Although it looks like using llamacpp_HF, it makes the character talk like it barely understands english, "she leaneds closer captureshis gazewith haert stopping intensity"
LOL that's very odd, is it like q2 or something? Is the non HF coherent?
I'm not sure what Q2 is. But yeah, the non-hf one is pretty coherent and makes some pretty unique generations (well at least the first time) and seems to stay on topic.
And you didn't set the temperature super high with the HF loader? What do your settings look like? May still be an issue with the base llama CPP wrapper that needs updating
Nope, initially it was, which made it even worse, but I changed to another preset that makes it look a bit more ok
Assuming that SillyTavern is sending the parameters correctly, which I would assume it is
That seems like a really high min p value, but otherwise looks like it should work.. dynamic temp is disabled I assume based on that empty box?
Can you try using it in text gen directly and see if you can reproduce those results?
Yeah, lowering the min p stabilized it a bit and dynamic temp is disabled.
But I cannot reproduce it in the text-gen chat. And I wasn't able to reproduce the previous error with it as well, it was only through the api that was causing the issue
Hmm that makes me think silly tavern is doing something.. silly...
Lol yeah. TBH I've been between the ST discord, the OB discord and the model's discord trying to figure out the cause. But yesterday someone with the same issue in the ST discord sent me this
https://github.com/abetlen/llama-cpp-python/issues/1154
Which kinda shows the issue, but then it's weird because if I call the API for OB directly, I would still get the same issue.
Can you view your console as you make the request and view what kind of parameters are being passed to the backend?