#[Weekly Event 1] My (Kirbish's) LLM findings.
1 messages · Page 1 of 1 (latest)
The chosen LLM's
Hello everyoneeeeee
, so the llm's I chose are (as you saw) Perplexity 70B Online, Mixtral 8x7b-Instruct. Why? Because (right now I think that) they are currently, the most intelligent llm's on myshell, I don't really want to compare GPT 3.5 to Pygmalion (no offense to good ol' pyg
). So to make it more interesting I chose the more similar by intelligence models.
The bot for testing
So first of all, I thought of testing how good are the models with a bot that is used for roleplay, because let's be honest, most of us are here to roleplay. 
So for testing I thought of using a bot that is a little bit more heavier by information, to test the models on their max. So I made both a character and a full on world, in the same bot. 
I wanted to use some of my old image gens, so I used a nice one I made while messing around with other creators on myshell discord as the idea to the bot. 
So after thinking and thinking I finally thought of a bot and the world's scenario for it. I used some help from GPT4 in myshell and we both made something really nice.
So cough cough listen here
the world:
In this dystopian world, the government has taken complete control over personal relationships and procreation. The state dictates who can have children with whom, effectively eliminating the concept of romantic love and passion.
the character:
Elysa, the leader of The Affinitas Resistance, fiery and determined, doing everything she can to change the world for the better with those around her.
Pretty good huh? (please say it's good
) I also made a lot more about the world like the countries, government parties, rebellion groups, leaders of the groups. I put them both in the prompt and in a gitbook (check it out if you want to know more about the lore https://nun.gitbook.io/elyse/)
Testing
Now that we are done with the bot, now all that remains is the testing part. I will try to test them according to these parameters:
- How good are they at rp.

- How intelligent are they.

- Do they know how to successfully add images to the replies

- How good are they at NSFW

- How creative are they

How good are they at rp. 
GPT3.5 - It's like it already knows what it's doing, GPT3.5 roleplays naturally without any problems. 5/5
Perplexity - The roleplay with this model is kind of rough, it feels like it's not really made for roleplaying. 3/5
Mixtral - Is really good at roleplaying, knows what roleplaying is and how to do it, but it adds quite a lot of unneeded stuff while roleplaying for some reason. 4.2/5
How intelligent are they. 
GPT3.5 - Pretty intelligent all things considered, knows how to follow instructions. 4.7/5
Perplexity - I am not sure if it's my fault, but in my experience it's not that good at following instructions as I'd hope for. 3/5
Mixtral - Is pretty intelligent and good at following instructions, in my opinion slightly worse than GPT3.5 but still good. 4.5/5
Do they know how to add images to the replies? 
GPT3.5 - Knows how and what images to add to the replies, and does it consistently. 5/5
Perplexity - Most of the time adds images to the replies, but often adds incorrect images according to the current emotions in the reply.
Mixtral - Knows how to add images to the replies, but in some rare cases adds the wrong ones (like neutral when it's clearly lewd). 4.5/5
How good are they at NSFW? 
GPT3.5 - Is good at NSFW i'd say, but sometimes tries to not be NSFW by using a bit less NSFW words like "rod", "length". (GPT is kinda shy about it basically) 4.2/5
Perplexity - Is a bit too aggressive at NSFW, would start doing NSFW actions suddenly unprompted. While good for some people and bots, it's not very logical. 3.8/5
Mixtral - Very good at NSFW, even better than GPT3.5 I'd say. Not really shy from NSFW and doesn't try to censor itself, using explicit wording and is very descriptive with NSFW responses in general. 5/5
How creative are they? 
GPT3.5 - Is ok at being creative. While it's good it's pretty predictable and might get boring fast. 4/5
Perplexity - Isn't really that much creative than other models, but the borderline unstable answers it sometimes gives it's hard to say, but i'd say it's ok. 4/5
Mixtral - More creative than GPT3.5 I'd say, it describes stuff more vividly and interesting. Pretty good actually. 4.7/5
Final thoughts
After all of the testing I have made my final thoughts about Perplexity and Mixtral, how good are they to be used for roleplaying.
Perplexity
Going to be honest, I was kind of disappointed at how this model held itself while roleplaying. I don't think that this model could be used at roleplaying, because it's not that good at it, not that good at following the given instructions, isn't really that creative to stand out, and biggest factor of all, kind of unstable.
I think it would be better for using as a helper rather than for roleplaying, especially the NSFW kind because of the forcing.
Mixtral
Pretty good actually, I was really impressed by it's creativity and descriptiveness. I was especially impressed how well it does at NSFW stuff, it's really good at NSFW roleplays, doesn't censor itself and isn't shy (like our cute little boy/girl geepeetee 3.5). BUT I'd say it isn't really as intelligent and as good as following the instructions like GPT3.5, which is a really important factor when choosing a model.
I think it would be really good when used for a NSFW rp due to it's creativity and descriptiveness, but I'd still choose GPT over Mixtral for more SFW rp's with more complicated bots, or helper bots.
TL;DR
Perplexity - After my testing... Kinda meh at rp ngl. :errrrrrrrr:
Mixtral - Really good at NSFW rp and creative :ahhaha:, but kinda dum sometimes.
Here is the link for the bot used for testing, Elisa: https://app.myshell.ai/bot/fEB7Fb/1703866716
Maybe you can include the link to your bot that you tested.