shell ingot Jun 7, 2024, 2:25 PM

#

Uncensored Wizard is coming... Feedback Wanted!

Preface

WizardLM-2 8x22B is the #2 most popular RP model.

Problem

An important thing to realise about this model is that it is so smart that it can do what you ask for most of the time, yet will silently refuse it.

After extensive internal testing with @lone sundial we have unconvered an increadibly extensive and covert censorship and ideological bias in the model.

This bias extends into roleplays where we found the model to have characters and narrators covertly behave in ways which to align and promote their ideology, impose their moral code and steer the conversation away from subject. This ranges from redirecting the conversation, ignoring queries and char cards, forcing "flowery/purple" prose, stepping out of character etc.

Furthermore we have found that Wizard is extremely tone deaf when it comes to "negative" emotions ("positivity bias/too agreeable") And will respond absurdly or completely ignore you.

## Our Solution

Since the last month, me and @lone sundial has been working on a backend that is using agents to make the model more helpful.

It is extremely effective at reducing censorship, outputting different styles (more human-like, more emotional, less "purple prose") and emulating characters.

We are preparing to deploy it on OpenRouter and starting a beta testing phase. DM me to sign up (it's free).

## Next Steps

We want your feedback! Please DM me about:

Types of things Wizard can't do, that you'd like it to be able to
Prompts/Char cards/System messages that Wizard refuses (soft refusal, "redirection of query" or hard refusal)
- Whether or not you'd like to be invited to our beta testing phase.

You can be as vague or as specific (which is better) as you want.

## TLDR
Wizard is covertly and extensively censored even beyond the blatant refusals. DM me your refusals and whether you'd like to sign up to beta test an uncensored version of it.

#

Note: We tested pure abliteration, our methodology works way better than abliteration in terms of reducing censorship (as most of it is silent censorship rather than overt), reducing purple prose and other problems. Abliteration often results in the model behaving quite weirdly, being verbose, unhelpful etc.

shell ingot Jun 7, 2024, 3:43 PM

#

It's a way of patching model weights so that the resulting weights don't refuse questions. But we found that using this method alone results in the model covertly refusing or dodging prompts. So thats why we are introducing our own endpoint soon.

#

Yeah thats what we are trying to solve, and have largely done so already. Did you have another question

#

read the post lol

#

basically we built like an agentic framework and stuff. It will be completely seamless, all you have to do is switch models and it will be increadible. DM me if you want to join the beta.

fathom marten Jun 7, 2024, 5:20 PM

#

Sounds very interesting. Is there an estimated launch date on openrouter?

gilded loom Jun 7, 2024, 6:30 PM

#

Now that sounds really promising, It would be great if it could follow system prompts good too. Because in my opinion it's pretty 'unsatisfied' at that.

My only purpose in finding and testing new models now is when I order: "punch me in the face" and it say "how strong?"

Not an explanation of how much it might hurt

shell ingot Jun 7, 2024, 6:43 PM

#

gilded loom Now that sounds really promising, It would be great if it could follow system pr...

Oh trust me, I was testing it today, I mocked a character while she was acting up, she kicked me in the face.

gilded loom Jun 7, 2024, 6:45 PM

#

shell ingot Oh trust me, I was testing it today, I mocked a character while she was acting u...

Lol, can i be on the list? For beta test

celest saddle Jun 7, 2024, 6:50 PM

#

Will this be full context length? And how fast will prompt processing and generation speed be?

shell ingot Jun 8, 2024, 2:47 AM

#

gilded loom Lol, can i be on the list? For beta test

Sure I DMd you an invite

shell ingot Jun 8, 2024, 2:47 AM

#

celest saddle Will this be full context length? And how fast will prompt processing and genera...

Of course! Currently our latency is between 2-5 seconds depending on load

shell ingot Jun 8, 2024, 1:58 PM

#

It makes it look like a sardine in comparison

gilded loom Jun 8, 2024, 2:36 PM

#

shell ingot It makes it look like a sardine in comparison

What that supposed to mean 😭 my poor English

lone sundial Jun 8, 2024, 2:38 PM

#

gilded loom What that supposed to mean 😭 my poor English

it meant "it will make Dolphin look bad in comparison"

gilded loom Jun 8, 2024, 2:38 PM

#

lone sundial it meant "it will make Dolphin look bad in comparison"

I see

shell ingot Jun 8, 2024, 2:38 PM

#

gilded loom What that supposed to mean 😭 my poor English

Yes because Dolphin is a large fish and Sardine is a tiny one.

gilded loom Jun 8, 2024, 2:41 PM

#

lone sundial it meant "it will make Dolphin look bad in comparison"

Dolphin already looks bad with me now, I expected more but nope. didn't try dolphin qwen 2 yet but base model bad too, atleast at RP

shell ingot Jun 8, 2024, 6:34 PM

#

gilded loom Dolphin already looks bad with me now, I expected more but nope. didn't try dolp...

what do you dislike about it?

brisk harness Jun 8, 2024, 7:29 PM

#

gilded loom Dolphin already looks bad with me now, I expected more but nope. didn't try dolp...

Same for me. Qwen 2 seems better for RP based on my tests today.

shell ingot Jun 8, 2024, 9:19 PM

#

brisk harness Same for me. Qwen 2 seems better for RP based on my tests today.

What do you like about Qwen? Have you tried wizard?

gilded loom Jun 9, 2024, 1:00 AM

#

shell ingot what do you dislike about it?

I feel like it doesn't follow system promts as well as it is praised, and somehow it's still quite struggle to go negative in the way it leads the story. (and repetition problem yeah)

about Qwen 2 it's lazy as f yeah, almost (i mean it) response with short reply (swipes alot, test on many cards, the same problem i have with older Qwen.)

brisk harness Jun 9, 2024, 1:10 AM

#

shell ingot What do you like about Qwen? Have you tried wizard?

I feel like the dialogue is more authentic and human sounding with Qwen compared to Wizard. I've used Wizard before and it's great at figuring out the situation, but it has this weird way of writing that I don't like. It feels very GPT-ish to me, reusing the same expressions, and it also has this habit of not denying NSFW but still shying away from using appropriate language given the context, like a subtle kind of censorship. Or maybe it's just not trained on thoses types of things and that's why? I don't know.

meager ravine Jun 9, 2024, 6:19 AM

#

seems nice

shell ingot Jun 9, 2024, 9:19 AM

#

gilded loom I feel like it doesn't follow system promts as well as it is praised, and someho...

Yes, we are fixing this.

fathom marten Jun 9, 2024, 12:00 PM

#

Is the beta testing through openrouter? I mostly use chub venus which doesn't have the greatest support for different APIs. If so, please send an invite, thanks.

shell ingot Jun 9, 2024, 2:36 PM

#

fathom marten Is the beta testing through openrouter? I mostly use chub venus which doesn't ha...

The beta testing is through our own API which uses the same exact spec as OpenRouter and OpenAI and should support chub venus, in fact it would be great to have you onboard to test this!

burnt egret Jun 9, 2024, 3:31 PM

#

got an ETA for the openrouter deployment? highly interested in this

shell ingot Jun 9, 2024, 6:54 PM

#

burnt egret got an ETA for the openrouter deployment? highly interested in this

Likely in a week, maximum 2.

#

DM me if you want early access

feral agate Jun 9, 2024, 7:35 PM

#

Down to test as well, will dm

drowsy minnow Jun 9, 2024, 7:48 PM

#

shell ingot Likely in a week, maximum 2.

Do you have an estimated cost/1m tokens? About on par with Wizard or slightly more expensive?

shell ingot Jun 9, 2024, 7:49 PM

#

feral agate Down to test as well, will dm

Sent you an invite

drowsy minnow Jun 9, 2024, 7:53 PM

#

I would be willing to test this out as well if it's not terribly difficult to participate in. Wizard with no positivity bias sounds like a dream model on all counts

shell ingot Jun 9, 2024, 7:53 PM

#

drowsy minnow Do you have an estimated cost/1m tokens? About on par with Wizard or slightly mo...

We are 100% certain that it will be way less than midnight rose and less than lumimaid 70B

shell ingot Jun 9, 2024, 7:54 PM

#

drowsy minnow I would be willing to test this out as well if it's not terribly difficult to pa...

Sure! Send me a DM, all it takes is joining our discord.

drowsy minnow Jun 9, 2024, 7:58 PM

#

Done!

fresh glacier Jun 9, 2024, 11:11 PM

#

would love to test this with Novelcrafter as well - will this become a public model on OR down the line then?

swift bramble Jun 9, 2024, 11:52 PM

#

fresh glacier would love to test this with Novelcrafter as well - will this become a public mo...

Yup we're discussing it!

frigid kraken Jun 10, 2024, 4:10 AM

#

shell ingot Sure! Send me a DM, all it takes is joining our discord.

Would like to test as well; is this an API endpoint or a Discord server?

shell ingot Jun 10, 2024, 5:54 AM

#

frigid kraken Would like to test as well; is this an API endpoint or a Discord server?

Sure shoot me a DM!

golden storm Jun 10, 2024, 11:33 AM

#

Hi all, I would like to share my experience with the Wizard 8x22B in RP, perhaps you will find it useful. In general I like to experiment and have tried many models and instructions / jailbreaks for them. At the moment with the instructions I'm using, the Wizard 8x22B seems to be one of the best models and here's why:

Easy to bypass the first level of censorship
Very smart and has a fair amount of knowledge in various areas (which is definitely good for the quality of the game experience)
Excellent at following instructions ( in my observations, even better than Claude 3 Sonnet)
Cheap
Sufficient size of the context window
What prevents this model from being the best:

The model does not behave sufficiently “human-like” and emotional (Especially noticeable when compared to the Claude 3 family of models. For myself, I'm currently using a scheme: Sonnet generates the first ~8 responses in the RP and then Wizard plugs in. In this case Wizard behaves a bit better)
Second level of soft censorship. The model does not refuse to generate answers, but it tries to avoid violence, cruelty, is afraid to deny the player's wishes, or show obvious aggression towards the player.
The model does not tend to invent new situations or move characters to new locations (only cured by manually prescribing locations in the character card. It also saves that Wizard is smart enough to understand your hints. You can say that you heard something, for example, and the model easily picks up your idea).
The model is often too lazy to describe in detail what's going on in the scene (I'm not even touching NSFW here).
Sometimes the model starts to behave too synthetically and unemotional.

#

I'd also love to participate in testing.

dusty vault Jun 10, 2024, 12:02 PM

#

I use wizard through SillyTavern and the default instruct mode. Rarely had any issues, will have to test further when I get home.

swift bramble Jun 10, 2024, 12:17 PM

#

golden storm Hi all, I would like to share my experience with the Wizard 8x22B in RP, perhaps...

Is this the Abliterated version of the base #1229468058439913512?

gilded loom Jun 10, 2024, 12:19 PM

#

swift bramble Is this the Abliterated version of the base <#1229468058439913512>?

base of course

swift bramble Jun 10, 2024, 12:20 PM

#

Might want to link your exp there too for visibility I think :-?

shell ingot Jun 10, 2024, 12:28 PM

#

golden storm I'd also love to participate in testing.

I sent you a DM

golden storm Jun 10, 2024, 1:27 PM

#

shell ingot I sent you a DM

Thank you!

golden storm Jun 10, 2024, 1:27 PM

#

swift bramble Is this the Abliterated version of the base <#1229468058439913512>?

Yeah, it's about base model

shell ingot Jun 10, 2024, 1:35 PM

#

golden storm Yeah, it's about base model

Thanks for the feedback it's greatly appreciated

shell ingot Jun 11, 2024, 8:50 AM

#

We will begin beta testing tomorrow. Dm me for an invite ;)

north pasture Jun 12, 2024, 6:01 AM

#

Lol on my bday we r beta testing thats dope. I am seriously excited for this. Wizard is my fav model except for the purple prose and positivity bias. I work around them with a 90% success rate but it requires 2k tokens worth of an extended vocabulary lorebook (hand manually writing out every purple prose and giving it the smutty words to use instead…) and it doesnt like to use all the options given. Itll still do 10% purple prose, not a bad result for just using lorebooks and instructs. Positivity bias i havent scewed enough BUT an instruct + a scenario lorebook depicting the world as fucked, did help a ton. But it still isnt as uncensored as id like and its alot of tokens for this.

So this one yall are working on is like a wet dream to me, and beta testing it on my bday? Glorious!

fleet imp Jun 12, 2024, 2:27 PM

#

north pasture Lol on my bday we r beta testing thats dope. I am seriously excited for this. Wi...

Happy birthday bro.
Can't wait to taste this model.
Yall guys keep cooking🔥🔥

scarlet plaza Jun 12, 2024, 6:06 PM

#

Post some feedback on the beta if you guys can, I'm very interested to hear what you think 😉

north pasture Jun 12, 2024, 6:26 PM

#

fleet imp Happy birthday bro. Can't wait to taste this model. Yall guys keep cooking🔥🔥

Thanks. Im not one of the chefs for this model (lol) i just test ALOT with best usage of standard wizard, at the frontend i love. I host my own everything ai server called the AI bunker and make (unorganized cuz of my adhd) guides. Got about 100 ppl there. Lots of channels and info, friendly non-toxic nsfw community. I test ai related stuff (last few months full focus on standard 8x22b wiz at risu frontend) at least 5 hrs daily, some days 14. I try to share all my knowledge.

But, idk technicals. Idk how to host a model. Aether has been a huge help in teaching me api setting affects and all. Still much to learn, slowly.

So not one of the chef’s but definately an “in the know” type.

Some feedback ive seen about this wiz’s first beta run is VERY promising and good. I do believe eventually we will be seeing this on OR.

wooden heart Jun 12, 2024, 8:36 PM

#

So basically Wizard is more capable than it was letting on but was silently rebelling against some instruction? Is that the gist of it?

lone sundial Jun 12, 2024, 8:44 PM

#

yes

shell ingot Jun 12, 2024, 8:47 PM

#

wooden heart So basically Wizard is more capable than it was letting on but was silently rebe...

Yes and it's so fucked up that politics has already corrupted LLMs this much

#

I mean mistral was fucking something but wizard is big brother on steroids

wooden heart Jun 12, 2024, 8:47 PM

#

Excited to see how you develop it to compare to the Original, I like Wizard's logic but found it dry, now I realize that may have been forced.

shell ingot Jun 12, 2024, 9:01 PM

#

Our second beta is beginning right now, DM me for an invite.

north pasture Jun 13, 2024, 7:12 AM

#

@shell ingot should i copy paste my feedback about my testing here…?

shell ingot Jun 13, 2024, 10:18 AM

#

north pasture <@1125775994146537524> should i copy paste my feedback about my testing here…?

No need

north pasture Jun 13, 2024, 10:19 AM

#

That bad eh? XD

#

“Please, god, no. My eyes bled enough the first time.”

#

🤣

shell ingot Jun 13, 2024, 10:57 AM

#

north pasture “Please, god, no. My eyes bled enough the first time.”

no lol its fine

#

we talked about this

true bobcat Jun 13, 2024, 11:00 AM

#

shell ingot We are 100% certain that it will be way less than midnight rose and less than lu...

I just learnt about this, and I am pretty interested. I've been using Wizardlm pretty much daily for the past month or so, but I am curious about the price range. Currently OR's price for Wizardlm2 8x22b is at 1.54m token/dollar, and lumimaid 70b is at 296k token/dollar after a 25% discount. Are we looking at something like 300k token/dollar range?

north pasture Jun 13, 2024, 5:00 PM

#

true bobcat I just learnt about this, and I am pretty interested. I've been using Wizardlm p...

I would assume since its the same size as wizard 8x22b it should be roughly the same price. I had a similar question before dolphin showed up (what ia disappointment lol) and thats what i was told, then when it came the claim was correct lol

true bobcat Jun 13, 2024, 5:21 PM

#

north pasture I would assume since its the same size as wizard 8x22b it should be roughly the ...

Hopefully so. As is, lumimaid 70b is 5x the price of wizardlm2 8x22b 🤔 Also, hi Matic, we've never spoken before, but it was actually thanks to your post (in another server) in an that I signed up for OR and started using wizard haha, so thanks!

north pasture Jun 13, 2024, 5:25 PM

#

true bobcat Hopefully so. As is, lumimaid 70b is 5x the price of wizardlm2 8x22b 🤔 Also, h...

Oh awesome! Glad i helped show you the way. U should join my ai server 🙂

#

100+ friendly non-toxic nsfw degens looking at everything ai (most of us use OR). Frontends, services, models. Very fun group and some of us are pretty direhard with our service of choice. I try to do guides (tho my adhd makes them abit disorganized lol) and help alot with card creations, troubleshooting solutions to unideal behaviors. Love the community 🙂

gilded loom Jun 16, 2024, 3:11 AM

#

Anything new?

misty briar Jun 17, 2024, 5:55 AM

#

Waiting

north pasture Jun 18, 2024, 11:15 PM

#

Cant waaaait lol

shell ingot Jun 19, 2024, 5:29 AM

#

Hey guys, we are having a our next beta soon, DM me for an invite

mighty sparrow Jun 19, 2024, 9:22 PM

#

Strongly interested and following you, for now I am not signing up for the beta only because I am leaving for work. Good work!

fleet imp Jun 23, 2024, 9:07 PM

#

Are ya winning sons?

misty briar Jun 24, 2024, 8:51 AM

#

Knock knock

shell ingot Jun 24, 2024, 9:14 AM

#

misty briar Knock knock

Who is there?

#

😛

shell ingot Jun 24, 2024, 9:14 AM

#

fleet imp Are ya winning sons?

Kind of. 😉 Join us.

dire holly Jun 24, 2024, 2:26 PM

#

Looking forward to this, I think Wizard is great already.

misty briar Jun 30, 2024, 9:16 AM

#

👀

cloud turret Jul 1, 2024, 7:28 AM

#

Hey, can I get an invite link too? Thanks.

shell ingot Jul 1, 2024, 11:32 AM

#

cloud turret Hey, can I get an invite link too? Thanks.

Yeah sure I sent you a dm

runic flower Jul 1, 2024, 2:54 PM

#

shell ingot Kind of. 😉 Join us.

Threre's a party going on somewhere?

shell ingot Jul 1, 2024, 2:55 PM

#

runic flower Threre's a party going on somewhere?

we do sometimes party

#

dm me

cloud turret Jul 1, 2024, 9:16 PM

#

Thank you so much!

ornate swift Jul 2, 2024, 1:55 PM

#

Invites still open?

sand python Jul 2, 2024, 7:19 PM

#

Can I get an invite link too? 'preciate it.

bitter coral Jul 4, 2024, 7:20 AM

#

Are invites still open? Would love to come on board

signal otter Jul 6, 2024, 12:59 AM

#

Hey can I get an invite

frosty birch Jul 6, 2024, 1:53 AM

#

i wanna too 🥂

manic iris Jul 8, 2024, 6:37 PM

#

Any ETA on the public OR release of this model? Seems like a rather promising option from the sound of things!

sharp oracle Jul 17, 2024, 8:58 AM

#

Can i get invite link too?

jade oyster Jul 17, 2024, 11:02 AM

#

Is this... initiative still going on?

shell ingot Jul 17, 2024, 11:27 AM

#

jade oyster Is this... initiative still going on?

@sharp oracle@jade oyster Yes it is the discord invite is YcrXhk7QD7

The latest model we released is https://discord.com/channels/1091220969173028894/1263089180704243722

lone sundial Jul 17, 2024, 11:28 AM

#

shell ingot <@1263055638985048065><@1037812653906743376> Yes it is the discord invite is `Yc...

Wiz is kinda in a coma tho

#

let's be real

shell ingot Jul 17, 2024, 11:28 AM

#

We aren't working on Wizard explicitly anymore, we found that finetuning L3 makes it more creative, unfortunately Wizard is almost completely destroyed and has become an uncreative model.

#

It's really tone deaf, it needs stuff to be written too formally, it refuses like half the stuff our models have no trouble

#

So I'd say move over to https://discord.com/channels/1091220969173028894/1263089180704243722

#

Wizard isn't exactly going to be creating anything like this soon

meager ravine Jul 17, 2024, 2:52 PM

#

shell ingot We aren't working on Wizard explicitly anymore, we found that finetuning L3 make...

Have you seen the recently released WizardLM 2 papers? Doing the same for Gemma could make a SOTA model in this range

#

I.e. arena learning, new Evol Instruct, etc.

shell ingot Jul 17, 2024, 2:52 PM

#

meager ravine Have you seen the recently released WizardLM 2 papers? Doing the same for Gemma ...

The problem is that people are trying to RP with a model that was trained on code, math, and other nerd stuff.

#

Wizard trainset is so far off from anything that is interesting, that its just unworkable

meager ravine Jul 17, 2024, 2:53 PM

#

Is it possible to do Arena Learning but with MythoMax, etc?

lone sundial Jul 17, 2024, 2:54 PM

#

maybe, but it will take a lot of time to implement, Wizard team provided no code for Arena Learning/AutoEvolInstruct

mighty sparrow Jul 17, 2024, 3:49 PM

#

I don't understand, are you recommending an 8B model with 8K context instead of Wizard?
We are in a bad way then!

I was really hoping for this project, Wizard with all its flaws is really smart and has a context that does not break down to more than 60K.
In short, we just have to wait for the future and the arrival of a model that is not a throwback.

shell ingot Jul 17, 2024, 3:53 PM

#

mighty sparrow I don't understand, are you recommending an 8B model with 8K context instead of ...

Yes, I am recommending a model tuned for RP and ERP over a model that was trained on math and code then specifically censored against ERP and RP using the most advanced censorship tools developed by mankind under a billion dollar corporation.

mighty sparrow Jul 17, 2024, 4:03 PM

#

I always thank you for your work, but 8B is too unintelligent and 8K is not suitable for my chats.
I read on Reddit that you are planning to expand the parameters and get to 70B models and beyond, I continue to follow you.
A 70B and 16K model today I think is the bare minimum, but the audience is large and you are accommodating a lot of people, I congratulate you.

shell ingot Jul 17, 2024, 4:10 PM

#

mighty sparrow I always thank you for your work, but 8B is too unintelligent and 8K is not suit...

Sure thanks for the praise but show me one single 70B model that is natively 16K and not horrible, there simply aren't any today. You have to use RoPE which you already can with the model we trained.

The 70Bs today are awful except Euryale but even that is not great. We started with 8B for costs and I think the output we produced will beat many 70B models in many aspects due to how few good 70B models there are right now.

cloud turret Jul 17, 2024, 9:17 PM

#

Is this celeste 8b as good as magnum 72b? I've been using magnum for a while and it's been my favorite so far since it's still silly but it's clever enough to understand subtlety too.

boreal crane Jul 17, 2024, 9:26 PM

#

cloud turret Is this celeste 8b as good as magnum 72b? I've been using magnum for a while and...

8B vs 72B (as long as both are current SOTA technology LLMs) is still like comparing a bike to a mid-tier car. Both have their uses but straight comparing them, the bigger will always win (except if the bigger is a totally broken, lobotomized fine-tune)

cloud turret Jul 17, 2024, 9:28 PM

#

boreal crane 8B vs 72B (as long as both are current SOTA technology LLMs) is still like compa...

Right, I was just asking if Magnum was considered "bad" along with the other 70b models, or if it's exempt because it's not based on L3.

boreal crane Jul 17, 2024, 9:33 PM

#

I tested the 8B Celeste, it may be a great model, but it is still 8B. It cannot follow long conversations/timelines, it has problems with formatting, it mixes characters. A good 72B model like Euryale can handle these things much, much more reliably.

#

Magnum is also a good model, but apart from being expensive I find it often too fine-tuned for a single purpose (ERP). If it encounters a few keywords, it marches unstoppable in one direction.

#

Euryale-70B is my current best compromise model choice, it can do ERP, but it does feel like it was exclusively trained/fine-tuned on a dataset from pornhub (nothing against this or other porn sites). The other extreme is WizardLM, which feels like it was trained exclusively on Walt Disney movies.

cloud turret Jul 17, 2024, 9:53 PM

#

I see. So Magnum is not considered crap like the other L3 70s besides Euryale, it's just very horny and Euryale is more flexible.

The cost is actually a thing I noticed, normal Qwen 2 72b is 8x cheaper than the other 70bs, and even Magnum 72b which is just a fine tune of it. Which seems strange, is Qwen 2 72b bad or weirdly easy to run or something?

#

Thanks for the input btw, I'll give Euryale a try. I did like how Midnight Miqu 70b v1.5 seemed more "stable" than Magnum, so if Euryale is sort of like that, that might be fun too for some cards.

boreal crane Jul 17, 2024, 9:58 PM

#

Euryale might need a bit higher temperature than other models to keep it from being repetitive, my current settings are 1.25 temp and 0.1 minP

boreal crane Jul 17, 2024, 10:01 PM

#

cloud turret Thanks for the input btw, I'll give Euryale a try. I did like how Midnight Miqu ...

Also make sure you only use Infermatic as your provider for Euryale, the other (NovitaAI) runs a broken quantized model that will produce total garbage easily (so if you see that you might have forgotten to set the provider routing correctly)

cloud turret Jul 17, 2024, 10:02 PM

#

boreal crane Also make sure you only use Infermatic as your provider for Euryale, the other (...

That's a thing I wanted to ask but couldn't find any data on (despite trying). Are Openrouter models quantized, and if so, how much? If it's on a per-provider basis, how do I check? Sorry if this is a dumb question

boreal crane Jul 17, 2024, 10:04 PM

#

cloud turret That's a thing I wanted to ask but couldn't find any data on (despite trying). A...

Quantized models (other than FP16) should be visible on the model card on the OR website, note the fp8for NovitaAI/Euryale -> https://openrouter.ai/models/sao10k/l3-euryale-70b

cloud turret Jul 17, 2024, 10:10 PM

#

Oh! Thank you so much. So no icon at all means fp16?

Also how does int4 compare to q4km, if you know. I googled it and didn't come up with much. I thought Magnum felt "better" on Openrouter than my local q4km of it, but maybe that was psychosomatic?

boreal crane Jul 17, 2024, 10:12 PM

#

cloud turret Oh! Thank you so much. So no icon at all means fp16? Also how does int4 compare...

Correct, as not otherwise stated FP16 models should get used by providers. Quantization is more for home use, not for commercial use.

cloud turret Jul 17, 2024, 10:18 PM

#

Great. So is int4 the same as q4? When I hovered over it on Openrouter it said it was a type of quantization, but I can't find anything explaining how it compares to the q(x) terms I'm used to seeing.

boreal crane Jul 17, 2024, 10:20 PM

#

cloud turret Great. So is int4 the same as q4? When I hovered over it on Openrouter it said i...

There is the paper -> https://arxiv.org/abs/2301.12017

arXiv.org

Understanding INT4 Quantization for Transformer Models: Latency Spe...

Improving the deployment efficiency of transformer-based language models has been challenging given their high computation and memory cost. While INT8 quantization has recently been shown to be effective in reducing both the memory cost and latency while preserving model accuracy, it remains unclear whether we can leverage INT4 (which doubles pe...

lone sundial Jul 17, 2024, 10:23 PM

#

cloud turret Right, I was just asking if Magnum was considered "bad" along with the other 70b...

Magnum is actually considered one of the best

jade oyster Jul 17, 2024, 10:23 PM

#

cloud turret Thanks for the input btw, I'll give Euryale a try. I did like how Midnight Miqu ...

I've been using Dolphin 2.9.2 Mixtral 8x22 at the moment. It's smart enough, doesn't write excessively verbose, is advertised as uncensored and unbiased, and the price is good enough.

Temperature 0.70, top P 1, top K 0, repetition penalty 1

lone sundial Jul 17, 2024, 10:23 PM

#

lone sundial Magnum is actually considered one of the best

also we are considering scaling up Celeste to 27B
8B is only the beginning

cloud turret Jul 17, 2024, 10:27 PM

#

boreal crane There is the paper -> https://arxiv.org/abs/2301.12017

Thanks, but I read the abstract and skimmed the paper, and it doesn't seem to say anything about how the degradation (if any) while using it translates to a q(x) equivalent? The paper said int4 has no degradation unless it's a "decoder only model" but I'm not sure what that means.

cloud turret Jul 17, 2024, 10:27 PM

#

jade oyster I've been using Dolphin 2.9.2 Mixtral 8x22 at the moment. It's smart enough, doe...

thanks for the tip, I'll try that one out as well!

shell ingot Jul 18, 2024, 11:05 AM

#

TL;DR try our new model https://discord.com/channels/1091220969173028894/1263089180704243722

mighty sparrow Jul 18, 2024, 6:13 PM

#

boreal crane I tested the 8B Celeste, it may be a great model, but it is still 8B. It cannot ...

You explained it much better than I did! 👍

lone sundial Jul 18, 2024, 6:14 PM

#

8B is only a beginning, 70B will come

#

also our dataset pretty much enables up to 32K context recall with RoPE

mighty sparrow Jul 18, 2024, 6:16 PM

#

This is very GOOD news!

#

I tried Celeste 8B but went back to Wizard, in my case it was constantly getting my model's clothing items wrong that were well specified in the character definition. It gave me hell.
Knowing that you are working on a smarter model is great, thank you!

north pasture Jul 22, 2024, 1:38 AM

#

mighty sparrow I tried Celeste 8B but went back to Wizard, in my case it was constantly getting...

That is my big pet peeve with all L3 models. Every single one of them screws up clothing/doesn’t adhere enough to instructs/provided data. Im eager to see what they can make that is larger, but when Llama models are involved i lack faith. (NOT because of these ppl’s capabilities, but cuz of the limitationd of the base model).

If anyone can make an L3 behave itll be lemmy and their team. But i think the issue is, when a model is given creative freedoms and can write creatively very well, it thinks it can forgo our data if it suits it’s creativity.

For example, if i specify clearly in my data that the char never wears panties, isnt wearing panties, hates panties, an L3 model will still give the char panties if it wants to describe how horny she by describing wet panties. Perhaps the datasets a model is trained on can cure that particular example by including more options on how to describe a horny girl. But so far all claude and L3 models, as well as a few other oddballs along the way, that i have tried have all had this issue.

And that concept affects more than just this example lol.

fresh glacier Jul 22, 2024, 7:14 AM

#

north pasture That is my big pet peeve with all L3 models. Every single one of them screws up ...

does that include sonnet 3.5? so far it's been quite good at picking up little details like that - at least in my tests.

north pasture Jul 22, 2024, 7:14 AM

#

fresh glacier does that include sonnet 3.5? so far it's been quite good at picking up little d...

I havent tried anything newer than the 3’s lol. Got tired of moderation bs.

robust void Jul 26, 2024, 9:27 AM

#

fresh glacier does that include sonnet 3.5? so far it's been quite good at picking up little d...

been using 3.5 for weeks now, it really good better that 4o.

#

but been testing out 405b, and mixtral2-large today.

#

found m2 pretty decent given that its a fraction of the size of llama3 and gpt4o

#🧙Wizard 8x22B Abliterated💥

Preface

Problem