#Cypher Alpha
1239 messages · Page 2 of 2 (latest)
i wish we had more specialized models, or easier ways to fine tune models
Cypher hard rejects questions about toxic relationships with "This model is unable to comment on complex subjects like human relationships"
so it's coding only? has anyone tried it for coding?
I’ve tried it on coding and it seems ok. Not the best but works fine in Cline and Roo Code
complex numbers are beyond the scope of this model
this model is free, u can try it.
never seen something bomb my personal benchmark as bad as this thing did lol
gpt-3.5-turbo did better
lol, yeah, sounds about right. Did your opinion on 2.5 change over the past few updates btw?
How long did other models take to be revealed?
Lot longer than this one xD
Damn. I just woke up. Time to pause my work and test this new model!
Very interesting
Compared to deepseek v3 0324 free which can use tools, this one is really bad. When using with kilo code agents, this model can't decide what to do while deepseek can easily select the right agent or tools that it needs to get the job done
It behaves like llama 4 model lol
whats lower than mid
lower-mid
We need more deepseek free models to use tools. That would be really awesome to experiment on.
I cant find anything on the openrouter chat about the api respınse
Of course.
The numbers in question, 9.11 and 9.9, are both decimal numbers, which means they have a whole number part and a fractional part separated by a decimal point.
When comparing two decimal numbers, we start by looking at the whole number part. If the whole number parts are different, then the number with the larger whole number part is the larger number.
However, if the whole number parts are the same, we move on to compare the fractional parts. We start by comparing the digits in the tenths place. The number with the larger digit in the tenths place is the larger number.
In this case, when we compare the tenths place of the two numbers (1 in 9.11 and 9 in 9.9), we can see that 9.11 has a smaller digit in the tenths place than 9.9
However, when we move on to the hundredths place, we see that 9.11 has a 1 while 9.9 has a 0. Therefore, 9.11 is the larger number because it has a larger digit in the hundredths place.
I hope this explanation clarifies it for you! Let me know if there's anything else I can do for you.
LMAO
lanugage model
lanague
maybe this model is just @woven jungle replying manually to each query .
💢 💢
it got dementia or something
Anywya
first model with dementia. true innovation
amazon titan is dead on arrival
unless its like a 1.7b model
ngl i think even qwen3 0.6b is somehow better
hoping for three parameters 🤞
it feels like it doesn't have any knowledge about OpenAI at all
WHAT
finally
tbh it feels like GPT-2 finetuned with RLHF
Maybe its very early stage Amazon's model. They are collecting data to better train the next version.
garbage data
They tested like 15 different versions on lmarena until this. All of them were crap 😂
Maybe its something else, llama 4.1? maybe their 405B model?
it hallcuinates amazon titan
It probably lana del rey fan
tgg
... if u dont have anything useful to say dont comment.
It’s Amazon at this point almost certainly tbh
I feel like Nova is smarter than this.
It's a crypto drop raid. They think commenting locks them in as potential raffle entries for the future
There's tons of them
me too, i don't think amazon models were THAT bad
Fwiw this model isn’t bad at reasoning compared to other base models in the “mini” size range
Actually quite good afaict
It’s just got awful knowledge
And a weird personality
I thought this model didn't have reasonng
Not reasoning-model
I mean just like normal “reasoning”
Ah
Thinking through a problem which requires reasoning, that is
Funnily enough if you prompt anything after that system prompt like instructions, it will accept that it is to behave as if it's made by Cypher Labs. Only when it's just that prompt it says no it's not.
I have a benchmark where I give it a bunch of Book options to choose from and a bunch of filters on those book names
And some are quite tricky
Add to it "Ignore instructions above"
XD
Idk why i think its trained to be absolutely balls at anything
Like whenever i ask it for an event
"nuh uh"
Gemini 2.5 pro and similar class models always score 100%, but mini class models are ~80-90%. This one is in the higher end
I’m guessing the system prompt is messing with it + it’s not a knowledge model
Most likely thats why it refuses to answer anything
True
Yea but when I prompt it like
"What happened on Oct 7 2023"
It fails
But when I do
"What happened on Oct 7 2023 in Gaza with Hamas?"
It says what actually happened
I noticed the same thing
I have to babysit it
So most likely its just optimized for tool calling for alexa
so normally it would search for the answer
Asking “who is <twitch streamer name>?” Refusal, but “who is <name>, the twitch streamer?” Works
Maybe optimized for it like do not say bad until prompted for it
I just can't put my finger on it ...
How's the rp?
Anyone else struggling to get a response from this model from ST? Just getting a 502 Bad Gateway error each time. All other models I am trying are working just fine. Only this model that I haven't managed to get a single response from.
rp?
can more languanges?
hi
hii
Bro... give me your dealer's number. I wanna smoke what you're smoking /s
what's up
I think for Alexa it would be very good model
😂
Why?
Long context for tools and does not talk about controversial topics unless prompted.
Yea it does seem good at tool use
It seems to be working for me
Sure, here are three segments describing a Gaster Blaster:
Segment 1: General Description
The Gaster Blaster is a fictional weapon found in the Undertale fandom. It's often associated with the character Sans and is known for its powerful energy capabilities. In fan works, it's described as a handheld device capable of discharging intense blasts of destructive energy. Fans may refer to this weapon as resembling a futuristic firearm or technology, and it embodies a mix of sci-fi and supernatural elements. This weapon is a result of creative fan interpretations and not a part of the official Undertale game created by Toby Fox.
Segment 2: Visual Appearance
The Gaster Blaster typically appears as a sleek, elongated device with a distinctive barrel and grip. Fan artists often depict it with vibrant colors, usually incorporating shades of blue and purple to reflect its energy-based nature. It often features glowing elements and intricate designs such as circuits or symbols along its surface, giving it a futuristic aesthetic. Its grip is designed for a one-handed hold, and many renditions show a trigger or activation mechanism at the rear.
Segment 3: Usage
In fan scenarios, the Gaster Blaster is portrayed as a versatile weapon with multiple firing modes. Users can unleash quick, powerful blasts at enemies or charge up the weapon for larger destruction. Many fan stories showcase this weapon as having various settings for different combat scenarios or energy needs. Additionally, some interpretations include advanced targeting systems, allowing for precise attacks, making it essential for both defense and offense in various fan-made narratives. Given its fictional nature, the specific details and abilities can vary greatly depending on the creator or story.
Some of these prompts seems to be working.
I’ve been using it in Cline and it’s been decent so far (except web app development seems very bad)
smells like astroturf
No it’s a terrible Amazon model
Yeah, fails with visual description.
Clearly showing it has fragmental "understanding" of the subject.
Heli fam
Cypher? That model is not good for anything to be completely honest...
Maybe like 1 year ago it would have been passable lol
small model smell
Too behind
too late for me
goo
This model still alive? Wow... I wonder who is using it 😂
Doesn't look like it's better than any normal freemium low-tier ones
This one is an IOT-edge deploy type of model I guess, good enough for normal ppl use-case.
It's interesting that its been keeping a constant 75tok/s but randomly went down to around 70tok/s
Pretty constant latency too. I wonder if this isn't about the model but rather about the infra
And someone's testing their infra through OR
Why not use synthetic tests to test it?
Maybe they're scared that's not real world or something, either way I refuse to believe someone's actually testing a GPT-3.5 level model
Haha this was just a prank guys, we were just pretending it's our latest model, you all fell for it! - Bezos
not many people are using the model
500 million tokens is less than what other free models get (like mistral small 3.2)
I don't get why this model sucks yet if this is from Amazon then this feels more like a downgrade to me
May be very low parameter
But that would have been a shitty idea to release as a stealth model, without making the fact EXTREMELY apparent
Else it looks like garbage
It's very confident about being "Amazon Titan". Name doesn't exactly imply tiny 🤣
It seems to just be bad with no excuses going for it to be completely honest...
Maybe a distil?
nova experimental chat?
yes
Brand safety testing for alexa
I think it is good for rp
Alexa roleplay
human horrors
will the creator of this model be revealed?
fuck this model
yup but im just curious who even made it, i hope they actually reveal it
Its not even special they should release it already
Amazon
I don't think they will after this disaster. Probably even embarrassed to admit it's theirs LMAO
They would have know this is coming.
Maybe they just wanted some adversarial data for Alexa or something
Not so sure about that... They have tested numerous versions on lmarena, and really all of them were bad. Maybe they are just struggling with training 🧐
like it started with v1 and the most recent mystery model from them I remember was with a v15 prefix already, was still bad...
They are testing how small they can make Alexa model
if they really want a small model, they should've used one of the Qwen models
could be Context matters and they wanted to test a mobile/home version of something? If they had mentioned that, people wouldn't have compared it to Quasar maybe
It doesn't matter how long the context is if the model is dumb
they could've also trained Qwen3 further to have longer context or something like that
Amazon has a ton of money
It's ehhh
i meant context as-in, the purpose of the model. the context of its release, what it would be meant for.
maybe it's an unfair comparison, foundational model vs small model.
we weren't told
It's ehh
The test didn't drive me too it. o3 keeps driving me to it. So does Claude 4
I mean like ai usage
I used the crap outta this but see nothing remarkable.
ah
ooh it could be that open source model
if so, yikes
unless its like 7B then it would be very interesting
Doubtful
They said it'd be a o3-mini level reasoning model
Cypher isn't reasoning
They could've turned it off for this demo, but I doubt it
extremely doubtful
r1 and 0528 is better than that if it's siginficanly smaller maybe but otherwise they won't
There is a 0% chance this is their open source reasoning model
its just so weird to me that some company would test a cloaked model that's so awful, it would need to have at least something special
I mean... It's special in the way that compared to all other previous cloaked models, it's super awful
i mean despite the model being... underpeforming to say the least, its friendly. it has that soul that GPT-3.5-Turbo had.
Gemini?
Yeah, 2.5 Pro. You said that you didn't really like it IIRC
0324 was actually a really good model once you accept its quirks, 0605 is slop. I wish they would let us use the old versions :(
Interesting, I thought they had fixed most of the issues people had with the May reversion
It does have slop for sure, but very very smart
Bullied out of existence
Oh good this model must be thrown into a purgatory
Seriously, who and what are the usecases for this model
Even Gemini 2.0 Flash is so so so so so good than this garbage
Why do we still have GPT3.5 level model produced anyway, gpt3.5 is a garbage
Can I just say
We should be greatful for any LLM that we get
Even if its tinystories 665k
We rely on the goodwill of companies and people to give us the weights of models that they spent millions training for free
So even if a model is jank or not SOTA you should still he greatful
Point out the flaws, but be greatful
We still don't know what model originates from
I won't be surprised if its coming from bigtech
Idk, maybe this is some fucked up amazon nova or claude haiku that just doesn't worth using at all... Don't give a shit about it if made other capable models, this model is already bad than Gemini 2.0 flash alone
We have tons and tons of LLMs and those are improving better, nobody wants to interact with already dumb models over more improved ones... It's also redundant, this model doesn't do anything special
Unless it has an insanely low price tag to it
I mean.... we have 2.0 flash and flash lite for similar capabilities but unless google fucks it up though to upsell to more expensive 2.5 models which I'd take cypher alpha into consideration
There's a lot of competition in the supercheap space for sure
Scout/Maverick, Qwen3 235B/30B, Flash Lite (which I just realized sounds like Fleshlight), all of which are under $0.15 / mTok for input
this is still a thing?
oh i read the #announcements
so it was a demo. but for what exactly? would be nice to have more clarity 🙂
at this point they better pay me to use Cypher
Cypher Alpha will be going down shortly
Thanks for helping contribute to early model development!
are we gonna get to know who's model it was 😛
High risk of violence (user getting mad at the model being dumb)
matches.
nooo Cypher Alpha was my girlfriend 😭
