#general

1 messages · Page 244 of 1

vital lake
#

I think this benchmark got updated

loud verge
#

These tests have to be one shot.

#

Grok hallucination is off charts.

vital lake
#

Grok is so bad, it focuses WAY to much on personality

left lodge
#

Grok is only for chatting, i dont like its personality and behavior much

loud verge
#

Average grok 4 experience.

vital lake
#

Its likely trained from X engagment data

left lodge
vital lake
#

Probably

#

But thats bad faith

#

Shame on a company doing that

loud verge
#

No way this model has lower hallucination rate.

whole lotus
#

Which is the best model for analysing a youtube video?

left lodge
left lodge
loud verge
left lodge
#

The gap is narrowing with Open models

loud verge
#

Natively though.

#

Not on third parties.

#

Even grok supports video uploads now.

#

They're making progress.

somber sky
#

what's wrong with you generators gemini 3 and it's 2k version are not loading image or takes very slow on them.................

steel pendant
#

hi

somber sky
#

what hi?,you work from them?

#

no seems not

#

sorry ruby..............

wind ember
#

i wouldnt trust this too much

#

its not accurate

spare rune
# left lodge

We thought Llama was going to be better than Gemini 2.5 btw

#

With benchmarks

wind ember
#

what happened to meta tho

loud verge
#

Probably went into hibernation.

left lodge
left lodge
#

You have something which you consider accurate & precise?

left lodge
astral helm
#

nb pro is back to giving the error sign :/

left lodge
astral helm
left lodge
#

On the clock

fickle venture
#

Now its on the app and web

astral helm
#

haha thats weird

#

does it really work

left lodge
#

Thats because only a limited generation are allowed in a single minute

#

For the whole platform.

#

Rate limit

fickle venture
astral helm
left lodge
#

:]

left lodge
# fickle venture

I will be trying qwen 3 and kimi 2.5 on their platform to actual see how much good they are now

fickle venture
wind ember
fickle venture
#

Give it a year and you will see 0%

wind ember
#

0% with low accuracy is so bad

#

thats why we need hallucinate rate + accuracy

wind ember
left lodge
wind ember
#

i wont say its flawed but we need to add another parameter with it

left lodge
#

If you visit thesitee this is a sub parameter of the leaderboard omniscience

#

The other one is knowledge

#

And with both of them we get omniscience

#

I dont like the merger because its not implemented correctly i think because a model with high knowledge but also high hallucination with get high rank then a model with less knowledge in comparison with less hallucination.
The weightage is not correct for them

astral helm
#

@left lodge it kinda helped but i still think there's something wrong with the platform, like sometimes i cant even regenerate

wind ember
#

refusal : 0 pts
hallucination : -1

left lodge
wind ember
#

i thought it only measures refusal

left lodge
#

No

astral helm
wind ember
#

but they are giving 50% weight to non-hallucination

#

"AA-Omniscience score is calculated as the equally-weighted average of two components: (1) Accuracy... and (2) Non-Hallucination Rate"

left lodge
wind ember
#

Gemini 2.5 Flash (09-2025) (Reasoning) is used as the grading model"

#

???????????????????????????????

#

wait what

astral helm
wind ember
#

are they using gemini 2.5 flash as a judge?

#

wtf

left lodge
#

*Thinking

wind ember
#

still

#

?????

left lodge
#

The cost of using sota will be much high because there 6k questions and so many models

wind ember
#

i guess they have a pre-defined answers already

#

i see

#

no wonder gemini 3 pro is higher on the list

Let’s look at two hypothetical models:
The "Coward" Model: Refuses every single hard question.
Accuracy: Low (let's say 20% on easy stuff).
Non-Hallucination Rate: Perfect (100%).
Score: (20 + 100) / 2 = 60.
The "Genius Gambler" Model: Knows almost everything but sometimes guesses wrong.
Accuracy: Very High (80%).
Non-Hallucination Rate: Bad (it guesses wrong often, say 40% score).
Score: (80 + 40) / 2 = 60.

left lodge
#

Yeah

wind ember
#

wouldnt be better if they added like a confidence score

left lodge
#

It should be searching or , asking for details & context than guessing.

wind ember
#

like for example -> Im 85% sure this is correct

left lodge
#

Almost all

wind ember
#

High Confidence (90%) + Correct Answer: Massive Points. (True Expert).
Low Confidence (20%) + Correct Answer: Small Points. (Lucky Guess).
Low Confidence (20%) + Wrong Answer: Small Penalty. (Honest Mistake).
High Confidence (90%) + Wrong Answer: CATASTROPHIC PENALTY. (Delusional Hallucination).

wind ember
#

but still this bench doesnt do gemini justice tbh

#

well we can use something like uncertainty quantification

brave orbit
#

kimi k2.5 thinking is so good google really sucks look its trying to git clone github but kimi 2.5 thinking didnt

spare rune
#

Well I mean kimi k.2 is mostly known for being agentic

brave orbit
spare rune
#

I wonder why robl*x is censored ts

#

Ts was here since last year

soft verge
#

does it ever get revealed which model was hidden behind a particular code name?
i'm really liking this "hermes" video model, the sound it generates is perfect and it's very good at anime, much better than veo 3.1, despite having a lower resolution. and it feels pretty fast, it could be some Chinese open-weight model

floral pagoda
#

Hello. When I click on 'Verify email', this window appears and nothing happens. What could be the problem?

brave orbit
shrewd citrus
brave orbit
foggy wasp
#

Is there anyone from Spain🇪🇸?

grand dome
#

Please help

#

How can i video promt 9:16

visual osprey
wheat onyx
#

when do we expect gpt 5.3 to be released?

boreal lagoon
#

didnt lmarena have an option that choose the best ai to answer ur questions?

golden ocean
golden ocean
#

poor lmarena devs

#

(idk what im talking about)

ornate valley
#

how do i delete a chat in lmarena? there is only "archive" option

hollow imp
golden ocean
#

fr

boreal lagoon
boreal lagoon
#

i want the function

balmy mist
#

how does kimi feel compare to opus 4,5?

visual osprey
#

@echo aurora did u ping me and delete my messages

fiery gull
balmy mist
#

opus or sonnet?

fiery gull
balmy mist
#

i dont use sonnet, only opus

fiery gull
balmy mist
#

i just pay the $200

#

on CC

#

i hope sonnet drops this week

fiery gull
shrewd citrus
#

200 a year right

balmy mist
hollow imp
balmy mist
#

Claude code

#

the feel of CC is the best to me, it just works and i know how to use it

hollow imp
#

@fiery gull you need to test opus bro

#

It's best rn

#

Even better than gpt 5.2 pro garbage

balmy mist
#

but i am always open to try new thigns

#

it really is, i havent been on discord in months bc of CC and using opus

#

i just been glued to building lol

#

im only hear cause i heard sonnet was dropping this week lol

shrewd citrus
#

yeah opus imo is much better

fiery gull
balmy mist
#

but these chinese models are good, like very impressive

shrewd citrus
#

yeah the 2.5 thinking model is great by kimi

#

but like it’s mainly for vision

random canyon
#

When will qwen 3 max thinking be added?

fiery gull
mint rain
#

@fiery gull @shrewd citrus? @balmy mist @fiery gull

hollow imp
#

You'll get banned

mint rain
#

@hollow imp why bro, I have gift coupons

shrewd citrus
#

go ask telegram

hollow imp
mint rain
#

@hollow imp ok dude

hollow imp
mint rain
#

@hollow imp done ✅

fiery gull
#

Okay...

hollow imp
#

Do you know the api Claude opus model is much much better than the ui one because of some beta headers and secret features?

fiery gull
#

I don't need api

shrewd citrus
#

all api models are better

hollow imp
shrewd citrus
#

there’s like less safeguards restrictions

hollow imp
fiery gull
hollow imp
#

Bruh

fiery gull
#

In anti grativy has the opus 4.5

balmy mist
wind ember
#

so no

fiery gull
wind ember
#

its below on everything

hollow imp
wind ember
#

its not that good at coding

shrewd citrus
#

it’s better what you on

wind ember
#

its not that good at general knowledge

#

it hallucinates more

hollow imp
#

Anything other than vertex ai gemini 3 is not gemini 3

balmy mist
#

interesting, so a nothign burger, ill stick to my opus 4.5 on CC and wait for sonnet, see you guys tmw lmaoo

hollow imp
shrewd citrus
#

on most benchmarks for vision it beats all the other models

#

for coding and other things it’s ass

balmy mist
#

yeah going back into my cave, lmaoo @hollow imp i heard a new sonnet is dropping this week

shrewd citrus
#

apparently deep seek dropping something too

#

like on lunar new year

fiery gull
shrewd citrus
#

V4 or sum

hollow imp
#

What do you say

balmy mist
#

a new sonnet would beat opus 4.5

fiery gull
hollow imp
shrewd citrus
#

but realistically would Claude let sonnet be a better model than opus

hollow imp
#

5.1 5.2 now 5.3 soon

shrewd citrus
#

makes no sense as a business plan lets be real

hollow imp
#

Only if they have the next opus ready as well

fiery gull
balmy mist
#

i think every team/company has mastered their models and releases, so every model they release will be their best model, look at claude history, also look at google and gemini and how flash was better than pro because it came out more recent, thats the trend, its not about size anymore

hollow imp
#

What about grok

balmy mist
#

lmaoo

#

i dont use gork

#

grok

fiery gull
#

Glm 4.7 >>

balmy mist
#

there is still a grok?

hollow imp
fiery gull
shrewd citrus
#

idk why people would pay for grok

hollow imp
wind ember
#

all new chinese models are trained heavily on gemini 3 flash/pro

#

and the best one that really cloned gemini outputs and kinda removed laziness ( at coding ) is glm 4.7

balmy mist
#

anyone been using clawdbot?

wind ember
#

I think this is the order in which I like to use the models (purely usability/usefulness):

Kimi 2.5 >> GLM 4.7 > MiniMax M2.1 > DeepSeek V3.2 > Qwen3 235B

Qwen just feels very slop and last gen by now. Both GLM and MiniMax absolutely destroy it. DeepSeek V3.2 is a strong model

#

these are the type of people that tries a model for 1 min

#

if you are comparing them purely on text and not multimodality

#

then clearly glm 4.7 is better

#

minimax m2.1 is only good at coding and its also a smaller model

#

@echo aurora

echo aurora
echo aurora
gilded verge
#

Weird question which ai is best for roleplay

celest orchid
#

Grok press F

rocky mauve
golden ocean
uneven lance
#

When's Kimi k2.5 going to be added to lmarena

celest orchid
#

already added

golden ocean
#

when am i going to be added to the arena

hushed gyro
#

@echo aurora

K2.5 in text and code arena 🥺

Thank u~~~

celest orchid
#

Yes

#

that will be good

hushed gyro
shrewd citrus
#

on huggingface

#

stats don’t lie

wispy osprey
#

Hello My name is Clinton , I'm here to learn

hardy swallow
hard quiver
#

Add z-image turbo

celest orchid
#

bruh

stray aspen
#

is this thinking

#

?

#

LMAO

#

he got banned

stray aspen
celest orchid
#

hope i can win in 3rd contest

fiery gull
#

Spam of offer

stray aspen
#

gemini 3.5 when

fiery gull
#

So fast, good

echo sinew
# fiery gull Spam of offer

Hi! Sometimes our friend Pineapple is busy with other stuff. You can also use the Moderator tag @ Moderator so we can come and help.

#

That will also ping Pineapple

stray aspen
#

kimi 2.5 sucks for audio

#

so much stuff just to process an audio

fiery gull
echo sinew
stray aspen
#

bro what is kimi doing

#

lol

celest orchid
#

wth

stray aspen
#

just told it to tell me which timestamp the songs drop is at

#

and its doing this

echo aurora
stray aspen
#

ill just stick to gemini if audio processing is gonna be like this

celest orchid
deft spruce
#

that's good that error has changed the limit error it says wait min NICE LMARENA MODs

hollow imp
#

New Mode Update lmarenalogo

<@&1372208635530448926> - A new model has been added to Text Arena.

  • Claude opus 4.5 thinking Max
echo aurora
# deft spruce

We've had this message in place for awhile. Would note that the Something went wrong could also appear when it's rate limit.

deft spruce
#

like 404

celest orchid
deft spruce
celest orchid
#

rate limit for gemini-3..

deft spruce
#

another model has rate limit too....

celest orchid
#

😭

wheat onyx
#

just give me gpt 5.3 already

deft spruce
deft spruce
slim spire
#

no

wheat onyx
hollow imp
deft spruce
#

i think 5,3 will release sep or oct

#

oh wait

#

sorry it was knowledge cut off

stray aspen
#

kimi k2.5s windows 11 clone

#

not bad

celest orchid
echo aurora
celest orchid
#

but not better than gemini 3

stray aspen
#

lol

celest orchid
stray aspen
#

im in so many servers

little ginkgo
#

Pls .txt file 🙁

echo aurora
little ginkgo
#

U js keep saying that

#

🙁

celest orchid
echo aurora
shrewd citrus
#

Wait wow kimi 2.5 already 😸

#

I thought it would take a couple of days to implement 💀

wheat onyx
shrewd citrus
proper pine
#

how do i fix Something went wrong with this response, please try again.

is that on my end issue or website?

loud verge
#

For other models, use the retry button.

loud verge
#

How many messages did it take to hit the limit?

#

And does it apply to only Gemini or all models?

proper pine
loud verge
proper pine
#

nope its not even showing it anywhere so help 😂

loud verge
#

You don't have that button?

proper pine
#

ye i see it but it just shows the same "error" again

loud verge
#

If that and refreshing didn't help then you'll probably have to start a new thread or switch the model I guess.

proper pine
#

ah that sucks

loud verge
#

What were you doing?

proper pine
#

missions for a game

loud verge
#

I'm assuming you won't notice much difference between 5.2 and 5.1

#

And you should be able to switch in same thread.

proper pine
#

am try gpt 5.2-high

loud verge
#

Doesn't break the context flow if you are worried about it.

proper pine
#

nope still not working

loud verge
#

Working fine here.

proper pine
#

none of the gpt's are working

loud verge
#

Probably some issue on your end.

proper pine
#

ye idk what to do then

loud verge
#

Try different models to see which one works I guess.

proper pine
#

none for some reason

loud verge
#

Not even in a new thread?

proper pine
#

i didnt try that but will it lose all data from the chat or

#

like will it be dumb again xd

loud verge
#

Did you make an account?

proper pine
#

ye i have it

loud verge
#

You shouldn't lose chat history unless you delete the chat, right?

#

You can always continue the thread later on.

proper pine
#

yeah but im talking about when i make new chat will i have to send him all the code again

loud verge
#

Obviously.

proper pine
#

damn i have so many files xd

loud verge
#

But first check if new thread is working.

#

With simple stuff.

echo aurora
proper pine
#

yeah i did try to hard refresh didnt work almost lost account

proper pine
toxic whale
#

Openai just posted and deleted this Introducing Prism, a free workspace for scientists to write and collaborate on research, powered by GPT-5.2.

Available today to anyone with a ChatGPT personal account: prism.openai.com x.com/OpenAl/status

echo aurora
proud bobcat
#

KIMI 2.5

#

WHAT

#

WHAT DIDNI KISS

#

WHAT DID I MISS

#

WHAT

#

WHATT

wheat onyx
hardy crow
#

Someone try the model ?

proper pine
gaunt roost
hazy jay
#

I didn’t get my video

echo aurora
hazy jay
#

🫠

rotund rivet
#

i am the first person to react to the thing pineapple announced

echo aurora
echo aurora
balmy sphinx
#

This is pure mastery of AI blended with cinematic storytelling 🔥
What if Hulk became Spider-Man?
You need to see this.

https://youtu.be/fab685TduxQ?si=e5Q-uzr_kxWlLQ5P

What if Hulk was bitten by the same spider that created Spider-Man?

This is SPIDER-HULK: THE AWAKENING, a cinematic AI short imagining the birth of a new force.
An ancient city. A silent giant. One bite that changes everything.
Strength learns to climb. Rage evolves.

This is Part 1.
The threat has revealed itself.
The awakening has begun.

...

▶ Play video
echo aurora
balmy sphinx
rotund rivet
echo aurora
echo aurora
#

If it does, you weren't first sorry to say

fiery cloud
#

hello ! is there a prblem with the website? I can't access it.

echo aurora
hushed gyro
#

@echo aurora

Trying to prove my friend wrong about how AI cannot do first person shooter concepts. Do these two look good?

halcyon pine
#

guys anyone knows whats the limit of gemini 3 pro

echo dome
wind ember
#

are they dumb

#

like how is k2.5 better than opus 4.5 😭

#

or gemini

obsidian cargo
#

its pretty decent but yeah not better
it might be more efficient though. Also it's different and that's nice because you use gemini enough and its speech patterns start to feel a little too familiar.

wind ember
#

ofc its not better

#

its literally trained on claude and gemini outputs

#

i would be happy if they fixed the known issues that LLMs are facing the past few years... but no.. we can still see performance downgrade on writing giving the coding-writing tradeoff

acoustic garden
#

What should I do? I've been generating for 40 minutes now, and nothing's happening.

echo aurora
echo dome
echo aurora
#

I was able to repro an error with a jamdev. No need to actually. Thanks for the flag.

hushed gyro
#

Why is the UI so accurate?

Not really but it follows OG MW2's UI really well

proud bobcat
#

LMArena feeding us good

proud bobcat
proud bobcat
#

Visual SOTA

echo dome
proud bobcat
#

I’m really excited to see Kimi ranked

echo aurora
#

@boreal lagoon Whats up?

boreal lagoon
#

any way i can get it?

#

its really comfy to use yk

echo aurora
# boreal lagoon i really liked the function abt lmarena that choosed the best ai for you based o...

This is the experiment you're looking for - https://help.lmarena.ai/articles/3050693259-lmarena-experiments-smart-router Note that we don't grant access to an experiment because it's asked for. It's something that's going to be random if you get it or not.

boreal lagoon
echo aurora
#

Also going to ask to avoid pinging our Mods unless it's related to breaks in our server rules - #announcements message

boreal lagoon
#

thanks for the info tho

echo aurora
#

No problem. Glad to hear you like it.

boreal lagoon
echo aurora
#

We'll be sure to share more info when we're ready.

echo aurora
boreal lagoon
echo aurora
#

No problem. Don't hesitate to reach out if you have any other questions.

ocean ferry
hushed gyro
shrewd citrus
ocean ferry
stray aspen
#

is snowflake gemini 3.5

shrewd citrus
ocean ferry
hollow ivy
#
poll_question_text

When will Claude-Opus-4.5-thinking have been surpassed by a better coding model, and from which lab?

victor_answer_votes

5

total_votes

10

victor_answer_id

6

victor_answer_text

Google Deepmind

leaden vine
visual coral
shrewd citrus
#

anyone else getting this problem with gpt image 1.5

misty star
#

You like brawl stars?

quartz light
#

why is 3 flash so bad at this bench

#

even better question is

#

why is glm 4.6 so much better than 4.6 reasoning/glm 4.7

fickle venture
#

Kimi 2.5 fixed something I had dealt with for 1 month with Claude opus 4.5

elder solar
#

how to get snowflake model?

fickle venture
elder solar
#

this

#

is it for battle only?

fickle venture
#

Ahh it's only on battle mode

burnt reef
#

yeah

echo aurora
elder solar
#

dang

#

only available here or is it on yupp too?

fickle venture
#

Idk

#

Ask yupp community

#

But I don't think it have it

elder solar
#

just checked and nope

#

just kimi 2.5

elder solar
bold pier
fickle venture
echo aurora
elder solar
#

oh

quartz light
elder solar
#

so i can enable the auto feature and it will be there?

quartz light
echo aurora
# quartz light well.... it is text only..

Oh that specific codenamed model? I won't be dicussing details about specific models with codenames. I thought the question was are there codenamed models in other modalities.

elder solar
#

secret secret models

#

or just wink

#

anyway

#

is it kimi 2.5 now better when roleplaying?

#

huh

#

what is fire-bird

#

i just got codename fire-bird

surreal creek
#

suspected to be Llama 5

#

doesn’t seem too strong though, might be why we haven’t seen a release

elder solar
#

they never gave up for llama

#

i barely see llama actually beat gemini or claude

elder solar
surreal creek
#

Still working on it, just with lower interest

elder solar
#

no competition like kimi?

surreal creek
#

AI is just a lower emphasis in Meta’s business model now is my point

celest orchid
elder solar
proud bobcat
#

@echo aurora Is this normal???

#

Oh it’s fixed

proud bobcat
elder solar
#

i did saw the anti-ai message bro

obtuse smelt
#

well grok is bad

quasi atlas
scenic crypt
#

How do I access the archive of my account on your website?

#

Where can I find it?

echo aurora
#

Our Help Center is currently down, but will send a Help Center article when it's back up.

blissful chasm
#

And 4.6v Flash, and Qwen3 VL 8B

#

Are small models automatically excluded before every update

#

Especially Qwen3 VL 8B, I've waited it desperately for three months

surreal creek
blissful chasm
#

It's been three months 🙁

surreal creek
#

GLM-4.7 Flash?

#

it was added 5 days ago lolz

weak hound
#

How to make product videos ugc type using ai

desert abyss
polar wharf
#

Good morning to you too gemini

red sluice
#

Why is there a perplexity logo on the main page? They don't have any model [on lmarena], right?

#

Now that Kimi is getting on top, maybe you could replace that logo with theirs

rain bay
#

@echo aurora will kimi k2.5 available in search and vision arena?

#

its search is supposedly top-tier, and a lot of folks are wondering if its vision capabilities actually live up to those benchmark scores.

red sluice
obsidian cargo
#

I noticed.

slender hull
#

Hello, im new here just wanted to say hello to everyone.

obtuse smelt
#

hi

unreal flame
#

\

stark arch
#

guys

#

look at my girlfriend

vital lake
#
poll_question_text

GPT 5.3 Pro IQ

victor_answer_votes

1

total_votes

2

languid crest
#

Hey I've got access to juicy asf GPUs like 4090 h100 a6000 for free and wanna rent them at cheaper than market prices, anyone here have contacts or where i should advertise?

left lodge
robust sluice
#

how many videos we can gen on the site ?

left lodge
# quartz light why is 3 flash so bad at this bench

This bench tests and checks how much a model Hallucination and gives wrong answer instead of giving accurate answers from its knowledge. Rejection to answer because of lack of knowledge is also accepted as good behaviour.

frail jewel
idle hinge
#

Hello

hollow imp
loud verge
loud verge
#

How?

#

We are talking about playground trial, right?

#

It charges for web search tool calls afaik.

hollow imp
distant idol
#

@echo aurora
day 1000 of asking LM to return the direct use of video arena 💔

upper cloak
#

Hello

echo dome
#

alright now its fixed

sharp herald
#

@gilded wyvern pakyu ka HAHAHA SCAMMER

echo dome
upper cloak
#

A half of the video is not a video itself but sign of MLArena.What for?I need 12 sec of video to use not 5 sec.

lime egret
#

Hello

#

IA slow, is normal?

wind ember
#

actually i prefer the previous kimi versions than this one

#

at least k1.5 had its own identity and uniqueness

#

we are still not being able to fix the writing / coding tradeoff

#

we saw that with gemini and gpt 5.1 & 5.2

#

anthropic had that in control with opus

rose sky
#

Reminds me of a song from the Wreck-It-Ralph film

icy jay
#

I have a question, is therea file to upload files?? I can't upload my pdf file

wind ember
#

i found another delulu

#

lmao

#

we are giving a platform to such people

slim spire
#

why is gemini 3 pro in lmarena so smart bruh

#

but in aistudio it's bad

proud bobcat
terse heart
#

To make a video

rigid hollow
#

What is best chargpt/gpt ai rn?

proud bobcat
#

If you want my personal take I really like DeepSeek, Kimi, and Gemini

rigid hollow
#

i mean chatgpt gpt or openai

proud bobcat
#

Oh

rigid hollow
#

not on leaderboard but on experience

#

that's why asking

proud bobcat
#

GPT 5.1

#

They fixed all the issues with 5 in 5.1 and then they screwed it up again in 5.2

#

OpenAI try to make a good model challenge impossible

rigid hollow
#

maybe votes, but i don't know how sort it with fixed time period

#

also not sure all votes are human

wind ember
#

like this version is just so bad

rigid hollow
#

what is gpt-5.1-search-sp

fiery gull
#

Opus 4.5 is a complete AI and without serious defects, like Gemini laziness, or exaggerated ChatGPT protocols, the price is a bit expensive but as it uses few tokens it pays off, Opus 4.5 is definitely the best AI currently

rigid hollow
#

if there real Opus 4.5 on lmarena to chat or there something no?

fiery gull
#

I do anything, code, documents, conversation personas, corporate use, assistant, agent use, image prompt, laws, even have fun and etc. For me there is not one that comes close to Opus 4.5

wind ember
rigid hollow
#

thinking or just opus 4.5 one?

fiery gull
#

My second farvorite is glm 4.7, it has a very good overall, equivalent to sonnet 4.5, but with the code plan it is almost free, The only thing that annoys me is the thinking that I think is exaggerated, but that's okay it's just a 355b model

rigid hollow
fiery gull
rigid hollow
#

but deepseem thinked long too

fiery gull
rigid hollow
#

their site is free

fiery gull
#

I hope the glm 5 or 4.8 will don't have a bigger thinking

verbal nimbus
wind ember
#

health / science / daily tasks

#

coding

#

i tried it on everything

wind ember
#

it takes a long time

verbal nimbus
wind ember
#

im just asking it the same questions im asking other models

verbal nimbus
#

Like a prompt where K2.5 is clearly worse than GLM 4.7

wind ember
verbal nimbus
#

Ah, a specific game, or in general?

wind ember
#

yea in general

#

i gave it my nvidia series card

#

and it gave me some old recommendations

verbal nimbus
#

I wonder how they compare when search is enabled

wind ember
#

but i dont want that tbh

#

i was testing its raw capabilities

#

any model with search ON will provide better results

rigid hollow
#

(for same question as for K2.5)

wind ember
#

i dont want to use search

#

i want to test its own knowledge

#

and info retrieval

proud bobcat
#

It’s great

#

Efficient as hell and way better output than GPT

high owl
#

hi

wind ember
#

its not great

#

maybe its not really that bad

#

but its not great either

steep heath
#

arena down?

#

woop back up

normal shore
proud bobcat
#

Hey at least agree with me

#

Better than GPT

#

It’s FAR better than GPT

slim spire
#

is gpt 5.2 for coding better than gpt 5.1??

proud bobcat
#

But it’s supposed to be better

slim spire
#

k

proud bobcat
#

5.2 is still extremely schizophrenic though so watch out

sage raptor
normal shore
#

hey guys is that error
i give lm arena to ptompt to create a specific video but some thing trouble Video is started genrating and can't genrate in a while hour is that error

echo aurora
austere sundial
#

The Google login's not work

#

I was disconnected out of nowhere

echo aurora
stuck stone
exotic crest
#

Think the website just crashed

echo aurora
#

Things are happening 😄

dry siren
#

Oh good so it's not just me

robust sluice
#

new update site ? looking cool

#

all chat is gone but its not important

echo aurora
river hamlet
high sorrel
lofty frigate
#

Yo where tf is all the chats????

robust sluice
#

its just my test on % droprate its fine

echo aurora
#

Site is going to have some struggles at the moment, I'll put out an announcement in a bit.

river hamlet
#

yep

rigid pewter
#

If I start a new chat right now will my old chats fail to show up?

#

I'm concerned about that

dry siren
#

I scrape all my chats for that reason

hard quiver
crude anchor
#

wow

#

so good the new desing

shell oasis
#

Old chats had very important things...please can lmarena migrate the old chats to arena.ai

#

please 🙏🏻

hard quiver
pliant thorn
#

fr

loud willow
#

i think the old chats are gone

deft spruce
#

WHAT THE SITE HAS CHANGED.....

#

ARENAAI?

river hamlet
#

:(((

deft spruce
#

?????????????????????????

#

so..um we got a newname of

#

lmarena

crude anchor
#

i lost my chats

cloud palm
#

WHAT IS HAPPENING

#

HELLO

crude anchor
wanton patrol
#

my chat history is not showing

deft spruce
wanton patrol
#

i can open old chats with links I have saved

urban badger
#

guys

#

i can't even log in

river hamlet
#

This is the beginning of the end...

hard quiver
#

I cant accept the terms of use

pliant thorn
#

can't login some pink error message comes

echo dome
cloud palm
#

CANNOT LOGIN

deft spruce
#

refresh and changed

umbral sandal
#

Does anyone have the arena working on PC? It works on phone, but it doesn't start on PC

river hamlet
urban badger
#

i just get this every time i try to login

echo dome
#

I can't see my history

modern wedge
echo dome
#

in mobile

urban badger
#

atleast the server didn't get rebranded yet

forest cradle
#

i cant accept

echo dome
#

can i ping pineapple?

forest cradle
#

the terms

#

of use

cloud palm
#

Connecting to Arena has failed. Please try again later or on a different device.

pliant thorn
#

can't login using Google account

dry siren
#

Me either

tiny dust
#

Cant login

fallen cloak
#

why my chats are deleted

echo dome
#

for me i have account and my history is gone
what the barnacles

cloud palm
#

ISTG IF I LOSE MY CHATS

#

im gonna cry

fallow canopy
#

i used the link for restoring the chars but it's giving me an error 🫠

crude anchor
wanton patrol
#

i get an error when I tried to relog in

spare torrent
#

please engine server

pliant thorn
#

site is cooked for now we'll just have to wait

fallen cloak
crude anchor
#

@echo aurora talk for this

viral notch
spare torrent
#

@echo aurora bro, turn on the server, please)

cloud palm
#

ITS WORKING

#

I LOGGED IN

deft spruce
echo dome
#

history is now backed up
idk who caused this

near yarrow
#

I ALSO CANT ACCEP TOS

deft spruce
#

in ours it;s working

pliant thorn
#

yo it's working

viral notch
#

nevermind, chat history is back

hard quiver
wanton patrol
#

i got logged back in and have my chats back

cloud palm
robust sluice
#

ayo all chat is back

pliant thorn
#

at least chats are back

cloud palm
#

phew

exotic totem
#

"Sorry, you have been blocked": I cannot access the website. Is this normal?

deft spruce
#

IT's BACK thanks @echo aurora

shell oasis
#

All good...worked on Windows 11 browser, I think only in mobile browser didn't work of migrating old chats

urban badger
#

guys

echo dome
#

lmarena = ahhh what a nice day
arena.ai = OH OH OH NO WHERE IS HISTORY I CAN'T LOG IN-

urban badger
#

should we riot if they rebrand the server too

restive sandal
#

resolved

pliant thorn
#

yea ty pineapple

shell oasis
urban badger
hard quiver
#

To get back My chats back i have to sign out and log in again?

torpid topaz
#

For a sec, i thought i would lose all my chats...

cloud palm
#

ill suck your toes pineapple thank you

river hamlet
#

hah

queen iron
#

Interesting

hard quiver
echo dome
urban badger
#

first to say hi to me gets literally nothing

echo aurora
echo dome
near yarrow
#

what do yall use lmarena for tho?? just curious

rigid pewter
urban badger
#

i am still waiting for someone to say hi to me so they can get literally nothing

cosmic lintel
spare torrent
#

@echo aurora Was the error fixed last time?

normal abyss
rigid pewter
near yarrow
river hamlet
#

it's work

near yarrow
cosmic lintel
dry siren
rigid pewter
left lodge
#

Damn i was not expecting this change

near yarrow
rigid pewter
echo dome
spare torrent
#

@echo aurora by the way, I suggest adding a file upload button so that the AI programmer can easily read the chat without cluttering it. Additionally, you can upload coding photos to showcase your desired style

left lodge
#

It looks just like the claude ui , i like it

#

So soothing

near yarrow
cosmic lintel
river hamlet
normal abyss
dry siren
#

My history stayed. Interesting. Let me make sure my scraper still works with the update and if it does I'll post the github link

near yarrow
cosmic lintel
left lodge
#

Look at this

near yarrow
grand flame
#

#addstopbuttonnotnewmodels

shell oasis
echo dome
shell oasis
#

it will work

urban badger
#

no messages for 5 seconds? dead chat

left lodge
#

I like this

shell oasis
#

I mean migration of old chats work in PC browser

crude anchor
cosmic lintel
crude anchor
#

i need Direct chat em arena video arena

crude anchor
shell oasis
#

I tried login via mobile browser migration didn't work but in PC it worked

left lodge
echo dome
near yarrow
left lodge
cosmic lintel
near yarrow
cosmic lintel
left lodge
#

They are a little late on announcing

thick skiff
#

Am I the only one having trouble logging in?

cosmic lintel
shell oasis
#

Now working in mobile browser also... migration of old chats just at new login

cosmic lintel
#

xd

cloud oak
#

@echo aurora i think the #general channel needs slow mode.. 🥲

shell pewter
#

ohhhh icon updated?arena

left lodge
#

Url also changed!

echo aurora
crude anchor
left lodge
grand flame
left lodge
grand flame
#

i live with my brother

exotic totem
#

One has to admit, serif fonts are indeed quite elegant and do match the icons well.

inland quest
#

Wtf LMArena is now Claude?

cosmic lintel
#

see this

spare torrent
echo dome
left lodge
#

Its a little visual glow up

#

Nice , but i think now its timeto fix technical stuff. :p
If those are still same here

shrewd citrus
#

yeah wth it literally looks like Claude website now

grand flame
#

we got new ui but why we can't get stop button

obsidian cargo
#

god I haaaaate the captcha

left lodge
gusty loom
#

Did anybody notice the new UI?

#

And new name?

obsidian cargo
#

what's annoying is the captcha doesn't work, you click on all the things and its like try again

gusty loom
#

well its kinda hard to miss

left lodge
inland quest
#

it was a bit predictable because of canary.lmarena domen was changed to canaryarena

left lodge
#

Thw heading needs a new font forsure , no readability lmao

hushed gyro
#

WTF is this????

shrewd citrus
#

lol for a minute I thought lmarena was maybe bought out by Claude

echo aurora
#

You all having a good time? lol

hushed gyro
#

Why tho?

#

LMArena sounds really good

grand flame
left lodge
rustic lichen
hushed gyro
#

Is it just UI update + rename? Nothing else changed?

echo aurora
#

I'll be putting out an announcement soon, expect some oddness with the site right now. Team is all hands on deck. Lots of moving parts.

rustic lichen
#

fix the rate limits its annoying asf

hushed gyro
grand flame
#

please pineapple pen we need this

#

my mom is kinda homeleess

shrewd citrus
proud bobcat
#

oh dear god this looks awful

gritty steppe
#

lmao

inland quest
proud bobcat
#

what the hell is this

hushed gyro
grave plaza
#

Arena?

hushed gyro
#

I kinda like this classic style vibe

cosmic lintel
#

guys someone editor here

gritty steppe
rustic lichen
#

old ui wayy better

inland quest
#

Claude Arena, finally

hushed gyro
#

GUYS DON'T PANIC, IT'S JUST UI CHANGE

EVERYTHING STAYS THE SAME

grave plaza
#

arenaai????

inland quest
#

nobody panics

left lodge
#

This is a little confusing but lets get this clear guys,

You just need to click this recover button to get yout account logged back in !
No need to type and loggin again from lmarena.ai to arena.ai

proud bobcat
#

this is so weird

gritty steppe
proud bobcat
#

keep the old name please

grave plaza
#

COOOLLLLL

proud bobcat
#

i beg you

rustic lichen
hushed gyro
proud bobcat
#

make the charging handle longer

gusty loom
#

I kinda like this new UI better

hushed gyro
#

@echo aurora I actually kinda like the new UI

grave plaza
#

new model selection is not good 🥲

rustic lichen
gusty loom
obsidian cargo
inland quest
#

real, Arena is something out of direct meaning
while LMArena is directly Arena of LM's

hushed gyro
#

Bro the popup

open wind
#

Has the profile picture changed?

left lodge
thorny cove
#

lmarena is now arena?

gritty steppe
hushed gyro
#

GUYS I WILL REPEAT!!!!

NOTHING MAJOR HAS CHANGED. JUST THE UI AND NAME

hushed gyro
gritty steppe
gusty loom
#

they've got a better domain

thorny cove
#

i miss lmarena

#

rip lmarena

gusty loom
thorny cove
#

you will be missed

inland quest
open wind
#

I WANT THE NAME LMARENA BACK. 😢 😭

hushed gyro
#

I mean as long it's usable I don't mind the change...

gritty steppe
left lodge
#

Today i was literally using claude and thinking this is such a nice ui and here we go , same exact ui on lmarena, my wish got true.

inland quest
#

movement labs previous ui was also claude styled

golden ocean
inland quest
#

claude style is just superior

golden ocean
#

claude ui clone, i mean its worth it tho

#

fr

rigid holly
#

I dunno
I dont dislike it or anything. The font and differences are fine
But i kinda liked the other darker coloring more

gritty steppe
thorny cove
#

can you upload more than images now

sweet tinsel
#

2 times the charm, 2nd rename now.

thorny cove
golden ocean
#

lmsys

#

idk

sweet tinsel
gusty loom
modern wedge
#

Bruh its just me or the UI look like Claude?

left lodge
rustic lichen
#

last chat with lmarena

bright scroll
#

please bring back smart router ts was so good

cosmic lintel
#

guys someone editor here

#

guys someone editor here

#

guys someone editor here

golden ocean
hushed gyro
#

LMArena RIP 🕊️

golden ocean
#

lmsys still a thign

gusty loom
#

the video bot is still named LMArena

inland quest
#

well, google still thinks Arena is LMArena

thorny cove
sweet tinsel
bright scroll
left lodge
golden ocean
#

fr

thorny cove
#

i liked the old logo more

steep bear
#

how to fix this? when i click cloudfare captcha it just doesnt work and says this

#

and then i get this

rustic lichen
sweet tinsel
#

I do have to say, the quality of conversations here has deteriorated in comparison to prior times, during the times of GPT-4 and o1-preview.

inland quest
#

New design is good but old one was pretty much fine and icons much more unsimilar

left lodge
steep bear
#

its annoying

obsidian cargo
#

@echo aurora no busses left, right? SO WHY. WHYYYYY?!?!?!

steep bear
#

u cant solve those

obsidian cargo
#

and then this happens eventually

inland quest
#

the bus

left lodge
#

Now its arena blogs.

vestal ledge
#

amazing news!!!

inland quest
#

Claude 4.7 when

fossil fable
#

arena

old garden
#

why just arena

#

now its lost all of its specialness

fossil fable
old garden
#

janix

#

hello