#general

1 messages · Page 347 of 1

obtuse smelt
#

oh yes is captcha strikes again ?

shy isle
#

does anyone know why sometimes when you ask a maths question the equations don't display correctly printing stuff like this:

e^{2x},dx
\int xe^{2x},dx
]

instead of the equation formatting

grand raft
#

oh

obtuse smelt
#

nice code

sullen sable
#

👍

surreal zephyr
shy isle
compact marlin
#

Hi

wary echo
brisk helm
#

upvoted

gleaming wraith
#

does anyone have any idea how long it usually takes until a newly added model gets added into the leaderboard (talking about gpt 5.5 in this case)?
Its been in battle mode for quite a bit now

strong ferry
#

Are GPT image 2 and Nano Pro down right now? Keep getting errors.

toxic verge
gleaming wraith
toxic verge
#

Oh I see apologies sir I thought you were asking the opposite

spiral trench
#

hello

fickle sable
#

Plz how to generate image-video

stray aspen
#

Go to the video arena

fiery crow
#

grok ai video generation

vital mantle
#

I hate it

#

so u bought 3 gpt 5.5s

hollow ivy
#
poll_question_text

Best model for very long, deeply immersive and highly realistic sandbox/RPG/adventure-games, with a very intelligent gamemaster-AI?

victor_answer_votes

17

total_votes

30

victor_answer_id

1

victor_answer_text

Gemini 3.1 pro

toxic verge
shy isle
toxic verge
#

If the arena is feeling the crunch. You know it’s real.

indigo knoll
toxic verge
#

How much is paying too much for AI in your opinion?

slender ledge
#

40 euros

#

idk

shy isle
#

ai should be free

keen beacon
#

Unfortunately

shy isle
#

hmm

slender ledge
#

Holy

#

I found an website unlimited opus 4.7 for free

heady kite
#

Nothing is free

slender ledge
#

Yes

#

And gpt 5.5

#

And 5.4

versed kelp
slender ledge
#

5 Reacts and I slide

heady kite
#

Do any of the open models actually compete with the proprietary ones? GPT 5.5 Thinking etc are agentic and I'm wondering how you replicate that

slender ledge
#

All free there

versed kelp
#

wats da sitee

slender ledge
#

5 reacts lol

#

Also opus 4.6

heady kite
#

Are you "just seeing this" or can you confirm that they actually perform? Because I can spin up a site that says it has GPT-6.

slender ledge
#

And gpt image 2 unlimited

slender ledge
#

So I know it’s real opus

slender ledge
#

Ima tell

#

19

#

10

#

8

#

7

#

6

#

5

#

4

#

3

#

2

#

1

#

Genspark.

#

W Slido

sly cedar
#

I noticed there is subscription

#

Which is unsure for me

slender ledge
#

For video

#

And other

#

But gpt 5.5 and opus 4.7

sly cedar
slender ledge
#

All unlimited

slender ledge
sly cedar
slender ledge
#

So u buy the plus

#

Plus is video too not unlimited

sly cedar
keen beacon
#

This is fake Opus 4.7

slender ledge
#

No

sly cedar
#

Im suspicious

slender ledge
#

I belive it’s the real

sly cedar
#

Is it normal?

keen beacon
#

This is how real Anthropic models look like

sly cedar
#

Ima test this

keen beacon
#

Normally there will be a big # at the beginning of replies

sly cedar
#

To make sure

heady kite
#

Tried their "super agent" mode got this

keen beacon
#

This does not have that.

velvet furnace
velvet furnace
slender ledge
#

Genspark

heady kite
#

Try sending one message with Opus and you get the pay up popup

#

Nothing is free

slender ledge
#

Nope

heady kite
#

You have credits in the bottom left. Mine went to 0 after the first message

slender ledge
keen beacon
heady kite
#

Yes I was using agent mode

#

Good point

slender ledge
#

It’s real opus

heady kite
#

Tbh I think agent mode is the only one that matters fir complex tasks

keen beacon
#

But they hacked the system prompt of some $#!+ slow model and call it Opus 4.7

slender ledge
#

Nope

heady kite
#

Well I get the pay popup every time now... have fun. I will just use the real models

#

If you have good agentic results with an open model please let me know

slender ledge
#

Use ai chat

slender ledge
#

What should I ask

#

So u guys know it’s real

keen beacon
slender ledge
#

NVM

keen beacon
#

For that I got 913 lines with Arena Max ("Response provided by Anthropic") vs 194 for Genspark AI's Claude Opus 4.7

slender ledge
#

It’s not unlimited

#

oh then it’s fake

sterile tartan
#

I have to some really complain about Genspark

slender ledge
#

wym

#

Yea it’s not unlimited

sterile tartan
#

Im just saying Genspark is not Trustable

rocky geyser
#

What do you think about agent mode in arena?

stable sage
#

Do anyone tried arena.ai/agent mode I just saw it and tried to check that preview and gone

sterile tartan
#

Don't buy their subs

rocky geyser
stable sage
stray aspen
whole sundial
pseudo hemlock
#

Hi kiri

low quiver
echo aurora
#

Good morning/afternoon/evening

echo aurora
#

saw it and tried to check that preview and gone
Did the generation not work? Or did you leave the page and when you came back it was gone?

whole sundial
stable sage
echo aurora
#

I couldn't give an early heads up sorry to say.

echo aurora
#

We need the votes and to validate the data. This process can depend on how long it takes from arena to arena.

stable sage
#

Even I couldn't see it in canary too

echo aurora
frank spruce
#

If you see a suspicious Discord login without the word "Discord" in the search bar, then don't login; this is the chance: it is a scam.

pseudo hemlock
#

PINEAPPLE

#

do yk how many votes gpt 5.5 has

#

or when it will be on the leaderboard

vernal raft
#

It's random?

#

Damn I didn't even had the chance to use it

obsidian cargo
#

gpt-5.5 high below muse spark??????

half veldt
#

damn

gray lion
#

wow

silent tree
pseudo hemlock
#

thats so interesting

silent tree
#

chances of getting gpt 5.5 was so low

#

how would u even vote??

#

if only direct was there

upbeat ferry
#

how is that possible

#

GPT 5.5 is the frontier model that rivals opus

split kayak
#

how to come up with good prompt battle

silent tree
echo aurora
silent tree
#

the chances of getting gpt in battle

#

were so low

#

its literally hard to vote 😔 😔

gray lion
#

i got it enough times

upbeat ferry
#

We need to vote for lord sam

silent tree
arctic lark
#

Gpt 5.5 is so low on the leaderboards, it makes no sense. It's waaaay better at coding than opus 4.7 in practice but my god is it low on the leaderboard relative to it???

silent tree
#

its not even possible

#

muse spark is so trash

upbeat ferry
#

I want team red vs team white

silent tree
#

how the hell?

glacial mulch
#

doesn't make any sense to me, i am pretty satisfied with 5.5

whole swallow
#

How it’s lower than muse spark on coding

upbeat ferry
#

I don't want another team specially with a blue color

silent tree
#

@surreal zephyr

whole swallow
#

Is there an api endpoint for muse spark

silent tree
#

yo gpt-ampro 4o

cinder jasper
#

Gpt 5.5 is not that bad wtf?

silent tree
#

its rigged ig

#

😔

graceful sundial
scenic loom
#

Lmfao

cinder jasper
#

The leaderboard is rigged as f*

whole sundial
upbeat ferry
#

GPT 5.5 would crush sonnet in every way

silent tree
#

thats not even possible

#

how is 5.5 worse than 5.2

#

not possible lol

scenic loom
#

Gpt 5.5 mogs opus lol

upbeat ferry
#

GPT 5.5 rivals opus 4.7

whole sundial
#

at least, not one available to the public. i think businesses have access to it but that's it

silent tree
#

either the votes are rigged or no one noticed 5.5

rocky geyser
#

Code arena lets you add secrets and a database now??? That is so cool!!

cinder jasper
#

What ever happened to LLmaren, lol? I'd rather trust the ArtificialAnalysis leaderboard more. GPT 5.5 is worser than 5.2 ahhh

shrewd citrus
#

5.5 can’t be that bad lol

whole sundial
#

honestly a stupid idea imo, meta would be making millions if they opened up muse spark api

rocky geyser
upbeat ferry
#

Wow

#

and you even use sonnet

#

which beats 5.5

rocky geyser
#

It is battle mode ;-;

smoky ruin
#

how is it 9th in code

silent tree
#

100% rigged

rocky geyser
graceful sundial
cinder jasper
#

That makes so much sense Aware

gusty helm
#

loool?

upbeat ferry
#

How does zucks muse spark beats 5.5

graceful sundial
#

I don't like that all the top models have been removed from the direct chat

gray lion
upbeat ferry
devout vault
smoky ruin
graceful sundial
cinder jasper
upbeat ferry
#

but somehow muse spark beat 5.5

plucky sparrow
#

wow gpt 5.5 getting it's ___ kicke dmore than i thought

upbeat ferry
#

It's laughably bad model...

#

can't even be compared with 5.5 and opus

plucky sparrow
#

i actually like it

cinder jasper
#

And here is it the first place? so who is the liar?

stray aspen
#

i told you

#

claude 4.7 is better

#

than gpt 5.5

upbeat ferry
#

claude has a lot of fanboys

stray aspen
#

no

spring oar
#

Opus is definitively better than GPT

stray aspen
#

its just better lmao

cinder jasper
#

nope

spring oar
upbeat ferry
#

5.5 is kind of equal or slightly worse

silent tree
# spring oar And ?

watch how they'll try to ragebait openai users by saying they worship sam altman

#

btw arena is unaccurate now

stray aspen
hollow nebula
silent tree
stray aspen
#

still better

spring oar
upbeat ferry
stray aspen
#

but the spud will be better than mythos

cinder jasper
silent tree
spring oar
silent tree
#

theres no way "muse spark" beats 5.5

plucky sparrow
#

benchmarks suck 😄

stray aspen
silent tree
#

Its not accurate

upbeat ferry
#

also deepseek v4 is on par with opus and 5.5

#

why it ranks 10th

arctic lark
stray aspen
plucky sparrow
silent tree
cinder jasper
stray aspen
#

its not even on par with gemini 3 flash LMAO

whole sundial
#

llama 4 maverick part 2?

upbeat ferry
wide parrot
#

Anyone have info or other servers on AI workflows

upbeat ferry
#

1m context window like opus

cinder jasper
stray aspen
upbeat ferry
#

bro even muse spark beat 5.5

#

how is my boi deepsek at 10th

#

Its also so cheap

stray aspen
cinder jasper
stray aspen
cinder jasper
rocky geyser
stray aspen
#

its not as good as it waws during release

rocky geyser
upbeat ferry
#

Gemini as a model is good tooling around it is bad

#

It was the most intelligent model a while ago.

silent tree
upbeat ferry
#

and even that got slapped by muse spark lmfao

rocky geyser
cinder jasper
silent tree
#

first u got agent mode first out of everyone

#

now u get the new "database" feature

#

😭

rocky geyser
#

Its fun tho

upbeat ferry
#

I can't stop laughing because of muse spark

silent tree
rocky geyser
quartz light
#

@echo aurora since the code arena results are being criticised:

here is my proposition:
when the arena team implements the credits system, they should create coding tasks for ai to complete, these tasks should be of wide variety of challenges which are easy to judge in quality, like physics based n stuff, different categories like 2d, 3d, games, n stuff, you would need to spend at least 5 sec viewing and interacting with 2 results to vote, you need to write what is better about the selected than the other, then a fast, efficient model judges your reason and rewards credits depending on how good the reason is (0 if nonsense/unrelated, 50% if simple, 75% if decent quality, 100% for high quality)

rocky geyser
#

Poor pineaple always getting pinged 😭

silent tree
rocky geyser
upbeat ferry
#

look at the companies

rocky geyser
#

since when can arena use commands 😭

upbeat ferry
#

How can meta make a good ai

quartz light
quartz light
#

read what i said

upbeat ferry
#

Lmfao even grok ranks 9

echo aurora
silent tree
#

@rocky geyser can u show proof vid (does the feature work?)

quartz light
rocky geyser
stray aspen
stray aspen
#

its so bad

silent tree
#

unaccurate results btw

proud bobcat
#

And gpt still lags behind

silent tree
silent tree
#

dont always rely on one site

#

💀

gusty helm
#

quick somewhat off topic question:
claude takes p4 and gemini p3 because of the rank spread here? score is tied but the +/- is smaller.?

cinder jasper
quartz light
#

@echo aurora do u see the criticism now

proud bobcat
#

Holy mother of cope

zealous sparrow
proud bobcat
#

Arena is based on people’s actual use of the model

silent tree
#

tuuuufffff

proud bobcat
#

If people don’t like 5.5 why would it be inaccurate

quartz light
gusty helm
sterile tartan
silent tree
silent tree
#

and shooockkkkkedddddddddd

echo aurora
proud bobcat
#

We need more benchmarks tbh

#

SweBench is not reliable anymore

cinder jasper
#

Ahhh gpt 5.2 chat and gpt 5.5 high are on the same level, one point difference

silent tree
sterile tartan
pseudo hemlock
#

Why is muse spark still marked as preliminary?

#

it has twice the votes as gpt 5.5 high

stray aspen
sterile tartan
#

Is still Preview

proud bobcat
upbeat ferry
proud bobcat
#

True

proud bobcat
#

I mean more independent benchmarks

#

For one I plan to make my own creative writing benchmark

#

For NSFW and SFW

echo aurora
sterile tartan
#

So they can't be Gamed

boreal anchor
#

Anyone here knows how to use Claude Opus 4.7 thinking in the Battle Mode?

west sierra
#

i just got buttfucked on polymarket betting on 1980+ debut score

quartz light
# echo aurora Our teams are looking into this criticism now, I'll followup when I have more in...

to add onto this, gpt has had very similar ui throughout the releases of gpt 5.2-ish to 5.5, thats whats being judged, not the code quality (functionality) of the websites themselves, so what I'm suggesting is a large, curated list of coding tasks for the models to complete (love the ones showcased on arena's youtube videos btw) and those could be voted by users in exchange for reward with credits (when those are added)

also note, I'm not sayin the non-curated code arena should be removed, the curated code arena will give people the opportunity to earn credits which they could use to send prompts to premium models in the non-curated code arena/text arena BlobCatHeart

signal pelican
#

people expected too much

silent tree
#

Gpt 5.5 high > gpt 5.5 > gpt 5.4 xhigh > Clock Popus 4.7 Thinking > Clock Popus 4.7 > GPT 5.4 > Gemini 3.1 Pro > GPT 5.3 Codex

sterile tartan
#

cx easy

silent tree
#

no dude once u ask it a question u get 392928282829229$ bill

#

😔

upbeat ferry
#

It is the most popular harness after chatgpt web

sterile tartan
#

So it shouldn't be there

silent tree
#

no hate to deepseek

stray aspen
upbeat ferry
#

you're making him look like muse spark

stray aspen
sterile tartan
silent tree
stray aspen
#

gemini 3.1 pro is trash

cinder jasper
sterile tartan
#

Indeed

rocky geyser
silent tree
#

AiAnosCranel 3.5 mogs

upbeat ferry
upbeat ferry
#

I agree

slim gorge
#

arena.ai is always so biased against gpt... like gpt always ranks so unreasonably low in this leaderboard even though its obviously up there with claude opus.

silent tree
#

@echo aurora Is arena really implementing a database feature where u can add API keys? and stuff, @rocky geyser told

quartz light
#

i'd put opus 4.7 above 3.1 pro 😅

upbeat ferry
cinder jasper
# stray aspen gemini 3.1 pro is trash

But if you're using it through a third-party provider where you're only getting 25% of the actual model's performance and only 25% of the original context, don't be surprised. I get the feeling most people only use AI through GitHub Copilot, lol

silent tree
silent tree
stray aspen
#

yeah it sucks

cinder jasper
#

Mistral Large 3 is the best coding model ever ❤️

upbeat ferry
silent tree
sterile tartan
#

Muse is Better then Mythos
Because it's Usable.

slim gorge
silent tree
stray aspen
sterile tartan
cinder jasper
upbeat ferry
#

I want deepsleep guy to post muse memes

cinder jasper
stray aspen
#

next update is the spud

silent tree
#

but then arena rigs it

#

and makes spud the lowest

#

😭

quartz light
#

btw yall, jus saying but sonnet 4 launched less than 1 month after 3.7 👀.. sonnet/opus 5 by next month?

upbeat ferry
#

It's still a valid metric

#

Thats why so many people are pissed about muse

slim gorge
cinder jasper
silent tree
#

the spud will wake up

sterile tartan
quartz light
stray aspen
#

the spud will mog every model

upbeat ferry
silent tree
#

no model would ever reach spud's level

#

anthropic will cry

#

HARD

silent tree
#

and wont release mythos cuz they dont have security

cinder jasper
#

It's crazy that there are still Claude fanboys out there after everything they've done

stray aspen
slim gorge
#

gpt 5.5 is mythos level except u can actually use it 😂

stray aspen
#

the spud will

silent tree
cinder jasper
#

Yeah, I thought so too, but we shouldn't get too attached to a model—they've just been messing with us, with the limits, the support, and the models

silent tree
upbeat ferry
stray aspen
median smelt
silent tree
sterile tartan
cinder jasper
#

It was about the promotional offer, which was only used so that the limits could be lowered afterward

upbeat ferry
#

Sam altman is the gaben of AI universe

silent tree
sterile tartan
stray aspen
silent tree
median smelt
silent tree
#

Use websim bot it's unlimited and freeeee

stray aspen
#

it still works?

median smelt
#

GUYS ITS DOUBAO NOT DUOBAO

silent tree
#

but websim removed

#

sadlyyyyyyyyy

stray aspen
#

it thought they patched it

median smelt
stray aspen
median smelt
#

it was only image reference before

upbeat ferry
odd roost
cinder jasper
#

Then why don't they just say so? Instead, they try to cover everything up. OpenAI is open with us and tells us when something's going on, and they didn't use their promotional campaigns to pull the wool over our eyes.

slim gorge
#

also all the bullsh they been saying that "we arent releasing mythos because it can hack anything" is just straight cap, its all fear-based advertisement

cinder jasper
#

idk but openai do it, so its claude fault 🚬

stray aspen
#

doubao doesnt work

slim gorge
#

isnt gpt 5.5 based on spud?

sterile tartan
#

Mythos~

stray aspen
slim gorge
#

holy

sterile tartan
#

Image 2 is also based on 4o

silent tree
sterile tartan
#

Sama will Mog All

slim gorge
#

gpt is gonna be so back

silent tree
#

its not spud, spud supports voice in and out

stray aspen
#

they already moggeed every model

slim gorge
#

real

thorny schooner
#

Did anyone else just get login out out randomly cuz I can't login back in for some reason

dim oak
#

trust me is the last problem the fe

upbeat ferry
# silent tree

I want to create one for muse please give me the prompt and the model

solemn nacelle
#

Muse spark and Gemini still better than gpt 5.5 high
lol 😂

sterile tartan
#

Just use image 2 first

dim oak
#

if you let see an ai made backend to a swe is gonna cry

upbeat ferry
#

Wait for him to become 1st

stray aspen
#

stop ragebaiting

silent tree
normal abyss
slim gorge
#

too bad gpt 5.5 dint remain efficient like the previous models. burns through tokens faster and tokens cost more. but maybe they'll fix that

faint vigil
#

I love gpt image 2

silent tree
#

😔

stray aspen
#

me to

sterile tartan
#

Me2

stray aspen
#

cause im so tuff

slim gorge
#

how

silent tree
#

websim users assembled

silent tree
slim gorge
#

oh

stray aspen
#

no

sterile tartan
#

Ahhh hell nah

upbeat ferry
stray aspen
#

it has good limits tho

#

for images

faint vigil
sterile tartan
#

Free and Go users still stuck on GPT 5.3 Instant

stray aspen
#

yeah cause you paying

#

im free tier

sterile tartan
#

Do you like 5.3 Instant

stray aspen
#

no

#

its bad

silent tree
#

I love 3.5

sterile tartan
#

Time to Upgrade

silent tree
#

instant

#

Tuff alert

sterile tartan
#

You forgot /imagine

surreal zephyr
#

Wtf

silent tree
#

ampro

#

ik

#

the results..

surreal zephyr
stray aspen
#

why is bro trying to generate images in general

silent tree
#

Its rigged

sterile tartan
#

💀

surreal zephyr
#

100%

#

Theres no explaination

silent tree
surreal zephyr
#

Gpt 5.5 makes MUCH BETTER UI

stray aspen
silent tree
sterile tartan
stray aspen
slim gorge
sterile tartan
#

Myth OS, Slopus, So-nnot, Hawkthu

silent tree
#

it was live yesterday

sterile tartan
#

Assthropic Lineup

silent tree
#

I'm sure we all saw the live

sterile tartan
#

Sama was Mogging

slim gorge
#

it looks insanely real holy

sterile tartan
slim gorge
#

image 2 is crazy

silent tree
silent tree
#

😭

stray aspen
#

lol

main nexus
slim gorge
silent tree
frank spruce
#

If you see a suspicious Discord login without the word "Discord" in the search bar, then don't login; this is the chance: it is a scam.

main nexus
#

Ok?

#

Uh

#

Good to know

silent tree
#

@echo aurora is this guy hacked or smth

echo aurora
faint vigil
#

Deepseek is the best

signal pelican
#

stop being lame

stray aspen
slim gorge
#

deepsleep

sterile tartan
slim gorge
#

LOL

sterile tartan
#

Is this Deepseek in some Alternate Universe?

upbeat ferry
faint vigil
silent tree
faint vigil
#

elephant now

silent tree
#

claude popus x deepsleep collab = 😴 Sleepos (Mythos but Sleepos)

sterile tartan
#

🤣

#

Deepseek is still cheap and usable

#

Unlike Myth OS

stray aspen
sterile tartan
stray aspen
#

slopus 4.7

slim gorge
#

notice how it's called mythos? because its just a myth

surreal zephyr
upbeat ferry
surreal zephyr
#

This is 5.5

surreal zephyr
#

Opus is not even in same universe

sterile tartan
#

Booooooooooooo!

slim gorge
#

opus is better in some aspects, i gotta admit that. but gpt is gonna surpass it in every aspect soon enough.

sterile tartan
#

Fastest to do that

#

🔥

slim gorge
upbeat ferry
sterile tartan
sterile tartan
#

20$ Free Plan

slim gorge
#

yeah pro plan feels useless af

sterile tartan
#

GPT Plus Mogs All

slim gorge
#

been using gpt plus and never had any issues with usage limits

sterile tartan
#

Yeah is kinda unlimited

#

Not exactly but kinda

surreal zephyr
#

They shadownerfed it 100%

sterile tartan
surreal zephyr
#

5.5 medium slow eats MUCH MORE than 5.4 xhigh fast

sterile tartan
#

This is Anthropic

surreal zephyr
#

Iv eat my 15% today via 5.5 medium slow

slim gorge
#

its just the model, it eats much more

surreal zephyr
#

5.4 xhigh fast spam couldnt even eat 2% per day

#

Iirc 5.5 counts 2x by default

#

For subs

sterile tartan
#

Is not exactly the model @surreal zephyr

#

The limits are decreased then before

surreal zephyr
#

Yes but

#

Idk if 5.5 medium

#

Or 5.4 xhigh fast + subagents

sterile tartan
#

5.5 Medium is actually holy token efficient

surreal zephyr
sterile tartan
#

But the limits are actually decreased

sterile tartan
#

But atleast the ChatGPT and Codex limits are Seperate

#

Unlike Assthropic

surreal zephyr
#

Honestly i dont like 5.5 that much

#

Its lazy as hell

#

For me

slim gorge
#

i used 5.5 xhigh (without /fast) for a bit in codex cli but i didnt hit the usage limit, didnt check how much it used tho

surreal zephyr
#

Week later

#

5.5 medium slow eats MORE

#

?!?!?!??

upbeat ferry
#

Is this real

sterile tartan
#

Subsidized Ai is slowly Decreasing

stray aspen
#

thats gemini

slim gorge
#

google needs to lock in too, they falling behind af

surreal zephyr
#

5.4 was exact opposite of lazy

#

5.5 is lazy

upbeat ferry
slim gorge
#

we need a digital whip for codex too

sterile tartan
olive spruce
#

5.5 lowkey goated

#

5.4 was so bad

surreal zephyr
#

5.4 pro was able to think few hours in single request and finish

upbeat ferry
surreal zephyr
#

5.5 pro thinks for 15 mins and gives up.
Waste of requests

upbeat ferry
#

I use this atm

slim gorge
#

yeah i thought gpt was falling down until 5.5 came out

sterile tartan
surreal zephyr
#

Per token yes.
Per request? Absolutely not

sterile tartan
#

@surreal zephyr just use 5.4

#

It it works better for u

slim gorge
surreal zephyr
upbeat ferry
surreal zephyr
#

And 5.4 bugs at long context

sterile tartan
olive spruce
surreal zephyr
#

Enterprises pay per request.
Pro model in web has request cap

upbeat ferry
#

Its real

#

🥺

slim gorge
#

wish i could access the large context window with plus 😭

olive spruce
silent tree
rocky geyser
silent tree
#

just GET IT.

proper sun
#

5.5 lost

silent tree
rocky geyser
silent tree
upbeat ferry
#

he's the real deal

#

Not a claude fan boy not a sam worshipper

rocky geyser
#

But the ai can actually run commands its pretty fun

silent tree
#

it can run bash?

upbeat ferry
thick blade
#

chatgpt 5.5😩

icy remnant
#

can we know if i use claude for chat and then limit hit when reset ?

thick blade
#

its not an app

icy frost
#

they are the same as yupp

#

that got closed

#

first no models and then rate limits per week

rocky geyser
icy frost
silent tree
rocky geyser
silent tree
#

no cap

#

He's lucky

#

It's a experiment arenas testing rn and he got it early

upbeat ferry
rocky geyser
upbeat ferry
#

Or he got the early build

silent tree
rocky geyser
silent tree
#

@echo aurora I request you to check modmail and completely block the word

#

Its gotten annoying now

icy frost
silent tree
#

asia

#

america

icy frost
icy remnant
# icy frost theres no way they added limits on models

is any method like i talk in one chat so much from ai due to which chat history get long and when limit hit need this full chat history why not add option to create a pdf of all chat llike preplxity
or thats chats how to use for in new chat to make a old ai converaation

rocky geyser
silent tree
silent tree
#

Hm

#

Wbu @icy frost

icy frost
silent tree
#

U Europe tooo???

icy frost
#

on my pc acc where i wasn't it's not

icy frost
silent tree
#

Time to go to Türkiye

rocky geyser
#

Lol

#

🤣

#

I know a friend he is in europe too but he doesnt have it

rocky geyser
icy frost
icy frost
rocky geyser
silent tree
rocky geyser
icy frost
icy frost
rocky geyser
silent tree
rocky geyser
silent tree
#

yea but faster

icy frost
olive spruce
icy frost
#

idk if it exists in real version but in battle this popped out

surreal zephyr
#

I just realised they gave 5.5 the default 1500 score 💀

icy frost
#

what do i prompt to the agent

olive spruce
#

lol

icy frost
#

cuz theres no captchas

#

on my acc

olive spruce
silent tree
#

these guys got it early man SO LUCKKKKKYYYYYYYY

olive spruce
#

fr

zinc herald
#

Hello everyone

rocky geyser
olive spruce
olive spruce
icy frost
rocky geyser
olive spruce
#

why is it giving story lore

rocky geyser
light sleet
#

unfair 😡

light sleet
#

no agent for me

#

😡

zinc herald
#

Please somone suggest best free vibe coding ai

light sleet
#

I see what youre doing arena i see 😠 😠

rocky geyser
icy frost
rocky geyser
icy frost
light sleet
#

Feather got everything early

rocky geyser
icy frost
#

let me try code

#

YOO THERES GLM

rocky geyser
icy frost
#

it works

icy frost
light sleet
#

Ask it to run command

rocky geyser
icy frost
#

so whats the agent mode

light sleet
rocky geyser
light sleet
#

feather les work together and make an AI website I'll credit u

icy frost
rocky geyser
rocky geyser
zinc herald
light sleet
#

Feather dm I'll give you a insane code arena prompt

rocky geyser
#

Okay

icy frost
rocky geyser
#

So i think i might have 2 beta features on one account :D

vale quest
#

4.7 is obviously better

icy frost
echo aurora
icy frost
#

GUYS I FOUND A NEW FLAG
disable-turnstile-voting → "disable-turnstile"
THIS WILL PREVENT AUTOFARMING
TURNSTILE IS BACK

quartz light
#

ok

icy frost
#

and this
portal_enable_billing_topups → true
it's about paid credits or something

vagrant temple
#

when comes mimo v2.5 pro in the ranking?

shrewd citrus
icy frost
whole sundial
# icy frost GUYS I FOUND A NEW FLAG disable-turnstile-voting → "disable-turnstile" THIS WILL...

turnstile is already used when voting, this is probably for disabling turnstile while voting (idk why they would want to do that, maybe because it's not really needed since recaptcha already runs when submitting a prompt? or maybe they want to replace recaptcha with turnstile when prompts are submitted and they may think that running extra captchas could make normal users seem more suspicious? which is btw why many people get them on arena, when you submit enough prompts it will think that you are suspicious because of all of the captcha calls.)

whole sundial
rocky geyser
median smelt
#

gpt-5.5-xhigh!!

rose jackal
#

add larp setting

stray aspen
#

<@&1349916362595635286>

echo aurora
#

Thank you

#

We're not going to delete that one for now. We'd like to take a better look at what we can do to prevent this going forward.

icy frost
#

real mrbeast

#

what do they even install to get that

echo aurora
loud herald
summer relic
#

Please remove verifications, when I click send message I have verifications and I have to select an image, there used to be one verification for several messages, and now I have one verification for one message

#

fix it

#

I have this with every AI

#

and Claude 3.5 Sonnet doesn't work, I write "something went wrong" in every message, in a new chat

icy frost
frosty lava
#

if we could atleast try the model accurately

#

the leaderboard would be much more accurate

#

if a model do a bad work but fast and the other one do a much better work but then at the end it have to simplify it cause of the limitation, people will choose the one that did the bad work

#

cause when a model "simplify" to meet the memory limit it simply ruin the whole work

frosty lava
#

when its far from being accurate

#

An accurate ranking system should be arena TOP priority

#

the website is made for this

echo aurora
#

Hey @frosty lava thanks for sharing this. It's important to note that in Battle mode the amount of Something went wrong errors is rare, especially in comparison to what you see in Direct and Side by Side. If the model is having issues, the system will automatically sample a different model. This will reduce the amount of errors seen in Battle. Even when an error happens, if a vote happened that vote wouldn't be counted towards the leaderboards.

For the context limit IIIRC the context limits are mainly for Direct/Side by Side. Similar story though where if a model does reach that limit, those votes wouldn't be counted as it'd be considered a failed generation.

oblique sentinel
#

I think when looking at arena scores you should just internally caveat it with fact it's about personality and other things not about which model is objectively better

#

5.5's personality is subpar relative to anthropic/gemini, etc.

#

it's less sycophancy-maxxed

normal abyss
vale quest
#

Or 15 maybe

normal abyss
vale quest
#

I have an idea

normal abyss
#

my biggest worry is that opus will be ridicusly low, however gemini 3.1 pro will be strong at 20 for sure

gray lion
#

around 15 is good i think

#

15-20 yea

normal abyss
proper sun
#

Everyone here thought gpt 5.5 was gonna beat Anthropic

vale quest
#

Sonnet at ~20

gray lion
#

like real ones

normal abyss
vale quest
vale quest
normal abyss
vale quest
#

Legit what it said

vale quest
#

Hey why dont they enforce ... nevermind

normal abyss
#

it would be 4 if you had an error on opus, which is pretty likely if your talking back to its release in 4.6

stray aspen
#

Does anyone know a way of using seedsnce 2 for free

frosty lava
#

Yes, of course on coding, muse spark, sonnet 4.6, kimi k2.6, glm 5.1 are better than gpt 5.5

#

it totally make sense bro

#

when qwen 3.7 will come out and be top 1 in coding you will say its fully deserved

#

right ?

#

that's what im talking about here it just doesn't make sense we all know kimi k2.6 is far from the level of gpt and anthropic, same for muse spark and glm 5.1

#

its not even a question

#

Yeah and i think its cause of the memory limit when the model try to do too big result on arena, for example three.js often hit a memory limit on arena

#

and then the model have to simplify its work

#

then the result you get at the end is not even at the level of the real result

#

i don't know this one seems atleast to somewhat make sense

#

of course

#

but anthropic vs gpt is a big battle and there's different opinions

#

but those small team model

#

are far from this level

#

they don't deserve being ranked that much

#

i can't wait for qwen 3.7 or something like that to be top 1 overall

#

of course they are and they probably also distilled

normal abyss
# normal abyss
poll_question_text

What seems a fair amount of daily messages for 1 account (through Gemini 3.1 Pro)

I'm just curious what people would think is a valid amount???

victor_answer_votes

3

total_votes

12

victor_answer_id

4

victor_answer_text

20

frosty lava
#

in real coding task they are much worse than the real frontier model

#

i think honestly its because those chinese model are in fact benchmaxxed and trained on frontend web design to make people think its actually good

#

and it also have an impact on arena leaderboard

#

try any of those in real coding environment with real task and you'll see the real difference

compact flame
#

Guys how do you think

#

How much mythos would've costed

#

If it was released

#

To public

frosty lava
#

it won't be priced like a normal opus model

compact flame
#

Since if it's that op

frosty lava
#

who want to pay 50 dollar per prompt

compact flame
#

We can't test it anyway

#

So we won't know

compact flame
#

Though gpt o1 pro costs 600$ output

frosty lava
#

its not worth it, the pricing is not worth it, its only worth it if you do some highly difficult research task and you don't need to do multiple prompt

silent tree
#

even if it really was, 5.5 xhigh is literally mythos quality the only reason anthropic isnt releasing mythos is because they dont know how to get better security, which means mythos could be easily jailbreaked yet Openai released 5.5 xhigh because their security is better, Mythos is just a overhyped gatekeep.

frosty lava
#

basically pro model are only worth it to do very difficult research, and thing like that but definitly not worth it for everyday task or even for creating your project

#

cause you simply can't use it everyday

#

and im not talking about the usage you'll get with the subscription

#

Yes i know that and we get some good usage with it, but i really don't believe in anthropic for being that generous

#

especially considering how they glazed it

#

they'll use it for increasing the price

#

just for the next opus to be equivalent to the actual mythos

#

for 20x lower price

silent tree
normal abyss
frosty lava
#

but i hate that i can only use gpt pro on web

#

if i could use it on codex

#

it'll be good

#

i actually need it in codex to do usefull thing

compact flame
frosty lava
#

its good for most task

frosty lava
compact flame
frosty lava
#

no its definitly not i tried it

compact flame
#

Just a different code name

frosty lava
#

i tried both

compact flame
#

Oh ok

frosty lava
#

its not the same model

compact flame
#

Xhigh or pro

frosty lava
#

the pro definitly

#

it wouldn't make sense anyway if the thinking 5.5 would be smarter

#

actually i tried 5.4 mini and realised that its actually a powerfull model and very fast model

#

for some prompt i thought on giving only to 5.5 i was just stupid

silent tree
#

^^

frosty lava
#

older model can already do most of what we're asking

#

and for 5x cheaper price

#

and it also just do the task faster it have higher inference or something i don't know why

#

but it felt like it was on fast mode without being on fast mode

silent tree
hollow rune
whole sundial
#

<@&1349916362595635286>

icy ice
#

qn for arena official folk - is there any way to use 'code' mode to build a cli app? the 'UI' option always defaults to a web app with vite/react etc

stray aspen
#

they are on par

raven otter
#

Image-2 went from being the best to being the worst

pale knot
#

js wait a few weeks, then they’ll make it the worst model—the same method they use to make the next model look impressive

echo aurora
#

Correct

pale knot
#

also I’ve found this LLM to be really good. Does anyone know anything '7bout it? Who provides it?

vernal raft
#

@echo aurora quick questons, are all your "thinking models" set to the maximum? like anthropic models

#

i noticed that the thinking box is way, but wayyyyyyyyyyyyy longer

balmy reef
#

hello

pale knot
balmy reef
#

i wanted to ask about where went the channel arena video are ?

echo aurora
#

Since Code Arena mostly deals with the front end, which isn't the area this model ecxells, the votes are showing that. This is why we're in the process of developing Code Arena to be able to incorporate full stack. This will give a more clear understanding of where models are ranking based on different engineering tasks.

echo sinew
balmy reef
#

thanks buddy

echo aurora
#

I am not sure tbh. Most conversations I've been seeing today are around Code Arena.

#

Can you explain a bit more about this?

#

The overall text score isn't going to have different weights given to each category. Each category will be just that, their category. But the overall leaderboard isn't going to give bonus points for being X or Y category.

plucky fractal
#

funny to watch people on twitter getting mad about a benchmark that's voted by real people with real tasks lol

#

it's very telling

pale knot
plucky fractal
#

ofc. they focus too much on benchmaxxing, which is understandable

light sleet
plucky fractal
#

but seeing people getting angry at arena is just funny lol

echo aurora
echo aurora
plucky fractal
#

i dont mind it but some people just ridiculously attacking arena

light sleet
plucky fractal
#

tell me any other place where you can test every model freely

light sleet
#

idk where the suggestion went but I definitely did or idk

light sleet
#

Gpt image 2 generates steak

proper sun
#

@verbal kite

#

<@&1349916362595635286>

#

<@&1349916362595635286>

echo aurora
#

Thanks

stray aspen
#

its nice

stray aspen
crude anchor
#

But If think this is a real image

whole sundial
stray aspen
#

gpt image 2

stray aspen
#

those views are probably AI too lmao

hollow ivy
#
poll_question_text

Does CM5 (Claude Mythos V) have (some) consciousness?

victor_answer_votes

6

total_votes

12

victor_answer_id

1

victor_answer_text

100% yes

desert fiber
#

hi , anyone know which model can edit pdf and send it back , instead of answering with "i cant edit this file"

thorny schooner
#

I genuinely am disliking Max so much like why the hell does it keep choosing Google AI when it obviously is way too short response limit for answers at times honestly did they specifically set it to favor that model or something ( i see it happened too short and long prompts in prompts of various types)

icy ice