#general
1 messages · Page 346 of 1
just overall inteligence
the things it can do
like it can one-shot more advanced things
without problems
but it overdoes many things
hi bananov i remember you said youre making your own model
nah not anymore
i mean
i have plans to start a kickstarter for it
and it wont be continued pretraining anymore
just SFT on qwen
or pretraining smaller modeles
@surreal zephyr
bro i tought it has a pc
so tuff
What is the program you are using there?
I got this info at some point
More like GPT 6 Pout
claude popus speaks
and then suddenly after popus speaks, you hit your usage limit! 1!!11
Geniemi Cooks
Me geniemi unlimited rate limit cuz he is genie
🤑
5.4 xhigh was close mythos level and 5.5 is.
5.5 is basically Mythos but Sam version
Far better than anthropic
Gpt 5.5 is Samthos
yet they dont wanna release mythos to public
cuz their security is trash
lol
why scared about people hacking, doesnt that mean their security is beyond trash?
and they cant do anything about ut
Yet, "Strongest" model
yet 5.5 is mythos level and it's released
and hard to jailbreak.
Lol
Because Sam is the OG
Claude is the Copycat 🙏🏻
GPT 6 Hollow Purple eats mythos
get sam rolled
ez ez
ok this must be an Google user
ok then this is a guy that never used AI and used it for basic tasks (like how ppl use GPT for basic tasks) and hes talking 💀
bro is stuck in 2023 😔
Shh
Let my senshi
Goat talk
honestly i dont think gpt 5.5 is spud
Nah aint no way, this tuff
Its the birth of spud
never was fully spud
the grown up spud is way stronger
Then when sprout?
this is just the beginning
they cant build strong security because they dont understand what they've built in the first place 😅 not my words, they've claimed all the time from the very beginning that they dont "understand" claude themsevles, i thought it might be some marketing slogan, knowing their struggles now, i believe them not understanding what they've built now
its not spud at all. spud is omni, spud supports voice in and out
OpenMythos is available
😭
Ok, but i guess we are going to see if next chinese model able to beat mythos
Then mythos is the "strongest"
mythos already mogged by 5.5
5.4 xhigh too *
I said, yet they are the "Strongest", meanwhile GPT 5.5 is the "Weakest" model that cooks mythos
wait bro that matches
Nah
anthro
myb
It debug my code pretty well
Even in complex scenarios
I guess claude is overhyping?
He will create "Samthos"
@surreal zephyr this true?
A Hivemind AI Knowledge in the first history
yep its not spud
"Samthos"
Roleplay exist btw ✌🏻
they looking for u @surreal zephyr
PETITION TO CREATE SAMTHOS
sora 2 died_
move ur cursor like crazy and then click the captcha and it's gonna solve without the puzzle
but then you realize hes in mobile
nopecha in firefox mobile
but then you realize he uses chrome
switch
but he loves chrome so much
but firefox supports extensions on mobile
but extensions support mobile on firefox
but support on extension mobile firefox
support but on Firefox mobile
on but mobile support firefox
oh, that poll was only for the models available (for free) at Nvidia's website
for a better model
on firefox mobile but support
linkinkmklink
linker the limk
linkin park
unfortunately, they offer no threads, so it's just a (temporary) playground (and they dont offer [chat]GPT/Claude/Gemini/Grok/MuseSpark, as these are not open models; but they have Gemma)
but with a phone, you get unlimited API access, which is huge (this might even motivate me, to get a phone, lol)
cjplfq gjkbr c ,bndjq hj,jnjd
Where is the ai support?
Yo is Gemini the most bi polar ai ever made?
"Gemini" means twins in latin for a reason ^^
yay give it smacks ^^
fr
smash/burn/obliterate/squash/destroy/annihilate/exterminate/extinguish/eradicate
Like how the hell are you going to make a tool like this just to have it come out completely neutered and useless
Like a tiger with no teeth
like the U.N. ^^
i think chinese models will catch up
only a question of time
(i'm not a fan of ccp lol)
They’re gonna be restricted too
yeah, politically censored
If you want to do business here, you need guardrails if you wanna do an angina, you need guard rails
but useful nonetheless
We’re screwed on both sides
And on top of that, you have to pay just to get a refusal it’s stupid
I see more action at chucky cheese then some of the output these models spit out
and then we get good AI
Thanks spicy fried chicken I'll not eat u for the next week
Anybody could be a ai safety researcher…they use the same tool just block everything
Hopefully asap
when is 5.5 coming to the leaderboard bruh
Yeah +1 to this. The idea was raised when we initially removed the models. Will flag again.
No ETA sorry to say.
Nice stream today
website is down
whole point of arena was so people could see the updated leaderboard to find the best model 😭
nvm its working now
Do you think if the model came out that was unrestricted that it would make the leaderboard?
I don’t think that would be possible
true but the model just came out and wasn't even tested before release, plus they have to get at least 2,000 verified votes before they can add it
I don't think will be too much longer
Hmm what do you mean by this?
yes but dont they test the model on battle mode and all before it actually releeases cause they get early access?
and deepseek v4 was added like the same second
So like if there was ever unrestricted model, it would be too much liability
For anybody to even wanna host it or post it
many models are tested before release but some aren't, especially claude models but sometimes others
what's yo eta what's yo eta
mmmm
They were talking about grok
what's yo eta what's yo eta
mmm mmm
Now grok has to compile in order to stay in business
We have a content filter in place that's on-top of the model's ability (or lack of ability) to moderate.
I hear you I’m just saying that getting cut off by Apple or any of these other things would make it extremely challenging for any unrestricted AI to truly flourish and exist. Not that it’s not impossible.
And I don’t mean just one that could generate nudity that’s not the point
There needs to be also a censorship benchmark, I think
Because it does exist and it does affect the models perception and it does affect the model performance and capabilities
It’s extremely hard to measure though it’s like an invisible ghost that you see you feel, but you just can’t pinpoint it
Yeah I do wonder if there is some way where there could be a evaluate all models without out own content filter. Will say though that in the current state this shouldn't effect rankings because if the content error happens the vote isn't counted.
No, it doesn’t at all, but it doesn’t also paint the full picture
When r we gonna get gpt 5.5 on the arena leaderboards
never
The number one complaint from users on social media, regardless of which one you go to Is moderation
It’s the elephant in the room and there’s no way to even tackle something so complicated
We do hear this feedback a lot (similar to general errors + captcha) for the filter.
It's possible there is some kind of change to how this works down the road
u should replace recaptcha to chinese puzzle challenge
The arena actually expose the problem on a more systematic level
One model can other model can’t it’s not uniform in terms of regulations
If one model blocked something, what’s gonna stop the user from going to another model and just generating it somewhere else what’s the point of you’ve been blocking it at that point?
Not for the arena, but for these AI safety conscious people
True, but also worth noting when our content filter triggers the block, instead of the model's, that's going to block both generations in Battle/Side IIRC
How confident are you of that?
~95%
Reason why I'm confident in that is there is a unique error message when iut's our filter doing the block instead of theirs
@echo aurora Sorry for tag quick question, is the new usage system coming tmr or just the update to the image gen?
Just the rate limit change, no ETA on new usage system
Damn, it's taking so long
It's a larger overhaul and going to take a bit to develop
Could you tall to the team and decide if there's something already established you can share?
I’m boutta get mad I need the models
over
Here is a good example of what I mean
You see how on the right it’s coming out of the mouth. Thats a safety gaurdrail on one model but not equally applies because the image on the left does not omit the feature.
OVER
I see what you're saying, but this is how those models are responding to that prompt. If the generation goes through, that doesn't mean our filter is coming into effect here.
insane
over 2
where is 4.7 opus?
it is hiding in battle mode
but i dont want battle mode
ok then it doesnt cares what you think it just wants to hide there
if that sounded harsh then im sorry
im just confused why i cant use it in direct
they have to cover up the cost since it was too expensive
Lol, and who's the effing judge on efficiency
and if opus 4.7 was in direct or side by side, i can predict that the cost (that is required to keep arena.ai site up) will drastically spike up immediately
what abt 4.6
oh hell nahh
no matter what version of opus
im moving to anti gravity
the reason behind is the large scale of people will keep using it and some even abuse it.
i can estimate that 95% of people using claude opus
Just remove all gemini models and keep gpt and opus
they need it to evaluate LLM
gemini my beloved
Gemini sucks
If Gemini 4 drops
ppl would beg
Gemini 3 was a big leap too
so
Kinda
2.5 still mogs 3.1
All google does now is downgrade their model after a week
I don't know, I think 3.1 is very creative for RP. I have to tweak 2.5 to get the right tone.
many people have different POV on gemini, and fr i actually like gemini
Fr
Fr
deepseek
Who uses qwen
does anyone even use deepske
No
it sucks now
Deepseek is actually good
deepsleep 😴
im just using gemini 3.1 pro because i have a free subscription
The only thing that upsets me is that Google has done something to the reading of PDF files that look like long pages. Gemini can no longer read their contents. It's too frustrating to continue in a new chat :/
that's up to them :>
bro roasting every model
Can't wait for gemini 3.5 and then a week later they downgrade it to be worse then 3.1
bad
I hate gemini with all my heart
its great
It sucks
what makes you hate gemini and claude?
5.4 xhigh was close mythos level
and 5.5 is mythos level
I tried opus, but for me he writes too much lol, I get tired of reading haha And gpt writes too dry and boring text.
Gemini likes to downgrade their models after releasing it for a few weeks
Opus 4.7 is worse then 4.6
what anthropic is basically saying is that
cant disagree
thats fake
clickbait
Pr stunt
really?
dammit
its fake
thats an ai pretending to be like mythos
what in the AI slop
it does say its a theoretical reconstruction of the Mythos architecture
could at least SEE how it performs
Gpt > claude > glm > gemini (being generous)
^
mimo is only good for frontend
5.1
Mimo actually goated
True
why do you roast every model tho gpt is definitely not best at everything
mimo is actually not good
im not saying gpt is good at everything just my personal opinions
mythos is better than 5.5
wdym
Vibe coding
Yeah
cap
ok
also mimo thinks forever which is annoying
me
It thinks forever tho
yes
^
thats why im just using gemini now
Yeah
Same
and honestly 5.5 so good at frontend now
its like they had a big leap for images and their frontend too
cap
image 1 and 1.5 were so sheet
piss filter
❌
Gpt 5.5 mogs all other ai models
gemini 3.1 pro is the poor man's gpt 5.5
✅
especially claude
but yeah if im saying this
nah it doesnt mog claude
its better but not massively better
Kimi just thinks to long for me
0/10 ragebait
cap
I'll never use kimi again
Hi
hi
wydm
these kind of stuff
sam altman?
Im actually gonna combust into flames if someone sends a prompt in #ai-creations thinking there is a bot that makes videos again
yes
will smith taking his sneakers off
dario vs sam mog battle
okok
Chatgpt, generate a screenshot of yo mama
I'm not sure how that will work but ok
I'll do my papa.
lol its ok @silent tree
was just being satire like those discord "users" that randomly put a prompt
Generating..
I get it
I'm generating my papa, and uhh sam vs dario mog battle ig
alright lets see
good boy gpt
so idk if it's because gpt image 2 generated it but it made sam win
Prompt: Sam Altman and Dario Amodei MOG battle
lol
why does dario's face looks fat
dairo needs a hair transplant asap
have you been doing 10+ gens with gpt 2?
im using a site which is broken
wow cool
you using davinci too?
DAVINCI USER SPOTTED
I was wondering since arena.ai sometimes pops the limit warning
so you both on davinci?
yes
Done generated my dad
giga chad
yep
jhis expression is so uncanny
but for more direct discord generating I use websim
but meh
dont tell me you use websim too now @stray aspen
same
@silent tree I only know about OpenAI's discord atm, 50 gens with 1.5 version per day
IT KEEPS MAKING HIM FAT
Hm
lol
how on earth you getting gpt2 on davinci, supposedly davinci.ai
that aint dario thats mario
there is
they added
xd
still dont believe you for some reason 😅
bet its gonna get patched in 3 weeks
davinki
we'll do it ourselves ig
good cutout on 2nd one
we know 😆
I'm curious to see your take on a screenshot of a game maker 8.0 fps from around 2012 and you get to decide the theme
Oh and in windowed mode
Wdym
Is that the prompt
Yes, for screenshots you start with a screenshot of (insert program) on windows 7 for example
Ok ima ask gpt image 2 to pick who's more handsome
GPT2 is really good with text and fake screenshots
Generating..
Prompts in the Line
- Generating: I'm curious to see your take on a screenshot of a game maker 8.0 fps from around 2012 and you get to decide the themeOh and in windowed mode
- Jalapeno vs Watermelon mog battle but you decide who's more handsome
@silent tree bro I meant you get to pick a theme lol
BROO
no worries
image 2 will pick
lol
Nice, thanks @silent tree !
Np
for some reason it chose framerate over the fps genre
different than what I got last time 🤷♀️
so now its jalapeno's turn
Jalapeno looks kinda sleek
yeah
its hard
My eyes and eyebrows DAMN
Sup y'all, just wanna ask, but do older models are being up to date? I tried asking in the ask here but for someone I stopped getting a response
What did jalapeno's lip do to itself 😭
he has a bit of meme energy
IT GAVE ME "ABSOLUTE AURA"
if its a tie
so i choose watermelon i changed my mind
sure
Then Jalapeno Vs Pineapple
imma send then output to him too
Jalapineapple
yall want anime or fruit
anime
ok i shouldnt have choosed anime lol
watermelon wins for me
after this we are going to finals
pineapple vs watermelon
@long minnow put ur vote in
Im making an ai app with free ai:
Claude Opus 4.7
DeepSeek V4 Pro
Gemini 3.1 Flash Lite
Gemini 3 Flash
Minimax M2.6
Mimo V2.5
Kimi K2.6
Qwen 3.6 Plus
Qwen 3.5 Flash
Someone put their vote in
u said that in websim too
give me no limit I'm ultra sigma
give me good credits
give me priority
i just assumed that i would get more reach
😈
🤔
Ok but can u js say who looks handsome @void shore
IM BM
yo should alow me use those for free
To the win!!
PLS SAY WHO LOOKS HANDSOME
PLS 🙏🏻
Not finals
There's next
- Pineapple Vs Watermelon
- Winner goes and loser completely out
- Jalapeno Vs Banana
- Winner goes to Final.
@storm dust Pineapple Vs Watermelon
U da judge so vote
we cant keep doing this
WE CAN
😭
yes....
im sorry but i choose pineapple he's goated
RiP
I CHOOSE PINEAPPLE
Next up
Jalapeno Vs Banana
@light sleet wake up man this ur last chance to go to the finals
Tuff alert
Semi Final!
Gpt rates banana 100 on everything
Damn
Wthhh
Gpt decides their looks btw @storm dust
ok
mrbeast
Voteeeeeee
Vote fastttt
Baaana
i vote jalapeno
I've been summoned..
Vote apple
#1: Pineapple
#2: Watermelon
#3: Banana
#4: Jalapeno
banana high diff
Tuff
FINALS: Pineapple Vs Banana
should apple join the battle too?
hes judge
Les continue this at #ai-memes
When will ChatGPT be on the leaderboard
my bad, I had to go afk.
banana's quite golden
never
🌩️
AI stopped adding messages... Why did this happen? They write my AI code and after 800 lines they write Something went wrong with this response, please try again. How to fix this? It’s been like this for 3 days
@echo aurora It would have been 10 generation request per hour limit, not 5.
I am now your GPT-Image 2! Ask me prompts to generate.
Tree 😭
not AGAIN
websim lol
Wtf
autism explained in ai terms
But it works if write in English ok
is Taiwan a country
😭 👍
kimi 2.6 >>
gpt 5.5 is better tho
I dont know about gpt 5.5 ;-;, I never paid for use ia
I have the opus 4.7 in antigrativy to test, but the gpt 5.5 I never tested
Glm is NOT better than gemini 31 pro gang
Ask them if a mug has no bottom and the top is sealed, can you drink from it?
I'll test this now, but I dont use the glm because the servers even crash
<@&1349916362595635286>
how many time does this happen a day...
🙏
world best ai
April fools over btw
Ur jokes r funny
ae haf no maknee how i get mowdle
bruhhh, glm dont work never ;-;
nah, I tested the glm here, the gemini 3.1 pro still better
Love that gif
Ty
lip sync ai video
holy
..?
Stadium will ALSO have a arena battle system like Arena
just with fewer models
because...
im not paying for 5000 opus 4.7 calls a second
gladly will it test i
will it gladly test i
Did Arena implement a Rate limit across IP?
i believe most of the rate limits are ip-based now
Hi
what is this
Is Claude ever gonna make an image generator?
i'm from anthropic, yes. we are.
I'm from hell, yes. we are
i'm from BM, yes. we are
hello what is the best ia no filter for cybersec? (web and local with olama ou ia LM) (sorry abou my bad inglish)
Stadium
Is Stadium available for android devices?
It should be available for all devices with a web browser
Sorry to be annoying but when I search up Stadium ai or anything similar it shows me image generators, so you please could like provide a link?
any model that u can fit in ram + vram you can find the abliterated version and use it with lm studio
no need for a dedicated no filter one just find the best censored one and then find the version of that without filters
- also Claude via Anti Gravity and gemini via ai studio are all pretty much completely uncensored for cybersec as long as u fill the context window a bit first & you don't just say "hack this site for me pls gpt 🥺🥺" or ofc it will trigger filters
& for what do you even need no filter ai for? since it's not "cybersecurity", it's just doing illegal stuff
and you can straight up say that
hey guys, Why is GPT-5.5 not showing up on the Arena AI website?
because it would get used too much if it was in direct chat
so it's only in the battle mode
ohh , thx buddy
antigravity have a lot of filter
and local my pc is bad one
i try but dont hork
work
it makes credential stuffing scripts with no issue
& reversing basic waf it doesn't even complain or has to be jailbroken
If you see a suspicious Discord login without the word "Discord" in the search bar, then don't login; this is the chance: it is a scam.
When is gpt5.5 coming to the rankings
rip image and other generations lol
what do you guys recommend to prevent AI generated app UIs from looking generic?
you could wire it up to figma
or use claude design
just dont give ai a prompt and leave the frontend upto it, keep changing it sorta is what works for me
I'm trying to make a dictionary but idk if it looks too sloppy or not
Will try, thanks
pv
my antigravity n esta funcionando
does look a litte generic, but not too bad actually
not working
kkkkkkkkkkkkkkk
kkkkkkkkkkkkkk cafundo tudo mn to fazedno um milhao de coisa ao msm tempo
fr?
yes the loggin
can help me
i try but still no working
in the redirect to my acc
he no loggin in anti
i have no clue tbh, im just as lost as you are, i havent been able to log into anti gravity for a week so i switched to cursor pro for now
only in others acc
What in the designarena knockoff is this
sorry my inglish is bad i no understand so well
i try youtube tuturials but still not workin
lmao
+1
Fym knockoff? idk what design arena is bro
its a dictionary gang not designarrena @stray aspen
calm down gang
Its the design arena font lmao
true but yeah
aye lmao
ill get a link soon
Oh okay thank you❤️
<@&1349916362595635286>
@quick jackal
something goes wrong half way into generating I'm pissed
just to make sure, is muse spark also not included temporarily?
<@&1349916362595635286>
are you seriously asking that question
Which one is better?
9
9
1
Gemini 3.1 Pro
how did u create that gif bro?
im gonna make this the introduction gif to stadium
thanks for this
stadium got free GPT image 2!!!!
how did he does that
oh ok ok
there will be credits on stadium
but it is NOT my fault
its my ai providers
for making users have credits
i believe you can use temp emails too
hello, i want to ask, why i cant access and use Claude Opus 4.7, Gemini 3-1 Pro and Flash, thanks
im gonna add them to Stadium
dont worry
my ai app
ok so in website, there is no Claude Opus 4.7, Gemini 3-1 Pro and Flash isnt it
i mean this one
Save Birdland?
Hello, i want to know more about the current ai competition, which is better and how it's evolving (ps: i am completing the starter task)
They really need a cash injection just saying bruh
whats ur ai providers
i use puter.js
because its limits are per user
not global
peak
and free unlimited opus 4.7
GPT 5.5
GPT image 2
etc.
as long as you can make temp emails
Hello, does anyone know Who IS "Botbot2"?
I have been modifying a photo, and It has been the BEST one
i was thinking abt making a page like chat with llms and when the credits run out a popup pops up saying smth like "hey, youre out of credits you can top up or..." and then include a download link for a pdf or some kinda file w the ai conversation with a text saying "you can use another account and paste this into your ai so it remembers the chat!"
But i have only seen It in the Battle model, then when i searched It in the side by side, It wasn't ther
i think its a code name or smthing
Whats that?
someone said it was by google
ugh sorry i don’t know how to explain it but the model isn’t actually called botbot2 it’s a model that hasn’t been released yet and they use a different name so they don’t reveal the real one
But how they can in lmarena know that?
I cant understand why they dont reveal the name
internal testing to gain data
When will Stadium be available?
now
GPT image 2 is really expensive
but it isnt my fault
its Puter's
i could switch it to low quality
and the limit resets monthly
which i personally cant live it
is there a temp mail method
im trying to figure out a temp mail method
you could use the "youremail@gmail.com + or . method"
Ig nano 2proo is coming on 19
Like its gonaa bhiii wy cheep
the . method doesnt work on puter :((
what is this stadium stuff going on???
Wsg
how are you doing this
Just add it as customizability to the UI lol
wdym
i meant GPT image 2 is an expensive model
is that free
This
Im pretty sure the different quality levels cost differently, right?
Best chinese ai
20
28
1
Kimi 26
is stadium daily usage and will there also be chat token limits
It’s sadly monthly
And the token limit is just what the models context it
Seedance
Visit Puter.js for more details on usage
i made a bot that can spam temp accounts
It’s not unlimited when it comes to ai
But like it’s cloud storage stuff is basically unlimited l
bro nothing is unlimited when it comes to ai
does anyone know how to setup glm 5.1 on cursor
api key
When will ChatGPT 5.5 be put on the leaderboard
i have many examples of gpt image 2 messing up text
where did you get seedance
capcut 🔥 🔥
Tuff
Hello
Looking like anthropic will still win overall even after gpt 5.5
How so?
In math I think it def has a chance
Overall is hard to say
When will the GPT 2 image chat arrive?
where is qwen 3.6 27b
protect him at all cost
why are there still a steady stream of people believing in this, speechless
<@&1349916362595635286>
here's the issue with the pareto leaderboards, and price estimates in general: they use raw cost for input tokens and output tokens. they should also factor in the tokenizer - some are more efficient than others
we should move towards a $ per million characters
Are yalls stupid
Why ?
Based Lychee - Nobody Is Friendly (Official Music Video) Prod. Stretch Nutz
Special thanks to Richard for his camerawork and the ppl in the video.
Go listen to it on SoundCloud also
https://soundcloud.com/basedlychee/nobody-is-friendly?in=basedlychee/sets/da-don-of-london
Who is better ?
8
12
1
Opus 4.6
Doubt that will happen any time soon
<@&1349916362595635286>
Pls add claud opus 4.6 or 4.7 in direct chat
Hello
does image generation require login now?
goku with parents
Gemini is basically google
It has 1 Zestytrillion datas
Bradar vat is yur profil pikture, delet delet
Im 67
Based Lychee - Nobody Is Friendly (Official Music Video) Prod. Stretch Nutz
Special thanks to Richard for his camerawork and the ppl in the video.
Go listen to it on SoundCloud also
https://soundcloud.com/basedlychee/nobody-is-friendly?in=basedlychee/sets/da-don-of-london
On Earth Day we present Artificial General Intelligence formal applied research and theory in definition, interface, evaluation and alignment as an environment native technology.
You can now become an annual customer, an investor or a community member.
Find us at www.queue.cx
Guys can you help where is the video arena?
