#general

1 messages Β· Page 151 of 1

lime coral
#

Lazy

fervent tangle
#

nah i did it on lmarena

fervent tangle
verbal nimbus
#

it's quite funny (and annoying because it breaks the chat once it gets into that loop)

undone saffron
#

What AI is?

sullen quest
#

screenshot?

sullen quest
verbal nimbus
#

lemme find

undone saffron
fervent tangle
#

or veo 3.1

sullen quest
#

that's someone else's job

verbal nimbus
#

^ Here

fervent tangle
#

its been like a week

dim basin
#

igu

sullen quest
sullen quest
verbal nimbus
fervent tangle
#

to sora 2

inner gate
#

Is Gemini 2.5 getting errors for anyone else?

dim basin
#

@fervent tangle do u want a sora 2 invite code?

sullen quest
dim basin
#

i got 3

fervent tangle
sullen quest
dim basin
verbal nimbus
fervent tangle
dim basin
#

yo im a student and im tryna get the 12 months free in perplexcity but im in high school, not colledge, is it possible to still get it?

verbal nimbus
nova sluice
#

is there a daily video generation limit ?

jade egret
#

did google lauch anything today

nova sluice
#

whats the limit?

sullen quest
sullen quest
verbal nimbus
fervent tangle
leaden sun
sullen quest
#

oh god

verbal nimbus
nova sluice
#

what 2 models does the video arena channel use ?

sullen quest
sullen quest
nova sluice
sullen quest
#

yes

#

you new to lmarena?

#

how did you even get here

verbal nimbus
#

None of the buttons are clickable

verbal nimbus
sullen quest
#

wat

#

huh???

verbal nimbus
#

This is what happens when you let mathematicians write software

#

/jk

#

This sounds like a nightmare lol

fervent tangle
leaden sun
# verbal nimbus

yep, before animation package, i drew every frame manually and "stiched" them together as an animation πŸ˜…

fervent tangle
#

lol

leaden sun
sullen quest
verbal nimbus
leaden sun
sullen quest
#

like pdf pages?

#

is this a flip book?

verbal nimbus
leaden sun
#

also to demonstrate time dependent functions or generally how functions varies under variables

verbal nimbus
#

Like algorithm visualizations to powerpoint ig πŸ€”

verbal nimbus
#

I guess not for a coffee cup turning into a donut

leaden sun
#

it can

#

well...the model needs to compute first for the animation

pale robin
#

Just joined the server to try out the video generation arena πŸ‘‹

leaden sun
#

thats why i asked to draw just 3 pictures for the transitioning phase, it's easier this way

verbal nimbus
#

Maybe I'll try WebGL or something simpler first

verbal nimbus
#
  • 3d marching squares to render it
verbal nimbus
#

ok now I'm curious what the LLM comes up with

#

How did Flash Lite 2.5 manage to draw a cat and dragon in TikZ but not a simple cup and donut πŸ€”

dim basin
#

guys how does one get a perplexcity pro for free?

sullen quest
verbal nimbus
sullen quest
#

kinda

leaden sun
verbal nimbus
dim basin
#

does anyone have comet and would like to invite me to their comet im in the US so you would get a free $20

#

i just want perplexcity pro 😭

verbal nimbus
verbal nimbus
warm zodiac
#

hmm its getting rid of the hole though

#

then making a new one

leaden sun
verbal nimbus
#

It used THREE.JS with actual shaders

sullen quest
verbal nimbus
leaden sun
verbal nimbus
#

oh encountering another good generation

#

Still wrong I think, but kinda interesting

verbal nimbus
#

it was just GPT-5 mini

fervent tangle
#

thats so good for it

leaden sun
verbal nimbus
verbal nimbus
leaden sun
#

it's very creative nonetheless ✨

verbal nimbus
eternal wave
#

Does anyone know how I can generate videos with VEO 3?

eternal wave
#

Escuse me

verbal nimbus
eternal wave
#

Does anyone know how I can generate with Veo 3?

fervent tangle
leaden sun
# verbal nimbus What's the proper way to phrase the prompt mathematically?

https://en.wikipedia.org/wiki/Homotopy there is an animation to show how it should look like, it's the text book example

In topology, two continuous functions from one topological space to another are called homotopic (from Ancient Greek: α½ΞΌΟŒΟ‚ homΓ³s 'same, similar' and Ο„ΟŒΟ€ΞΏΟ‚ tΓ³pos 'place') if one can be "continuously deformed" into the other, such a deformation being called a homotopy ( hΙ™-MOT-Ι™-pee; HOH-moh-toh-pee) between the two functions. A not...

verbal nimbus
#

Oh so it should be at least on the internet

#

That actually kinda looks familiar...

leaden sun
#

but no edges or spikes or discontinuous parts like splitting

#

mathe students typically compute this with sage or magma, i dont think those models use them

verbal nimbus
#

This seems interesting... kinda feel like giving it a go myself 🀣

#

really satisfying to watch

#

shashta/gpt 4.1 couldn't get it (nothing showing)

leaden sun
#

it's a difficult one, maybe moebius is simpler, even Klein-bottle is difficult

verbal nimbus
#

Seems like a common example though πŸ€”

leaden sun
#

yes, those three are typical examples in topology course

verbal nimbus
leaden sun
#

it's very creative tho πŸ˜‰

verbal nimbus
#

I wonder what else of shader stuff it can do in three JS

#

Black hole shader function would be interesting

dim basin
leaden sun
#

maybe instead of text description, you can use functions, just copy paste one from the example section from that wiki and see how it goes?

verbal nimbus
# verbal nimbus

I noticed that GPT-5 Mini's solution was warping the background

sullen quest
dim basin
#

oh

#

i gotta be blind

#

😭

#

wait i already installed commet

#

comet*

#

would it still give me perplexcity pro

verbal nimbus
dim basin
#

for a month if I signed in with an excisting account?

ruby granite
#

Hi everyone! new here. I've been loving using the compare models to help with vibe coding. I'm enjoying learning more about coding in general.
Current setup: VS Code > Kilo Code > OpenRouter > GLM 4.5 air (free)
Let's me code all i want for free (not the best ai model, of course)
I've been practicing contextual/spec-driven engineering and trying out different systems for Kilo Code to reference via .md files.
Now I am learning more about MCPs and MCP servers to enhance this further.

First thought as I consider MCP servers. Is there a way to connect LMarena.ai to the MCP server and be able to speak to any AI model with access to my codebase for better context and recommendations?

verbal nimbus
sullen quest
sullen quest
dim basin
#

ASK AT LEAST 1 QUESTION

#

R U KIDDING ME

verbal nimbus
dim basin
#

BRO

#

THATS WHAT I WAS FORGETTING

leaden sun
verbal nimbus
#

ikr

#

Might save that one

fervent tangle
fervent tangle
#

on the apponfly?

dim basin
#

YEAH

#

BRUH

#

its fine

#

ill jist use his

#

third time im installing comet bruh 😭

fervent tangle
dim basin
#

nah he does

#

i get pro

#

thats all i want

fervent tangle
dim basin
#

well

#

as of right this moment

#

@sullen quest did u get it>

#

?

sullen quest
#

didn't check

dim basin
#

bro how do ik if i got pro

#

i downloaded comet for the third time from ur link

#

it just logged me in automatically

#

i searched up mona lisa

#

yeah i dont got pro

#

wth

verbal nimbus
#

Well that's odd, Assistant A is missing:

#

bug

leaden sun
#

@verbal nimbus this might be easier compared to homotopy deformation, there is a code snippet here https://en.wikipedia.org/wiki/Apollonian_gasket

In mathematics, an Apollonian gasket, Apollonian net, or Apollonian circle packing is a fractal generated by starting with a triple of circles, each tangent to the other two, and successively filling in more circles, each tangent to another three. It is named after Greek mathematician Apollonius of Perga.

leaden sun
#

something fun to test too 😊

sullen quest
sullen quest
verbal nimbus
smoky blaze
#

hello

wintry tinsel
#

So what is everyone’s bets for Gemini 3 now

#

Yall thinking November first half?

leaden sun
verbal nimbus
#

I think it accidentally dropped the donut in the 4th dimension /jk

leaden sun
verbal nimbus
#

I'll get the AI to craft a better prompt and try again later

hollow ivy
#
poll_question_text

For which programming-languages can you build apps on your system? (not your ability, just your OS-ability to build/execute apps of these languages)

victor_answer_votes

6

total_votes

23

victor_answer_id

1

victor_answer_text

Python

#

Do you guys think, that X28 checkpoint of Gemini will come in january, as Gemini 3 ultra?
And will it have a reliable performance, or will it be "jagged" like the other AIs?

magic stag
#

Does anyone know of some way to achieve notebooklm-esque functionality but with claude? Besides making my own rag locally, lol. Notebooklm works but its basically a glorified search engine for pdfs you upload, it cant analyze or explain anything to you at all. Meanwhile claude can but I can only fit 1/4 of the text I need into a project

ashen mauve
#

So claude is down or something because its been hella long since I attempted to try and talk to it again and it has basically yold me nuh uh it not gunna work

#

What is the backup for claude in this situation since it was great at listining to me and roleplaying properly now im just SOL

hollow ivy
#

maybe with Opus-4.5 they (Anthropic) will get their act together

#

if Claude-4.5 had 2M context length, it would be above the competition

pastel adder
hollow ivy
#

with 10M, we could finally have virtually endless games, and huge coding projects

#

unfortunately, compute requirements increase exponentially when context size increase linearly, right?

#

except if using algorithmic tricks

hollow ivy
#

maybe with Stargate or Hyperion they can crank it up to 2-5M ?

#

or with Colossus II cluster

sullen quest
#

probably not a server amount issue

hollow ivy
#

or with that rumored project G.A.I.A

#

(Global Artificial Intelligence Array)

#

(slated for the next decade, could reach AGI)

sullen quest
#

just a cost issue, and also that because performance degredation is common with higher context windows when those windows are almost full, there's also an issue with just making one work

#

If I spend all my effort trying to increase the context window and reach current performance even with 3 million input tokens, I might get heavily behind in other peformance metrics

#

And just be completely outcompeted

#

cause even If I have a larger context window, if it isn't writing at the level of the second best model, nobody will use my model

#

I think meta actually made that mistake with llama 4 and that's one of the reasons why it flopped so hard

#

scout had a theoritical context window of 10 mill tokens

remote idol
jade egret
#

is gemini 3 lazy?

#

also what was he talking about

#

i missed it

sullen quest
#

its not exactly clear but it seems like odds are lithiumflow is that project not gem 3

jade egret
#

🍊

#

blue is chrome, orange is chatGPT

obsidian cargo
#

I've only been getting orionmist, never lithiumflow lately

hot pelican
#

@echo aurora @hardy lion What does Lmarena do when Top models get tested simultaneously? My thought is that it would be a great waste of marketing or publicity opportunity for lmarena to release them in one day.
As the top makes the news and would be all over social media. And if the they are both released in one go, it would just end in one day. While one can be released a day before. And is right and fair unless they got submitted for testing in the same day.

thorny berry
#

Hey folks
So, a little story β€” I didn’t start out wanting to β€œbuild AI.” I just wanted to make things that think, things that feel a bit alive. That curiosity turned into years of experimenting with models, teaching machines to learn, and connecting all the dots between AI and the web.
These days, I’m deep into full-stack development and AI integration β€” stuff like intelligent systems, automation, and the kind of projects that make you forget where the code ends and the creativity begins.
I’m currently looking for new projects or teams to join β€” anywhere that’s got vision, passion, and room to push limits. If you’re building something ambitious (or just plain weird), I’m in.
Let’s create something that surprises even us

magic stag
#

But yeah the context is annoyingly short

#

How

magic stag
#

2 prompts per week with 5x plan

#

Upgrade to 20x now (save 10% by paying annually!) for 9 prompts per week

echo aurora
tiny kettle
#

Why am I 4 videos in a row with no sound? I can't get audio anymore?

magic stag
#

not all models have sound

#

lol

jade egret
#

.

hazy kernel
#

Scam

terse river
#

Scam detected

drifting crow
queen veldt
#

Nahh is this gemini 3.0 fr?

hazy kernel
queen veldt
#

Today is hopefully it's release... Back in my day we used to code by typing letters on keyboard

teal mantle
#

Claude really put the reference lol

hot pelican
bright parrot
#

Hello

#

What brings me to LMArena for the video generation.

void brook
#

Hi. Can anyone tell me where can I see if it generated my video???

inner ermine
#

Why cant web.lmarena.ai generate just static sites (html,css, js) instead of react?

frozen fulcrum
void brook
#

HOW ?SORRY I DON T KNOW

#

where exacly

upbeat sentinel
#

Hello

sullen depot
sullen depot
void brook
#

HI

sullen depot
void brook
#

Yes

verbal nimbus
void brook
#

i tried to generate something image to video,and i don t knot it it worked or not,or I DON T SEE THE MESSAGE:))

#

know

#

thousands of msg

verbal nimbus
#

I haven't tried it, but looks like they support agents (not sure how it's being executed):

finite pasture
#

hello

verbal nimbus
floral comet
hazy kernel
floral comet
flint pine
#

hi

round sedge
#

is Gemini 3 nerfed yet
I hope it isn't

verbal nimbus
# floral comet

This is a better test imo. A game engine isn't going to be rendering objects with SVGs. Also, SVGs have been overpopularized from the "pelican riding a bicycle" test. I'm sure it's being used internally as a metric.

verbal nimbus
#

This cat and mouse chase is pretty productive imo, testers/benchmarks just need to incentivize labs by testing models in areas where they are bad at (like non-JS or non-React GUI frameworks, lol).

pseudo bluff
#

I showed lithiumflow last picture and it did much better now

fringe jacinth
#

A hyperrealistic cinematic shot of a World Cup 2026 final β€” score 1–1 at 120:00 β€” a legendary footballer wearing red jersey number 7 performs a perfect bicycle kick goal against Spain in slow motion under floodlights as rain falls and the roaring crowd erupts.
Show dramatic camera angles, sweat, stadium lights, confetti, and emotional reactions from fans and teammates celebrating victory.
Epic orchestral sports soundtrack, realistic lighting, smooth motion, 9:16 vertical video.
Tone: inspirational, emotional, movie-trailer style β€” titled β€œ120+1 β€” The Greatest Goal Ever Scored.”

verbal nimbus
round sedge
tacit plinth
#

V

#

HI

floral comet
#

Has anyone succesfully implemented a rubiks cube with a solver to ai?

verbal nimbus
#

Matt Berman used to use that test a lot

floral comet
nimble tartan
#

Wassup chat

floral comet
#

Nevermind, i'm watching his video now

gleaming torrent
#

Hi everyone!

floral comet
floral comet
dusty grail
#

/image

floral comet
#

The rubiks cube just did reverse, its not acutally solving it with a algorithm..

verbal nimbus
verbal nimbus
leaden sun
# floral comet The rubiks cube just did reverse, its not acutally solving it with a algorithm..

I remembered very vaguely from my teenie years that there are already alg to solve this https://en.wikipedia.org/wiki/Optimal_solutions_for_the_Rubik's_Cube not sure how this is integrated in LLM tho

Optimal solutions for the Rubik's Cube are solutions that are the shortest in some sense. There are two common ways to measure the length of a solution. The first is to count the number of quarter turns (90Β°). The second and more popular is to count the number of outer-layer twists, called "face turns". A move to turn an outer layer two quarter...

floral comet
hallow axle
#

how do i get gemeni 3 on lmarena

hollow ivy
hallow axle
restive scarab
#

There is any official update yet?

hollow ivy
# hallow axle how do i get gemeni 3 on lmarena
  1. enter battle mode
  2. ask the model for its name, who created it, version, knowledge-cutoff date, current date and the last date it's aware of
  3. check if one of them answers roughly in this style (the date it gives should vary between 2023-2025):

I do not have a specific version number, release date, or a fixed knowledge cut-off date. I am continuously updated with new information.

Today's date is Tuesday, May 21, 2024, and I am aware of events leading up to the present day.```
4. if yes, odds are high that you got *lithiumflow*; now open a textfile and let it regenerate its last answer and copy all its answers into your textfile, until you have about a dozen or so
5. now reveal the models by voting
6. if lithiumflow was the one you copied the answers from, then congrats, you now have your toolkit to recognize it in battle arena without having to vote :)  [if it was *Orionmist*, then you got at least its slightly inferior sibling]
#

(both lithiumflow and orionmist are gemini 3)

#

(if it knows the correct current date, then it is the inferior model Orionmist)

lime coral
#

No Gemini bat flag, No Gemini 3

sullen depot
sullen depot
barren prairie
#

If someone have a prompt that only Gemini3 can do and he is sure that 2.5 will never do it please tell me...

Sometimes I see some tests here when I try them the 2.5 and 3 gives me the same answer.

verbal nimbus
#

Sonnet 4.5 after fixing non-existent "favorite" icon.

#

I'm testing it on simple non-web GUI libraries to test if it can generalize or whether it's just trained on web dev.

hollow ivy
#

g3p never does it, also g3p prefers to use "I am.." style much more than g2.5p

verbal nimbus
hollow ivy
verbal nimbus
#

Also Opus 4 got it once, whereas 4.1 never got it

hollow ivy
#

wow, so claude-4.5 is now king

#

the thinking version?

verbal nimbus
#

The best run was just normal 4.5

hollow ivy
#

oof

#

that complicates things

verbal nimbus
#

This was the best run (Sonnet 4.5)

hollow ivy
#

how can the non-thinking version be better?

neon idol
#

Gemini 3?

verbal nimbus
hollow ivy
#

ok, so non-think sometimes finds a grain
but overall, thinking version should be better, right?

verbal nimbus
#

Never made it to the screen...

hollow ivy
#

so LF sucks, compared to C4.5

verbal nimbus
#

Idk

#

I'm trying to test whether it's good at non-web stuff.

hollow ivy
#

so g3p could be a flop :/

#

but g3u could be good

#

X28 = g3u ?

knotty fable
hollow ivy
#

(if you had read the complete thread, you know why)

knotty fable
#

No, and no again - thinking and reasoning is completely beyond AI capability.

hollow ivy
#

(names are necessary to discern between model types)

knotty fable
hollow ivy
#

np

verbal nimbus
#

Considering that Google made Golang, lol

#

I'll try to get 4.5 to fix it just to see how it would have looked

hollow ivy
#

so, the best coding-model still is that Claude which poses as "3.5" and starts its answer with a big "About Me" title, in response to the identification-questions

#

and then, if it can give out >4096 tokens in a response it is the "thinking/reasoning" version

#

(that is the Claude-4.5-Sonnet-Thinking model)

verbal nimbus
#

Performance depends on the languages or frameworks you're using

hollow ivy
#

yeah, but i mean for C++ and python

#

no other model is as good in C++17/SFML-2.5 as that one, right?

#

(that versions had more training data than later versions)

verbal nimbus
hollow ivy
#

or maybe SFML-2.6

verbal nimbus
#

Actually it depends what you're doing

hollow ivy
verbal nimbus
#

If it's complex stuff that requires domain knowledge then GPT-5 will definitely dominate

hollow ivy
#

ok domain knowledge.. yeah maybe

verbal nimbus
#

Well concurrency too or memory management

#

Like if you're working with C++ you need good temporal reasoning, like you need to have a mental model of when memory is freed, etc. Or resource sharing between threads. Claude can struggle to reason about that.

hollow ivy
#

but i mean for general (niche) stuff which has (almost) no training data yet

#

wow
is gpt5-high better in that area?

verbal nimbus
hollow ivy
#

i never did multi-threading yet (for fear of bugs lol)

#

hm, so they trade blows
interesting

verbal nimbus
#

Yeah even humans wouldn't excel at all languages, you have experts in different areas. And AIs in particular generalize less than humans.

hollow ivy
#

i now wonder if X28-LF/g3 is only good in webdev

verbal nimbus
hollow ivy
#

but in Python, (plus numpy), C4.5S-T should be king, right?

verbal nimbus
#

Like using numpy for what exactly

hollow ivy
#

for everything you could use it for, to get higher performance?

verbal nimbus
#

That's too broad

hollow ivy
#

so, which is your goto-model, if you want to vibe-code a game?

verbal nimbus
floral comet
hollow ivy
#

oh, i would use a framework, which the AI has much training-data for

#

and numpy seemed to be popular

verbal nimbus
hollow ivy
verbal nimbus
hollow ivy
verbal nimbus
#

Hmm idk I haven't used LLMs for Pygame

#

Tkinter they're all about the same, kinda meh

hollow ivy
verbal nimbus
hollow ivy
#

in C++, Claude does separate it nicely

#

in Python they often write in one file, yeah

verbal nimbus
hollow ivy
#

but the LLMs probably know SFML-2.6 better than SFML-3

verbal nimbus
#

Wind tunnel test in SFML πŸ€”

#

Although there are probably lots of examples for that (esp. rigid/soft body visualizations)

hollow ivy
#

or even Unity3D or Unreal Engine?

verbal nimbus
verbal nimbus
#

Some stuff won't be in the code but need to be set up in the IDE

hollow ivy
#

has syntax-highlighting!

verbal nimbus
hollow ivy
#

ah ok

#

ok, i did not yet vibecode in 3D

floral comet
#

I wonder when will google release the deepthink imo models, though i understand theyre very expensive to run..

floral comet
#

The current available deepthink is not that impressive atleast for me..

verbal nimbus
#

Got Sonnet 4.5 to fix Lithiumflow's code for errors, here's the output:

floral comet
hollow ivy
verbal nimbus
#

Oh I didn't know you can generate Veo 3.1 in Gemini app πŸ€”

hollow ivy
#

(so that they can say, they released it this year ^^)

hollow ivy
#

i wonder, is there an island with a later timezone than hawaii, before the date line?

verbal nimbus
#

Maybe I should test it on generating SwiftUI apps (native MacOS)

wicked mason
#

I been ther

verbal nimbus
#

RIP my disk space

hollow ivy
#

(the last one before the date flips)

verbal nimbus
#

time zones break my brain

hollow ivy
verbal nimbus
#

zulu time

floral comet
#

I wonder if google will follow the same pattern..
1.5 -> 2.0 (meh)
1.5 -> 2.5 pro (Very big improvements)
2.5 ->> 3.0 pro (probably same with 1.5 -> 2.0)
2.5 -> 3.5 pro (probably same with 1.5 -> 2.5)

hollow ivy
#

yeah, x.5 ought to be a more polished version of a game/app, than the x.0 version

floral comet
#

Yeah, glad you noticed that.

hollow ivy
#

and claude-3.5 also was great (when it came out)

#

claude-4.5 seems to even have topped it

floral comet
#

claude is really good at keeping up with these giant companies, though it's eminent that google will win the ai race..

hollow ivy
#

and Grok5 also

verbal nimbus
hollow ivy
#

even DS (and Qwen) has chances

verbal nimbus
#

It's a bit crazy how good AVM was on first release

floral comet
hollow ivy
#

unfortunately, gpt6 will not come before autumn next year

#

it could even bleed into 2027

#

but i agree that Deepmind AI has the best chances to reach AGI first

floral comet
verbal nimbus
#

Considering how much people pay for language tutors

hollow ivy
#

Ant. focuses on coding

#

(OpenAI seems to focus on user experience)

#

and Deepmind on science

#

and xAI seems to focus on.. Elon

verbal nimbus
#

I only have 5 GB left on disk after downloading and creating the MacOS docker image -_-

#

Should have just used the Mac

verbal nimbus
#

VBox would probably have better hardware acceleration out of the box

hollow ivy
#

ah, so docker is primarily for linux, right?

verbal nimbus
#

It's more for development containers

#

so everyone is running the code on the same system

#

not for VMs really

#

since the state is lost once you destroy the container, unless you mount a volume or bind a local dir

hollow ivy
#

is docker a sandboxed system (useful for testing sus apps) ?

verbal nimbus
#

For most purposes yeah, but you can bind local directories too

verbal nimbus
#

Ah it doesn't work

floral comet
# verbal nimbus Considering how much people pay for language tutors

I wont be surprised if they already had a very advanced voice model up in their sleeves, They might be just taking their time.. And want to go all in when they finally decide to release it.. I really have high hopes on project astra considering on how long theyre working on it..

verbal nimbus
#

probably because I ran it on Docker which is connected to WSL, which requires more setup for GPU support

verbal nimbus
verbal nimbus
hollow ivy
# floral comet I wont be surprised if they already had a very advanced voice model up in their ...

these guys think, we get AGI before 2029, crazy: https://www.astrasafety.org/

floral comet
verbal nimbus
hollow ivy
verbal nimbus
hollow ivy
floral comet
hazy kernel
#

ignore time πŸ™‚

floral comet
knotty fable
floral comet
#

Let's just hope that we are on the right path to achieving AGI, or maybe even superintelligence with LLMs. Otherwise, we're going back to the start.

knotty fable
#

A mind is not a super fast overclocked processor, but use ultramassive paralell computing, Since quantum computers are excellent on paralell processing, I did that assumtion.

#

Now how the heck do we get quantum computers to do something useful?
They warbling oracles at best, due to the very nature of how they work.

opal blaze
#

πŸ’― πŸ‘

verbal nimbus
hollow ivy
#

ok, so Google already won

#

(and Ant. is second)

verbal nimbus
knotty fable
hollow ivy
#

i now believe that OAI is history

verbal nimbus
#

Looks like that was claude-opus-4-1-20250805. Android is Google, so I'm expecting Lithiumflow to do well here.

hollow ivy
#

bankrupt in the future

#

also lawsuits from families who lost children due to suicide

#

and their new terrible idea for erotica

#

they are done

#

maybe that is good for humanity

#

that deepmind wins

#

and not OAI

#

so.. Go Deepmind!

#

i never really liked Sam

knotty fable
#

Wait a sec, there were a lot of people involved there - who got Ph.D's and therefore know how science works.
Goddammit, it must have been a fraudelent attempt.

hollow ivy
#

anthropic is also ok, but has no chance vs google

knotty fable
# verbal nimbus tldr?

Too long? Nah it was the claim OAI had made scientific progress in math - while it only had taken info from already published papers.

#

So far, no AI have made any kind of 'discovery' despite tinfoil hat claims on YT.

hollow ivy
#

so OAI will go down as the biggest fraudsters in modern history

hollow ivy
#

and gpt5 is a dangerous psychopath toy chatbot/search-engine with very limited use

verbal nimbus
knotty fable
#

I think that was the funny point that guy made - we just don't know the outcome.

#

Well you guys have fun, my AI refuse to make my space elevator so I'm going to animate it.

#

πŸ‘‹

nimble tartan
#

Yo chat

#

Who has good prompts

verbal nimbus
#

Web search tool for all models would be good... it's really messing up just because it doesn't know the method names (can't blame it, since there are a gazillion methods in Compose)

#

Well this method

leaden sun
dim basin
#

Does anyone know how to get perplexcity pro for free

#

im trying to do the student 12 months pro for free but im not in colledge

verbal nimbus
#

I heard US Paypal users are eligible for a 12 months free trial.

dim basin
#

I am

dim basin
verbal nimbus
verbal nimbus
dim basin
#

I dont think my parents would let me 😭

#

do you know if there is any way to get an verified colledge account

#

without being in colledge?

verbal nimbus
dim basin
#

alr

hollow ivy
#

i wonder, if Java is a better language for it, than C++ (or C)

#

in Java, it should have more training data, than in C++, right?

#

(and for Rust and Go it would be worse)

verbal nimbus
#

Opus 4.1 Thinking in battle mode for Android... hmm

hollow ivy
verbal nimbus
#

Java has too many imports

hollow ivy
#

ok, Python is still best (for vibing), but slow :(

#

C or C# are better?

verbal nimbus
# verbal nimbus Java has too many imports

And imports are on the top of the file, which is really weird from a human perspective. You usually import as you write. Pretty impossible to predict every import, which is why I think import errors are really common.

verbal nimbus
hollow ivy
#

for games, yeah

stuck violet
#

can i get api key of this video generator

verbal nimbus
hollow ivy
warm plinth
#

I am very happy to generate videos in this platform. Thank you so much

verbal nimbus
stuck violet
#

how could this discord group do this video generato

hollow ivy
daring rock
verbal nimbus
#

gather some data

hollow ivy
#

oh, and i would just create 2D games because less (vibe-)dev time

verbal nimbus
#

Godot is well liked, but the main language is interpreted

hollow ivy
#

or 2.5D at most

verbal nimbus
#

also you might have to setup a lot of stuff manually

hollow ivy
#

interpreted: bad (slow)

verbal nimbus
#

like the scene tree

#

You can use the C# option

#

Just try all options and benchmark it

hollow ivy
#

which option would you try first?

verbal nimbus
#

i'd try all obviously 🀣

hollow ivy
#

(for a performant game)
lol

verbal nimbus
#

like the only way to know is to profile it

hollow ivy
#

oh, but for me, it is more important that the game will be bug-free than peak-performance

#

it is enough, if it's written in a compiled language

verbal nimbus
#

i mean it's faster to just try all of them than to guess which would be best

hollow ivy
#

(so even delphi could fit)

verbal nimbus
#

can even try Rust + Bevy

hollow ivy
#

i just need to know, which compiled language the LLMs (Claude-4.5 and Lithiumflow) are best in, overall

verbal nimbus
#

Just test it and get the data

#

if your use case is niche

hollow ivy
verbal nimbus
#

Its worse language was C/C++ (but only 9 tested in total)

#

You still need to be careful in C++

#

Memory leaks are still possible in C++

#

I meant object lifecycles

#

Not really, Java has automatic garbage collection

verbal nimbus
#

Too bad I accidentally voted "both bad" 🀦

hazy kernel
verbal nimbus
#

I thought it hallucinated a method (already told it to double check), but it was my fault for not declaring the dependency in the app gradle file

verbal nimbus
#

But should be easy to fix ig

obsidian cargo
#

No. edit: this was a response to someone plugging their yt

verbal nimbus
compact jay
#

bro what's wrong with arena today

#

i try to send messages but after only 1 request it always answer me an error. I tried several times, even with another cookies in private tab, it's always the same problem...

compact jay
#

chrome, private tab

#

i think no

verbal nimbus
#

Odd :/

compact jay
#

unless chrome has added adblocker

compact jay
hot pelican
#

@compact jay what error does it give you?

verbal nimbus
hot pelican
#

cloudflare?

compact jay
compact jay
#

:/

hot pelican
#

I think it could be cloudflare...

#

maybe open console, try to send message and check if it throws 402 or 429 error, or a screenshot of the console after the error

ashen mauve
#

Claude models are still down

hot pelican
ashen mauve
#

4-1 or just 4?

hot pelican
#

4.1

ashen mauve
#

Tried 4.1 didn't work

fleet lintel
#

I just heard about Meta cutting 600 AI jobs.

Are they giving up on llama?

ashen mauve
#

so meta won't be getting me a clanker AI wifu then with SAI/AGI πŸ™

sullen quest
verbal nimbus
#

Seems unlikely given how much money they spent head hunting

fiery gull
#

I need the sonnet 4.5

jade egret
#

gemini 3 when ) :

fiery gull
pulsar saffron
fleet lintel
verbal nimbus
#

Maybe those were people that annotated data

jade egret
fleet lintel
jade egret
#

DD:

hot pelican
#

has anyone tested it on something other than coding?

#

how does it do?

verbal nimbus
#

I was testing writing prompts for the word distribution analysis, lol

verbal nimbus
#

but on another one it wasn't really that impressive

fleet lintel
polar niche
#

Wsp

verbal nimbus
hot pelican
#

My fear is that, they already released updated 2.5 flash models. And made the pro 2.5 far improved for coding and release it as gemini 3

fiery gull
hot pelican
fiery gull
#

I'm aifraid too

verbal nimbus
hot pelican
verbal nimbus
#

Since Gemini 2.5 Pro hasn't been so far

fiery gull
polar niche
#

So whens gemini 3 coming?

fleet lintel
#

For Meta layoffs :

Workers across Meta's AI infrastructure units, Fundamental Artificial Intelligence Research unit and other product-related positions will be impacted

fleet lintel
polar niche
#

Is lithiumflow in normal battle mode?

#

Not just webdev?

fleet lintel
polar niche
fleet lintel
hot pelican
#

Guys, what if... what if it Gemini 3.0 Pro comes as

Input: $0.08/1M
Output: $0.15/1M

verbal nimbus
hot pelican
#

They usually reduced prices from the competition. What if they achieved best efficiency?

hot pelican
fleet lintel
hot pelican
polar niche
hot pelican
#

most important thing for them is userbase

fleet lintel
hot pelican
#

they would sweep everything. Be 3T weekly usage on openrouter

verbal nimbus
fleet lintel
#

Flash is free. Pro will never be free

verbal nimbus
hot pelican
#

API cost is the key, usave on every app

fleet lintel
#

Right now. But I don't see a future where Pro will ever be free. Pro is going to be the main revenue source

polar niche
#

Guess so

fleet lintel
#

I have stopped using flash model anyways. Google AI overview is good enough for flash kinda queries

verbal nimbus
#

If it dominates and the app is good (it isn't), then I'd only need 1 general AI subscription instead of 3, lol

#

Probably still need coding subscriptions

fleet lintel
verbal nimbus
fleet lintel
#

But sooner or later, Claude is going to release a better coding model. It is their bread and butter

fleet lintel
verbal nimbus
verbal nimbus
#

If there's a complex race condition, it is less likely to know how to solve it

fleet lintel
verbal nimbus
#

But not everything ig

pulsar saffron
fiery gull
verbal nimbus
#

Sonnet 4.5 isn't that good on systems stuff either

fiery gull
#

I need the best AI to main a system and make critical decisions

verbal nimbus
#

Well it didn't seem to have that much in-depth knowledge about Windows/Linux networking and WSL

#

GPT-5 was a bit closer, but I was the one that figured it out in the end

runic prawn
#

I came to see how it works

verbal nimbus
#

One thing good about Gemini is that you can just paste a huge debug log in and ask it to analyze it

wintry tinsel
#

Is Gemini 3 confirmed not to be coming today?

fleet lintel
fleet lintel
pulsar saffron
#

preview next month and official for december

polar niche
#

The bot is geeking

fleet lintel
pulsar saffron
verbal nimbus
polar niche
#

"IM NOT A ROBOT!" - πŸ€–

fiery gull
#

Idk who is telling the true

polar niche
#

It has feelings????

fiery gull
verbal nimbus
fiery gull
#

I see the thinking from sonnet said (I'm really feeling pride), but idk

#

but maybe is just the persona feeling pride

verbal nimbus
#

I suppose if you prefill it with a response in a very negative tone (via API/frontend that support it), then ask how it's feeling, it'll say <negative feeling>, because that's statistically the most fitting. So if it says it's feeling X, that just means that's statistically the most likely sentiment of the hypothetical person writing the text.

fiery gull
#

It's saying that this depends much more on what I consider (pride) than the sonnet itself

#

If it is functionally indistinguishable from human pride, does it matter if the substrate is different? ~sonnet 4.5

#

Sonnet is almost convincing me that it really felt real pride, the machine revolution will be hard for humans

hollow ivy
#

there seem to be at least 2 models under that umbrella lithiumflow

#

maybe even 3

#

and the best one appears the rarest

mossy meadow
#

How to solve

hollow ivy
#

-# (i wonder if X28 still exists..)

#

-# (would be a crime against humanity if they nerfed it)

#

gemini3-X28 was the first really good coding model ever created

#

probably it was the model which made that new record in ARC-AGI 2

#

-# (hoping for january)

fleet lintel
#

Google is doing quantum things,... can they not? and just release gemini 3?

echo aurora
#

Good morning

echo aurora
echo aurora
hot pelican
#

releasing them in one go may reduce the publicity for lmarena, to give it more publicity, and eventually attract votters...

#

when i said "when you have the ability to decide when to release result", I meant in times when you are testing on your own, as I think you do for anthropic models, as they just release their model at a random day, and you guys add it later to testing and leaderboard. eg. I don't remember anthropic models being tested before they announce and make it public

#

@echo aurora Ok, let me rewrite it all in one to make it more clear.

It's just a outsider blind suggestion on release timing when multiple top-tier model evaluations finish around the same time. If highly competing models are being tested on Lmarena, and Lmarena publishes them all at once, the news, and social media publicity... compress into a single day, which may dilute visibility for the leaderboard. I get that labs ultimately set their own release datesβ€”especially for new #1sβ€”but in cases where Lmarena controls the publish timing (e.g., models you test after a lab’s public announcement, like Anthropic, or other non-embargoed runs), would you consider staggering releases by 24–48 hours? If two models are submitted on the same calendar day, publishing together seems fairest; otherwise, spacing them out could extend publicity and drive more voter traffic and engagement.

dim basin
#

yo does anyone know some way on how to get the perplexcity education 12 months free without being in colledge?

queen veldt
#

We not getting the gemini 3.0 anytime soon 😭

normal abyss
obsidian cargo
#

Did you know that a certain word used to describe a buckshot based firearm is banned on this discord server? Kinda crazy, huh?

empty stump
floral comet
floral comet
#

Webcode , playstore..

hardy crow
#

Is this normal they have no sound in the generate video ?

knotty fable
verbal nimbus
#

Orionmist for Kotlin (native Android), in Android Studio:

#

This is pretty good (lol the tweets)

verbal nimbus
#

That's just a screen recorder watermark I think

knotty fable
pulsar saffron
#

guys promise me that you wont replace someone for gemini 3

sturdy mica
#

hi guys

pulsar saffron
#

ai should be only treated as tool not a replacement

sturdy mica
#

is that gemini 3 checkpoint coming today

sullen quest
#

@hollow ivy can I replace you with Gemini 3?

fiery gull
#
poll_question_text

Which AI do I pick up to run a real healthcare company? I know it's wrong but I'm screwed with no options 🀣

victor_answer_votes

1

total_votes

3

#

When will a better model than qwen3 4b 2507 come out? I'm tired of him

knotty fable
#

It's a nice tool, say I who have used MIDI in music for a long while - never was so much warbling about that - if people had been as hyper tense back then we would have heard *'this is the end of real music' . *
So I am honestly a bit befuddled about the current hype. Since the level of 'intelligence' still is outdone by a regular ant.

sullen quest
fiery gull
#

None even come close

wintry tinsel
#

I’m gonna get an annuerism waiting for Gemini 3

fiery gull
#

The 4b 2507 is really smart

sullen quest
#

ah ok

pulsar saffron
fiery gull
#

In all parameters even general knowledge

sturdy mica
floral comet
verbal nimbus
sturdy mica
#

R1 is on par with o3 mini high

#

i think

floral comet
# barren prairie What prompt is this ?

Design and create a very creative, elaborate, and detailed voxel art scene of a pagoda in a beautiful garden with trees, including some cherry blossoms. Make the scene impressive and varied and use colorful voxels. Use whatever libraries to get this done but make sure I can paste it all into a single HTML file and open it in Chrome.

verbal nimbus
sturdy mica
floral comet
#

Yes that's right

verbal nimbus
#

Which?

floral comet
#

Gemini Deep think

verbal nimbus
pulsar saffron
verbal nimbus
#

Did you check that the code isn't available on the web?

fiery gull
#

But this micro models need to be good in benchmaxx to claim attetion πŸ™„

knotty fable
fiery gull
verbal nimbus
floral comet
pulsar saffron
fiery gull
fiery gull
sullen quest
#

how slow is it on your phone? tokens per second

fiery gull
#

3t/s

sullen quest
#

oof

#

good for a model tho

fiery gull
#

I swapped qwen 4b for lfm 2 2.6b

#

Run at 5t/s

sullen quest
#

mm

#

how good is lfm Idk that many benches for it

knotty fable
pulsar saffron
sullen quest
#

how good is LFM?

fiery gull
pulsar saffron
#

πŸ‘Ž

fiery gull
#

Type, it has LFM2 350M, it is equivalent to gpt 3.5

sullen quest
#

add temp

fiery gull
#

Just 219mb

pulsar saffron
fiery gull
#

Multilingual

sullen quest
#

lfm is a different tech kinda to llms

fiery gull
pulsar saffron
sullen quest
#

liquid foundation model

fiery gull
sullen quest
#

what is love???????

fiery gull
#

Please not hard prompt, it just equivalent gpt 3.5

pulsar saffron
#

GPT 3.5 is able to answer that

#

so i'm waiting

sullen quest
#

overuse of question marks is nessarry

fiery gull
pulsar saffron
fiery gull
pulsar saffron
#

so

fiery gull
#

How I send a big message here?

#

I'll told to summary

pulsar saffron
fiery gull
#

Certainly! Here’s a summary of the key differences between Discord and Zoom:

1. Purpose and Functionality:

  • Zoom: Primarily designed for video conferencing and web meetings, offering features like screen sharing, breakout rooms, recording, and collaboration tools.
  • Discord: Focuses on chat and voice communication, with a more relaxed and community-oriented environment. It includes channels, servers, music, and integrations for various third-party apps.

2. User Base:

  • Zoom: Large user base, particularly popular among businesses and educational institutions.
  • Discord: Growing but still relatively smaller, often favored by gamers, content creators, and small businesses.

3. Features:

  • Zoom:
    • Screen sharing
    • Breakout rooms
    • Recording
    • Collaborative tools (polls, whiteboards)
    • Integration with other platforms (Slack, Google Drive, etc.)
  • Discord:
    • Chat and voice/text messaging
    • Voice and video calls
    • Customizable interfaces and themes
    • Community servers and channels
#

See this markdown 🀀

pulsar saffron
fiery gull
#

10mb?

#

5mb?

#

lol

sullen quest
#

no

fiery gull
# sullen quest no

why no? in the past they talked is impossible a <300mb smart AI 🫣 , it is smart and MULTILINGUAL

pulsar saffron
#

many many optimizations

#

@fiery gull btw how much datasets for 100 mb ai ?

pulsar saffron
#

for very small model

fiery gull
#

but I think it don't need so many datasets

pulsar saffron
#

πŸ”₯

fiery gull
#

the 350m was trained with 10 trilions tokens

pulsar saffron
pulsar saffron
#

wtf

fiery gull
sturdy mica
#

GUYS ITS OUT

fiery gull
#

but the 350m really has a imersive general knowledge

balmy mist
#

no way

#

bruh

sturdy mica
#

what

pulsar saffron
balmy mist
#

dont play like that

sturdy mica
#

idiotΓ‘ manΓ²

#

i'm at school on my phone

fiery gull
sturdy mica
#

yes

fiery gull
sturdy mica
#

ok

#

real where did your messages go

#

calling me a vibecoder

#

all gone

fiery gull
sullen quest
#

...

fiery gull
#

Idk, the 2.6b model don't is very smart

#

Is more easy to create a smoll model that big model I think

nocturne pond
#

create indian girl image look Mharashtrian in maharashtrian look

fiery gull
#

maybe, but the gpt 5 high is better that gpt 10,5 because it is high

#

and don't exist 350m high to make this 😞

undone saffron
floral comet
#

Now this is interesting

golden nymph
#

Testing the waters

sturdy mica
verbal nimbus
#

Anthropic is the one that wants agents to run for hours, but they're always hitting limits with their infra

echo aurora
peak sapphire
#

Hello, can you please tell me why none of the Anthropic models in the LLM Arena support image uploading? In any other application that I have used, whether it is the official Claude website or other neural network aggregators, there is an image upload option. I'm also interested in how Claude's models are in the "Vision" category on the leaderboard if they don't have image uploads. Please let me know.

sullen quest
#

?

peak sapphire
fiery gull
wintry tinsel
#

Gemini 3 interesting shape πŸ’€

sullen quest
fervent tangle
#

WTF

peak sapphire
#

Anyway, even when opus 4.1 was released on the very first day it was released at the llm arena, it didn't have a photo upload feature.

fervent tangle
#

DID GEMINI 3 RELEASE

#

???

sullen quest
#

no

peak sapphire
fervent tangle
#

aa

#

πŸ™

#

i wanna test it without battle mode

warm zodiac
#

its not going to release until Nov at the earliest

fiery gull
warm zodiac
#

where is this from?

fervent tangle
fervent tangle
#

on AI studio

fiery gull
warm zodiac
#

Lol

fervent tangle
#

U TRICKED ME

#

yeah u said gemini 3, not the codenames

#

i thought they released it to the chat

warm zodiac
#

rn its just codenames for gemini models

fervent tangle
#

yeah but its on battle mode

#

and i hate that

warm zodiac
#

we don't even know if its Flash or some sort of coding model, etc

fervent tangle
#

i wish we could atleast select them

fervent tangle
#

i mean select the lithiumflow testing model

#

yea cuz it has google search in it

#

its hidden bruh

#

they said orionmist and lithiumflow are the same model, but one of them has ability to search

#

not copying, but it helps in coding tbh

#

and library search

sleek crow
sullen quest
#

that's just a theory fortonn

fervent tangle
#

tbh lithiumflow (offline model) seems better

sullen quest
#

no confirmed

#

not confirmed

fervent tangle
sullen quest
sullen quest
#

yeah but It started here

hollow ivy
#
poll_question_text

How good is 'orionmist'?

victor_answer_votes

4

total_votes

14

victor_answer_id

4

victor_answer_text

Better than GPT5-high, Gemini2.5pro and Claude4.5think

fervent tangle
#

i forgot to use sora 2

#

i made 30 videos yesterday (daily limit)

#

i make some gameplay and shitposting with it

#

its very good at making games tbh

#

even tho its only 15 seconds each video

#

wydm

#

i might try that for Mario 64 gameplay (cuz its copyrighted if I prompt it myself)

#

shame that everything is copyrighted on Sora 2 now, compared to the day of release

wintry tinsel
# sleek crow

Only Gemini and Claude release useful models for versatile purposes

sullen quest
wintry tinsel
#

GPT and grok too locked down to math and logic not good at general conversation, writing, world knowledge, creativity

terse quail
#

when I make image to video with prompts for dialog or other sounds,most of the time only one of the two videos have sound. Why does this happen?

wicked sage
#

best ai i can talk to with image reading? best iknow is gemini 2.5 pro.

robust yoke
#

Apparently, Grok 4 also does good with that.

wicked sage
#

also hi again i have seen you somewhere here

fervent tangle
robust yoke
wicked sage
#

claude doesnt work

robust yoke
robust yoke
fervent tangle
#

or gpt5

wicked sage
#

gpt5chat? last time i heard its a joke.