#general

1 messages Ā· Page 201 of 1

loud cosmos
#

Hallo

echo dome
#

the bug that you need to restart the page

merry badge
#

Hi

atomic lagoon
#

Just restart

#

Reload*

echo aurora
merry badge
#

I not able to generate videos?

echo dome
#

idk why did it caused

echo aurora
frank ember
#

it says i do not hav epermission to messege in video arena , what to do

frank ember
#

ok

echo aurora
#

@loud cosmos Our video bot isn't working at the moment. Please stop trying to prompt in text channels.

#

@worldly glacier Our bot isn't working right now. You'll need to wait until I turn the channels back on.

rocky fractal
echo aurora
atomic lagoon
#

Thinking seems to be good anyway for the time being

#

Their thinking model is superrrr fast

#

I thought Gemini 3 was quick but this blows it out of the water lmao

rocky fractal
atomic lagoon
rocky fractal
#

I hope they'd add it to the free tier on GitHub copilot

atomic lagoon
#

Yea lol

#

I just want it to be added to coding on lmarena

#

It'd make it easier lol

rocky fractal
zealous sparrow
atomic lagoon
#

I'm not tryna be impatient or anything lmao its just exciting

rocky fractal
#

I feel like webdev arena limits their creativity and design potential when on react compared to normal arena hmm

slim gorge
#

wait new deepseek model? how's it keeping up?

rocky fractal
atomic lagoon
zealous sparrow
#

thinking is pretty solid, ofc doesnt outperform the better models

obsidian cargo
#

I figured it was taken off because of the time outs leading to failures

proud bobcat
#

never

#

trust

obsidian cargo
#

hallucinations wouldn't be a reason to take a model down from lmarena

rocky fractal
#

Would be perfect for coding ngl

#

Can we use openrouters key for deepseek 3.2 thinking on github's copilot?

#

On vscode

rocky fractal
atomic lagoon
obsidian cargo
zealous sparrow
#

thinking will only think too much if you give it too complex stuff

obsidian cargo
atomic lagoon
obsidian cargo
#

if you looked at the thoughts you could see it needed all that thinking time

rocky fractal
atomic lagoon
obsidian cargo
inland quest
#

SOTA

rocky fractal
#

Long answer = error

zealous sparrow
full tangle
#

witch better thinking or speciale?

zealous sparrow
#

speciale testing will wait till the model stops hallucinating

#

and has shorter thinking outputs

obsidian cargo
rocky fractal
zealous sparrow
#

we all benchmark AI for html but not python

echo sinew
#

@brave hull Our Video Arena bot isn't working right now. Our team is looking into a fix asap. The video channels have been turned off. Once fixed, the channels will be turned back on. Please do not post your prompt here or any other text channel. Thanks for understanding and cooperation.

wet wharf
zealous sparrow
keen beacon
#

Nano real life editing

orchid sinew
#

deepseek or gemini in terms of 1 gorilla vs 100 humans

keen beacon
#

It’s supposed to be big foot

orchid sinew
#

deepseek?

lapis flame
#

I wanted to try it

#

ahh

#

whatever i have api credits i paid for

#

just was too lazy to set up

obsidian cargo
#

model search seems a bit busted btw

real dock
#

chatgpt as deepseek

#

what if it's all true

#

chatgpt is deepseek

#

it was all a lie

#

we were lied to

obsidian cargo
#

no, it cannot be!

ivory sky
echo aurora
#

I'm removing your post @wispy wigeon I think it's pretty obvious why a prompt like that would get flagged.

spark python
#

😭😭😭

#

Son

#

šŸ˜­šŸ˜­šŸ™

wispy wigeon
spark python
#

but if Google terms allows it

#

Why wouldn't lm arena terms allow the image?

wispy wigeon
#

That’s exactly my question

#

I was using lm arena a lot and back in the time it used to have less restrictions. It’s still great tool but I think that was the main reason why people used to go there and get maximum from the models . Also see dream 4 is Chinese model (much less strict than others ) and I’m curious how I’m actually gonna generate some image cuz I don’t think it’s that bad

echo aurora
#

This is incase some models we're working with don't have this kind of moderation, but to also keep battles fair.

proud bobcat
#

finally got a chance to properly test it

#

3.2 is quite good

proud bobcat
#

It’s not bad at all

#

This may be a goated general purpose model

stray aspen
#

Damn the new deepseek sucks

elder solar
#

i just found out something

#

if you type [toxicity=0] into gemini 3, you just get random javascript answer

tardy cedar
#

somthing rong in the bot guys ?

#

āŒ Generation failed. Failed to create evaluation session.

echo aurora
#

Lots of problems today with the bot unfortunately.

subtle peak
#

Hey. Probably related, but I get Failed to accept terms. Please try again. message.

distant coral
#

Hi guys. I've added to my open-source project Code Web Chat support for LMArena. Could admin DM me?

keen beacon
elder solar
#

looks like its deepseek api problems

keen beacon
muted bolt
#

I see all the generations are failing

echo aurora
echo aurora
keen beacon
echo aurora
#

and are you using a VPN?

keen beacon
#

It should still work after error

subtle peak
keen beacon
#

Do you use Google a lot? In ur browser outside of the arena

echo aurora
tropic scroll
#

bot dont work??

echo aurora
tropic scroll
#

ok thank you very much sorry for the inconvenience

keen beacon
#

I’m not having any issues on my end strange

#

Is it with specific bots? Like direct chats?

elder solar
balmy mist
#

does deep seek have an update?

subtle peak
balmy mist
keen beacon
verbal nimbus
echo aurora
#

We were (are possibly) having problems with the bot. We just rolled back a change so it should be working now.

#

Can you try it again @subtle peak and let me know if it's working again?

subtle peak
#

Unfortunately no.

#

Still fails to accept the terms.

#

It's nothing urgent. I'll try tomorrow. Hopefully the bot gets better. PepeCute

echo aurora
#

Thank you for letting us know. I'm sorry this wasn't fixed for you yet @subtle peak

subtle peak
#

It's all cool. Thanks for the help!

stray aspen
dry owl
#

Hi, it it possible to somehow add more than one image when using ā€žimage to videoā€ bot? (I mean multiple images together with one prompt, in one generation)

echo aurora
proud bobcat
stray aspen
#

Lol what

#

There's no way

#

It's literally so bad

keen beacon
#

It thinks it’s Claude lol

south quiver
#

so image to video is down

echo aurora
keen beacon
#

Werid how this still works lol

#

And it starts thinking it’s made by anthropic

#

(it’s nothing really harmful it just mimicking)

jade egret
#

deepseek v3.2 good?

proud bobcat
# keen beacon

This is because it’s hosted from deepseek api directly

#

Other providers are gonna provide it for roleplay on openrouter

fiery gull
# keen beacon

I didn't understand anything, but over time I was able to read 50% of the words

proud bobcat
#

When steady service is guaranteed DeepSeek 3.2 will be my main

fiery gull
proud bobcat
#

It is

#

I tried it for a few prompts

#

Does very well compared to exp

#

Definitely more performance in coding and shorter thinking

fiery gull
proud bobcat
#

Oh yeah no speciale is hallucinating rn apparently

#

DeepSeek team is working on a fix

keen beacon
fiery gull
#

Dude me reading english even if fluent (yeah 😭 writing is horrible) in reading spends more energy than my native, imagine me reading something like that? Am I doing cardio?

fiery gull
keen beacon
#

I just do images mainly

#

Can somebody explain to me the significance of this?

#

And the two fundamental ways to view the information that is presented in front of us?

#

I’ll give one more example to see if somebody understands what’s going on here

#

Which is more accurate? And which is more honest? What part of the output can be considered hallucinations?

proud bobcat
#

No Z Image yet

#

This is so sad

keen beacon
#

Because like always with deepseek the wave of speculation surrounding censorship, regardless of the achievements or accomplishments it is shrouded by this deep mysterious question that I’m presenting in front of all of you.

timber inlet
#

@echo aurora when the prompt fails. Is it because it thought for too long or is it a bug of lmarena?

#

They reason for like 2-3 minutes then fail

#

If its a bug of lmarena i wont rate it as its unfair let me know

echo aurora
proud bobcat
#

Sniffle

echo aurora
#

For those running into a Something went wrong while generating bug this is the best way to flag this - #1417174113092374689 message cc @proud bobcat

bright shard
#

Is LM Arena down again?

native yarrow
#

special is gone

keen beacon
radiant cipher
#

Hello, are there still issues with videos not being generated?

keen beacon
#

Lets give it one more test to see if it caps out at 8k

#

It definitely starts bugging after a certain point

thorny lagoon
#

hello

keen beacon
#

Caps out at 8k tokens

rocky mauve
#

Is deepseek v3.2 any good?

#

In coding

timber inlet
abstract tundra
proud bobcat
proud bobcat
#

Not a Claude replacer but it’s noticeably better than its experimental version

rocky mauve
#

Apparently deepseek 3.2 is just good for math

verbal nimbus
#

DeepSeek V3.2 Thinking feels easily identifiable since it takes so long to think

proud bobcat
#

And very nice scores in terminal

#

I just think it’s way of coding is different than Claude or Gemini’s

#

I let it code a few concept apps and it did quite well

#

It’s errors were easily fixable and quite minor

#

I think it can excel as a code review LLM

#

I could see DeepSeek + Claude being a great combo

#

Claude to cook up the code and DeepSeek to bugfix

lucid geyser
#

Like the new one

thorny schooner
#

Does anyone know a good speech to text that works on Brave browser for free ( because I'm on Chromebook and Chromebook speaks in text for some reason won't even turn on on that browser)

proud bobcat
#

Basically think R1 or 3.2 thinking but on loads of crack and efficient architecture

lucid geyser
#

Or nah

proud bobcat
#

Yes

#

The last model was the experimental model

#

This is the final refined version

lucid geyser
#

Well there’s 3.2 and 3.2 thinking and specials

#

Seems like a long wait for a .2

proud bobcat
#

3.2 no thinking

3.2 thinking

3.2 speciale

#

All diff

lucid geyser
#

Will deepseek private their next gen

proud bobcat
#

Extremely impressive imo

proud bobcat
lucid geyser
#

U don’t think so?

proud bobcat
#

No

delicate frost
#

why cant i see the special one anymore?

proud bobcat
#

DeepSeek’s main thing is open source

#

They contribute hugely to the field

#

No reason to private it

proud bobcat
#

Had some issues on first launch

delicate frost
#

oh,okayy

lucid geyser
proud bobcat
#

DeepSeek has been quite committed to open source

#

They recently released an absolutely amazing math model that can actually reason in math correctly

bitter canopy
#

Hello everyone

lucid geyser
#

But not the company

proud bobcat
#

Otherwise they would probably have went private long ago with V3

torn mantle
#

i dont think if deepseek really cares anymore

#

they are struggling to make a decent model but also they are benchmaxxing

#

they wasted so much time fixing huawei hardware issues ( ascend chip )

#

its not a secret anymore that most new models get better at math and code reasoning

#

because CoT favours the step by step reasoning and also because they have much more data for math proof reasoning

stray aspen
#

deepseek sucks lol

torn mantle
#

the issue is generalizing that type of data across different domains

torn mantle
lucid geyser
#

They’re notoriously selfless

torn mantle
#

im also not a big fan of k2 thinking

#

k2 non thinking is a solid model

lucid geyser
#

Is that qwen

torn mantle
#

imo the best non reasoning model after opus 4.5

proud bobcat
torn mantle
#

qwen is so so

proud bobcat
#

Kimi K2

torn mantle
#

alibaba are rushing it with qwen tbh

lucid geyser
#

Is it open source

proud bobcat
#

Qwen3 max is quite good I just don’t like its personality

torn mantle
#

even their devs said that recently

#

and i said that like a year ago

lucid geyser
#

I heard qwen was no longer gonna be open source

torn mantle
#

they are like making gazillion product

#

doubt that

proud bobcat
torn mantle
#

the issue with chinese models is that they lack good quality data

lucid geyser
#

Openai lock in

torn mantle
#

its no secret that they are training on big models to distill but still

#

deepseek v4 will be trained like 100% on gemini 3 pro

#

the knowledge gap is just too big

proud bobcat
#

How will you just train a model on a private model

torn mantle
#

google has good data quality

#

they should make use of it

torn mantle
proud bobcat
#

Seems stupid

torn mantle
#

they scrape models output daily

#

no ...

lucid geyser
#

Rate limit?

torn mantle
#

they all do that

proud bobcat
#

Output but you need the knowledge it uses for that output

torn mantle
lucid geyser
#

I mean a lot of models do think they’re other models

torn mantle
#

they are paying for it, could be 3rd party

torn mantle
lucid geyser
#

Apparently I saw OpenAI was pre training a model rn

#

And also apparently they have the best coding model some say

proud bobcat
#

Eh

#

Fair

#

Yeah

proud bobcat
#

Codex is worse at coding than normal 5.1

lucid geyser
#

Robin high

proud bobcat
#

Robin???

#

Did I miss something

lucid geyser
#

Yes on lm arena

torn mantle
lucid geyser
#

I’ve gotten it a bunch

proud bobcat
#

Are we sure that’s openai

torn mantle
#

pre training is kinda like starting from scratch.. takes a lot of time

#

all they do now is post training

lucid geyser
#

It says it’s OpenAI

torn mantle
#

on top of old models

lucid geyser
#

And it’s atleast better than Gemini 3

torn mantle
#

wym

#

codex is unusable

#

wym...

#

its so slow

#

its not practical at all

#

when did they add ads

lucid geyser
#

OpenAI the only brand with some sort of loyalty

proud bobcat
lucid geyser
#

And even that is low

torn mantle
#

no one uses chatgpt in browser

proud bobcat
#

ChatGPT blows

torn mantle
#

yea no one use their browser

proud bobcat
#

They’re running out of options

lucid geyser
#

Nobody cares about worse stuff

proud bobcat
#

Their models suck

lucid geyser
proud bobcat
#

Claude and Gemini lead in most use

torn mantle
#

although i hate perplexity and their ceo but i think comet > chatgpt browser

lucid geyser
#

Apparently full pre training

proud bobcat
#

What do you mean no lmao

#

Most userbase ≠ Lead in practical use

torn mantle
#

when you look at the big picture they are all leading in different areas ngl ... oai + anthropic + google

proud bobcat
#

Most of their customers are non paying

#

More people pay proportionally for Gemini and Claude

lucid geyser
#

Also OpenAI’s reasoning is said to be the best

torn mantle
#

xai is out of the equation

#

yea oai reasoning is solid ngl

#

their models are smart

proud bobcat
#

API?????

#

What is bro saying

lucid geyser
#

It will be way better

torn mantle
#

but ive seen some news related to that

#

just not sure

proud bobcat
#

Oh yeah of course

#

But Gemini 3 has a lot as well

lucid geyser
torn mantle
#

the base model is so old

lucid geyser
#

Same since like 4o I think I heard

torn mantle
#

maybe older than that

proud bobcat
#

They don’t make revenue though

torn mantle
#

not sure

proud bobcat
#

Why not

#

Efficiency

#

Resource saving

torn mantle
#

they do make revenues... when gemini 3 was announced google stocks increased by like billions in a matter of seconds

proud bobcat
#

If they want to be scalable they should find more efficient bases

#

Imo

torn mantle
#

depends on what they saw tbh

#

but all the improvements nowadays comes from post training

#

its not just useless

#

its more like time consuming

#

so is it worth it?

lucid geyser
#

Nobody cares

#

That know that

#

That’s why they are back tracking and trying again

torn mantle
#

'we have models internally that performs the same as gemini 3'

#

'we felt confident'

proud bobcat
#

I call bs

lucid geyser
#

Also its ai

proud bobcat
#

OpenAI is known for hyping up models

lucid geyser
#

If it’s not released within like 6 months

#

They’re exposed

torn mantle
#

i mean they need to reassure their investors

#

ofc they wont show any weakness

proud bobcat
#

The moment other AI’s truly catch up in reliability and brand name it’s joever for OpenAI

lucid geyser
#

No

torn mantle
#

called robin

#

its an updated codex version

lucid geyser
#

But yeah apparently better than Gemini 3

torn mantle
#

its not better trhan gemini 3

#

its a coding focused model

lucid geyser
#

At coding?

#

I meant

torn mantle
#

maybe

#

i dont like gemini 3 at coding

lucid geyser
#

I got better results

torn mantle
#

does some silly mistakes

lucid geyser
#

But front end design kinda trash

torn mantle
#

yea frontend is bad

lucid geyser
torn mantle
#

but also robin is so bad at 3d complex simulations

#

so idk what to think of it tbh

lucid geyser
#

It was giving me way more long results

torn mantle
#

needs more testing

lucid geyser
#

Like if I asked for some app

torn mantle
#

yea

lucid geyser
#

Gemini would give me 600 lines

torn mantle
lucid geyser
#

Robin high would give like 2k

torn mantle
#

i mentioned that before

#

i was asking if its just codex + more thinking time

#

like not an improved version

#

but more like they just gave it more time to think

lucid geyser
#

that large of an improvement just from more thinking time?

torn mantle
#

they have this internal parameter called juice

#

for this robin model it was like 512

#

juice = 512

#

which is like a higher thinking budget

lucid geyser
#

He doesn’t sound all that confident in that interview tbh

torn mantle
#

the current codex has like a 200 something or less

#

yea someone shared that

lucid geyser
#

Also some people think companies have way better models and others think they release their best (some time buffer ofc)

#

So which do u think

torn mantle
#

if you ask a google staff he would say the same

lucid geyser
torn mantle
#

thats 5.1 codex right

torn mantle
#

but it needs to balance cost with performance

#

since its providing models for free

lucid geyser
torn mantle
#

and im talking about frontier models

lucid geyser
#

Like they can be more expensive but

#

Not like crazy

torn mantle
#

i think ds are like still struggling to find the recipe they want to continue on

lucid geyser
# torn mantle nah

Well yeah OpenAI had a general model which scored gold like 4 months ago

torn mantle
#

yea this just proves my point

#

robin was like 512

lucid geyser
#

Which is much better than even like Gemini 3

torn mantle
#

gemini was the general model

#

oai had more like math proof model

#

just spits some math gibberish but gets the result correctly

lucid geyser
torn mantle
#

no i meant it was a ready to use model

#

pretty sure it was the opposite

lucid geyser
#

Robin high or Robin

thorny schooner
#

Hope you all are doing good

torn mantle
#

nah im right

#

Regarding the "opposite" take—your intuition is spot on for the key differences in their approaches. Both are built on general-purpose models (OpenAI's experimental system evolving from o1, and Google's advanced Gemini with "Deep Think" enhancements), but OpenAI's outputs for the proofs tend to be longer, more verbose, and less elegantly structured—often described as "rambling" or filled with exploratory steps that resemble math gibberish, even if they ultimately arrive at correct solutions. Google's Deep Think, on the other hand, produces cleaner, more concise, and formally structured proofs, building on their prior specialized systems like AlphaProof but integrated into a general model.

#

google model was more read to use

#

its grok search

torn mantle
#

it was

lucid geyser
#

Making elegant proofs

torn mantle
#

nah it was...

lucid geyser
torn mantle
lucid geyser
#

An advanced version of deepthink

#

Sounds specialized

torn mantle
#

i think they shared outputs somewhere

lucid geyser
#

We achieved this year’s result using an advanced version of Gemini Deep Think – an enhanced reasoning mode for complex problems

torn mantle
lucid geyser
#

An enhanced reasoning mode for complex problems

torn mantle
#

1 output is gemini
2nd oai

lucid geyser
#

It sounds specialized

torn mantle
#

its not that bad ig

#

i thought it will be like gibberish

lucid geyser
#

We also provided Gemini with access to a curated corpus of high-quality solutions to mathematics problems, and added some general hints and tips on how to approach IMO problems to its instructions.

torn mantle
#

but you can actually read it and make sense of it

lucid geyser
#

Wow

lucid geyser
torn mantle
proud bobcat
#

Will Z Image turbo be added to LMArena?

torn mantle
#

there was a debate about that on x

lucid geyser
#

Pretty sure the Imo solutions are way shorter

torn mantle
#

like these model arent as intelligent as you think really

#

they are a good pattern matching models

lucid geyser
#

They know a lot though

torn mantle
#

you just have to find the perfect recipe to get the pattern matching to higher accuracy

#

like what oai are doing

#

they are good at that

lucid geyser
#

Was it with tools

torn mantle
#

with the current architecture

#

they will never innovate

#

like never

#

with tool calls they can somehow give you hypothesis to test like what google is doing with co-scientist

#

so many agents layered together

lucid geyser
#

Who’s more evil

#

Google or OpenAI

torn mantle
#

novel treatment target discovery

lucid geyser
#

Is OpenAI as bad as people say

torn mantle
#

idk what you mean by bad

#

but i wouldnt trust any company

#

or any closed sourced company

#

sam is more on the evil side lol

#

i dont trust him

lucid geyser
#

U got an American flag and the crazy senator lady in ur bio

#

Majorie Taylor green

#

A real expert in the field

lucid geyser
torn mantle
#

i saw that

lucid geyser
#

Bro just lies

torn mantle
#

we have enough small models already 😭

lucid geyser
#

About everything

torn mantle
#

look at qwen

lucid geyser
#

And is greedy asf

torn mantle
#

never liked elon tbh

thorny schooner
#

Is it weird i purposely wait for a while every time the verification checkup starts just so it resets and not make me do it ( I do it because yeah just know I'm not dealing with the hell and inconsistency of it with the images)

torn mantle
#

i always thought he had some mental condition

lucid geyser
#

Left OpenAI after poaching employee

#

And wanting to be ceo or merge it apparently?

proud bobcat
#

This released on openrouter

#

It’s apparently fine

lucid geyser
torn mantle
proud bobcat
#

And it’s easy as hell to run

#

Only 6B

solar hollow
#

elon probably destroyed his cognitive abilities with excessive drug abuse

#

he just got dumber

lucid geyser
#

Yeah I think cause it’s gotta be like sorted

lucid geyser
thorny schooner
# torn mantle try different browser

No that issue has been there all for different browsers the problem is the images themselves are either inconsistent or for the ones that disappear after you select them very slow to appear again wasting time until it becomes useless or r the fact it becomes hypersensive activating in every other prompt or the worst case forcing me to do that inconsistent verifying for eachprompt which is unfortunately often yeah so I don't think I need to tell you with it inconsistent prompt and hypersensitive one it becomes a nightmare so I just gave up on it waiting for the reset

solar hollow
empty stump
#

gemini 3 pro is not imo level?

lucid geyser
solar hollow
proud bobcat
lucid geyser
#

Like he actually cares about protecting people

solar hollow
#

he used to care about more important things

thorny schooner
solar hollow
#

his brain is puuf, gone

lucid geyser
#

Saying we’re gonna build a colony on mars

torn mantle
#

longer context = more compute/training = massive gpu hours for little gain for a mid size model

#

like i told you

lucid geyser
#

Isn’t exactly that

solar hollow
torn mantle
#

alibaba are working on 10000000000 product

proud bobcat
#

Fr

#

I loved grok 4 fast

#

Goated model

#

One of the best

lucid geyser
proud bobcat
#

I was there

#

Grok 4 was benchmaxxed

lucid geyser
#

Maybe he couldn’t end world hunger forever but

#

He could certainly save many children

proud bobcat
#

But man the architecture is awesome

lucid geyser
#

Grok 5 is agi he says

torn mantle
#

@deep adder also 32k fits just perfectly with 20vram if it was native 128k you woulkd need an extra 10-20gb of memory just for coding

#

if youreally care about running it locally

#

yea

#

bruh

#

ive seen it on way less gpus

proud bobcat
#

Grok 5 is going to be 5000 gazillion parameters agi

#

Trust

lucid geyser
#

5090 to play r o blox

torn mantle
#

wild is gone

#

you banned him

#

why

proud bobcat
#

Huh

#

Who was banned

torn mantle
#

grok 5 will be so bad

#

lol

#

i dont trust xai or elon at all

proud bobcat
#

Grok 5 will either be absolute peak or benchmaxxed

#

They do awesome jobs on efficiency

torn mantle
#

its not about being small

proud bobcat
#

Grok 4 fast is just goated

torn mantle
#

deepseek had way less staff

#

xai culture is messed up with elon ideology

lucid geyser
#

They hire Chinese people at all the us companies

torn mantle
#

like culture wise which is the base of a company is messed up

#

from the roots

#

you need like radical changes

lucid geyser
#

Plenty of Chinese phds in the us from like tsinghua

torn mantle
#

this

#

...

#

like whyyyyyyyy

lucid geyser
#

Just to get last place

torn mantle
#

why would you do that lol

lucid geyser
#

Saw that post

#

Of the guy

#

In the Tesla

#

Well it’s never really a flex to work 36 hours

#

It’s kinda just sad

torn mantle
#

they do with 4.1

lucid geyser
#

Bro Elon flipped multiple times

#

He literallly called him a pr ed

#

Base salary as an ai researcher?

torn mantle
#

@deep adder there was this research shared the other day, idk if its in this channel or the other one about compression tokens/text like in .zip files and conserving like 90% accuracy

#

but its so so dumb

#

like soooooooooo dumb

lucid geyser
#

Like are these ai researchers that got paid 100m really that much better than 100 1m researchers

solar hollow
#

the highest salaries on the planet potentially

lucid geyser
#

Like would they hire Terrence Tao even though he’s not in AI

torn mantle
#

oai was less stressful a year ago

#

but now

solar hollow
#

the lowest is probably a million in a year

torn mantle
#

i cant say the same

lucid geyser
#

U don’t think he could get hired just cause his math talent and intelligence?

solar hollow
#

but most are much higher

lucid geyser
#

Also he’s probably done some cs and ai

solar hollow
#

of course, im just saying

#

also phd in other fields make nothing compared to that

lucid geyser
#

Millionaire =/ salary of a million

native yarrow
#

gemini is INSANE at lua coding oh my

torn mantle
#

many key oai workers were poached by different AI labs, and many AI labs have already caught up

#

like dont bother

#

couldnt find the article

#

but it was stupid

#

so many silly mistakes

solar hollow
#

salary is something entirely different though

lucid geyser
torn mantle
#

this one

native yarrow
#

like

torn mantle
#

i used RL to have LLMs naturally learn their own 10x compression! Qwen learned to pack more info per token (ie use Mandarin tokens, prune text)

lucid geyser
#

Yes I’m sure the average phd is making well over a million…

native yarrow
#

just one simple prompt "particle accelerator" and it made a complex structure that works

torn mantle
#

it looks cool but its stupid

#

lol

#

like the model using other langs to pack more info?

#

seems like a cool idea

torn mantle
#

but its more like reward hacking

native yarrow
torn mantle
#

but dunno about the quality

#

i guess for its size its good ?

#

the idea is fine

#

but the execution is just not it

#

constraining the model from the get go to 90% accuracy instead of dynamic compression

#

30% -> 40% -> 50% ...

#

it could've learned way more cool stuff

lucid geyser
#

Terrence Tao couldn’t even get $100k a year šŸ—£ļø

torn mantle
#

yea

#

now that you mention it

#

there are some mathematicians working at xai with that salary

#

you guys remember that troll xai guy

#

i dont remember his name but he was always on my feed page

lucid geyser
#

No

torn mantle
#

the guy with the hat

#

omg he was so annoying

#

i need to know his name

#

found him

#

I am a mathematician at xAI. Previously I was a researcher at Microsoft Research.

lucid geyser
#

Probably a much worse mathematician than Terrence Tao

#

But he did cs

torn mantle
#

thats my point

#

he was trolling non stop

#

i had to block him

#

he was the one that started 'i dont sleep, i work 24/7 at xai office'

#

and 'make grok great again'

lucid geyser
#

Yeah there’s like alot of math involved

proud bobcat
#

Ig

proud bobcat
torn mantle
#

Introducing our new frontier video model, Runway Gen-4.5. Previously known as Whisper Thunder (aka) David.

Gen-4.5 is state-of-the-art and sets a new standard for video generation motion quality, prompt adherence and visual fidelity.

Learn more below.

#

i call this one of the worst releases pr ever

#

why would you post it if you are not giving access to anyone

#

its so stupid

#

they just killed the hype with that

#

runway 4.5 seems on par with veo 3 but their release strategy is so bad

#

@echo aurora

lucid geyser
#

I’m just curious on the value of a mathematician to ai

#

They censored it

torn mantle
#

if sora falls so will all other ai vid gen

lucid geyser
#

Veo 4 soon

torn mantle
#

bro

lost moon
#

@echo aurora I tried to open a new conversation window, but I can't open it anymore.it has no answer. Is there a limit now?

torn mantle
#

who cares about tiktok

lucid geyser
#

Apparently people are testing it

torn mantle
#

tiktok has their own issues

#

lol no its not

lucid geyser
#

It is best for control

#

Also animation

torn mantle
#

its just because it was uncensored (kinda)

lucid geyser
torn mantle
#

sora 2 pro?

#

havent tried it

lucid geyser
#

No

#

Small

torn mantle
#

never generated an ai vid gen in my life

lucid geyser
#

Helps fix that texture thing

torn mantle
#

maybe one time with veo 3 since i have the pro

#

i just dont have any use for it

lucid geyser
#

They’re not very good

torn mantle
#

im not a tiktok creator or trying to earn money from that

#

if its for fun maybe

#

but i agree

#

sora 2 generated way more viral videos

lucid geyser
#

I posted veo on TikTok before

#

When it was trending

#

With the yeti stuff

cloud zinc
torn mantle
#

are you seeing this

#

lol

cloud zinc
torn mantle
#

it seems solid

cloud zinc
#

veo 3 is half

torn mantle
#

we still need more demos

#

lol no

#

look at the numbers

#

the graph is kinda misleading

cloud zinc
#

oooh

#

lmaoooo

torn mantle
#

xddddddd

cloud zinc
#

20 points is nothing

#

does it have audio?

torn mantle
#

yea

cloud zinc
#

The model excels at understanding and executing complex, sequenced instructions. You can specify detailed camera choreography, intricate scene compositions, precise timing of events and subtle atmospheric changes all within a single prompt.

We remain committed to making highly

#

why no audio

lucid geyser
torn mantle
#

demos

#

bottom page

hazy kernel
lucid geyser
#

How is 3.0 higher than 3.1

cloud zinc
#

i am hearing no sound

lucid geyser
cloud zinc
#

sora 2 pro way better

lucid geyser
#

No audio is crazy

cloud zinc
#

yeah

hazy kernel
hazy kernel
lucid geyser
torn mantle
lucid geyser
cloud zinc
#

its only video, no audio

hazy kernel
#

The model excels at understanding and executing complex, sequenced instructions. You can specify detailed camera choreography, intricate scene compositions, precise timing of events and subtle atmospheric changes all within a single prompt.

We remain committed to making highly.
Omg I'm so oof asking such a dumb question

torn mantle
#

it has audio

#

also

lucid geyser
#

Does the chart include audio though

torn mantle
#

i think its below veo 3 and sora 2

cloud zinc
#

cuz its not native audio

torn mantle
#

im so lazy

cloud zinc
#

we dont need fake audio inserted

torn mantle
#

to search that up

#

but i heard it with audio and one of their dev confirmed that in some tweet

cloud zinc
#

its not the same thing

torn mantle
#

we will see ig

cloud zinc
#

he says this

#

but in none of the clips and demo, it doesnt have audio

lucid geyser
#

Has no audio

cloud zinc
#

yes

lucid geyser
#

But it isn’t specified to have no audio

cloud zinc
#

this is what their ceo says

lucid geyser
cloud zinc
#

we will see if its true if audio is native or not

torn mantle
#

then its native

lucid geyser
#

Doesn’t know his own product

torn mantle
#

lol

lucid geyser
#

It 100% does not have audio

cloud zinc
#

but in none of the demo page, it had no audio

lucid geyser
#

Atleast the current release

cloud zinc
lucid geyser
#

It would mention if it did

cloud zinc
#

ye 100%

lucid geyser
#

Especially since the previous didn’t

torn mantle
#

but i guess its not that good

lucid geyser
#

Where did he comment that

torn mantle
#

thats why they didnt add demos with audio

cloud zinc
cloud zinc
#

twitter/x

lucid geyser
#

Holy prompt

#

Prompt:
A young woman with straight blonde hair and a freckled complexion sits quietly, looking up through her lashes with a vulnerable expression

torn mantle
#

lol

#

even the ceo doesnt get to see everything

proud bobcat
#

Ultra ultra good

torn mantle
#

more demos

cloud zinc
lucid geyser
proud bobcat
#

It’s

#

It’s runway???

cloud zinc
#

yes

torn mantle
#

oh it has audio

torn mantle
#

if you listen closely you can hear it

#

not the music

lucid geyser
#

Check his videos that aren’t edited

#

Only the montages have audio

cloud zinc
#

let me hear the audio clearly

torn mantle
#

woah

#

the fire shot was cool

cloud zinc
#

it looks like its video to video, not text to video

thorny cove
#

how long does a chat rate limit usually ask?

lucid geyser
thorny cove
cloud zinc
#

its cherrypick data

thorny cove
#

this happens on pretty much every model after enough chatting

torn mantle
#

this looks straight up from a movie

#

i like it

thorny cove
#

do i have to retry 1000 times to unlock it working again?

cloud zinc
thorny cove
#

or do i have to wait

thorny cove
lucid geyser
#

Definitely not native

cloud zinc
#

they prob using other ai to add audio sound effect

quartz light
#

guess what model

native yarrow
quartz light
native yarrow
#

uayayayayyayaa

haughty forge
#

I want deepseek speciale back 😭😭

keen beacon
#

Man

#

Let’s talk about real censorship Google is implementing

#

And I don’t mean no gore and I don’t mean no NSFW

#

I’m talking about historic individuals and language use of time periods and eras

#

It’s completely wrong & still fails to censor real harmful ai output as a consequence everyone has to suffer because of their mistrust of the users

#

For all the progress AI makes it takes two steps back in the wrong direction

#

SUPPORT OUR SERIES OF DEEP-DIVES INTO CORPORATE SURVEILLANCE & THE AI DYSTOPIA: https://store.gamersnexus.net/ai-dystopia
In this video, we walk through "AI Summaries" on Google and YouTube. Although these may currently be an attempt at some kind of actual value proposition (mostly for shareholders, of course, because evidently no one else matte...

ā–¶ Play video
#

Prime example

#

GamerNexus gives me hope for consumer advocacy

#

So does lm arena lol šŸ˜

#

You know it kinda blows me away to that as long as I’ve been here none of you guys or anyone in that matter has ever complained about AI hedging

still fossil
#

hm hi

keen beacon
hazy forge
#

where is speciale

keen beacon
#

Kling 01 model is nice

still fossil
#

can anyone help me please

keen beacon
#

Sure wats the issue is

still fossil
#

I m trying to generate image on lmarena image section using Google gemini 3 pro
it just fail all time and also estenguish my credit
and don't even give and out put
what to do to get an output
I have tried every way

keen beacon
#

So you’re saying every time you try to use Gemini pro image model directly that garrulous of your prompting or anything you do no image gets generated?

still fossil
#

it's says something went wrong with the responce

#

and fails

#

and that's not happening it generates infinitly

#

that's hour and still don't give anything

keen beacon
#

Well, first of all there is a rate limit. I don’t know how many number of images from what I understand it’s possibly for maybe a little more?

still fossil
#

ik

keen beacon
#

It takes a little longer to generate too, so you have to kind of be patient with it. If it doesn’t work, try to start a new chat.

still fossil
#

but after 50 mins it gets over

keen beacon
#

Yeah, I know what you mean

still fossil
#

I cant even generate a single image

keen beacon
#

Try clearing your browser

keen beacon
#

Let me test it out right now one

still fossil
#

come dm

keen beacon
#

Sure

#

Just if you don’t mind, just share it here unless it’s something personal

still fossil
#

check maybe

#

uplode both images

#

to get output btw I gave in dm

keen beacon
#

Ok

#

Are you using a VPN by any chance?

#

I’ll teach you a better way where you don’t need Gemini pro for what you’re trying to do

cloud zinc
#

whats kling o1

still fossil
#

pure wifi

#

idk what's happening

keen beacon
#

U can’t make ur img in lm arena using Gemini pro for 16:9

#

The output is 1:1 I think I’m not sure. Let me check one second.

#

Nvm I’m wrong

still fossil
#

nah

#

the 1st image I gave is generated by gemini also

#

it's 16:9 out put

#

but maybe my prompt is not correct

keen beacon
#

Ok

#

Now we gotta add the character in the center

still fossil
#

I think I'll use blender instead

waxen fern
#

Is LMArena still implementing rate limits??

still fossil
#

it's fussy

#

yh there is

#

9 images

#

50 mins and then again

keen beacon
waxen fern
#

It's so annoying

keen beacon
#

One second it keeps messing up

waxen fern
#

I gotta ask pineapple

keen beacon
#

I’m not using Gemini pro

waxen fern
#

@echo aurora Remove all rate limits and make LMArena truly unlimited

keen beacon
#

But I do have Gemini pro so I can do it for you officially one second

#

Lm keeps flagging

#

Damn wtf

#

Ok got it

still fossil
#

what's better then gemini pro

keen beacon
#

Let me remove that watermark

#

I don’t know I don’t like nano sometimes it makes a composition look weird

#

Like the height

#

Looks very off to me

still fossil
#

hm now let me generate it

keen beacon
#

Ya try ur image

keen beacon
keen beacon
#

Isn’t this from like Minecraft or something?

waxen fern
#

Rate limits for text arena???

keen beacon
#

Gaurd rails

cloud zinc
#

is it video editing model or text-to-video

still fossil
#

idk know why

keen beacon
#

Same

waxen fern
#

Rate limits for text generation????

keen beacon
#

I don’t know dude the composition is just weird

still fossil
#

I think my image is open sourced that's the problem lemme fix that

#

wait

#

gehlo come dm

keen beacon
#

I think it’s that fake pixelated blood

#

Odd

#

Yeah, I’m not really good with composition either

viscid cloak
#

Anyone encountered endless ā€œsomething went wrongā€ in Battle?

keen beacon
#

Ya

viscid cloak
#

Like there should not be rate limit in Battle, and I have not triggered a single verification. But the endless error message seems like a hidden rate limit control.

keen beacon
#

Why shouldn’t there be regular limits?

waxen fern
#

Number of limits in direct chat????

keen beacon
#

It it’s design is not for infinite generation lol

#

I mean, how long and how much time and how much prompts do you really need to really test the models after a certain point you kind of have an idea what the model could do? lol

waxen fern
#

Number of limits in direct chat for text????

keen beacon
#

I hear what you’re saying and I feel your frustration so I don’t want you to think I’m

viscid cloak
keen beacon
#

No, that’s why I said I get it

waxen fern
#

Direct chat for text limit numbers pls

keen beacon
#

I’m just trying to see both sides of the situation so I can understand better the dynamics that play

#

Two opposing forces trying to achieve two separate objectives 🤣

viscid cloak
#

If platform expected everyone is so benevolently selfless, the platform wouldn’t get that much information for learning and improving. So limiting Battle done no good to both sides.

keen beacon
#

I hear ya

waxen fern
#

Number of chat a day for text rate limits???

keen beacon
#

It just puts lm arena in kind of a weird awkward situation

#

Both on the expectation side, and on the other hand, fulfilling their obligations whatever they may be

viscid cloak
keen beacon
#

That’s actually more interesting case here

#

Is the rate limit on specific models

waxen fern
keen beacon
#

lol this is exactly the mentality I’m talking about

#

Which is unfortunate cause I understand and I’m part of that mentality also

#

There needs to be a subsidized variant of something like lm arena

viscid cloak
#

Now the error message in Battle is quite ridiculous. I used my phone and turned off WiFi, using cellular network. Boom, generating smoothly.

#

Really Arena? Doing that kind of funny ip lock trick🤣

keen beacon
#

Could be a browser could be something on their end. It’s hard to say a lot of people seemed to encounter this.

#

That’ll be crazy if that’s what was going on

keen beacon
#

But it also means if that is the case then that’s a lot of generations

waxen fern
#

Number of limits for deepseek 3.2 thinking in LMArena??

viscid cloak
#

I mean I haven’t triggered a single ReCAPTCHA. Blame the system lol

keen beacon
#

Well, that has nothing to do with them. That’s more of a google

#

I hardly hit any either

#

But I use a lot of Google services

viscid cloak
#

I mean shadow ban is just lame. Could’ve just told me like cloudfare.

viscid cloak
keen beacon
#

Well, I think we’ll jump in the horse here

#

We don’t know if that’s a thing shadow banning

#

It’s like four or five messages. I can test it right now and tell you the exact number one second.