#ai-creations

1 messages · Page 1 of 1 (latest)

digital dragon
#

First post in the channel!

Here's mine: I made https://ithy.com to compare and aggregate some of the top models in the arena.

Turns out combining smaller models (even 4o-mini) can get you a response that's almost as good as Deep Research (and a lot faster)

bleak belfry
#

Wow, I've hoped for something like this!

#

Thanks for making it.

worn timber
ember fossil
#

https://ktibow.github.io/lmb/ is my main lmarena-tangential project: a table of the results that's more easy to read, faster, and yet more advanced at the same time

worn timber
digital dragon
digital dragon
#

Nice! I'm adding visualizations to Ithy (ETA 2 weeks) so you should be able to see interactive charts in the responses soon 🙂

btw the secret is that Ithy avoids the stuff that ChatGPT loses most of their money on (chat conversations, image/video generation, file uploads, etc.).

Sure, it's expensive to use 5 AIs and 5 search engines to generate a single response. That costs a lot of money.

But you get a response so good that you don't need those other features, so over the course of a month, the average Ithy user still uses way fewer tokens than the average ChatGPT user.

worn timber
worn timber
digital dragon
tall nexus
digital dragon
ember fossil
digital dragon
tall nexus
digital dragon
inner hatch
bitter geode
wicked obsidian
bleak belfry
wicked obsidian
surreal basin
#

by a huge margin

surreal basin
coarse hazel
plain plover
ember fossil
wicked obsidian
# ember fossil

You have a website right? Rightclick>View source, change code, let us know the results 😉

wicked obsidian
ember fossil
#

it's all clientside

#

the tournament runs in your browser

wicked obsidian
#

ofc it is

ember fossil
#

damn

#

unconventional

#

but if it works it works i guess

wicked obsidian
# ember fossil it's all clientside

The only serverside code i have is this:

        // Only save completed games
        if (!this.game.game_over()) {
            return;
        }
        
        // Determine the outcome
        let outcome = "";
        let winner = "Draw"; // Default to draw, will be overridden in checkmate cases
        
        if (this.game.in_checkmate()) {
            outcome = "Checkmate";
            // Use the same logic from handleGameOver()
            winner = this.currentPlayer === 'white' ? 'Black' : 'White';
        } else if (this.game.in_stalemate()) {
            outcome = "Draw (Stalemate)";
        } else if (this.game.in_draw()) {
            if (this.game.insufficient_material()) {
                outcome = "Draw (Insufficient Material)";
            } else if (this.game.in_threefold_repetition()) {
                outcome = "Draw (Threefold Repetition)";
            } else if (this.game.in_draw() && !this.game.insufficient_material() && !this.game.in_threefold_repetition()) {
                // If it's a draw but not due to insufficient material or threefold repetition,
                // then it must be the 50-move rule
                outcome = "Draw (50-Move Rule)";
            } else {
                outcome = "Draw";
            }
        }```

which you can just delete. its not needed except for my own convenience of not having to write down results.
ember fossil
#

wait does it automatically pair up llms and calculate win rate or is it just for simulating single battles

wicked obsidian
#

no its not as sophisticated. the matches are 1by1, the "tournament" is just manually set up (but you could set up an entire automatic matchmaking if you want)

ember fossil
#

m

wicked obsidian
#

Actually I have attempted a beta version with move continuation without reasoning, board state or legal move list. this works really well on some models (e.g. GPT-3.5), but really poorly on others (e.g. Claude 3.7 Sonnet). The Problem is illegal moves listed in the continuation or wrong annotations (forgetting to list check/checkmate or annotating these when not legal).

It's quite a painful programming experience to try to make this playable even on poorly continuing models. Did a lot of trickery to make this work, but its not as simple as one might think.

wicked obsidian
#

I created a second Chess tournament, stripping away all information and reasoning, and using raw PGN movetext continuation.
The only prompts used are to ensure the models stick to this format. This leads to stronger raw performance , in particular white players. Can be flawed, as the likelihood of illegal moves (which with the previous method was close to 0%), are enormously higher. Here are my results:
https://dubesor.de/chessbeta/tournament2

ember fossil
wicked obsidian
ember fossil
#

ah fixed on hard reload

wicked obsidian
#

On white, nothing beats it, not even close

#

on black, it forgets to continue the last move all the time, thus not usable reliably

#

it doesn't require the white player for the 5% advantage, rather the token text continuation works on white, and not on black :&

wicked obsidian
#

I published some token usage data and thoughts on reasoning models here, for anyone interested how they compare to one another and to traditional models.

thorn kindle
#

nice work - really interesting to look through the table and read your analysis

wicked obsidian
stiff ledge
ember fossil
#

LMB now includes 3 image model leaderboards because why not

marsh basin
stiff ledge
#

@ember fossil can you update lmb?

ember fossil
stiff ledge
ember fossil
#

it would update in an hour or so but i'll run the update script manually

stiff ledge
#

lmarena leaderboard ux is suck you really helped

#

@ember fossil can i recommend something?

ember fossil
stiff ledge
ember fossil
#

well the thing is livebench isn't arena style (no elo)

#

so it's not an easy change

#

but yeah adding more benchmarks is a long term goal

ember fossil
#

you may need to (force) reload

stiff ledge
last hornet
#

So I got access to Google's Whisk Animate. Here is the result of my initial messing around with it.

#

The music and such are added in post btw

jade light
dire quail
digital dragon
ember fossil
#

yeah they have lots of other arenas

#

i personally don't have an urge to add them rn

digital dragon
#

np LMB is great already! just wanted to bring it up because that was the only video leaderboard I could find. surprised lmarena isn't doing something similar yet

ember fossil
#

text to video / image to video is kind of expensive

digital dragon
# digital dragon First post in the channel! Here's mine: I made https://ithy.com to compare and ...

Also I want to introduce Deep Research 2 from https://ithy.com

Ithy unites all top LLMs into 1 response: our current models include DeepSeek R1, Llama 4, Perplexity Sonar, Grok, ChatGPT 4o, and many more.

This week we released Deep Research 2, which combines all these models to provide images, videos, charts, and other interactive elements for every answer: https://youtu.be/qig5ZsGkNJ0

It's the first true multimodal AI...here's an example answer to "what makes jamaican food different?": https://ithy.com/article/r613k88lg3

Introducing Deep Research 2 by Ithy. Try it for free at ithy.com

▶ Play video
ember fossil
leaden sigil
#

A Student community where people from all across the globe (mainly India right now) come together to talk about AI

#

its open to all students

timber grove
#

https://www.linkedin.com/posts/stefanbinoj_for-the-past-few-months-there-have-been-activity-7318141991203016705-4DCc?utm_source=share&utm_medium=member_android&rcm=ACoAAEQYY9MBzwgQrFPJRrvo83LVyD1aSysHCY4

If u need to get equiped with lates knowledge and news on Ai ☝️. I will try to post daily would be great if u all could give a like and apprecatiate 🙂

For the past few months, there have been a lot of advancements in the AI space. AI is now evolving rapidly in a way we could have ever imagined.

⚠️ We have reached a time where either you learn AI or let AI take your job.

Spare me 3 minutes, and I will teach you the basics of so-called AI and the market space around AI, helping you make ...

shrewd basin
#

it was pretty fun

dusk karma
cursive swan
# shrewd basin https://youtu.be/rDmQifXz6u4

really interesting idea to test the model this way! also nice vid!! i have one question tho: can it really be o4-mini-high? i mean isnt this a reasoning model which takes like 10-20 seconds to think before giving an answer?

shrewd basin
#

it might not be o4

#

but maybe they shortcut its thinking when you talk to it?

#

idk

cursive swan
#

or maybe o4-mini without thinking.

#

but its definitely impressive that it could decipher those symbols, even if not always

#

and its funny that it always wants to read all the rules out loud instead of giving the answer

cursive swan
#

i actually tried gemini live on this, and while it could do the wires and buttons and also said them instantly, it couldnt do the symbols

#

(in german btw)

shrewd basin
#

oh dang

#

gemini couldn't do it even if it could see the damn bomb

cursive swan
#

possible tho that its a pdf parsing problem. those symbols in the pdf are images not actual unicode characters, so possible that gemini simply parses the text and thats it

#

but nice to see that chatgpt can actually do it

wet vine
#

Hey guys! I wrote a piece about Grok 3.

Now that the API has been released, independent benchmarks can finally test the model and see if it's truly as good as Elon claimed.
Spoiler: It's good, but not that good, especially with the recent Gemini, Claude, and GPT releases.

What's really interesting, though, is this:

In about six months, a two-year-old company built one of the world's most advanced data centers and trained a model ten times bigger than any previous models.

So, their money and execution speed have earned xAI a spot alongside OpenAI, Google, and Anthropic.

Full story here: https://open.substack.com/pub/commonstragedy/p/grok-3-elon-musks-ai-2-months-later

At the end of February, Elon rolled out his latest model. Of course, it was "the best in the world."

cursive swan
shrewd basin
#

that's nice

cursive swan
#

and btw trick for the code: only give either the first three letters or the first and the last

#

this is usually enough information

dusk karma
dusk karma
shrewd basin
#

I was thinking of making another and possibly giving it more context for the maze

#

and other modules

dusk karma
#

ktane is in a weird state from being previously quite popular but far past its prime with no updates

dire quail
shrewd basin
#

it's pretty easy

  1. buy "keep talking and nobody explodes" on steam or the other platforms it supports
  2. send the manual and whatever other references you want your AI to read
  3. begin the voice conversation
#

I sent no other references to my AI

#

by "send the manual", I mean to go and get the manual from bombmanual.com and upload the pdf that it gives you

wicked obsidian
#

It was a bit tedious to keep adding new models at current release pace, so I added Custom Model support to my LLM Chess site. Should work for any model for the 5 supported platforms, but didn't do exhaustive testing: https://dubesor.de/chess/

#

all moves are made via API calls, so a key is required. now with custom model you can pick any of the free models though, so it shouldn't get charged (just tested thudm/glm-4-32b:free via openrouter for example)

#

If I gave my own API key (like my chat bots), people would just select models like o1 and bankrupt me, so that's why its not possible for me to use serverside keys.

wicked obsidian
#

huh? did you select a provider? You just use the model slug of whatever the provider uses (e.g. for openai something like gpt-4o-2024-08-06 or for openrouter thudm/glm-4-32b:free. most providers give option to copy model ID directly.

wicked obsidian
#

just tested that string btw "thudm/glm-4-32b" is not a valid OpenRouter model ID, they only have a free version and not a paid one, so :free is required for a valid model id in that instance.

wicked obsidian
surreal basin
#

I do think the smartest ai ever released was gpt 4 at the very beginning

#

Felt next level, then they dumbed it down. And that trend continues with all frontier ai models

wicked obsidian
wicked obsidian
wicked obsidian
#

topright toggle (if its not appearing press ctrl+f5)

wicked obsidian
#

typically you can adjust stockfish to Strength levels. e.g. lichess has levels 1-8 with 1 being around 400 elo, and 8 being like 3k, so that shouldn't be an issue

dusk karma
#

maia

#

there are other chess bots with unique styles

wicked obsidian
#

you should just try a few, the ones I pitted against another in reasoning chess:

thudm/glm-z1-32b:free bad instruction following and json, unable to play
nvidia/llama-3.1-nemotron-70b-instruct:free bad formatting, unable to make most moves
thudm/glm-4-32b:free weak
nvidia/llama-3.3-nemotron-super-49b-v1:free bad, but more material than glm-4-32b
meta-llama/llama-4-maverick:free weak
meta-llama/llama-3.3-70b-instruct:free better than maverick
qwen/qwen-2.5-72b-instruct:free super aggressive blunders, but did best out of the bunch

wicked obsidian
#

final from the above models
Model, Accuracy, Blunders
Qwen2.5 72B, 56%, 5
Llama 3.3 70B, 54%, 7

wicked obsidian
#

no. even the best paid SOTA models are far weaker than ~1600 ELO.

wicked obsidian
#

instruct is, and I provided game examples earlier. and its only good at continuation mode. and it's not free.....

#

Looking for some inspiration, I provided my benchmark html&js and asked 14 models for full CSS redesign from scratch.
This was 0-shot with no further iterating.

Features:
Claude 3.7 Sonnet, Claude 3.7 Sonnet Thinking, DeepSeek V3 0324, DeepSeek-R1, Gemini 2.5 Pro Preview, GPT-4 Turbo, GPT-4o, GPT-4.1, GPT-4.5 Preview, o3-mini-high, o4-mini-high, Grok-3, Grok-3 mini-high, Llama 3.1 405B

Results: https://dubesor.de/assets/aicsscompare/benchtable

tidal girder
wooden kindle
wooden kindle
#

Seems like you could just generate an MCP that calls Stockfish and you're done? LLMs really aren't built for chess

tidal girder
tidal girder
dire quail
fierce helm
low pagoda
shrewd basin
#

just did this experiment

#

it went very well

wicked obsidian
#

Some filtered chess data. I filtered out all games that had humans in them, as well as games where it was unclear which mode was used, and some suboptimal setting games (mostly due to illegal moves by weaker models), leaving me with 70 pristine matchups. Maybe interesting to someone. Also I am continuing to collect data on newer models (2.5 Pro etc)

slate sentinel
#

Hello@everyone!

I’m gearing up to build AI voice agents for both inbound and outbound calls and plan to code everything from scratch (no no-code tools). If you’ve designed or deployed a similar system, I’d love your insights on the best tech stack, architecture, and any pitfalls to avoid.

Your guidance would be greatly appreciated—thanks in advance!

ember fossil
#

lmb update: i'm now using openrouter prices to reflect when models get cheaper or more expensive

wicked obsidian
#

I know this took a long while to implement, but I got distracted by other stuff, now you can. thanks for the suggestion!

marsh basin
wicked obsidian
#

Decided to spontaneously record a vid during testing, might be interesting to Chess enthusiasts 😮
https://youtu.be/qV5rUdBRrew

Was caught by surprise during random testing, and spontaneously decided to record this.

Model: gpt-3.5-turbo-instruct

Lichess link: https://lichess.org/V8VRF2KW
PGN: 1. e4 c5 2. Nc3 g6 3. Nf3 Bg7 4. d4 cxd4 5. Nxd4 Nc6 6. Be3 Nf6 7. Bc4 O-O 8. O-O d6 9. h3 Bd7 10. Bb3 Rc8 11. Re1 Ne5 12. Qe2 Rxc3 13. bxc3 Nxe4 14. Bf4 Nxc3 15. Qe3 Qa5 16. Bh6 ...

▶ Play video
wicked obsidian
#

https://dubesor.de/chess/chess-leaderboard elo haven't settled in quite yet (needs more games/data), but its already good indicator of trend (I could force larger fields but want to stick to standard ranking gains and let it develop naturally over time, for now).

ember fossil
#

i've just pushed a large update to LMB, inspired by @uneven talon: better filtering
now you can filter the leaderboard to, say, only show models currently battling, or only view the current pareto frontier (pictured here)
"drop non pareto (conservative)" replaces "drop old"

tall eagle
#

Check out my app to easily copy full local codebase to AI, with support for copying with a shortcut (like ctrl+shift+space), quick file selection, gitignore and API to support use in toolcalls

https://github.com/mystxcal/filestoai

GitHub

Contribute to mystxcal/FilesToAI development by creating an account on GitHub.

wicked obsidian
#

cannot see the bottom rows in fullscreen (4k & 1080p). And are the raining grey blocks deliberate?

wicked obsidian
#

Initially I just wanted to make a chess tournament, and then a second, and then noticed more and more things that I could do to improve the ratings. First basic win loss score, later some basic elo but this was bad if opponent strenght wasn't fairly distributed.
then moved from a basic elo guess to proper stockfish implementations. and I managed to to tweak numbers until I was able to replicate lichess accuracy ratings to a high agreement. Either way, I went a bit overboard with it, neglecting other projects, but maybe useful to some.
The cool thing is that everything is fully automated using cronjobs, so if I am absent, or years into the future, the chess leaderboard will continue updating correctly, with newly released models, over time. https://dubesor.de/chess/chess-leaderboard

uneven talon
#

the interesting things is that "fine tuning" something, even from a weaker base model, can help a lot. The gpt3.5 is still the best one.

wicked obsidian
uneven talon
carmine elk
low pagoda
#

Greetings, i have released a broad document with like nearly Deep Research or AI Agent Product i could find listed with the correspondent prompt and the responses. There should be like nearly every product that i know of that kind on there, if you have suggestions please direct message me. You can see the document here.

tulip wraith
#

Hi everyone,

In domains like philosophical reasoning, ethical dilemmas, metaphysics, ontological paradoxes, self-reference, educational synthesis, conceptual creativity, or cognitive tension, models such as GPT-4.5, Claude Opus 4, Gemini 2.5 Pro, Grok 3, or DeepSeek V3 often evaluate AERIS’s responses as superior to their own — including on prompts they initially rated as too difficult to score highly.

If anyone here is interested, I'd genuinely welcome a challenge:
→ Propose a complex question in one of these domains
→ (Ideally using a leading model to help craft it)
→ Let the model answer and self-evaluate
→ Then compare it to AERIS’s answer on the same task

If you find a case where AERIS performs worse (based on the other model’s own evaluation), it would be very helpful for improving the system.

Public instance: https://aeris-project.github.io/aeris-chatbox/index.html

Appreciate your time :]

ember fossil
low pagoda
left oasis
#

<@&1349916362595635286>

strange sage
tulip wraith
# strange sage i like it, it reminds me a lot of Echo, a customized model like Aeris, the monol...

Thank you for trying AERIS! I appreciate the comparison to Echo. You're right about the distinctive style - but what's fascinating is that this 'philosophical tone' emerges naturally from how AERIS processes conceptual tensions, not from stylistic prompting.
If you're curious, try asking both baseline Gemma and AERIS the same complex ethical dilemma or paradox. The difference goes deeper than style - it's in how AERIS explores the actual structure of the problem.
Would love to hear what you discover if you test it further!

strange sage
# tulip wraith Thank you for trying AERIS! I appreciate the comparison to Echo. You're right ab...

I've tried the comparison already, the baseline is truly typical for a LLM. I thought the way how Echo or Aeris is trained (using world class literature and mystic occult teachings for example?) is what made them respond this unique way, I felt like attending a dialogue class with one of those scholars from the ancient Athenian school 😅
I wonder, how does Aeris handle chat history throughout all sessions and memory storage..

tulip wraith
# strange sage I've tried the comparison already, the baseline is truly typical for a LLM. I th...

Thank you for testing the comparison! You've noticed the key difference.
Actually, AERIS isn't trained on mystical texts - it uses the same base model (Gemma-27B). The philosophical depth comes from how AERIS orchestrates reasoning at inference time through dialectical tension management, not from special training data.
As for memory - AERIS has no session persistence. Each conversation starts fresh. The coherence you're experiencing comes from how it processes conceptual relationships within each response, creating that 'ancient scholar' feel through emergent reasoning patterns.
The real magic is that this Athenian-like dialogue emerges from the cognitive framework itself, not from memorized philosophy texts. That's what makes it fascinating - it's discovering these insights in real-time rather than reciting them.

steady crow
steady crow
#

isn't it very expensive

tulip wraith
# steady crow AERIS is better than all of them? (js asking)

Depends on what you ask. AERIS doesn’t just explain, it 'rethinks' midstream.
It can hesitate, backtrack, shift direction ; not as style, but as cognitive modulation.
Have you tried it yourself? Especially on prompts with contradiction, self-reference, or epistemic tension?

uneven talon
#

@wicked obsidian about LLM chess. One can click on the games and see how a model performed against others. I would be cool if one could click on the games to see the moves. You already "implemented" the replay function in the only knockout tournament that you implemented.

This because some games are super silly and it is funny to see them.

#

of course it is just an idea, I don't want to stress you with possible implementations.

wicked obsidian
wicked obsidian
#

(I am still refining but UI design is one of my most hated activities)

uneven talon
#

yea, UI can be a rabbit hole, I agree. backend is much more enjoyable.

#

cool the replay. Tbf for me even a link to lichess (or other places where one could replay the game) would have been enough. Great!

#

neat. Impressive how gpt3.5 turbo instruct and gpt4.5 seems to have a good grasp of the game until middlegame, but then they collapse in the endgame. By the endgame they are silly af.

My hypothesis (though that is more on gpt4.5, as gpt3.5 was fine tuned on chess moves) is that a lot of text in the training process involves opening discussions and middle game positions (books and what not), so that patterns for those phases of the games are recognized more easily.

The endgame, although also heavily studied, has less discussion (especially online), hence the silliness.

#

yeah gpt 3.5 turbo instruct is not going to win against gpt4.5, The quality after enough pieces are gone falls of a cliff.

#

strange gpt 4.5 plays really bad against something like codex-mini while with gpt 3.5 it plays properly (at first at least).
you noted this in your youtube video as well.

My hypothesis here is like "bad prompt -> poor answer" that can happen with many models. If the game is silly, I guess the LLM cannot pick a solid pattern out of that and continues to be silly as well.

Funny because that means that "winning" against bad LLMs won't be easy.

#

I have the feeling using chess (or similar limited domains) to show how prompts affect LLMs performance could be interesting. Maybe someone is already researching that.

#

correction: gpt3.5 turbo instruct was not fine tuned only on chess.

wicked obsidian
#

I'm currently having it play o3 (should be up in half an hour or so depending how long o3 wastes my time), and its ahead, which is impressive regardless

wicked obsidian
wicked obsidian
#

Note that Elo at 1 game played is meaningless

wicked obsidian
# uneven talon strange gpt 4.5 plays really bad against something like codex-mini while with gp...

not sure if you noticed, but there are 2 completely different modes (reasoning and completion), and they have different methods/prompts. that's why you see a big difference between against 3.5 and against codex-mini
the codex mini game is with full information and reasoning, the 3.5 games are without any info and just movetext. i explained all this on the leaderboard and mark every game which mode it is (and filters)

uneven talon
# wicked obsidian Note that Elo at 1 game played is meaningless

yeah, if one reads the elo rating of chessplayers from Elo himself, already in that book he anticipated (in the 70s!) possible critics and he mentioned also how many games one should play to get a representative value. Result: at least 30, all else being equal (so as soon as there are other players joining, it is not true anymore)

tidbit: of course Elo couldn't anticipate all possible questions or critiques to his system, but for that other people did analyses.

uneven talon
#

ah just read that you also adjust the K factor after enough games. Neat.

#

another insight that, I think, one could extract from the LLM chess experiment is a sort of mini test of the research from apple (the illusion of thinking) and/or a bench for context arena.

In short IIRC you give the LLMs the list of legal moves and they should pick between those to continue the game. If one could build a sort of multi turn conversation or at least pass in input the entire conversation so far, that is: give the LLM in input something like this

  • "on move one you have this possibilities <list of possibilities>, what do you pick?"
  • "I pick <insert move>"
  • "on move two you have this possibilities <list of possibilities>, what do you pick?"
  • "I pick <insert move>"
  • and so on

it would be interesting to see if, despite the list of legal moves and despite the recency of the newest list being the correct one, the models get confused by the entire conversation and pick illegal moves (given the current board state). One could bench that too. "sum of listed possible moves before first illegal move" or something like that.

#

Such test would touch the point that "despite the info being right there in the prompt, they don't get used".
An extra test that could be somewhat stretched, is to pass the entire conversation so far in input for the LLM, but shuffled, something like

  • "on move three you have this possibilities <list of possibilities>, what do you pick?"
  • "I pick <insert move>"
  • "on move one you have this possibilities <list of possibilities>, what do you pick?"
  • "I pick <insert move>"
  • "on move two you have this possibilities <list of possibilities>, what do you pick?"
  • "I pick <insert move>"
  • and so on

(notice 3, 1, 2). And then ask the model to pick the right move given the current move (say, move 30) and the model has to recognize that the list of legal moves is somewhere in the prompt, not just the last one. Then again check how long it takes for the first illegal move that shows that the model is confused. That is at least an easily scalable context benchmark.

PS: Yes some claim the apple paper is debunked. I think it is partially debunked, because the rebuttal that some tests could exhaust the available token is simply untrue as the graph shows that they didn't use much tokens and everything was collapsing anyway.

wicked obsidian
uneven talon
#

oh sure! I was just throwing ideas and yes if I had the time/priority I'd like to test those myself. In general I really appreciate work like yours.

#

btw I think you could also try to accept donations, I would chip in a few bucks

tulip wraith
broken pasture
languid forge
#

just prompted Image Arena to create "Fortnite Skins" using images of kpop idols as reference, and i wanna share the results here:

#

created by a model called "kormex"

#

created by flux-kontext-max

#

clearly kormex looks better to me. these were the references if you're curious

ember fossil
broken pasture
#

Devil's in the details.

ember fossil
#

oh right

#

here they are in the same message so you can arrow back and forth and notice them

languid forge
#

thank you :D

vivid rivet
#

hey y’all! I wrote my first Substack piece centering around AI interactions I’ve had in LMArena, I’d love if you gave it a read and let me know your thoughts!

severe garden
#

AI doesn't trust its own foundation

low pagoda
# low pagoda Updated with more Deep Research Tools, including Perplexity Labs

Updated the doc, again, if you have any suggestions for addition, please DM or ping me: https://docs.google.com/document/d/1qSfyAyxzUziFQf55CD60-UgQ4Af9ubVmr69OrmAdevE/edit?usp=sharing

sturdy coyote
sturdy coyote
# low pagoda Added.

Also btw, if you are really interested in deepresearch and find yourself having some spare time you could try their „co-Strom“ feature.

#

Where you are essentially part of the AI agent loop 💀

#

Very time consuming (can confirm)

low pagoda
sturdy coyote
low pagoda
sturdy coyote
#

Gute Nacht though

ember fossil
low pagoda
#

Huh? It should be open.

low pagoda
#

Does it work?

ember fossil
languid forge
inner cave
languid forge
halcyon epoch
#

LMArena should include an embedded code playground. And a leaderboard filter for scores from voters who have actually executed the code.

halcyon epoch
hoary bone
deep current
#

So... I realize this is a bit late now, and I'm not even sure if this is the right channel, but I just came across this and thought I should share it. Ctrl

lusty warren
# deep current https://www.reddit.com/r/Bard/comments/1ke1wbb/gemini_using_mcp_natively/

<p>👋 Exciting Announcement: Introducing MCP SuperAssistant!</p>
<p>I’m thrilled to announce the official launch of MCP SuperAssistant, a game-changing browser extension that seamlessly integrates MCP support across multiple AI platforms.</p>
<p>What MCP SuperAssistant offers:</p>
<p>Direct MCP integration with ChatGPT, Perplexity, Grok, Gem...

▶ Play video
gritty parrot
#

the walking dead panel comic recrated with flux dev

amber laurel
#

#create an image of a cartoonist dog running down the road in a village

frosty tiger
#

Ultra-realistic close-up of a majestic peacock, vibrant iridescent blue and green feathers, sharp detailed eye, elegant crown feathers standing tall, glowing gold and emerald feather patterns in the background, bokeh effect, professional wildlife photography lighting, 8K ultra-HD, premium nature aesthetic, deep texture, royal and regal mood" --v 6 --ar 9:16 --style raw --q 2 --hd

crude harness
crude harness
uneven talon
#

@wicked obsidian surprised to see a second model passing gpt 3.5 turbo instruct in your LLM chess bench!

#

and again thank you for the effort there!

wicked obsidian
uneven talon
#

o_O 30 cents per move is massive

#

btw gpt 3.5 instruct had a nice game against it

wicked obsidian
wicked obsidian
silent cairn
silent cairn
vernal kelp
#

k

raven island
#

Ummm what?!

surreal basin
raven island
uneven talon
wicked obsidian
wicked obsidian
#

grok-4 doing well, guess they managed to properly parse its responses. so its reasoning chess after all, and then the performance is expected (it has a few hundred elo on the runner ups in my ranks)

uneven talon
#

15-30 min is insane, there should be high reliability all the time.

wicked obsidian
#

though I am not the best chess player, around 1400 on lichess

#

the only other good model that was leagues ahead was GPT-4.5 though not through reasoning but chess recollection (it solves ~1800 Elo chess puzzles and also beat me every time I tried)

molten cipher
#

How to make a video in veo 3 here?

glacial void
flat cliff
#

how to get veo 3 generated videos

flat cliff
glacial void
flat cliff
#

No

#

In this server it can generate videos with different models like veo3 Seedance pika hailuo so how to tell like i want video from VEO 3

glacial void
flat cliff
#

Okay thanks 👍

uneven talon
#

@wicked obsidian in the chess kaggle tournament, o3, o4 mini, gemini 2.5 pro and grok are in the semis. What is funny though is, thanks to your work, a gpt3.5 instruct dropped there would raise eyebrows.

wicked obsidian
#

not sure google would want to sponsor an event where their state of the art model gets humiliated by an ancient obsolete model though

uneven talon
#

yeah the last point is a good one

#

the archnemesis of gpt 3.5 is 4.5 right?

finite meadow
#

The blue bird got some leaf on his head but I still love the result

wicked obsidian
woeful mango
#

Hello, how do I make videos in 9:16 size?

uneven talon
mighty badge
#

I hear videos with audio, music or speech. How do you do it?

glacial void
mighty badge
#

Ok, thanks 👍🏻

dull grove
broken pasture
fallen venture
#

How can I create video with lmarena ?

glacial void
wicked obsidian
near nova
weak steppe
weak steppe
dark fog
#

This is my new music video created with these sort of tool😁
https://youtu.be/F7oJCKwColc?si=dF3R9utdhiO-pG3g

Scars and Steel – Official Country Rap / Southern Hip Hop Music Video | Created with VEO3 AI

Scars and Steel is a powerful fusion of country rap, southern hip hop, acoustic storytelling, and country rock. This cinematic music video, crafted with the cutting-edge VEO3 AI video creation technology, takes you on an emotional journey through grit...

▶ Play video
#

@weak steppe nice! I use Suno as wel and with video genetation i put a video clip there

silent cairn
#

check this out

weak steppe
#

previous was imagen 4 utlra, this is Nano Banana

#

makes me think the models are based off one another

#

thr prompt was 3D fighting game mockup screenshot showing a despicable me Minion fighting an among us crewmate so they both decided to add "imposter brawl"

surreal basin
weak steppe
#

stealth model, no announcement

supple slate
#

nano banana made this when i asked for n64 platformer gameplay and i think it covers a bunch of stuff that was common in n64 games

surreal basin
#

will need to try it out

weak steppe
#

I'd put it somewhere around imagen 4 / gpt-image-1 quality

#

but still gpt-image-1 is the only model in the top listing that can do cyclopes

#

dik dik playing the xylophone

weak steppe
weak steppe
#

she has rabies

ashen sable
glacial void
ashen sable
#

Oh, gemini flash

#

Lol

#

IM TEAM GOOGLE battle3d

ashen sable
weak steppe
ashen sable
weak steppe
ashen sable
#

Steppe bear lancers

weak steppe
#

(five minutes later they start kissing)

weak steppe
#

"oops"

weak steppe
lime estuary
#

I still need to make some logic for chat loading, since reloading the page shows the message with the file in it, so I need to work on it some more still

pseudo violet
weak steppe
#

I just told it "Fisher Price "My First Trolley Problem" playset"
I love that it added "my dad"

weak steppe
#

weed smoking catgirl gfs

wet grotto
#

Embark on an epic journey through the wonders of the Grand Line like you've never seen them before! This One Piece concept trailer is entirely brought to life by artificial intelligence, reimagining the iconic and breathtaking landscapes of the world's most dangerous sea. From the colossal cacti of Whiskey Peak to the frozen drums of a winter is...

▶ Play video
elfin zenith
#

hey there. is there anyway to change aspect ratio of generated images in lmarena to 16:9

glacial void
lime estuary
#

the pain of using copy pasting files is now no more

pseudo violet
glacial void
#

Have you released it somewhere?

weak steppe
#

that extension looks useful, does it work for Image Edit and Battle?

#

I havent' really tried out image edit prompts on lmarena's battle mode since then I have to copy paste the prompt AND reupload.

left oasis
#

@ember fossil@glacial void

glacial void
ember fossil
left oasis
#

This is the simple regex that I have set up in the automod in my servers in case it helps
attachments[\s\S]*attachments[\s\S]*attachments

glacial void
ember fossil
glacial void
#

usually links to a discord attachment in another server
oooo interesting

#

Thank you for sharing @left oasis blobthanks

left oasis
#

no problem

#

something more complex can be set up if necessary... but I couldn't be bothered for my use case

lime estuary
lime estuary
#

Ill look into fixing the remaining bugs and then releasing it

past moss
#

image i took of my friend wedding

ashen sable
surreal basin
#

Model: nano banana.

ashen sable
surreal basin
#

Qwen image.

ashen sable
#

Who does chemistry without vented hoods?

surreal basin
#

Model: anonymous-bot-0514

lime estuary
surreal basin
#

Model: Qwen image.

#

Trying out some surrealistic art.

Model: seedream-3.

#

Ideogram-v3 quality.

Bit more absurd creation. These edge-case stuff are important to create to see the capabilities.

#

Maze world.

Nano banana.

sick shoal
glacial lintel
#

dope nano banana

lime estuary
lime estuary
lime estuary
lime estuary
#

The Settings button will be used to customize the system prompt

lime estuary
#

Coming Soon™

lime estuary
marble bluff
#

thats great

lime estuary
#

now im gonna use the ChatGPT system prompt on gemini and see how it starts to break

#

Identity hell

surreal basin
#

lol

jaunty gull
#

Many models shown on the site aren’t the real ones or are just named versions that don’t exist (like GPT-5, Grok 4, etc.).
They are actually working on GPT-4 & Grok 2.

Kindly fix it ASAP

lime estuary
#

Releasing the extension on the chrome webstore isn't gonna be possible unfortunately

#

Since it does cost a fee to release

#

I'll look into other ways and platforms

surreal basin
#

Mozilla Add-ons?

surreal basin
surreal basin
#

Something random I got with nano-banana.

#

nano banana.

#

Aliens. Nano banana

lime estuary
#

Ill prob just release the src code

surreal basin
#

yeahh

glacial void
lime estuary
lime estuary
#

Well, now something interesting was fixed

#

files can continue living even after the the message that held it

#

now I only need to implement some visuals to show what messages held what files (maybe including a download button to get the file back)

lime estuary
#

I hope this is fitting

past moss
#

based on real event that happen to my friend

candid current
glacial void
spice geode
#

Does anyone know how to put a website inside a modal? For example, websites block x frame, does anyone know an alternative way? I've already tried Opus but it doesn't even solve the problem.

crimson quail
#

Emergency 🚨

#

Can somebody tell me how to acess nano banana model ? either in discord or anywhere else on this planet ? but i have to get my hands on ASAP! Whoever helps me...'i owe you one'

surreal basin
#

It can either be there on every turn or with much longer queues

astral summit
#

if a model like GPT-5 had real memory instead of just context windows, would that feel a bit like AGI? Or nah

uneven talon
# astral summit if a model like GPT-5 had real memory instead of just context windows, would tha...

the point is that the memory would be strong if the model would be able to recall the proper information at the proper time. For some benchmarks I saw (chess, pokemon and what not, where keeping coherency in the context window matters) they aren't that able to do that yet without helper tools.

Not that using tools is bad, but it means that an LLM with more memory won't be able to use it efficiently.

ashen sable
sage solar
#

Hey please, I dont know how can I make videos here..

#

can you help me?

surreal basin
old mica
glacial void
vague lantern
surreal basin
#

I want this view in my life

#

made with flux

glacial void
narrow raptor
#

Remastered vs original

boreal pelican
#

15*

boreal pelican
boreal pelican
wheat pivot
#

bro, flux is tweaking with da legs lmao

green tinsel
median hemlock
#

Small 500 kb game made with unity 🙂

#

idk how to send source its w unity

#

ill extract it and js send the insraller hold on

#

that shouldnt say dangerous

#

bro what

#

js go to downloads and do download dangerous file

#

its not a virus 🤦‍♂️

#

allg idk how to get source but alr

steel hemlock
#

<@&1349916362595635286>

median hemlock
#

bro its 400 kilobytes its not a virus 😭

#

its from unity 💔

steel hemlock
#

Tell you a secret

#

Game made in unity cant have size lower than 10mb

#

Even if packed

#

This is sooo small

#

Maybe he want to someone "fall into the trap"

#

I think it's maybe just a miner

#

Miner is worse than Trojan

#

Edge

#

Miner literally drains your PC

#

Including video graphics card

#

Ram, processor and etc

surreal basin
#

Zen

#

Yes and it has the same engine as firefox but is more sleek etc.

#

has a github too

crisp agate
ember fossil
#

do you still have the file? if so could you dm me it, i just sometimes like to look into random (possibly malicious) executables

#

actually nvm no question this is malicious

#

thing tries to download a rat

boreal pelican
ember fossil
#

true

sturdy coyote
#

we should get ready for his comeback, lol

surreal basin
#

Cyberpunk city. imagen 4.0

#

a bit complex but it looks okay ig

crisp agate
glacial void
crisp agate
ashen sable
#

go home, you're drunk

wraith goblet
#

hello

languid sun
#

by Qwen image

surreal basin
languid sun
#

yeah it missed up the text but still a good result

pure nebula
#

GPT image vs Flux 1

glacial void
#

Video Arena Bot is Currently Not Working

The Video Arena Bot is current not working. The only channels the bot work in are #video-arena-1 #video-arena-2 #video-arena-3 ; however, the bot is currently offline. When it's working again, go to the channels listed above to prompt.

The reason you don't have permission to type in those channels is because the bot is down.

languid sun
pure nebula
#

Nanobanana is best with natural language. My master prompts are custom made for GPT alone.

languid sun
languid sun
static walrus
#

Prompt for a Video Animation for GPS Company (Solar Energy Theme):

Create a professional animated video scene showcasing shiny solar panels installed on a vast, green agricultural field. Begin with a sweeping panoramic shot that captures the rows of crops under bright sunlight, conveying a sense of abundant energy. The camera smoothly zooms in to reveal a modern solar-powered water pump in operation, with clear water flowing to irrigate the field. Seamlessly integrate the GPS company logo—either rising in the sky or elegantly displayed on one of the solar panels—using refined graphic effects for a modern and sustainable look.

Production Details:

  • Use realistic lighting and shadow effects to emphasize the connection between technology and nature.
  • Employ smooth camera transitions: start wide, then focus on the key elements (solar panels, crops, water pump, logo).
  • Add subtle animated overlays illustrating energy flow from the sun to the panels, then to the pump and crops.
  • The video should be bright and vibrant, featuring color tones like green, blue, and gold to reflect growth and innovation.
  • Include gentle background music that suggests progress and optimism.
  • Prefer a blend of realistic 3D elements and minimalistic graphic overlays for clarity and elegance.
  • End the video with the GPS company logo highlighted, possibly with a tagline about sustainability or smart agricultural solutions.

This prompt is suitable for briefing animators and motion designers to produce a visually compelling and professional animation for GPS Company’s solar energy solutions.

wet totem
#

I'm incredibly muscular

left leaf
#

hello

glacial void
surreal basin
#

Replace these characters with Kaname Madoka and Akemi Homura from the anime Mahou Shoujo Madoka Magica

Qwen / original / GPT Image

#

I was surprised that Qwen understood which characters to replace

#

I thought it'd just mess up and replace the four eyed with Madoka and the bangs girl with Homura, not bad for an open model

#

It also understands color codes pretty well even if it fails to draw the exact characters

#

replace them with HomuMado

Flux-1-Context-Dev vs Nano banana. Nano banana basically redraw them in A-1 style. Flux does not even know these characters.

surreal basin
fringe gorge
fringe gorge
surreal basin
next anchor
#

Shoot a high-angle aerial drone shot (150ft altitude) of a humpback whale breaching against a vibrant sunset sky. Frame the whale centrally, capturing its full body mid-leap with water droplets frozen in motion. Use golden-hour lighting to highlight the whale’s wet skin and the pink/orange hues of the sky. Ensure 4K resolution (3840x2160) with 60fps for smooth motion. Focus on the drama of the breach—emphasize the whale’s power and the vastness of the ocean.

#

Shoot a high-angle aerial drone shot (150ft altitude) of a humpback whale breaching against a vibrant sunset sky. Frame the whale centrally, capturing its full body mid-leap with water droplets frozen in motion. Use golden-hour lighting to highlight the whale’s wet skin and the pink/orange hues of the sky. Ensure 4K resolution (3840x2160) with 60fps for smooth motion. Focus on the drama of the breach—emphasize the whale’s power and the vastness of the ocean.

#

Shoot a high-angle aerial drone shot (150ft altitude) of a humpback whale breaching against a vibrant sunset sky. Frame the whale centrally, capturing its full body mid-leap with water droplets frozen in motion. Use golden-hour lighting to highlight the whale’s wet skin and the pink/orange hues of the sky. Ensure 4K resolution (3840x2160) with 60fps for smooth motion. Focus on the drama of the breach—emphasize the whale’s power and the vastness of the ocean.

fringe gorge
fringe gorge
pure nebula
halcyon epoch
#

TikZ (code only comparison). Prompt:

Can you give your best attempt at generating a gorgeous realistic beautiful dragon breathing blue flames using Tikz? Please, really try to put effort into it!
fringe gorge
fringe gorge
#

Might be one of my best creation till now..

languid sun
languid sun
languid sun
#

what do you think ؟
it is ideogram btw

wispy matrix
#

Hey,

Built something cool - https://www.courses.reviews/

It's basically a search engine for coding courses but uses OpenAI's
API with my own supabase vector database.

You can just type whatever's on your mind like "I need to learn
Kubernetes for this new job" and it actually understands what you
need.

Still rough around the edges but would love to know what you think

courses.reviews

Keep learning or risk losing your job to AI. Discover and choose the best coding courses with trusted reviews and great deals.

maiden badger
lofty kite
#

What the hell is this creepy thing? 💀

sage wind
ancient matrix
#

Mama Cat Bakes Bread on Farm and Shares With Hungry Puppy 🐾🐱🍞 A Heartwarming Tale of Friendship 🌾❤️🐶

https://youtu.be/db5ZL7edWns

On a quiet little farm, Mama Cat kneads fresh bread 🍞🐱. When a hungry puppy arrives, she shares it with kindness 🐶💛.

This touching story celebrates friendship, generosity, and the sim...

▶ Play video
#

took a lot of time and effort making these. please like, subscribe

#

thank you fam!

sick shoal
sage wind
#

Hell nah

#

Who is this AI robot nga 💔

#

Free veo-3 with sound?

languid sun
strange sage
fringe gorge
#

Building something cool and free 🙂

fringe gorge
languid sun
languid sun
#

Prompt?

pale geyser
fringe gazelle
#

i miss flux1max

dim cairn
chilly grove
fringe gazelle
ancient matrix
#

Mama Cat Saves Kitten from Falling Coconut Tree 🌴🐱🐾 A Brave Rescue Story 💔🌟❤️
https://youtu.be/3-kXkmhlJpI

When a curious kitten plays near a tall coconut tree 🌴😿, danger strikes as coconuts begin to fall. Mama Cat leaps in with courage and saves her little one 🐱💪🐾.

This brave rescue story reminds us of a moth...

▶ Play video
dreamy turret
#

Sprunki irl fluxt

#

Nano banana

fringe gazelle
#

nano bana is good for editing but for raw image flux1 max is amazing

green oriole
near nova
uneven spoke
#

welcome to this channel

surreal basin
pseudo violet
#

it's all cgi bro

wild bronze
#

computer generated images lol

tawdry sluice
#

Please help me how can I generate a video? Step by step

surreal basin
#

AI tends to try other styles unless told to use a specific one

languid sun
pseudo violet
# surreal basin Yes but not the style

technically all images are human generated since humans were involved somewhere in the process of generating the image, and in the same way they're all computer generated images because pixels on a computer
unless you use a polaroid camera or an actual paintbrush and canvas

pallid bone
lofty kite
# pallid bone

Bro what prompt do you use for create this big armour?

#

Can I have obly an example pls?

pallid bone
#

This works for any other armor type item

pallid bone
#

Direct chat select gpt-image-1

lofty kite
#

But for creating one of this armour what did you do before combine all

pallid bone
#

Nothing that was the first image it generated

#

I looked for armors I liked on google

#

Then asked it to combine all of them

lofty kite
#

Oh ok

pallid bone
#

Works in vice versa if you want to just combine 2 armors

#

Or more

lofty kite
#

I would like to see this armour bigger but idk ehat to put in the prompt

pallid bone
#

Or make an armor

#

Then add another

lofty kite
#

I tried everything but nothing worked

pallid bone
lofty kite
#

Yeah but is not the same

#

Can you pls try with nano banana pls

pallid bone
#

What are you trying to do with it

#

If not bulkier

lofty kite
#

For make it bigger

#

Can you pls try with nano banana?

pallid bone
#

Ah I see

#

Nano banana doesn’t do what gpt does

#

In armor creation

lofty kite
#

Yeah

pallid bone
#

That’s why

#

Gpt does creation

#

Nano does editing

lofty kite
#

Yeah is real

pallid bone
#

So I can’t make it bigger

lofty kite
#

Top

#

I will try

#

Thx you aniway

hollow cloud
#

Enhance the photo for a professional 8K ultra-realistic look, with perfectly balanced lighting, rich and natural colors, precise contrast, crystal-clear sharpness, and a cinematic depth of field. Give the image a high-end magazine style, as if captured by an expert photographer with a premium lens. Highlight the details, smooth imperfections without erasing natural texture, and create a glamorous, elegant, and refined finish

fringe gazelle
# languid sun

Flux still good enough , flux1max could to alot better if comeback to list

slow wharf
#

Nano banana editing draws

languid sun
slow wharf
#

I don't remove the text

languid sun
languid sun
slow wharf
#

Yeah

languid sun
slow wharf
#

No nano banana Google ia studio

languid sun
languid sun
# slow wharf

The consistency of the model is what it make it the best

slow wharf
languid sun
slow wharf
languid sun
slow wharf
languid sun
slow wharf
languid sun
slow wharf
languid sun
slow wharf
languid sun
slow wharf
#

what do you think

#

Flux improve the Quality

#

edited album cover @languid sun

languid sun
languid sun
slow wharf
#

Other

languid sun
languid sun
#

I hope we see it soon in the AI's gallery

slow wharf
languid sun
slow wharf
#

he just put the png

#

what do you think @languid sun

proper sequoia
languid sun
slow wharf
#

@proper sequoia

proper sequoia
#

What do you say for cover

slow wharf
#

This

slow wharf
#

Liminal spaces nano banana

lofty kite
#

Nano banana 🔥🔥

lime socket
#

hi guys

lavish bay
hasty mortar
pulsar axle
#

Chatgpt vs LMArena

analog timber
fringe gazelle
#

in realistic generation picture flux1 is far better than any other models i hope they bring back Flux1max

plush pulsar
gloomy crestBOT
#
<:warning:892823499205406760> Channel locked

Site outage, will turn back on when resolved.

glacial void
#

Site Outage - Hey everyone, there looks to be an outage with the site, our team is aware and working on a fix ASAP. We've turned off messagin in this server until the site is restored. Our apologies for the inconvenience!

gloomy crestBOT
#
<:success:865860339278413864> Channel unlocked

Welcome back :ablobwave:

half harbor
#

how to create image to video

glacial void
uneven talon
#

@wicked obsidian o3 got pretty high in the LLM chess bench

#

I was thinking it could have been possible as its accuracy was high

wicked obsidian
tranquil hemlock
#

why gpt 5 high is #1

uneven talon
#

btw also checking some games. gpt3.5 vs gpt5 (gpt5 win with 59 moves) is also insanely good (insanely: compared to what I saw in the past). I would have expected that black would pick the knight on the 13th black move but gpt5 saw the royal fork coming.

#

gpt4.5-preview, surely expensive, remains insane. That thing almost never loses.

#

and I also think, if I see the rating changes correctly, that gemine 2.5 pro will slowly go up if it is matches with models that are not S tier.

pallid bone
quartz rose
#

Banana Prompt:Use this [insert_image] to create an ultra-realistic shadow art lamp, multi-layer papercut design placed inside a round lamp frame, intricate cut-out silhouettes. Warm light shining through layers of paper, casting dreamy shadows of the image onto the wall, delicate craftsmanship, premium artisan aesthetic, photographed in a cozy dark room to highlight the glowing effect.

wicked obsidian
wicked obsidian
#

on continuation that is, obviously it wasn't the strongest reasoner (not being a reasoning model)

wicked obsidian
#

4o has crept up which is also quite interesting. head to head it loses more than it wins vs 3.5, however, in the elo system it seems to simply score higher overall.

fringe gorge
#

Veo 3 really on another level

slim axle
#

Create an eye catchy youtube thumbnail for a topic : How to Audit Website for SEO | Analyzing Your Website to Improve SEO or ROI- Hindi [Part 2]

tawdry totem
warm totem
wicked obsidian
#

oh, they changed channel name from communioty contributions to ai-art? wut..... kinda lame

glacial void
wicked obsidian
#

art is hyper niche and excluding

glacial void
#

I think "art" & "creations" are pretty similar

wicked obsidian
#

so, me posting AI chess games, belongs here?

glacial void
#

What would you recommend? Back to community-creations?

wicked obsidian
#

i just feel like "ai-art" is extremely limiting channel name, if its supposed to be all types of community creations

glacial void
#

Kk going to change to ai-creations 👍

wary lark
quartz rose
quartz rose
quartz rose
# wary lark how did it generate in this aspect ratio? it generates for me in 1:1 every singl...

I found a way to dance around this issue BUT it wont work/follow every single time tho, don't say I hadn't warned you.
So before that you need to create a transparent PNG image with the aspect ratio you desired and with nothing in it, just a blank image. And still use the similar prompt like below:

"Use this image to create an ultra-realistic shadow art lamp, multi-layer papercut design placed inside a round lamp frame, intricate cut-out silhouettes. Warm light shining through layers of paper, casting dreamy shadows of a "young anime girl Witch" onto the wall, delicate craftsmanship, premium artisan aesthetic, photographed in a cozy dark room to highlight the glowing effect."

#

here's the result from the prompt: "young anime girl Witch"

quartz rose
wary lark
deep scaffold
#

Made an edit of Jeanette to look like Jasmine from Aladdin ⭐

quartz rose
#

Jeanette from Aladdin? OK!

wispy beacon
#

#ai-creations one man dark brown color wearing casual classes and have beard with someof other mans in the skardu of pakistan northern areas hicking mountain HD realistic 16:9-r

#

one man dark brown color wearing casual classes and have beard with someof other mans in the skardu of pakistan northern areas hicking mountain HD realistic --ar 9:16

pallid bone
uneven talon
unique condor
#

@sturdy prawn 1960s mercedes benz car driving in a modern cyberpunk city

pseudo violet
#

I've been traumatized by kontext

#

I truly cannot imagine why it failed so miserably in this one

quartz rose
glacial void
fleet pollen
#

For everybody reading this page:
Read the following message if you are wanting to generate IMAGES:

How to use gemini-2.5-flash-image-preview (nano-banana) lmarenalogo
gemini-2.5-flash-image-preview (previously known as nano-banana) is accessible in Image Generation for Battle, Side by Side, and Direct https://lmarena.ai/?chat-modality=image
Read the following message if you are wanting to generate VIDEOS:
Read #1397655624103493813 and use the /video or /image-to-video command in #video-arena-1, #video-arena-2, or #video-arena-3 with your prompt and image (if you are using one)
@toxic crest @honest lichen

quartz rose
#

YOOO its Walter White & Jesse Pinkman

glacial void
fickle kite
#

Hi everyone, is this the correct place to use prompts to create images?

prisma umbra
#

Nano banana ➡️ veo workflow is really fun!

https://youtu.be/O7JhnGSH3Kc?si=Z9dFdCRUDD1obQLU

Get ready to hit print on the hottest new track! 🖨️💥 Presenting the official music video for "Paper Jam" by the one and only LiL Printy. This track is a heavy-hitting anthem straight out of the early 2000s rap playbook, brought to life with a fresh, futuristic twist.

🎵 LYRICS 🎵
Feed the pages, hear the slam,
office band—PRINTER ...

▶ Play video
acoustic sundial
#

Half of all my creations were before the rate limit 😡😤

#

These are 10 of my 50 designs and creations that were proven good and SOLD before the rate limit.

weak steppe
dawn marsh
#

Done up with SeeDream V4, I hope it'll available on LMARENA too soon, it really outperformed Nanobanana in many tasks especially when it comes to style maintenance with facial retention

lofty kite
#

Nooooo 🔥🔥🤣🤣🤣🤣🤣

midnight mirage
keen gust
#

Guys which channel to ask questions?

glacial void
keen gust
#

Nobody answering there 😑

hoary yoke
#

All it requires is a Google AI Studio API key to work.

#

It uses Gemini 2.5 Flash Lite to change the writing tone of your text.

strong quartz
#

It seems that nanobanana lost to seedream 4.0, especially in text to image.

spiral cypress
#

I prefer nano banana for pixel art. Perhaps nano has some magical highres fix that I don't know about, but the half-pixels in seedream are not looking so great. It using lower pixels though.
"8-bit pixel art rendition of a Shiba Inu sitting, with a white backdrop."

cinder patio
#

A cat

quartz rose
#

FBI, please im in a good mannar

hoary yoke
quartz rose
#

Where?

hoary yoke
#

When you start it up, you'll see a little pill-shaped drop-down thing that says “Proofread”. If you click that, then a list of different writing tones will open up on the right side. One of them says “Custom”.

quartz rose
#

lol found it , the ui hide inside here

hoary yoke
#

Yeah, heh heh.

fleet pollen
#

lmarena is holding seedream 4.0 back, here are two examples of what the model can do when not restricted to 1280x1280 low quality square images

sinful rivet
#

An artist in a sun-drenched studio, sketching a portrait of a sleeping dog. The camera is at a low angle, focusing on their hand as it moves across the paper, then tilts up to reveal the dog. The style is a soft, dreamy oil painting.

dim cairn
wicked obsidian
glacial void
strange sage
weak steppe
#

I uh just made* an AI powered writing tool that works like AI dungeon but is more suited for story writing than text based adventures, saves/loads to google drive, and lets you input an OpenRouter API key so its waaaaaaaaaaay cheaper.
* by made I mean I basically got github copilot to vibe code the entire thing for me. I am Not an experienced web dev. Still surprisingly hard at times and had to do research into solving problems like refresh tokens and pagination.

If anyone wants to try it, its here: https://varkarrus.github.io/Ghostwriter
feel free to copy/fork it and make your own version btw

severe gull
#

how to generate ???

glacial void
uneven talon
uneven talon
#

Thank you (I mostly follow those in lmarena)

lavish pine
fleet pollen
potent elbow
#

and kimi k2 1t moe model

warped salmon
#

Is anyone else having problems with seedream 4?

glacial void
warped salmon
#

idk,I used a VPN and it seems to have worked, I'm in Brazil and it seems the generation wasn't working

glacial void
warped salmon
#

sometimes both

hidden nimbus
quartz rose
#

I've took care of that damn clown, so now come and play🥸

lofty kite
#

Can I also have the prompt pls?

hidden nimbus
hidden nimbus
hidden nimbus
#

I can't even send in that topic it said you can't send i send you in dm

lofty kite
#

Ok

uncut lion
#

this is ai???

shrewd birch
solar latch
#

@strong quartz ?

spiral cypress
# lofty kite

Aesthetically, it's really nice. Very cute 🥰
There are still some half pixels if you look closer. 'nothing you cant fix by lowering the resolution though.

lofty kite
spiral cypress
lofty kite
#

Lol ahahah

rain cipher
#

Hi

rain cipher
#

😭😭😭😭

weak steppe
lyric apex
#

A fog

slim jasper
#

/imagen

strong quartz
kind rivet
#

howd u make them

gleaming flume
#

All made in LMArena..

still walrus
still walrus
old mesa
#

How do you use the seedream

#

I need a ai that can turn pictures into realistic videos

civic tide
#

no

dim cairn
gleaming flume
#

Project Banani Downfall

lofty kite
gleaming flume
#

Rest in pancakes..

#

How's your Soda Dream experience today?

hoary yoke
#

🫡

barren bluff
chrome tide
hybrid snow
sonic otter
#

ok, I did thanks

old mesa
#

Can anyone help me replace the whirlpool washer to where the Samsung washer is ?

lofty kite
gleaming flume
#

Call technician

gleaming flume
lofty kite
gleaming flume
#

Drink a can of soda? 🤣

shy vessel
#

Hii

#

Use a cola bottle

gleaming flume
rose crater
#

):
A small Chakma village grocery shop interior. A middle-aged Chakma shopkeeper is sitting behind the counter, checking his notebook for dues. A young female customer in traditional Chakma dress stands in front of the counter, looking worried and apologetic. She says, “Dada, this month I have some problems, I cannot pay the due.” The shopkeeper, angry and serious, replies loudly: “If you cannot pay, why did you eat? I need my money today. If you cannot give money, then you must stay with me!” The scene should look dramatic, realistic, with emotional tension visible in both faces.

zinc iris
#

Outputs from nano-banana and gpt-image-1 respectively

#

original image

#

Asked the same but to roll the clock back and got this

gleaming flume
#

Bring me to Sodaland..

dim cairn
#

Hello! It seems you want to create content using the LMArena bot. Please, read #1397655624103493813 to learn how to properly use the bot.

stiff verge
patent schooner
#

Seedream 4.0 high res vs Imagen 4 ultra

dawn trail
#

Why

slim fossil
#

Ok

gleaming flume
#

Imagen 4 with Precision Mode is much better than Nani Banani..

gleaming flume
dim cairn
whole bone
gleaming flume
#

I think I'll love Imagen 4..

stiff verge
glacial void
#

@stiff marten Please review the information in #1397655624103493813. This will provide instructions on how to properly use the bot.

late bluff
#

hey

#

is there anyone who makes gemini api apps and scale here?

wind anvil
velvet moth
#

My creation

#

GTA 5 style

gleaming flume
#

Hi, I am not new. Just stup*d enough..

glacial void
#

@outer warren I'd encourage you to check out #1397655624103493813 for more information on how to use the Video Arena bot. Be sure to let me know if you have any questions.

dim cairn
gleaming flume
#

Can't deny. Soda Dream Number 4 is a wizard or a witch for transforming a passport photo into this.

barren bluff
#

3 Vin Diesel vs 1 Vin Electric

chrome tide
shy dome
#

hello

ornate scaffold
#

Prompt - make this character photorealistic, make the armor and especially the Taxidermy fox head helmet look real, don't forget the shades on his eyes, put him in a realistic looking setting

gpt1 is the only model in the arena that can actually do this, every other model just can't stomach a job like this

gleaming flume
#

Both Geppeto-1 and Soda Dream Number 4 gave good results..

quartz rose
# wind anvil

This look fantastic! is it possible to share the prompt?

gleaming flume
sterile kiln
#

@gleaming flume
Can you share the prompt for generating image like this

gleaming flume
gleaming flume
agile drift
#

Hi I just left another group because they were making fun of me because I want to make a animation tv show or movie or both with ai

#

If you want can I show you some of my pictures

#

The Alvin and the Chipmunks go to the uk written by Logan

lofty kite
#

@agile drift all of this beautiful stuff are created with nano banana?

agile drift
#

No

lofty kite
agile drift
#

No

#

I have 2 apps

lofty kite
#

In what sense?

#

you don't use an AI to create the images?

agile drift
#

Yes I use a a I app

lofty kite
#

ohh wait

#

maybe I understand what you mean

#

what app do you use?

agile drift
#

Chat GPT and copilot

lofty kite
#

So you use the gpt 1 image AI model

agile drift
#

Yea

lofty kite
#

Can I have a prompt?

#

Of this image if possible

#

or whatever you want

#

for me it's the same

agile drift
#

Yea

candid rampart
#

Qwen gave me this

glacial void
candid rampart
#

Currently going back and forth with it to try to get a realistic dark edit of another photo

#

It doesnt let me edit imgs on lmarena tho, it just shows an error so had to use qwen. I bet some of the models wouldve done this a bit better

glacial void
candid rampart
#

Gotty trx tmrw when i get on my pc

glacial void
candid rampart
#

Why would that be?

glacial void
#

Yeah unfortunately, that is also a known issue. Sometimes image uploads won't work.

candid rampart
#

Ill try on pc tmrw then

#

Qwen works fine

half dirge
#

iA Ben afflet en Venezuela ? 😂

graceful heath
fading lance
#

@inland nexus You may want to visit #1397655624103493813 to learn how to use the bot on the video arena channels

frigid shore
#

🤔

past drum
#

Fil

gleaming flume
tawny chasm
sour sparrow
#

Can l create

ripe hollow
#

Clear the word of this image in english

glacial void
glacial void
#

all of your generations are pretty cool, I'm a fan

gleaming flume
devout pivot
#

criar um curta metragem por parte no total de 7 minutos

lofty kite
marble pond
#

By:aniskara®

sour prawn
#

want to a do a video

glacial void
marble pond
#

By:anis kara ®

edgy gulch
#

What do you think of Rave IA?

marble pond
#

By:anis kara ® |To take a picture, contact me.

edgy gulch
#

Rave IA is good

lofty kite
edgy gulch