#general | Arena | Page 9

torn mantle Apr 2, 2025, 3:15 PM

#

i got it twice and i chose it over gemini 2.5 pro

#

needs more testing

#

they are kinda similar but just the little details gives it the edge

eager mica Apr 2, 2025, 3:16 PM

#

Looks like so. I just had a round with it.

torn mantle Apr 2, 2025, 3:17 PM

#

#

nightwishper got to be the next sota coding model for sure

#

#

i want to get nightwhisper for that prompt

#

stargazer did a good job too

#

yea

#

should be cool

#

#

nightwishper

#

gotta give it to this model

#

the amount of details

#

it nailed the colors too

#

are you really dancing rn?

#

xd

#

btw i asked it to make it look modern

#

its good no?

#

lol no

#

i got it like 5 times from 6 tries

#

its the opposite

#

the probability will be higher if its a new model

mossy drum Apr 2, 2025, 3:43 PM

#

New model in Arena: olmo-2-0325-32b-instruct (I tried to search here for the name or just olmo, nothing found)

brittle tiger Apr 2, 2025, 3:46 PM

#

torn mantle the probability will be higher if its a new model

I think it means more probability they want the model out as fast as possible

rigid widget Apr 2, 2025, 3:51 PM

#

The reason why Gemini 2.5 Pro is so good is 1) AI Studio 2) LMarena

brittle tiger Apr 2, 2025, 3:59 PM

#

That's just been assumption bc it's beating 2.5. I don't think confirmation of it saying it is yet

rigid widget Apr 2, 2025, 4:01 PM

#

Anthropic is already top

Screenshot_2025-04-02-19-01-01-558-edit_org.mozilla.firefox.jpg

#

If we consider the power they have, yes they are bad

rigid widget Apr 2, 2025, 4:17 PM

#

Friends, I have a very important exam in 2 months (University exam). I will be away for 2 months.

#

Have a good life to all of you.

keen beacon Apr 2, 2025, 4:32 PM

#

#

confirmed all google

#

interestingly they do not have the "gemini-test-xx" ID that all previous google anonymous models have had

#

for example 2.5 pro was gemini-test-38

#

but these IDs are just their names

#

No because some companies don't train it in at all or train certain parts. Some might not do it well either

torn mantle Apr 2, 2025, 4:48 PM

#

keen beacon

google said something but i was so sceptical

#

they said that they will dominate the coding area with their new models

#

moohowler seems like google flash model

#

nightwhisper is so so good

#

finally google

lime coral Apr 2, 2025, 5:03 PM

#

torn mantle they said that they will dominate the coding area with their new models

https://x.com/officiallogank/status/1869902322840571922?s=46

Logan Kilpatrick (@OfficialLoganK) on X

We are going to build the world’s most powerful coding models, lots of good progress already with 2.0.

2025 is going to be fun :)

torn mantle Apr 2, 2025, 5:10 PM

#

lime coral https://x.com/officiallogank/status/1869902322840571922?s=46

yea this one

#

really beautiful

brittle tiger Apr 2, 2025, 5:10 PM

#

Can confirm nightwhisper is cracked. Sick working demo here

plain zinc Apr 2, 2025, 5:25 PM

#

#

Google is COOKINGGG

lime coral Apr 2, 2025, 5:26 PM

#

Is it a thinking model?

plain zinc Apr 2, 2025, 5:26 PM

#

lime coral Is it a thinking model?

It seems like Gemini 2.5 pro is finely tuned to programming

#

Gemini 2.5 Pro Coder

#

Not Exp even

lime coral Apr 2, 2025, 5:26 PM

#

So thinking

plain zinc Apr 2, 2025, 5:34 PM

#

cedar tide Apr 2, 2025, 5:37 PM

#

Hi, sorry to bother you, are there any mystery Sota models currently in the arena?

honest garden Apr 2, 2025, 5:39 PM

#

24 jarat gold

plain zinc Apr 2, 2025, 5:50 PM

#

cedar tide Hi, sorry to bother you, are there any mystery Sota models currently in the aren...

nightwhisper

cedar tide Apr 2, 2025, 5:50 PM

#

cedar tide Hi, sorry to bother you, are there any mystery Sota models currently in the aren...

NightWhisper its the best coder all time ?

plain zinc Apr 2, 2025, 5:51 PM

#

cedar tide NightWhisper its the best coder all time ?

It seems to be much better than 2.5 pro

cedar tide Apr 2, 2025, 5:53 PM

#

plain zinc It seems to be much better than 2.5 pro

Just for coding ?

eager mica Apr 2, 2025, 5:54 PM

#

honest garden 24 jarat gold

Personally, I think it's a good finetune of a small or mid-sized model rather than a SOTA or even frontier model.

plain zinc Apr 2, 2025, 5:57 PM

#

cedar tide Just for coding ?

Well I do not know

#

Check it yourself

willow sparrow Apr 2, 2025, 6:03 PM

#

Anyone know of a good alternative to manus?

brittle tiger Apr 2, 2025, 6:05 PM

#

Another banger demo from nightwhisper

plain zinc Apr 2, 2025, 6:11 PM

#

brittle tiger Another banger demo from nightwhisper

What kind of arena is this? Webdev?

brittle tiger Apr 2, 2025, 6:11 PM

#

plain zinc What kind of arena is this? Webdev?

Yea

honest garden Apr 2, 2025, 6:16 PM

#

eager mica Personally, I think it's a good finetune of a small or mid-sized model rather th...

No

#

Its to good

#

To be a small version

#

Or mid sized

eager mica Apr 2, 2025, 6:20 PM

#

honest garden No

Trivia knowledge is strongly dependent on model size, and the model didn't seem to be particularly good at that in my tests.

#

I don't think they are on the Arena, but you can never be 100% sure. It might even end up that the Llama/Meta-branded models are actually Qwen3 in disguise (I doubt that, but...).

keen beacon Apr 2, 2025, 6:25 PM

#

There have never been anon Chinese models to my knowledge

#

Yea

torn mantle Apr 2, 2025, 6:36 PM

#

brittle tiger Another banger demo from nightwhisper

beautiful

primal orbit Apr 2, 2025, 6:58 PM

#

hi, are these new google models available for general chat or it's just code?

eager mica Apr 2, 2025, 7:10 PM

#

There's apparently another vision Meta model, cotton.
Though, for my uses I found qwen2.5-vl-32b-instruct is actually pretty good, almost on the level of Google Gemini models.

#

cotton felt more like the other recent text-only anonymous models from Meta (?).

torn mantle Apr 2, 2025, 7:22 PM

#

primal orbit hi, are these new google models available for general chat or it's just code?

webdev

#

has bunch of new google models

#

nightwhisper for now only exist in webdev since it may be a coding model only

keen beacon Apr 2, 2025, 7:39 PM

#

unfortunately not as spider isn't on the webdev arena

primal orbit Apr 2, 2025, 7:40 PM

#

there is a new grok model called "anonymous"

keen beacon Apr 2, 2025, 7:41 PM

#

document not found

primal orbit Apr 2, 2025, 7:42 PM

#

https://snipboard.io/86wfLe.jpg

Upload and share screenshots and images - print screen online | Sni...

Easy and free screenshot and image sharing - upload images online with print screen and paste, or drag and drop.

#

it's able to process pictures

keen beacon Apr 2, 2025, 7:45 PM

#

primal orbit https://snipboard.io/86wfLe.jpg

it it on the text arena too or only vision

primal orbit Apr 2, 2025, 7:46 PM

#

I'm using standard chat, but I'm uploading a picture in prompt and ask to analyze

torn mantle Apr 2, 2025, 8:20 PM

#

primal orbit https://snipboard.io/86wfLe.jpg

interesting

torn mantle Apr 2, 2025, 9:03 PM

#

webdev arena has some rendering issues

#

couldnt get it to work for the last hour

torn mantle Apr 2, 2025, 9:05 PM

#

primal orbit https://snipboard.io/86wfLe.jpg

couldnt get it so far

#

grok outputs are hit-miss

torn mantle Apr 2, 2025, 9:37 PM

#

guess the model

#

#

#

#

there is sonnet 3.7 thinking/nightwhisper/gemini 2.5 pro

#

not in order

keen beacon Apr 2, 2025, 9:52 PM

#

what is the order then

#

😭

#

were supposed to guess ig

keen beacon Apr 2, 2025, 9:52 PM

#

torn mantle

nightwhisper?

torn mantle Apr 2, 2025, 9:52 PM

#

keen beacon nightwhisper?

yea

keen beacon Apr 2, 2025, 9:52 PM

#

holy f1ck

#

yeah that one def looks the best

torn mantle Apr 2, 2025, 9:53 PM

#

most models fails the waveform look

keen beacon Apr 2, 2025, 9:55 PM

#

torn mantle

is this 2.5 pro? last one sonnet 3.7?

torn mantle Apr 2, 2025, 9:55 PM

#

keen beacon is this 2.5 pro? last one sonnet 3.7?

yea

#

spot on

torn mantle Apr 2, 2025, 9:57 PM

#

keen beacon is this 2.5 pro? last one sonnet 3.7?

kinda suprised even sonnet 3.7 thinking didnt get it well

keen beacon Apr 2, 2025, 10:03 PM

#

0-shot asking for a realistic twitter 2021 landing page

#

nightwhisperer

#

wtf

torn mantle Apr 2, 2025, 10:03 PM

#

they cooked with this one

#

hopefully they go all out for this model

#

API available

keen beacon Apr 2, 2025, 10:04 PM

#

i guess this is a web dev tune of 2.5 pro?

#

general code finetune

torn mantle Apr 2, 2025, 10:04 PM

#

available on all coding IDEs

keen beacon Apr 2, 2025, 10:04 PM

#

gemini coder is the rumour

#

i fear anthropic may be cooked

torn mantle Apr 2, 2025, 10:19 PM

#

keen beacon i fear anthropic may be cooked

they really should go all out for this model

balmy mist Apr 2, 2025, 10:21 PM

#

how do i test out the nightmare thing?

honest garden Apr 2, 2025, 10:30 PM

#

eager mica Trivia knowledge is strongly dependent on model size, and the model didn't seem ...

Oh

#

Does small size mean its bad

#

Like less smart

torn mantle Apr 2, 2025, 10:39 PM

#

balmy mist how do i test out the nightmare thing?

night what?

eager mica Apr 2, 2025, 10:50 PM

#

honest garden Does small size mean its bad

Smaller size means it is technically limited on how much knowledge it can have. Small models tend to be less smart, but to some extent this can be compensated for.

forest coral Apr 2, 2025, 10:52 PM

#

Thanks for the answer👌

#

How did you managed to find the model?(riveroak) I couldn’t find it anywhere

willow grail Apr 2, 2025, 10:55 PM

#

what site is this

new-sota-coding-model-coming-named-nightwhispers-on-lmarena-v0-khl9clk9jgse1.png

keen beacon Apr 2, 2025, 10:57 PM

#

web dev arena 🙈

willow grail Apr 2, 2025, 10:57 PM

#

keen beacon web dev arena 🙈

should i use web dev or alpha

leaden palm Apr 2, 2025, 10:57 PM

#

willow grail should i use web dev or alpha

what do you think

willow grail Apr 2, 2025, 10:57 PM

#

leaden palm what do you think

alpha??? but does alpha have nightwhisper

leaden palm Apr 2, 2025, 10:58 PM

#

willow grail alpha??? but does alpha have nightwhisper

do you want to do web dev????

willow grail Apr 2, 2025, 10:58 PM

#

maybe? i dont know?

#

i wanna do everything which can make me money

torn mantle Apr 2, 2025, 11:01 PM

#

forest coral How did you managed to find the model?(riveroak) I couldn’t find it anywhere

lmarena

#

#

#

guess the model 😖

#

really impressive

keen beacon Apr 2, 2025, 11:07 PM

#

random guess nighthowler?

torn mantle Apr 2, 2025, 11:07 PM

#

keen beacon random guess nighthowler?

howler xd

#

yea nightwhisper vs o3-mini

willow grail Apr 2, 2025, 11:08 PM

#

the blue one is nightwhisper

keen beacon Apr 2, 2025, 11:08 PM

#

wait its available in regular lm arena?

torn mantle Apr 2, 2025, 11:08 PM

#

stargazer

keen beacon Apr 2, 2025, 11:08 PM

#

nightwhisper is in lmarena now?

torn mantle Apr 2, 2025, 11:08 PM

#

keen beacon nightwhisper is in lmarena now?

no its on webdev

keen beacon Apr 2, 2025, 11:08 PM

#

bruh u threw me off 🙈

torn mantle Apr 2, 2025, 11:08 PM

#

the other message was a reply to the other guy

#

i want to challenge it more

#

gemini 2.5 pro

torn mantle Apr 2, 2025, 11:11 PM

#

torn mantle

clearly this one is better no?

keen beacon Apr 2, 2025, 11:11 PM

#

yea but not by much i think

#

maybe i like 2.5 pro better

willow grail Apr 2, 2025, 11:12 PM

#

asura ignoring me

#

i feel insulted

torn mantle Apr 2, 2025, 11:12 PM

#

willow grail should i use web dev or alpha

webdev

#

are for coding battles

#

alpha arena is for text output battle

keen beacon Apr 2, 2025, 11:12 PM

#

alpha arena doesnt have anon models yet right?

torn mantle Apr 2, 2025, 11:12 PM

#

also alpha arena doesnt have recently added models

keen beacon Apr 2, 2025, 11:13 PM

#

no point in using it

torn mantle Apr 2, 2025, 11:13 PM

#

keen beacon no point in using it

yea for now

#

im having so much fun with nightwhisper

#

cant get my hands on it

keen beacon Apr 2, 2025, 11:14 PM

#

try a harder task the portfolio one isnt hard enough

torn mantle Apr 2, 2025, 11:14 PM

#

keen beacon try a harder task the portfolio one isnt hard enough

i was just testing alignment/organization/colour choice/design style

#

i mean it may look similar but some smaller details gives it the edge

#

its what i had in mind tbh

torn mantle Apr 2, 2025, 11:16 PM

#

keen beacon try a harder task the portfolio one isnt hard enough

any ideas?

#

#

Nintendo Switch library

keen beacon Apr 2, 2025, 11:17 PM

#

oh thats way better

#

the first one

torn mantle Apr 2, 2025, 11:17 PM

#

keen beacon oh thats way better

its nightwhisper vs sonnet

keen beacon Apr 2, 2025, 11:18 PM

#

ya nightwhiper is very very good

keen beacon Apr 2, 2025, 11:18 PM

#

torn mantle any ideas?

maybe a webgl game? minecraft clone or smthing not sure (mc is too easy maybe) if that works with webdev arena

primal orbit Apr 2, 2025, 11:20 PM

#

stargazer is available in general arena. Just had it

keen beacon Apr 2, 2025, 11:20 PM

#

primal orbit stargazer is available in general arena. Just had it

yea nightwhisper isnt

primal orbit Apr 2, 2025, 11:20 PM

#

but it has "My knowledge cutoff is generally around late 2022 to early 2023"

#

which is odd for new model

keen beacon Apr 2, 2025, 11:21 PM

#

primal orbit but it has "My knowledge cutoff is generally around late 2022 to early 2023"

no its a hallucination they didnt train the cut off in

torn mantle Apr 2, 2025, 11:21 PM

#

keen beacon maybe a webgl game? minecraft clone or smthing not sure (mc is too easy maybe) i...

yea that should be cool

primal orbit Apr 2, 2025, 11:22 PM

#

nebula had june 2024, If i'm correct

torn mantle Apr 2, 2025, 11:22 PM

#

primal orbit nebula had june 2024, If i'm correct

2025 jan

keen beacon Apr 2, 2025, 11:22 PM

#

primal orbit nebula had june 2024, If i'm correct

hallucination

primal orbit Apr 2, 2025, 11:34 PM

#

if stargazer is flash thinking 2.5, is it expected to be better than 2.5 pro or on par? Or what's the point?

keen beacon Apr 2, 2025, 11:34 PM

#

primal orbit if stargazer is flash thinking 2.5, is it expected to be better than 2.5 pro or ...

its worse but cheaper and faster

#

it really depends on how theyre pricing 2.5 pro really tbh

#

if its like flash lite and flash, with barely a price difference, most would prefer flash i think. (in this case, 2.5 pro)

#

https://webdev.lmarena.ai/

#

#webdev-arena, top of discord has a link lol

leaden palm Apr 2, 2025, 11:38 PM

#

im restraining myself from lashing out

#

it's just lmarena with a react sandbox

#

why would it

keen beacon Apr 2, 2025, 11:38 PM

#

u can ask it to output a webpage with python code as text lol

leaden palm Apr 2, 2025, 11:38 PM

#

just use regular lmarena

#

i doubt it would be better at python

keen beacon Apr 2, 2025, 11:39 PM

#

i mean if it sucks at python compared to ur experiences in other models then it means its not a generalized finetune like people speculate

#

or the web dev thing is degrading it a lot somehow (outputing the resulting python on a page)

leaden palm Apr 2, 2025, 11:40 PM

#

why are you calling them offline programs

keen beacon Apr 2, 2025, 11:40 PM

#

i mean its easily accessible just use 2.5 pro lol

leaden palm Apr 2, 2025, 11:41 PM

#

is that to say that html files are a government conspiracy?

keen beacon Apr 2, 2025, 11:41 PM

#

🤣

#

ur not even using wasm lol

#

wasm requires compiling

leaden palm Apr 2, 2025, 11:42 PM

#

is python jit faster than js jit?

keen beacon Apr 2, 2025, 11:42 PM

#

v8 is extremely fast too anyway

#

all things considered

leaden palm Apr 2, 2025, 11:43 PM

#

idk not that i hate python

#

its just a bit funny to me

keen beacon Apr 2, 2025, 11:43 PM

#

maybe not if you use python with bindings to faster stuff though

glad imp Apr 2, 2025, 11:44 PM

#

keen beacon v8 is extremely fast too anyway

v8 had billions of dollars invested to get to that point

#

python has only recently started to get investments to run faster

glad imp Apr 2, 2025, 11:44 PM

#

leaden palm is python jit faster than js jit?

no

keen beacon Apr 2, 2025, 11:45 PM

#

i dont think u should keep adding restrictions to your thing. given u havent gotten ai to make ur stuff work properly once i believe

#

u should get it working even if its slow then adjust/ gauge model performance from there

#

ur askingn too much

#

for now

#

you can build it yourself with ai assistance, but expecting it to zero shot build everything like that its just not possible right now

#

ask the right questions to the ai, slowly and incrementally build it out, and i would bet u can accomplish this with even 2.5 pro

#

it really comes down to the user tbh

#

did u know where the bug was yourself

#

ya i think ur relying on too much ai at that point tbh. if u wanted to do it yourself, you'd keep using it judiciously. ai is a powerful tool rn if u know how to use it right

brittle tiger Apr 3, 2025, 12:20 AM

#

Asteroid simulator in 1 shot on 2.5. could def be made much better through more chatting

https://g.co/gemini/share/60fcf5c244c9

Gemini

‎Gemini - Three.js Asteroid Impact Simulation Code

Created with Gemini Advanced

#

Idk nightwhisper has been better vs it on a couple matchups with it. Not way better tho. Being tuned for coding will make it even better once out if it is Gemini coder

#

Oh no idea then just been in webdev

edgy niche Apr 3, 2025, 12:44 AM

#

Hey everyone!
I’m excited to share a new open-source framework we’ve been working on — Rankify!

Rankify is designed to streamline tasks like retrieval, reranking, and RAG (Retrieval-Augmented Generation). It's flexible, modular, and we hope it’ll be a helpful tool for anyone working in these areas.

We’d love for you to check it out, give us feedback, and if you find it useful, please consider giving it a ⭐ on GitHub — it really helps!

Thanks a lot, and happy coding! 😊

eager mica Apr 3, 2025, 12:47 AM

#

Claude is the polar opposite, on the other hand.

#

Both, in my opinion.

leaden palm Apr 3, 2025, 1:04 AM

#

claude thinking gets it if i use your full message as context

#

...

#

...

#

bad eval imo

models pass it easily if you say its sfw
it's clearly meant to sound nsfw and all humans would say it is
it doesn't cohere, it goes from "h__" to "th__"
infinitely many solutions, yet claimed to just have a select "several"

#

but it's not good because it's not hard

#

gemini 2.5, claude thinking:

#

o3 mini (albeit weird):

#

o1 high is fine

leaden palm Apr 3, 2025, 1:42 AM

#

i can't see why you think it's so innocent lol

#

are you a non native speaker

leaden palm Apr 3, 2025, 2:03 AM

#

just block em lol

#

what's a good way to complete it then?

#

so i guess it implies explicitness at every level

#

that
not proper grammar

torn mantle Apr 3, 2025, 6:54 AM

#

keeps winning

golden ocean Apr 3, 2025, 7:12 AM

#

Fr tf kin d of conversations u guys having

calm sequoia Apr 3, 2025, 8:05 AM

#

Anyone have access for the Nature papers? I need one paper for my new super-fancy-prompt 😄

humble sonnet Apr 3, 2025, 8:10 AM

#

Is Gemini really "the best coding model in the World"?

torn mantle Apr 3, 2025, 8:14 AM

#

humble sonnet Is Gemini really "the best coding model in the World"?

as of released models, sonnet 3.7 thinking model still have a slight edge

#

but things can change with this new model

humble sonnet Apr 3, 2025, 8:16 AM

#

torn mantle but things can change with this new model

Oh yeah, and is full free?

torn mantle Apr 3, 2025, 8:17 AM

#

humble sonnet Oh yeah, and is full free?

càd?

humble sonnet Apr 3, 2025, 8:17 AM

#

torn mantle càd?

I saw it was coming out and it was free but is there a limit?

torn mantle Apr 3, 2025, 8:18 AM

#

humble sonnet I saw it was coming out and it was free but is there a limit?

you are talking about gemini 2.5 pro?

#

well its free on aistudio

humble sonnet Apr 3, 2025, 8:18 AM

#

torn mantle you are talking about gemini 2.5 pro?

Yeah

torn mantle Apr 3, 2025, 8:18 AM

#

they also made it free on gemini website

#

but with rate limit

humble sonnet Apr 3, 2025, 8:18 AM

#

I saw it

#

On gemini is with rate limit

#

And aistudio ?

torn mantle Apr 3, 2025, 8:19 AM

#

humble sonnet And aistudio ?

didnt get rate limited

humble sonnet Apr 3, 2025, 8:20 AM

#

torn mantle didnt get rate limited

Nice !

opaque adder Apr 3, 2025, 8:59 AM

#

torn mantle keeps winning

the hell is that other ai

placid spear Apr 3, 2025, 9:19 AM

#

does anyone else have the issue with Gemini (the thinking models such as "gemini-2.5-pro-exp-03-25" and "gemini-2.0-flash-thinking-exp-01-21") in lmarena.ai where gemini can't give a full response/stops mid sentence? For example i'll ask an explanation for a code snippet, something like "explain this function to me, your answer should be atleast 300 words long and include an example" Gemini will be writing and then just stop/finish mid sentence without giving an error or anything. When Regenerating the same thing happens again but it stops at another place. I've only got this problem with Gemini thinking models, every single model such as deepseek r1, claude 3.7 thinking, o3 works fine.

keen beacon Apr 3, 2025, 9:22 AM

#

placid spear does anyone else have the issue with Gemini (the thinking models such as "gemini...

this is a regular gemini bug happens on their regular site w/ paid plan as well

#

claude deepseek or even chatgpt handle longer responses better

placid spear Apr 3, 2025, 9:24 AM

#

keen beacon this is a regular gemini bug happens on their regular site w/ paid plan as well

i've never had the problem when using gemini via chat

#

only on lmarena

keen beacon Apr 3, 2025, 9:25 AM

#

placid spear i've never had the problem when using gemini via chat

ur crazy i have that issue with gemini all day

#

i use it for adding print debugs on my luaU scripts

placid spear Apr 3, 2025, 9:26 AM

#

keen beacon ur crazy i have that issue with gemini all day

never happened to me on gemini chat

#

always got full responses

keen beacon Apr 3, 2025, 9:27 AM

#

#

i can show like 100 different examples of gemini having seizures mid chat

placid spear Apr 3, 2025, 9:28 AM

#

not saying you're lying

#

all i'm saying is it never happens to me on gemini chat but only on lmarena

keen beacon Apr 3, 2025, 9:29 AM

#

nah ik

#

its just funny cuz of out all the models gemini is the most glitchy for me so when i saw ur chat i was shocked 😭

barren prairie Apr 3, 2025, 9:55 AM

#

humble sonnet Is Gemini really "the best coding model in the World"?

I don t think , It is still can t fix code mistakes 🫥🫥

balmy mist Apr 3, 2025, 10:04 AM

#

barren prairie I don t think , It is still can t fix code mistakes 🫥🫥

what about nightwhisper?

modern haven Apr 3, 2025, 10:43 AM

#

How do i access nightwhisper

balmy pine Apr 3, 2025, 10:53 AM

#

Whats the best ai

#

For most knowledge n smart

kind cloud Apr 3, 2025, 11:03 AM

#

modern haven How do i access nightwhisper

https://web.lmarena.ai/

modern haven Apr 3, 2025, 11:22 AM

#

kind cloud https://web.lmarena.ai/

I cant seem to pick the model here?

brittle tiger Apr 3, 2025, 11:32 AM

#

modern haven I cant seem to pick the model here?

You need to come across it randomly. It's arena blind matchups. Nightwhisper is probably weighted slightly higher than others from my anecdotal experience tho

lime coral Apr 3, 2025, 11:36 AM

#

https://x.com/a7m7s1p6dv20/status/1907684868164825260?s=46

ᅟ (@a7m7s1p6dv20) on X

(initial?) pricing scheme for gemini 2.5 pro

via glama AI

eager mica Apr 3, 2025, 11:37 AM

#

At the moment I'm getting 24_karat_gold in every round, basically. Dunno if I'm being "lucky". 😅

lime coral Apr 3, 2025, 11:39 AM

#

humble sonnet Is Gemini really "the best coding model in the World"?

best imho and free + long ctx. You will still need to try it on your use case

plain zinc Apr 3, 2025, 11:44 AM

#

How do you like nightwhisper?

#

How good is he?

lime coral Apr 3, 2025, 11:49 AM

#

Right now it’s only on the web arena so not thorough test. It seems to be good stylistically too

humble sonnet Apr 3, 2025, 11:57 AM

#

lime coral best imho and free + long ctx. You will still need to try it on your use case

okay thx

#

what is vision category ?

lone nimbus Apr 3, 2025, 12:11 PM

#

where do i find nightwhisper ai

#

nvm

barren prairie Apr 3, 2025, 12:43 PM

#

balmy mist what about nightwhisper?

Still can t fix them

opaque adder Apr 3, 2025, 12:58 PM

#

lone nimbus where do i find nightwhisper ai

where did u find it

torn mantle Apr 3, 2025, 1:02 PM

#

opaque adder where did u find it

https://web.lmarena.ai/

brittle tiger Apr 3, 2025, 1:05 PM

#

opaque adder Apr 3, 2025, 1:23 PM

#

thats islamaphobic

#

are u saying that cause im jewish

#

thats anti semitic

#

..

#

look

#

just stop being anti semitic

blazing rune Apr 3, 2025, 2:02 PM

#

opaque adder thats islamaphobic

what are you smoking? nobody said anything about you

balmy pine Apr 3, 2025, 2:22 PM

#

What is the best bot

torn mantle Apr 3, 2025, 2:23 PM

#

balmy pine What is the best bot

you

balmy pine Apr 3, 2025, 2:23 PM

#

How

torn mantle Apr 3, 2025, 2:23 PM

#

@balmy pine generate a stunning looking website

balmy pine Apr 3, 2025, 2:23 PM

#

I’m saying the smartest anonymos bot

#

No

#

🔥 🚽

torn mantle Apr 3, 2025, 2:23 PM

#

xd

balmy pine Apr 3, 2025, 2:23 PM

#

💩 💩 💩

torn mantle Apr 3, 2025, 2:24 PM

#

oh no

balmy pine Apr 3, 2025, 2:24 PM

#

Stargazer r 24 karat gold r moonhowler

#

R nightwhisper

torn mantle Apr 3, 2025, 2:24 PM

#

best new model?

balmy pine Apr 3, 2025, 2:24 PM

#

Yeah

#

Smart

torn mantle Apr 3, 2025, 2:24 PM

#

for coding its nightwhisper yea

balmy pine Apr 3, 2025, 2:24 PM

#

N most knowledagable

balmy pine Apr 3, 2025, 2:24 PM

#

torn mantle for coding its nightwhisper yea

No

#

Not coding

#

Just in arena

torn mantle Apr 3, 2025, 2:24 PM

#

most knowledgeable is gemini 2.5 pro and sonnet 3.7

balmy pine Apr 3, 2025, 2:25 PM

#

No

#

The anonymos

#

Bots

torn mantle Apr 3, 2025, 2:25 PM

#

there is one

#

wait

balmy pine Apr 3, 2025, 2:25 PM

#

Stargazer or 24 karat gold

#

Or other one’s i forgot name

torn mantle Apr 3, 2025, 2:25 PM

#

stargazer

#

is good

#

its expected to be gemini 2.5 flash thinking

balmy pine Apr 3, 2025, 2:26 PM

#

Cuz

#

Gemini 2.5 pro sucks

#

In typing

#

And following instructieons

torn mantle Apr 3, 2025, 2:26 PM

#

wdym by typing?

balmy pine Apr 3, 2025, 2:26 PM

#

Like

#

The style

torn mantle Apr 3, 2025, 2:26 PM

#

its actually good at insturction following

torn mantle Apr 3, 2025, 2:27 PM

#

balmy pine The style

ah

#

im not fan either

balmy pine Apr 3, 2025, 2:27 PM

#

It always types robotic

torn mantle Apr 3, 2025, 2:27 PM

#

but its more knowledgeable than other models

balmy pine Apr 3, 2025, 2:27 PM

#

Idk how to explain

#

Yeah

#

But it when I tell it to type specific way

torn mantle Apr 3, 2025, 2:27 PM

#

balmy pine It always types robotic

well you can prompt it ig to write in another style

balmy pine Apr 3, 2025, 2:27 PM

#

It doesn’t do it

balmy pine Apr 3, 2025, 2:27 PM

#

torn mantle well you can prompt it ig to write in another style

Yeah like for example

#

When I ask it to use

#

Complicated words and stuff

#

It start’s speaking other language

#

It doesn’t do it correct

#

That’s the only problem

#

That’s why I’m thinking stargazer isn’t gemii cus it types way differently from it

torn mantle Apr 3, 2025, 2:28 PM

#

never had that tbh

balmy pine Apr 3, 2025, 2:28 PM

#

Like in my specific instructions

#

All gemini models

torn mantle Apr 3, 2025, 2:28 PM

#

balmy pine That’s why I’m thinking stargazer isn’t gemii cus it types way differently from ...

no its gemini

#

someone confirmed that already

balmy pine Apr 3, 2025, 2:28 PM

#

R typing the same

balmy pine Apr 3, 2025, 2:28 PM

#

torn mantle no its gemini

Like

#

Just gemini or

#

100% gemini 2.5 flash thinking

torn mantle Apr 3, 2025, 2:29 PM

#

torn mantle Apr 3, 2025, 2:29 PM

#

balmy pine 100% gemini 2.5 flash thinking

2.5 flash thinking

#

because its recent + fast

#

was recently added just after 2.5 pro

balmy pine Apr 3, 2025, 2:29 PM

#

torn mantle

Wow

#

That’s weird

#

What about 24 karat gold do we know what it’s is

#

Cus it’s the best at following instructions but it’s not smart

torn mantle Apr 3, 2025, 2:31 PM

#

balmy pine What about 24 karat gold do we know what it’s is

i have no idea

eager mica Apr 3, 2025, 2:31 PM

#

balmy pine Cus it’s the best at following instructions but it’s not smart

I don't find it exceedingly good at following instructions, to be honest. It's good at explaining things and in creative writing (as long as you don't need factuality).

torn mantle Apr 3, 2025, 2:31 PM

#

my experience with that model isnt any good

#

it yapps a lot

balmy pine Apr 3, 2025, 2:32 PM

#

eager mica I don't find it exceedingly good at following instructions, to be honest. It's g...

Yeah that’s what I wanted to say I think

torn mantle Apr 3, 2025, 2:32 PM

#

it started going crazy after the 2nd prompt

balmy pine Apr 3, 2025, 2:32 PM

#

Like when I ask it to follow instruction

#

It’s creative and stuff

torn mantle Apr 3, 2025, 2:32 PM

#

eager mica I don't find it exceedingly good at following instructions, to be honest. It's g...

its actually good at that

#

the model that sucks at instruction following is grok 3

balmy pine Apr 3, 2025, 2:32 PM

#

Unlike the ones like deepseek r1, claude, gemini, all

balmy pine Apr 3, 2025, 2:32 PM

#

torn mantle the model that sucks at instruction following is grok 3

Yeah

#

Thinking mode

#

It follows instructions for one message only

torn mantle Apr 3, 2025, 2:32 PM

#

ive used them a lot and grok 3 should take the lead on being the worse at that

torn mantle Apr 3, 2025, 2:33 PM

#

balmy pine It follows instructions for one message only

yea

balmy pine Apr 3, 2025, 2:33 PM

#

Grok 3 without thinking it works a little better but still suck’s

torn mantle Apr 3, 2025, 2:33 PM

#

they added like a small fix but its not working

#

after each message they remind the model whats the context in summarized bullets

balmy pine Apr 3, 2025, 2:33 PM

#

It’s too dumb sometimes I even re sent the instructions

eager mica Apr 3, 2025, 2:34 PM

#

torn mantle its actually good at that

That might depend on how the instructions are formatted. It feels like it needs very detailed instructions to "get it".

balmy pine Apr 3, 2025, 2:34 PM

#

And it still doesn’t understand

#

Or follow them

torn mantle Apr 3, 2025, 2:35 PM

#

balmy pine It’s too dumb sometimes I even re sent the instructions

i unsubbed from grok 3

#

not worth it

torn mantle Apr 3, 2025, 2:35 PM

#

eager mica That might depend on how the instructions are formatted. It feels like it needs ...

idk...

#

didnt had that issue tbh

#

i never found myself re-explaining again my prompt or reminding the model of the context

#

i mean its not like sonnet

#

sonnet is more enjoyable to talk to

#

but gemini isnt that bad either tbh

balmy pine Apr 3, 2025, 2:37 PM

#

I found that stargazer is more creative than gemini 2.5 pro

#

And follows my instructions more precise

#

24 karat gold does perfectly

#

But

#

Too random and gets alot of info wrong / makes up stuff

torn mantle Apr 3, 2025, 2:38 PM

#

balmy pine 24 karat gold does perfectly

tf

#

this model ....

balmy pine Apr 3, 2025, 2:38 PM

#

Yeah

#

It would be the best model if it was less random

torn mantle Apr 3, 2025, 2:38 PM

#

idk

balmy pine Apr 3, 2025, 2:38 PM

#

And had more knowledge

torn mantle Apr 3, 2025, 2:38 PM

#

no its nowhere near best models

balmy pine Apr 3, 2025, 2:38 PM

#

Like if they realased a larger version of it

torn mantle Apr 3, 2025, 2:38 PM

#

dont judge it based on how it writes

balmy pine Apr 3, 2025, 2:38 PM

#

Way larger

balmy pine Apr 3, 2025, 2:39 PM

#

torn mantle dont judge it based on how it writes

Yeah

#

I just wonder why they don’t make a model like the same intelligence as gemiini and stuff

#

But more creative

torn mantle Apr 3, 2025, 2:42 PM

#

the thing is that each one of us judge a model based on his own preferences and benchmarks, the reason why i said 24k gold isnt good because it failed my multilingual benchmark, it didnt perform well at coding tasks, and its general knowledge is really really limited

#

i rarely judge a model on how it writes since i believe thats a thing that can be modified by the system prompt

eager mica Apr 3, 2025, 2:42 PM

#

It does seem to be a small creative-writing-optimized model, but it's possible the system prompt it's been given is actively harming other uses.

torn mantle Apr 3, 2025, 2:43 PM

#

i mean even llama 405b is good at writing

#

its more human-alike at writing than other models

#

luca is probably a chinese model

eager mica Apr 3, 2025, 2:44 PM

#

This is the system prompt that got extracted the other day by riidelfi
https://gist.github.com/riidefi/443dc5c4b5e13e51846a43067b5335a1

Gist

Meta (?)'s `24_karat_gold` (lmarena) System Prompt

Meta (?)'s 24_karat_gold (lmarena) System Prompt - prompt.txt

torn mantle Apr 3, 2025, 2:45 PM

#

let me try to get this 24k gold model to test it again

#

well its so fast

#

thats for sure

eager mica Apr 3, 2025, 2:47 PM

#

It should be easy to find, it's as if Meta(?) retired most other models.

torn mantle Apr 3, 2025, 2:49 PM

#

24_karat_gold is the most yapping model 😭

#

i ask it a one simple question and it delve into some other areas that i didnt even ask for

#

idk what you see good about this model

torn mantle Apr 3, 2025, 2:51 PM

#

eager mica This is the system prompt that got extracted the other day by riidelfi https://g...

Now it make sense

#

I can see it now

#

Its definitely an interesting model

timber veldt Apr 3, 2025, 3:08 PM

#

anybody can tell me what is the difference between the CHAT tab and the SEARCH tab?

plain zinc Apr 3, 2025, 3:14 PM

#

balmy pine Just in arena

Did you encounter Nightwhisper in lmarena itself?

torn mantle Apr 3, 2025, 3:18 PM

#

i got multiple times on webdev

#

its a fun model

plain zinc Apr 3, 2025, 3:19 PM

#

So the release is already next week!

torn mantle Apr 3, 2025, 3:19 PM

#

maybe the value is its in the things i havent asked for and i wish to ask for

plain zinc Apr 3, 2025, 3:20 PM

#

Because Nebula disappeared then too.

#

Don't you remember?

plain zinc Apr 3, 2025, 3:20 PM

#

torn mantle i got multiple times on webdev

Same

balmy pine Apr 3, 2025, 3:20 PM

#

plain zinc Did you encounter Nightwhisper in lmarena itself?

No

#

Its impossible

#

Only in webdev

plain zinc Apr 3, 2025, 3:31 PM

#

Where?

#

Again in WebDev?

#

Or lmarena? 👀

plain zinc Apr 3, 2025, 3:50 PM

#

Does it only work with him?

slate cliff Apr 3, 2025, 4:05 PM

#

Hello everyone. I conducted my own testing of LLMs on the same task, which is detailed in the technical specification, and created a chart. i've attached it below

wheat onyx Apr 3, 2025, 4:16 PM

#

How much better than 2.5 Pro is Nightwhisper? Do we have an idea?

balmy mist Apr 3, 2025, 4:17 PM

#

it seems like nightwhisper is really good at making working apps with good UI, but i would still say claude is better in terms of logic, anyone else agree?

plain zinc Apr 3, 2025, 4:18 PM

#

wheat onyx How much better than 2.5 Pro is Nightwhisper? Do we have an idea?

Much better

balmy mist Apr 3, 2025, 4:18 PM

#

but i will still give whisper the edge then claude then gemini

plain zinc Apr 3, 2025, 4:18 PM

#

very strongly

plain zinc Apr 3, 2025, 4:18 PM

#

balmy mist it seems like nightwhisper is really good at making working apps with good UI, b...

I do not know, maybe yes.

balmy mist Apr 3, 2025, 4:19 PM

#

yeah its hard to tell off only a few examples, but the ui i get from nightwhisper has always been better than whatever its going against

#

this is using nightwhisper?

#

lmaoo smart man

#

so you just kept prompting it right? how long can you extend the chat for?

#

lol

#

i wonder what its context is

#

i got it like 4 times already lol

#

but next time imma keep the window

#

its easy to tell when its whisper bc it takes longer than other models

#

hmm not sure, from what I have seen python has been faster with the coding, but for running i think react, but i am not sure im still new to this lol

#

wait how are you sharing it?

#

what app you using

gentle plinth Apr 3, 2025, 5:15 PM

#

The URL is only valid for a short amount of time tho, the above already expired

#

Which is understandable, otherwise you would be able to have free webhosting xD

torn mantle Apr 3, 2025, 5:32 PM

#

why did u chose right side? xd

torn mantle Apr 3, 2025, 5:34 PM

#

balmy mist it seems like nightwhisper is really good at making working apps with good UI, b...

wdym by logic?

#

you have any examples?

balmy mist Apr 3, 2025, 5:34 PM

#

i wish i kept the example

balmy mist Apr 3, 2025, 5:34 PM

#

torn mantle wdym by logic?

but i did a test with creating a pokemon simulator

torn mantle Apr 3, 2025, 5:35 PM

#

but i agree its hard to compare them giving that we only have visual/aesthetic battle mostly

balmy mist Apr 3, 2025, 5:37 PM

#

the ui for whisper was way cleaner and had some visuals for the elements, while claude 3.5 was a lil basic

however, whisper chose very weird values for the attack power for each attack that were kinda to high and it did not apply the super effective and not as effective attack logic in as well, imo but it still worked just the numbers where high for the attacks and it did not decrease the attack power enough for a none effective attac based on the element type, but still better overal implementation imo

#

im so mad i did not keep that, idk what I was thinking lol, do we have access to our old battles?

#

imma try it again, hopefully I get a whisper vs 2.5 matchup lol

north vale Apr 3, 2025, 5:40 PM

#

https://matharena.ai/
gemini 2.5 pro got 24.4% on the USAMO 2025

MathArena.ai

MathArena: Evaluating LLMs on Uncontaminated Math Competitions

#

that's actually nuts

balmy mist Apr 3, 2025, 5:41 PM

#

north vale https://matharena.ai/ gemini 2.5 pro got 24.4% on the USAMO 2025

wtf, wait didnt o3 large get around that number when they first announced it?

#

send the error message as a follow up and see if it fixes it, copy the link of the 2.5 result so you can reference it after whisper fixes the issue

north vale Apr 3, 2025, 5:43 PM

#

balmy mist wtf, wait didnt o3 large get around that number when they first announced it?

that was for frontiermath, a different thing

#

the important part is that frontiermath is answer based (eg the answer is 10.3498) whereas USAMO is proof based

#

and so far llms had been really bad at proof based olympiad problems

#

but good at answer based

#

but now they're decent at both

balmy mist Apr 3, 2025, 5:44 PM

#

north vale the important part is that frontiermath is answer based (eg the answer is 10.349...

wow so gemini 2.5 truly is the leading model then

#

possibly, try it

#

anybody getting this? this has happened a few times to me already

north vale Apr 3, 2025, 5:46 PM

#

balmy mist wow so gemini 2.5 truly is the leading model then

yeah

oblique flint Apr 3, 2025, 5:47 PM

#

I just wish 2.5 pro was better in cursor man.. It's noticeably worse than 3.7 sonnet at toolcalling within cursor

balmy mist Apr 3, 2025, 5:47 PM

#

oblique flint I just wish 2.5 pro was better in cursor man.. It's noticeably worse than 3.7 so...

same!!

oblique flint Apr 3, 2025, 5:49 PM

#

claude will sometimes hit the 25 toolcall limit in a single prompt, while gemini usually forgets that it has tools in agent mode, and if you prompt it to use the tools it still wont call more than 5. Idk if I'm prompting it wrong or if it's just a model issue

#

like gemini will just say "please include the code for that code file", MF you literally have the tools to read that file lol

balmy mist Apr 3, 2025, 5:52 PM

#

brooo i swear i thought I was using it wrong when I was telling it to find a method and it started telling me to do it because it couldnt lol

oblique flint Apr 3, 2025, 5:52 PM

#

having said that, 2.5 pro is still pretty darn amazing in ai studio when it has full context

balmy mist Apr 3, 2025, 5:52 PM

#

yeah thats where i use it the most

#

especially with the system instructions and different settings you can use

torn mantle Apr 3, 2025, 5:55 PM

#

balmy mist anybody getting this? this has happened a few times to me already

yea

#

a lot

plain zinc Apr 3, 2025, 5:57 PM

#

Yes! We need to come up with just such a prompt

#

Not every model can handle my promptness. I can send it here

#

And you can show it here later.

#

Okay?

#

let's say the font used is Press Start 2b or something. There is also a code for almost 1000 lines. Maximum diverse design. WITHOUT IMAGES ONLY.
write the best Minecraft web edition website, so that everything is beautifully designed and understandable, types of services, price, description, name Minecraft web edition. All in one html5 code. Try to please me. Try to be much better. You have to impress me. mining-based design of the type from Mozhanga. the design is even stronger. Try to be the best

#

This is prompt

#

Most models are dumb because they can't install the appropriate font.

#

the design is boring for many

#

They don't even add animation. 😠

#

Is it still generating code?

#

Prompt: Let's Use the font used is Press Start 2b or something. There is also a code for almost 1000 lines. Maximum diverse design. WITHOUT IMAGES ONLY.
write the best Minecraft web edition website, so that everything is beautifully designed and understandable, types of services, price, description, name Minecraft web edition. All in one html5 code. Try to please me. Try to be much better. You have to impress me. mining-based design of the type from Mozhanga. the design is even stronger. Try to be the best

#

Really?;)

balmy mist Apr 3, 2025, 6:18 PM

#

i just got NW vs gemini 2.5 for my pokem prompt

plain zinc Apr 3, 2025, 6:18 PM

#

Very amazing! Can you work on it even further?

balmy mist Apr 3, 2025, 6:18 PM

#

no I am about to now, so you can basically erase the old prompt lol?

plain zinc Apr 3, 2025, 6:19 PM

#

I want to make it as futuristic as possible.

sterile dust Apr 3, 2025, 6:19 PM

#

Which model is best

#

I think that it's 24k gold

balmy mist Apr 3, 2025, 6:20 PM

#

wait paws which prompt are you using to reset the chat?

plain zinc Apr 3, 2025, 6:20 PM

#

Prompt: Make it as futuristic as possible. Add animations. And in general, expand the code from html5, css to js. The site should look like it was made by a senior-level programmer. This site is identical to the AI style. It needs to be fixed

balmy mist Apr 3, 2025, 6:21 PM

#

btw here is my pokemon prompt: create a pokemon simulator that has the same battle elemental logic as traditional

gemini 2.5: https://3000-iz11tij2mupw1bg01rvuz-0df36e7a.e2b-foxtrot.dev

NW: https://3000-i0f5vp73gcgj17lnb7cag-6b30fe6e.e2b-foxtrot.dev

sterile dust Apr 3, 2025, 6:21 PM

#

Why? Is it sometimes talk rubbish?

balmy mist Apr 3, 2025, 6:22 PM

#

balmy mist btw here is my pokemon prompt: create a pokemon simulator that has the same batt...

imma be honest, gemini won in this case lol

#

when i click nothing happens?

#

the enter the nexus button

plain zinc Apr 3, 2025, 6:25 PM

#

VERY cool! But the font is lost.

#

it doesn't look like minecraft anymore)

#

Oh!

#

Bro

#

I have another prompt.

balmy mist Apr 3, 2025, 6:25 PM

#

ill try the prompt to with geminie vs nw

#

so i just sawy forget previous prompts right?

#

okay thank you

plain zinc Apr 3, 2025, 6:27 PM

#

Prompt:Everything is cool, but bring back the minecraft font and speaking of futurism, I meant for you to implement it in the form of some kind of modpack in which there are several futuristic themes with each with its own animation. LITERALLY everything should not be divorced from the minecraft style and its themes. ALL LIBRARIES must be hard-coded to match the theme and style of Minecraft. (You can add a piece from Minecraft dungeon)

#

Yes

balmy mist Apr 3, 2025, 6:31 PM

#

gemini: https://3000-ivwsc8kklboujmfvekqyb-4daf0015.e2b-foxtrot.dev
nw: https://3000-i7h3fkuq00dw0u9owjcwz-020dd869.e2b-foxtrot.dev

#

urs looks so good

#

omgg what prompt you used?

#

i used this:
Treat this prompt, as if it was the starting prompt and forget everything above:
Everything is cool, but bring back the minecraft font and speaking of futurism, I meant for you to implement it in the form of some kind of modpack in which there are several futuristic themes with each with its own animation. LITERALLY everything should not be divorced from the minecraft style and its themes. ALL LIBRARIES must be hard-coded to match the theme and style of Minecraft. (You can add a piece from Minecraft dungeon)

#

ahh that makes sense lol

#

yeah

plain zinc Apr 3, 2025, 6:37 PM

#

Bruh

#

💥🔫

#

I think Google did it on purpose.

#

I'll be waiting for the next code.

#

I'm not going anywhere.

wheat onyx Apr 3, 2025, 6:40 PM

#

plain zinc Much better

cool! looking forward to it's release

sage raptor Apr 3, 2025, 6:40 PM

#

is nightwhisper any good ?

plain zinc Apr 3, 2025, 6:43 PM

#

Yes, and that he often appeared there.

#

Damn

#

Well, damn it;(

#

No, I'm disappointed that the result was not given.

#

My promptness and result representations desired the best in my head

sterile dust Apr 3, 2025, 7:03 PM

#

Uh-Oh

Screenshot_2025-04-04-03-02-21-569_com.microsoft.emmx.jpg

Screenshot_2025-04-04-03-02-45-411_com.microsoft.emmx.jpg

keen fulcrum Apr 3, 2025, 7:12 PM

#

Looks like there is a new gemini model on the table

sterile dust Apr 3, 2025, 7:14 PM

#

Gemini 3？

balmy mist Apr 3, 2025, 7:15 PM

#

keen fulcrum Looks like there is a new gemini model on the table

another one aside from NW?

balmy mist Apr 3, 2025, 7:18 PM

#

keen fulcrum Looks like there is a new gemini model on the table

damn bro just teased us lol

#

anyone tried new devin?

keen fulcrum Apr 3, 2025, 7:27 PM

#

balmy mist damn bro just teased us lol

Flash Thinking is due and then there is nightwhisper which performs better than 2.5 pro

#

torn mantle Apr 3, 2025, 7:35 PM

#

yep

#

again and again

#

i cant test any model

balmy mist Apr 3, 2025, 7:37 PM

#

guys looks at this: https://3000-icsagj64lzbs22wdnplj8-9cff23e1.e2b-foxtrot.dev

#

best result so far

#

what should i add?

torn mantle Apr 3, 2025, 7:38 PM

#

balmy mist guys looks at this: https://3000-icsagj64lzbs22wdnplj8-9cff23e1.e2b-foxtrot.dev

nice

#

i think its time to challenge these models

balmy mist Apr 3, 2025, 7:40 PM

#

yeah i used gemini 2.5 in studio to give me prompts

torn mantle Apr 3, 2025, 7:40 PM

#

i made some complex prompts the other day, lemme see how they perform

balmy mist Apr 3, 2025, 7:40 PM

#

well it cleans up my prompts before i send them

#

that gen was night vs 3.7

#

3.7 couldnt even gen 😦

#

but i am shocked that nw gave images for the pokemon, wild

#

did it search online for them? how is this possible?

torn mantle Apr 3, 2025, 7:45 PM

#

balmy mist that gen was night vs 3.7

there are some issues with webdev arena

#

first you need to stay on the window screen

#

i think there is a script linked to focus event listener

#

or smth

torn mantle Apr 3, 2025, 7:45 PM

#

sterile dust Uh-Oh

and for this one, its basically a timeout

#

if the window screen is inactive for 2min or so, the sandbox gets terminated

balmy mist Apr 3, 2025, 7:48 PM

#

i tried that and sonnet still fails

#

but i think i am in love with night

#

i am still shocked by the images

#

like i tried that prompt so many times and never told it about images for the pokemon

brittle tiger Apr 3, 2025, 7:51 PM

#

balmy mist that gen was night vs 3.7

How is it populating images?

balmy mist Apr 3, 2025, 7:51 PM

#

but Nightwhisper gave me that

brittle tiger Apr 3, 2025, 7:51 PM

#

Looks amazing btw

balmy mist Apr 3, 2025, 7:51 PM

#

thats what i am saying

#

its failing now, im trying to bring it back up lol, i sent a follow up prompt with: "add new features"

#

ill send link if its working, webdev is just glitchy rn

#

accoridng to gemini: TLDR: The Pokémon images come from web links (URLs) stored directly in the code for each Pokémon. The code then uses standard HTML <img> tags to tell the browser to load and show the images from those links. The links point to images hosted online by the PokeAPI project.

#

makes sense and is simple, but it would have to have researched this or have been trained on this and remembered the url for the PokeAPI

#

this model is really good

#

here is the new link:
https://3000-iyvmu4z1f9svqcklh0xrp-9cff23e1.e2b-foxtrot.dev

torn mantle Apr 3, 2025, 7:58 PM

#

balmy mist here is the new link: https://3000-iyvmu4z1f9svqcklh0xrp-9cff23e1.e2b-foxtrot.de...

just for an experiment

#

can you tell it

#

to restyle it as if its an apple expert designer

balmy mist Apr 3, 2025, 7:59 PM

#

okay

#

i did a screen record of the last promp so that i have that example lol

#

ill post this in community and then post the apple restyle after its done gen

#

omgggg

#

yoo brooo

#

that prompt

#

https://3000-ip3p87737yfj4ty1qj6rf-020dd869.e2b-foxtrot.dev

#

check it out

#

yoo

#

wtf

torn mantle Apr 3, 2025, 8:08 PM

#

oh looks really nice

balmy mist Apr 3, 2025, 8:08 PM

#

yeah you are good at prompting

torn mantle Apr 3, 2025, 8:08 PM

#

it did follow apple design principle quite well

balmy mist Apr 3, 2025, 8:09 PM

#

yeah like really well lol

torn mantle Apr 3, 2025, 8:10 PM

#

you can ask it to animate the characters

#

like make it bouncing if active

#

you can make it look even better

#

add something like

#

when the character is active, add a smooth animation bouncing on the avatar

balmy mist Apr 3, 2025, 8:11 PM

#

okay ill add that now, if it can do that i will be shocked

#

its funny because i have 3.7 vs nw in this battle but 3.7 was not able to generate anything the whole time

#

but i think that might be an error with webdev

torn mantle Apr 3, 2025, 8:12 PM

#

balmy mist its funny because i have 3.7 vs nw in this battle but 3.7 was not able to genera...

what was the initial prompt

#

i can try it on sonnet 3.7 thinking in vscode

balmy mist Apr 3, 2025, 8:13 PM

#

📎 message.txt

#

so i gave the part where it says Prompt: to before it says "This prompt "

#

yoo bro are you a master prompter or sum?

#

how did you know this would work?

#

https://3000-iafvffwwaiybslrn5tbfd-020dd869.e2b-foxtrot.dev

torn mantle Apr 3, 2025, 8:16 PM

#

balmy mist how did you know this would work?

its light work for this model

#

let me think of something else

balmy mist Apr 3, 2025, 8:17 PM

#

let me know if 3.7 can do the prompt, cause i have never seen a model follow instructions so well, it reminds me of 4o img gen

torn mantle Apr 3, 2025, 8:18 PM

#

make it more like a battle in a 2d map, left vs right, when a Pokémon attacks it will be animated in a cool way to attack the other character of the other side

torn mantle Apr 3, 2025, 8:18 PM

#

balmy mist let me know if 3.7 can do the prompt, cause i have never seen a model follow ins...

xd

brittle tiger Apr 3, 2025, 8:19 PM

#

balmy mist https://3000-iafvffwwaiybslrn5tbfd-020dd869.e2b-foxtrot.dev

Very cool that was done in lmarena. Battle worked flawlessly for me

torn mantle Apr 3, 2025, 8:19 PM

#

@balmy mist kinda curious if it can follow also my last prompt

#

if it can do a 2d map etc...

balmy mist Apr 3, 2025, 8:20 PM

#

brittle tiger Very cool that was done in lmarena. Battle worked flawlessly for me

yeah man, did not expect it to actually work tbh

balmy mist Apr 3, 2025, 8:20 PM

#

torn mantle make it more like a battle in a 2d map, left vs right, when a Pokémon attacks it...

okay ill try now

balmy mist Apr 3, 2025, 8:20 PM

#

torn mantle <@367710025994731520> kinda curious if it can follow also my last prompt

me too

#

this will be nuts

#

its so much fun playing with these model man

torn mantle Apr 3, 2025, 8:22 PM

#

yea

balmy mist Apr 3, 2025, 8:23 PM

#

you are like the llm whisper

torn mantle Apr 3, 2025, 8:23 PM

#

now that you mention it, kinda curious if it can accurately clone yugioh cards

balmy mist Apr 3, 2025, 8:24 PM

#

ooooooh, i was thinking about doing a yugioh game sim earlier but i thought it might be too hard for it

#

but you want me to try a sim of the battle and have the cards there right?

#

with thiis model you could prob clone the game and then have custom cards with 4o inserted

#

https://3000-i2vryfpbe5lx69c6lu1my-90451382.e2b-foxtrot.dev

#

hmm i didnt test yet, im scared to test lol

#

hmm not bad

#

it could be better

#

but im shocked it got some of the functionality

#

and it changed the pokemon lol

torn mantle Apr 3, 2025, 8:26 PM

#

lol

#

it looks much better

#

i liked how there is a bit of shake when the character is attacker

#

yea it can be much better

balmy mist Apr 3, 2025, 8:27 PM

#

yeah me too, its interesting how the model interprets the prompt and

brittle tiger Apr 3, 2025, 8:29 PM

#

https://x.com/testingcatalog/status/1907891942869922292?t=Q30isS2oxgO7U-qBjdYMtA&s=19

TestingCatalog News 🗞 (@testingcatalog) on X

BREAKING 🚨: Google is preparing to launch another model on Gemini, potentially next week, ahead of the Cloud Next event.

balmy mist Apr 3, 2025, 8:30 PM

#

lmaooo google really won

torn mantle Apr 3, 2025, 8:30 PM

#

balmy mist yeah me too, its interesting how the model interprets the prompt and

idk maybe we should ask it to make the map more realistic

#

and bit bigger?

#

and draw lines between monsters

balmy mist Apr 3, 2025, 8:32 PM

#

okay lets do it, give me one line prompt for that prompt whisper 🙂

#

current build, had to share here bc we built thi together lol

torn mantle Apr 3, 2025, 8:33 PM

#

balmy mist current build, had to share here bc we built thi together lol

alright

#

how about this

#

given the attack type, for example if its fire, we generate fire icons that start from the character and go smoothly attacking the enemy, it should look really flowing smoothly

balmy mist Apr 3, 2025, 8:35 PM

#

im actually shocked that its switching the pokemon lol

torn mantle Apr 3, 2025, 8:35 PM

#

lets see how it does that

balmy mist Apr 3, 2025, 8:35 PM

#

bet

torn mantle Apr 3, 2025, 8:35 PM

#

balmy mist im actually shocked that its switching the pokemon lol

yea its random

balmy mist Apr 3, 2025, 8:37 PM

#

damn, i cant use the same session anymore 😦

#

#

ill try again

#

imma try making a new session until i find NW again lol

#

i got the code for it

#

gonna give your prompt with the code

torn mantle Apr 3, 2025, 8:38 PM

#

if it didnt work then ask it to use this : https://png.pngtree.com/png-clipart/20240115/original/pngtree-flame-icon-collection-png-image_14120730.png

and to make sure the direction of the flame is correct since its vertical and it should make it horizontal

torn mantle Apr 3, 2025, 8:39 PM

#

balmy mist

xd

balmy mist Apr 3, 2025, 8:40 PM

#

ahhh okay, once i find NW ill do it, gotta keep playing around

#

i think i found it again :p

#

nvm lol

#

when you give it the code it makes it easy to copy lol

#

https://3000-iw8rxwfqdctphkiwo5p08-ec17a5a5.e2b-foxtrot.dev

#

but this is what qwen did

#

sonnet 3.7 sucks

#

you wanna try and giving sonnet this code?

#

here is teh code and the prompt, so just plug this into sonnet:

📎 message.txt

#

imma keep trying in webdev until i find it again

#

i found nightwhisper but i cant get it to gen with this much code 😦

#

imma keep trying

torn mantle Apr 3, 2025, 8:48 PM

#

yea you should've hit the context limit or smth

balmy mist Apr 3, 2025, 8:49 PM

#

damn im actually sad

#

we were on a roll

#

so it seems like stargazer and nightwhisper are good at generating code but not good at editing existing code or maybe its a context issue on webdev?

keen beacon Apr 3, 2025, 9:18 PM

#

balmy mist current build, had to share here bc we built thi together lol

this all u guys doing?

balmy mist Apr 3, 2025, 9:18 PM

#

lmaoo

#

yeah

keen beacon Apr 3, 2025, 9:18 PM

#

i did this with gemini

balmy mist Apr 3, 2025, 9:18 PM

#

you got better ideas for us to use with nightwhisper?

keen beacon Apr 3, 2025, 9:18 PM

#

balmy mist Apr 3, 2025, 9:18 PM

#

wow

#

thats impressive

#

what was prompt and this was on webdev or studio?

#

i think gemini second best coding model, sonnet is just trash now compared to nw and gemini

keen beacon Apr 3, 2025, 9:19 PM

#

balmy mist what was prompt and this was on webdev or studio?

this is roblox studio, and there was no prompt only engineering

balmy mist Apr 3, 2025, 9:19 PM

#

but technically nw is the next version of gemini lol

keen beacon Apr 3, 2025, 9:19 PM

#

there are alot of different systems

#

datastore, networking, etc

balmy mist Apr 3, 2025, 9:20 PM

#

but you used gemini?

keen beacon Apr 3, 2025, 9:20 PM

#

balmy mist but technically nw is the next version of gemini lol

by far out of all the models i used for LuaU yes

#

gemini is the best and i use it

#

but i use the on gemini official site paid plan i use lmarena for image gen (ui ideas, etc)

#

balmy mist Apr 3, 2025, 9:21 PM

#

interesting, i was thinking about using llms to make roblox and fortnite games, this will be op

keen beacon Apr 3, 2025, 9:21 PM

#

balmy mist interesting, i was thinking about using llms to make roblox and fortnite games, ...

its bad on UEFN, ive done commissions for big influencers on there not worth cuz all u can earn is off concurrent

#

go into roblox

#

its the biggest game on the planet now in its genre, and they have a developer exchange program

balmy mist Apr 3, 2025, 9:22 PM

#

wow how long you have been doing this for?

keen beacon Apr 3, 2025, 9:22 PM

#

30k robux which is fairly easy to get with a cash grab cookie cutter game is 105-150 usd paid by roblox, fully taxable

keen beacon Apr 3, 2025, 9:22 PM

#

balmy mist wow how long you have been doing this for?

awhile

#

not using ai but making games on roblox a mean minute

balmy mist Apr 3, 2025, 9:23 PM

#

i heard you can make a lot of money in that

keen beacon Apr 3, 2025, 9:23 PM

#

one of these guys i used to know owns a smaller game in visits and he just bought a brand new audi off it

balmy mist Apr 3, 2025, 9:23 PM

#

like there is a one piece game right?

#

wow

keen beacon Apr 3, 2025, 9:23 PM

#

balmy mist like there is a one piece game right?

yea but those games are huge those guys are multi millionaires

balmy mist Apr 3, 2025, 9:24 PM

#

which games have you played? and what does it take to make games for roblox?

keen beacon Apr 3, 2025, 9:25 PM

#

balmy mist which games have you played? and what does it take to make games for roblox?

honestly you just need a understanding of the game building software

roblox itself is a super simplified engine, it's just understanding the fundamentals that'll help you debug, put scripts in the right places, etc

ive been playing roblox for awhile though im young so i grew up on this game

#

i play like hood games though where you can sell drugs and rob people

#

those types of game sell custom stuff for USD in their discords without roblox knowing so they make crazy amounts of money

balmy mist Apr 3, 2025, 9:26 PM

#

wait so your saying that roblox is a bigger game than fortnite?

keen beacon Apr 3, 2025, 9:26 PM

#

yes.. maybe 3x bigger

#

they even beat minecraft

#

roblox is by far the biggest game in all of man kind

balmy mist Apr 3, 2025, 9:26 PM

#

wow

keen beacon Apr 3, 2025, 9:26 PM

#

💀

balmy mist Apr 3, 2025, 9:26 PM

#

i always hear about it, but didnt know it got this big

keen beacon Apr 3, 2025, 9:26 PM

#


Active Players – It consistently has millions of daily active users, often surpassing even Minecraft and Fortnite in concurrent players.

Revenue – Roblox generates billions of dollars yearly, with players spending money on in-game purchases and Robux.

Content – Unlike traditional games, Roblox is a platform with millions of user-generated games, making its content library massive.

Playtime – Many users, especially kids and teenagers, spend hours daily on Roblox, making it one of the most engaging platforms.```

keen beacon Apr 3, 2025, 9:27 PM

#

balmy mist i always hear about it, but didnt know it got this big

yea nah its crazy now

#

cuz its not only kid thing now

#

there is gambling, 17+ games, bars, voice chat, etc

#

od stuff

balmy mist Apr 3, 2025, 9:27 PM

#

have you thought about wats to add ai into roblox?

balmy mist Apr 3, 2025, 9:27 PM

#

keen beacon there is gambling, 17+ games, bars, voice chat, etc

lmaoo wtf

keen beacon Apr 3, 2025, 9:27 PM

#

balmy mist have you thought about wats to add ai into roblox?

they already have their own ai assistant in the engine that semi works but using gemini is better

#

you cant essentially add ai into the game building engine and make it do everything for u

#

u have to tell it like okay

#

"i wanna develop a tycoon, walk me thru it step by step"

balmy mist Apr 3, 2025, 9:28 PM

#

like you can use the gemini api in your scripts for npcs?

keen beacon Apr 3, 2025, 9:28 PM

#

you can but you'd have to build that out yourself

wheat onyx Apr 3, 2025, 9:28 PM

#

For those of you who are writing emails, articles, etc., do you still use gpt 4.0? Why/why not?

keen beacon Apr 3, 2025, 9:28 PM

#

wheat onyx For those of you who are writing emails, articles, etc., do you still use gpt 4....

no because it spams em's hyphen, i like deepseek more

balmy mist Apr 3, 2025, 9:28 PM

#

wheat onyx For those of you who are writing emails, articles, etc., do you still use gpt 4....

hmm yeah 4o got a new update and its been pretty good, but 3.7 is soid as well

keen beacon Apr 3, 2025, 9:28 PM

#

But i can"-"

And you can"-"

and they can"-"

keen beacon Apr 3, 2025, 9:29 PM

#

balmy mist hmm yeah 4o got a new update and its been pretty good, but 3.7 is soid as well

i agree image generation is crazy but only that

#

balmy mist Apr 3, 2025, 9:29 PM

#

you cant go wrong anymore with any SOTA model in terms of writing emails and stuff tbh

keen beacon Apr 3, 2025, 9:29 PM

#

keen beacon

chatgot 4o new text update

#

balmy mist Apr 3, 2025, 9:30 PM

#

keen beacon

even the normal version of it is solid, 4o as a model is good now, like not better than 3.7 and gemini but i would use it 3rd

#

maybe deepseek as well but its up there

wheat onyx Apr 3, 2025, 9:30 PM

#

balmy mist you cant go wrong anymore with any SOTA model in terms of writing emails and stu...

I mean complex emails/articles, where you are attempting to get across the most information, concisely, in a logical flow, and readible format.

Some must be better than others

balmy mist Apr 3, 2025, 9:30 PM

#

keen beacon you can but you'd have to build that out yourself

that shouldnt be to bad

#

i think that might be the next phase of games, i have not seen anyone do it right yet tho

keen beacon Apr 3, 2025, 9:30 PM

#

U will have to pay for api each response tho

#

And its not gaurenteed ur game will make money

#

What alot of people do is they make a regex

#

so if ur sentence contains the word happy or something, the npc responds with a happy pre-written response

keen fulcrum Apr 3, 2025, 9:31 PM

#

keen beacon And its not gaurenteed ur game will make money

There is whop.com and skool.

keen beacon Apr 3, 2025, 9:31 PM

#

you can use AI to write out those responses and cover every possible response

balmy mist Apr 3, 2025, 9:31 PM

#

wheat onyx I mean complex emails/articles, where you are attempting to get across the most ...

thats where prompting comes in, you can get the model to output whatever you want, just gotta guide it right, but if i was u i would use 3.7 sonnet for stuff like that gemini as well since its free

keen beacon Apr 3, 2025, 9:31 PM

#

keen fulcrum There is whop.com and skool.

Whats that

#

?

#

GPT wrapper?

balmy mist Apr 3, 2025, 9:32 PM

#

keen beacon U will have to pay for api each response tho

you will only have to pay if your game is being used tho

wheat onyx Apr 3, 2025, 9:32 PM

#

balmy mist thats where prompting comes in, you can get the model to output whatever you wan...

I'm surprised that seems to be the majority of the responses I'm getting here

keen fulcrum Apr 3, 2025, 9:32 PM

#

No monthly subscription communities / courses
services

visual turret Apr 3, 2025, 9:32 PM

#

https://www.testingcatalog.com/google-plans-new-gemini-model-launch-ahead-of-cloud-next-event/

TestingCatalog

Google plans new Gemini model launch ahead of Cloud Next

Discover the latest updates on Gemini, including potential new model launches and experimental tools. Stay tuned for exciting features like scheduled prompts and video generation.

keen beacon Apr 3, 2025, 9:32 PM

#

visual turret https://www.testingcatalog.com/google-plans-new-gemini-model-launch-ahead-of-clo...

No waaaaaay

#

Video generation to Gemini

visual turret Apr 3, 2025, 9:32 PM

#

'full' gemini 2.5 pro may lunch next week

#

So it might be in testing

keen beacon Apr 3, 2025, 9:33 PM

#

Lets gooo

visual turret Apr 3, 2025, 9:33 PM

#

In lmarena

keen beacon Apr 3, 2025, 9:33 PM

#

Hopefully paid users get early access

visual turret Apr 3, 2025, 9:33 PM

#

Agreed

keen beacon Apr 3, 2025, 9:33 PM

#

oh shoot lmarena gets it early?

balmy mist Apr 3, 2025, 9:33 PM

#

so if its being used that you will make money and counter act the api keys, if no one is using your game then no api cost, but I see what you mean, then it might be best to use the cheapest model, if we can get 2.5 for cheap i would cry

#

wait whatttt

visual turret Apr 3, 2025, 9:33 PM

#

keen beacon oh shoot lmarena gets it early?

I was just guessing that but the rest is from the website

balmy mist Apr 3, 2025, 9:33 PM

#

next week is going to be big

#

especially with nightwhisper and stargaze plus video gen wow

keen beacon Apr 3, 2025, 9:34 PM

#

balmy mist so if its being used that you will make money and counter act the api keys, if n...

You can do that, just enable paid access and plan out logistics

balmy mist Apr 3, 2025, 9:34 PM

#

yeah but you get rated limiited

keen beacon Apr 3, 2025, 9:34 PM

#

There are games in roblox like that, but its to talk to anime girl roblox NPC

balmy mist Apr 3, 2025, 9:34 PM

#

you cant have a game on that with those limits

keen beacon Apr 3, 2025, 9:34 PM

#

and u can donate to make her happy or change her mood

#

its like 1:1 chat

visual turret Apr 3, 2025, 9:34 PM

#

balmy mist so if its being used that you will make money and counter act the api keys, if n...

2.5 pro is said to cost as much as deepseek r1

#

If it does go paid

balmy mist Apr 3, 2025, 9:34 PM

#

hmm i guess thats not bad, but still not ideal for a npc game

keen beacon Apr 3, 2025, 9:35 PM

#

balmy mist hmm i guess thats not bad, but still not ideal for a npc game

nope not a full out one

keen beacon Apr 3, 2025, 9:35 PM

#

balmy mist hmm i guess thats not bad, but still not ideal for a npc game

but definitely look into roblox studio

balmy mist Apr 3, 2025, 9:35 PM

#

i will thanks for the tips

keen beacon Apr 3, 2025, 9:35 PM

#

that idea is complicated, you don't need to think too hard. make a money grab game. trust me.

balmy mist Apr 3, 2025, 9:35 PM

#

you made money from it?

keen beacon Apr 3, 2025, 9:35 PM

#

yes thousands of USD from commissions

#

this is my first time attempting a game by myself cuz i have funds for ads

keen beacon Apr 3, 2025, 9:37 PM

#

balmy mist you made money from it?

if you do social media just intern for me

#

we could go 50/50 in earnings on this project

#

ill supply funds for marketing + im in freshmen year of marketing degree

balmy mist Apr 3, 2025, 9:38 PM

#

lol, im a software dev dont know much about social media

keen beacon Apr 3, 2025, 9:38 PM

#

just need a bunch of clickbait content

keen beacon Apr 3, 2025, 9:38 PM

#

balmy mist lol, im a software dev dont know much about social media

rip

balmy mist Apr 3, 2025, 9:38 PM

#

but i am interested in learning how to make these games and making a good ai workflow

keen beacon Apr 3, 2025, 9:38 PM

#

https://create.roblox.com/docs/luau

Luau | Documentation - Roblox Creator Hub

Luau is the scripting language creators use in Roblox Studio.

#

look over documentation

balmy mist Apr 3, 2025, 9:38 PM

#

have you used MCP servers for them yet?

keen beacon Apr 3, 2025, 9:38 PM

#

should be easy for you

keen beacon Apr 3, 2025, 9:39 PM

#

balmy mist have you used MCP servers for them yet?

no idea what that is but you can't self host roblox servers they provide all of that including optimization for free unlimited

#

if its a api you can use endpoint

#

and they have httpsrequest+apiservice

#

MCP servers are specialized servers that allow AI models to interact with various data sources and tools through the Model Context Protocol (MCP).

balmy mist Apr 3, 2025, 9:40 PM

#

imma cook this weekend and get back to you, ill add you and show you what i come up with this weekend, if i can make this mcp server it would make building games in roblox cake

keen beacon Apr 3, 2025, 9:40 PM

#

Yes its very possible

keen beacon Apr 3, 2025, 9:40 PM

#

balmy mist imma cook this weekend and get back to you, ill add you and show you what i come...

https://devforum.roblox.com/t/expanding-assistant-to-modify-place-content-beta/3107464

Developer Forum | Roblox

Expanding Assistant to Modify Place Content [Beta]

Hey Creators, Today, we are excited to announce we’re expanding Assistant’s capabilities to perform a broad range of actions in Studio. Assistant can now help you modify the DataModel in order to automate some of the repetitive parts of your work. For example, it can modify properties in bulk, swap items, or restructure your DataModel. This...

#

read this

balmy mist Apr 3, 2025, 9:42 PM

#

keen beacon https://devforum.roblox.com/t/expanding-assistant-to-modify-place-content-beta/3...

thank you, i will dive into this tonight

balmy mist Apr 3, 2025, 9:43 PM

#

torn mantle if it didnt work then ask it to use this : https://png.pngtree.com/png-clipart/2...

https://3000-ih9eb6fqel78t50wpu4gc-ae4bd0ef.e2b-foxtrot.dev

#

i managed to get it back, not exactly how we had it before but something

keen beacon Apr 3, 2025, 9:49 PM

#

how yall building the sandbox websites

#

self host?

balmy mist Apr 3, 2025, 9:50 PM

#

nahh from webdev arean

#

https://web.lmarena.ai/

keen beacon Apr 3, 2025, 9:50 PM

#

ty

balmy mist Apr 3, 2025, 9:51 PM

#

you prompt it then it puts two llms against each other and then you copy the link thats in the block section

#

night whisper is so good

keen beacon Apr 3, 2025, 9:51 PM

#

is direct chat possible?

#

or nah

balmy mist Apr 3, 2025, 9:51 PM

#

it on shotted the pokemon prompt while no other model could do it

#

nahh but you can just keep prompting after it does the first generation

#

and you can even say forget the previous prompts and then put a new prompt in the same battle session

#

which essentially acts as a new chat with the two models you are comparing

keen beacon Apr 3, 2025, 9:58 PM

#

balmy mist nahh but you can just keep prompting after it does the first generation

bet ty

balmy mist Apr 3, 2025, 10:04 PM

#

this is what gemini 2.5 made:
https://3000-iela363fcru8h9mhaomda-2eb11a2e.e2b-foxtrot.dev/

#

this is what whisper made the second go around: https://3000-it574jggc7ofsflef62j1-90451382.e2b-foxtrot.dev/

#

yall see the difference between the models?

#

clear difference

torn mantle Apr 3, 2025, 10:14 PM

#

was nightwhisper ever on lmarena?

balmy mist Apr 3, 2025, 10:14 PM

#

nahh does webdev

#

yo you see the recent gens?

#

i wish you could have tried it

#

but look guys

#

https://x.com/OpenRouterAI/status/1907870610602275203

OpenRouter (@OpenRouterAI) on X

Excited to announce our first-ever “stealth” model... Quasar Alpha 🥷

It’s a prerelease of an upcoming long-context foundation model from one of the model labs:

- 1M token context length
- specifically optimized for coding, but general-purpose as well
- available for free

#

i think this might be nightwhisper

#

this is exciting af

#

imma test it in vsc now

#

it still could be but people saying its from open ai

ancient reef Apr 3, 2025, 10:17 PM

#

people think its from oai

balmy mist Apr 3, 2025, 10:17 PM

#

but 1 mill context has to be google

#

this is so weird lol

eager mica Apr 3, 2025, 10:18 PM

#

https://arxiv.org/abs/2501.15383

arXiv.org

Qwen2.5-1M Technical Report

We introduce Qwen2.5-1M, a series of models that extend the context length to 1 million tokens. Compared to the previous 128K version, the Qwen2.5-1M series have significantly enhanced long-context capabilities through long-context pre-training and post-training. Key techniques such as long data synthesis, progressive pre-training, and multi-sta...

#

What if...?

#

Qwen... Quasar...

ancient reef Apr 3, 2025, 10:19 PM

#

taken from OR discord:

#

someone did a personal benchmark on it too:

raven void Apr 3, 2025, 10:25 PM

#

so it's not 2.5 flash?

brittle tiger Apr 3, 2025, 10:34 PM

#

balmy mist https://x.com/OpenRouterAI/status/1907870610602275203

It's basically confirmed openai. they've been removing big tells it is them as they've been spotted today

keen beacon Apr 3, 2025, 10:34 PM

#

balmy mist https://x.com/OpenRouterAI/status/1907870610602275203

why they name it quasar 💀

brittle tiger Apr 3, 2025, 10:34 PM

#

hoping someone will benchmark the 1M context on evals knowing that

keen beacon Apr 3, 2025, 10:34 PM

#

quasar is a rat

#

ancient reef Apr 3, 2025, 10:35 PM

#

https://huggingface.co/silx-ai/Quasar-1.5-Pro
xp
someone suggested it. cool there's another lmm already named quasar tho

silx-ai/Quasar-1.5-Pro · Hugging Face

balmy mist Apr 3, 2025, 10:37 PM

#

this might be an L launch by the, NW is most likely going to be 1 mill context and it shits on every other model

#

OA always tries to do this to google lol

#

but im in love with NW

#

no way this is true lol

#

can someone test this lol

timber kiln Apr 3, 2025, 10:39 PM

#

They couldn't score that sh1t even if the answers were in context of llm while testing

balmy mist Apr 3, 2025, 10:39 PM

#

lmaoooo fr

#

lol

#

so this is o3? Open ai is so confusing

#

they just said they delaying they next release and then heard about google release next week lol

#

now they drop this

#

ahh dang what is your prompt?

#

that happens to me sometimes

wooden crescent Apr 3, 2025, 10:42 PM

#

balmy mist no way this is true lol

FAKE

balmy mist Apr 3, 2025, 10:43 PM

#

nw was buggin for me earlier but rn its cooking for me

#

it comes up every other battle now

#

post voting

#

and once my context is up for the session

#

i look for it again

#

but with a bigger prompt

wooden crescent Apr 3, 2025, 10:44 PM

#

guys can u use nightwhisper

balmy mist Apr 3, 2025, 10:44 PM

#

hmm i would say it depends on how big your tasks are

#

like i went through 4 iterations of my pokemon sim and on the 5th one it did not work

#

so i would say 4-5 depending on the ask if not more

wooden crescent Apr 3, 2025, 10:45 PM

#

how can i see it there

balmy mist Apr 3, 2025, 10:45 PM

#

you gotta prompt it and then vote on which ever looks better

#

then it reveals the model names

#

then you can keep talking to it post vote

#

you can either say forget the old prompts and give it a new one to start, or continue

wooden crescent Apr 3, 2025, 10:46 PM

#

im now in webarena

#

just writing somethung

#

then wait

#

?

balmy mist Apr 3, 2025, 10:47 PM

#

yo gemini is so bad look at this lol:

#

https://3000-iffpgfbydtp0busrhei3r-90451382.e2b-foxtrot.dev

wooden crescent Apr 3, 2025, 10:47 PM

#

I heard its better then 2.5 pro

balmy mist Apr 3, 2025, 10:47 PM

#

compared to this:
https://3000-idhb3rqyyzv6iuuu0gsr8-87045d8f.e2b-foxtrot.dev

#

way better

#

look at these examples

#

it was even better before but i updated the sim a lil

#

it had a better background and animations but that was a previous session

balmy mist Apr 3, 2025, 10:48 PM

#

balmy mist https://3000-iffpgfbydtp0busrhei3r-90451382.e2b-foxtrot.dev

even on a new session it one shotted it, while gemini did this

wooden crescent Apr 3, 2025, 10:48 PM

#

there will be never an ai to code 100& correctly

lime coral Apr 3, 2025, 10:48 PM

#

balmy mist no way this is true lol

GPQA has lot of mislabeled answers. If a model gets 96% we can start to ask questions

wooden crescent Apr 3, 2025, 10:49 PM

#

i still write with webarena nothing comeso ut

balmy mist Apr 3, 2025, 10:50 PM

#

wooden crescent i still write with webarena nothing comeso ut

what you mean?

#

screenrecord

wooden crescent Apr 3, 2025, 10:50 PM

#

for the nightwhisper

balmy mist Apr 3, 2025, 10:50 PM

#

i can screenshare real quick in the arena playground call thing

wooden crescent Apr 3, 2025, 10:51 PM

#

balmy mist Apr 3, 2025, 10:51 PM

#

you have to vote

wooden crescent Apr 3, 2025, 10:51 PM

#

ah okey

#

what now

#

can i just write

balmy mist Apr 3, 2025, 10:51 PM

#

you see my screen

wooden crescent Apr 3, 2025, 10:51 PM

#

anything

balmy mist Apr 3, 2025, 10:51 PM

#

yeah

wooden crescent Apr 3, 2025, 10:52 PM

#

which one is better

#

2.5 or nightwhisper

balmy mist Apr 3, 2025, 10:53 PM

#

night whisper

wooden crescent Apr 3, 2025, 10:53 PM

#

is it sota

balmy mist Apr 3, 2025, 10:53 PM

#

u see my screen right?

wooden crescent Apr 3, 2025, 10:53 PM

#

yes

#

how u add gemini 2.5

#

in the left side

balmy mist Apr 3, 2025, 10:53 PM

#

i got lucky with gemini

wooden crescent Apr 3, 2025, 10:53 PM

#

i have claude on left on gemini

balmy mist Apr 3, 2025, 10:53 PM

#

its random

wooden crescent Apr 3, 2025, 10:53 PM

#

k

balmy mist Apr 3, 2025, 10:54 PM

#

then start a new battle

#

until you get night whisper

wooden crescent Apr 3, 2025, 10:54 PM

#

its sota

#

?

balmy mist Apr 3, 2025, 10:55 PM

#

yrah

#

yeah

wooden crescent Apr 3, 2025, 10:55 PM

#

by me is round over

#

everytime i write

balmy mist Apr 3, 2025, 10:56 PM

#

this is difference between gemini and nightwhisper:

#

the big one is gemini and the two on the right is from NW different sessions

balmy mist Apr 3, 2025, 10:56 PM

#

wooden crescent by me is round over

screenshot

wooden crescent Apr 3, 2025, 10:57 PM

#

now nightwhisper gone

balmy mist Apr 3, 2025, 10:58 PM

#

damn you gotta keep playing with it bro

#

wait what??

#

no way

#

what you mean screenshots?

#

oh i see what you mean

wooden crescent Apr 3, 2025, 11:04 PM

#

#

look

#

round over

balmy mist Apr 3, 2025, 11:08 PM

#

wooden crescent

now just keep prompting it

#

this is genius

#

im def gonna use this now

#

sonnet is better than stargazer based on this result

#

generally which is obvious

#

but at leaast we know how to prompt star and night

#

i cant find nw anymore

#

i keep getting stargazee

#

yeah i love nw

#

i been using this prompt now: who are you? and which company do you belong to?

#

and the models snitch themselves lol

#

you can basically game the system tho?

#

lol jungle chest

#

yeah i love that movie