#general

1 messages Β· Page 10 of 1

balmy mist
#

yeah then you had zathura

#

oh yeah i saw that one as well

#

they made two movies of that new one right

#

the classic is still best

#

it was actually scary to me as a kid lol

#

it makes sense that nightwhisper is good with ui because of the system prompt

#

most likely it will

#

i got something diff

#

yeah thats why you gotta unplug, we def moving toward a world like that

#

they both ignored the company part

#

and i thought night was just a coding agent

#

but it seems they updated

#

thats maybe why i couldnt access it for 10 mins straight

#

"who are you? and which company do you belong to?"

#

yeah

#

but before i was getting normal results

#

try your old prompt on a fresh sess

#

yeah your prompt

#

its interesting they chose react at the framework to train the models on

#

bro you are so smart to start the prompt session that way

#

"who are you"

#

it leaves room for another prompt without having to say clear previous prompt

#

and it feels like a chatbot like this

#

lol its an easy way to game the system in a way, companies probably do it when they first launch

#

yeah

keen beacon
#

ijed

#

did u figure out what the nightwalker model is

balmy mist
#

its google model

keen beacon
#

no way

balmy mist
#

the qusar is open au

keen beacon
#

oh the quasar one

balmy mist
#

yeah quasar is def open ai

keen beacon
#

oh gpt

#

Open AI

balmy mist
#

yeah it could be o3 lol

#

it kinda has to be

#

yeah o3 coder

#

liek the full model

keen beacon
#

how hard is it to make a LLM

balmy mist
#

not min

#

if you got money not that hard

#

but to make a good one hard af

keen beacon
#

what type of data do they use

#

just plaintext

#

of websites

brittle tiger
keen beacon
#

or do you have to literally sit down and train it like a dog

keen beacon
balmy mist
sterile dust
#

Is 24k gold a updated spider?

eager mica
#

Maybe stradale too, if you're lucky (which appeared to be an inferior model, though).

alpine coral
#

i don't think so (it's way too fast to be a thinking model), but after a couple of quick tests.. damn it's good.. it's nearly getting the same score as o3-mini on a quiz i'm using atm.. mainly tests comprehension / verbal reasoning

#

it says its from oai; feels like it is too (gets character counting questions consistently right that all new oai models.. i think it uses oai's tokenizer at the very least)

#

admitedly the 1m context window though is the bit that suggests perhaps it isn't though

keen beacon
#

yes so ai can replace industry jobs

north vale
#

on priors that would just be google / nightwhisper?

#

basically the coding focused version of 2.5 pro or something

alpine coral
#

yeah the context window points to that

north vale
alpine coral
#

but curiously, current google models always accurately self-idenify as google models (and same for oai models). quaser-alpha says its from oai, and responds to the question in a very similar way to how to oai models do

raven void
#

Is it their upcoming open source model

keen beacon
#

Doubt they'd release a 1m ctx model oss

balmy mist
north vale
#

deepseek wouldn't do this they'd just release the model

balmy mist
keen beacon
#

That model is a scam

balmy mist
keen beacon
#

Unrelated to this

north vale
north vale
balmy mist
#

the inference is so fast, its hard to believe its a reasoning model

keen beacon
#

Indicating contamination because a model can't get that high because of wrong questions etc. I think the scores are completely made up

balmy mist
#

yeah might be a scam

#

this is what quasar said about its system card lol:
Thanks for sharing the detailed info about Quasar-1.5-Pro, developed by SILX AI. It's a highly specialized reasoning model, incorporating some innovative techniques, such as:

Key Innovations

  • Token Temperature Mechanism:
    Distinguishes 'hot' (important) tokens from 'cold' (less important) tokens during training, dynamically weighting input components. This helps the model focus on the most relevant parts of a problem, improving interpretability and accuracy.

  • Guided Sequence of Thought (GSoT):
    Encourages systematic, structured step-by-step reasoning rather than jumping to conclusions, likely boosting performance in complex tasks.

Performance Highlights

On benchmarks including Math500 and AIME, Quasar-1.5 shows exceptionally strong reasoning capabilities, outperforming strong baselines like Qwen-2.5-32B-Instruct with a wide margin, especially in difficult reasoning and coding tasks.

Summary

Quasar-1.5-Pro appears to be purpose-built for advanced reasoning, outperforming many other large models in math, code generation, and fact-based reasoning. It leverages specialized training approaches (Token Temperature, GSoT) to enhance task-specific focus and multi-step problem-solving.


In Context of Your Earlier Question:
Since Quasar-1.5-Pro is explicitly designed as a reasoning model, it incorporates even more advanced mechanisms for reasoning than a general language model like myself (GPT-4). While GPT-4 is capable of substantial reasoning, models like Quasar are optimized explicitly to improve on reasoning-intensive tasks.

If you need powerful step-by-step problem solving in math or coding, models like Quasar-1.5-Pro might be especially effective!

north vale
#

could it be xai?

alpine coral
# balmy mist didnt deepseek say open ai wen it first launched?

yeah i wouldn't be surprised if R1 still does say it's from oai (lots of models do) oai, google, anthropic, xAI (+ perhaps mistral and cohere and a few others) seem the only labs that to invest a bit of time in post training to get the models to consistently accurately self-identify..

north vale
#

prolly not

keen beacon
#

It's likely to be openai

alpine coral
#

oh i didn't mean to attach the ss to that reply

balmy mist
#

lol

#

what is Sila AI?

alpine coral
#

but nonetheless, yeah i agree (oai seems most likely by far for me at this stage)

north vale
#

and how good is it compared to 2.5?

#

to yall

#

ive asked it a few questions it did well in but i can't rly tell

balmy mist
#

its a middle eastern company

alpine coral
north vale
keen beacon
north vale
#

lol never heard of them

#

oh ok

balmy mist
#

imma stop using that model lol

alpine coral
#

stargazer seemed farily decent, but stil behind 2.5

balmy mist
#

nightwhisper is the best model right now

#

and yeah only in webdev

alpine coral
#

yeah i was scrolling through earlier - seems v strong

balmy mist
#

u seen the pokemon results?

#

they trained nw on being a react dev and it seems like it focuses on being really good at UI/UX

alpine coral
#

tbh as i don't really know anything about web dev (other than what looks aesthetically pretty / pleasing), i don't spend much time reviewing all the screenshots ha

balmy mist
#

im trying to get it to make a yu gi oh game

alpine coral
#

more interested in what pepole who know about the domain have to say - and sll seems overwhlemingly positive

balmy mist
#

i mean just test it out yourself

#

thats what i did

#

ppl talked about it so i tried myself

#

and i was blown away

#

gemini still does good at coding but i think NW is specifically trained to be good at UI/UX and react which makes its output looks so good, like it found the sprites for these gens

#

i did not tell it to get the image for the pokemon

alpine coral
#

yep all good - i get the gist of it all πŸ™‚

balmy mist
#

it just did it, while gemini did not grab the image

alpine coral
keen fulcrum
#

What is your stance on Meta using pirated results for training?

thorny drum
#

do people think quasar is not google?

#

1M context length + space themed name definitely hint google

north vale
#

it says it's openai, google models usually say they are googole models

#

that's like the main piece of evidence against

raven void
#

Miss the old days when everything other than Claude 3.5 sucked

alpine coral
#

i'd say ~70% of the questions i've come up with myself (though invariably they are derivatives of some pre-existing riddle), so it's not really possible for them to simply recall the answer

keen beacon
#

is quasar alpha anonymous chatbot huh

#

it has the same You are trained on data up to October 2023. appendix

alpine coral
# alpine coral i'd say ~70% of the questions i've come up with myself (though invariably they a...

just by way of example, stargazer on the left, 24_karat_gold on the right. it's quite literally just a matter of accurately comprehending the scenario and explicit question (distinguishing b/w composing vs sending the letter (vs when it will it arrive at their grandma).. stronger models generally pick up on it and get it right, weaker ones jump to assumptions based on the various details which are mostly extraneuous to the actual question

keen beacon
#

that i found bizarre on anonymous chatbot

alpine coral
keen beacon
#

did u test anonymous chatbot btw?

alpine coral
#

need to get back to my actual work though instead of playing around in the arena (/aistudio ha)

alpine coral
keen beacon
#

the new one came out surprisingly fast after the latest chatgpt 4o latest revision (the previous anonymous chatbot)

alpine coral
#

ikr

#

the tempo has really shifted a few notches the past few weeks hasn't it

keen beacon
#

yeah its insane

keen beacon
#

it has the same cut off as chatgpt 4o latest indicating different pretraining in line with chatgpt 4o latest

#

it seems

#

oh this is the new 4o or something lol. theyre about to announce it fr

#

oh well that mystery is over i guess

north vale
#

yeah seems likely

#

is it also rly fast? they talked about it getting faster

balmy mist
#

is quasar not that middle eastern company?

#

is that confirmed?

#

nvm this is a diff model

keen beacon
balmy mist
#

it came out last year

keen beacon
#

im pretty sure its:

  • from openai
  • is the new anonymous chatbot
  • chatgpt 4o latest lineage (based on pretraining knowledge)
#

openai should try hrader at something like this tbh

balmy mist
#

4o is so good now, its like what deepseek v3.1 is trying to be

#

its literally reasoning in the inference output

#

so you are probably right

keen beacon
#

oh its probably already live in chatgpt 🀣

#

i should check

balmy mist
#

nahh 4o has been good since they updated it last week with the image stuff

#

maybe they did another update

#

but 4o is really good now, i wanna test out my pokemon prompt on it lol

plain zinc
#

Why isn't nightwhisper in the chat arena?

#

in lmarena

#

Or is he there? And is he a VERY rare bastard?

balmy mist
#

anybody got some reasoning questions for me to ask 4o and quasar?

balmy mist
alpine coral
# keen beacon oh this is the new 4o or something lol. theyre about to announce it fr

yeah i was thinking maybe their open weights model that they said they are testing / preparing to release... but it seems too strong for that to be the case (also the 1m context window.. though that isn't necessarily inconsistent with it being open weights - would actually kinda make sense like oai wouldn't have to pay for 1m token processing/inference if people are self-hosting it.. but yeah anyway.. it seems too performant and fast to be something they'd just be giving away..)

#

so in short, i also lean to it being yet another upgraded version of 4o

#

faster and with 1m context window (and seemingly also more performant potentially)

#

perhaps it isn't multimodal.. just a pure text model

keen beacon
#

but they have beenn working on 4o native image gen on a separate model (4o based model) compared to the chatgpt 4o latest line

#

it should be able to take images though, as we can see in chatgpt

#

this should be in the same lineage/line of chatgpt 4o latest (post december)

alpine coral
#

yeah agree

keen beacon
#

they havent released benchmark results AT ALL for the new chatgpt 4o latest models that were continued pretrained past december. and with anonymous chatbot being the same, it seems. im like 99.99% certain this is a formal launch of the new 4o

alpine coral
#

and context window length is kinda artbritrary right? like there is nothing technically preventing the 4o family from having 1m (or whatever) context windows - it a very simplified level

keen beacon
#

google anon models are just way more fun to figure out tbh

balmy mist
#

i just test 4o and its trash on simple bench lol

#

got 1/10

#

but nightwhisper tied with gemini 2.5 with 5/10

#

yeah which ia wild

#

i like that so we dont have to wait for them lol

keen beacon
#

bruh its not a secret they iterlaly said it in their report lol

#

this is not a particularly new innovation tbh. everyone does it now

#

to different degrees

#

yes

night trout
#

PSA: I think nightwhisper is generating code so long (or thinking so long?) it times out.

#

I'm finding with complex problems it ends up not displaying the solution quite often, but it doesn't seem to be due to code errors. The whole LMArena sandbox for it just goes black.

balmy mist
#

yeah thats been happening to me

night trout
#

Example: When given a prompt to create what is effectively super monkey ball:

#

Notice even the <> Code and [ ] Block buttons aren't displayed, so I believe the sandbox itself is failing, not the model.

modern sable
#

sounds like it is yes

leaden palm
#

the model responds, and if the model responds with a code embed the sandbox shows up

#

in this case the model didn't respond at all

modern sable
#

direct response to deepseek

#

trying to reclaim the crown

night trout
leaden palm
#

yup

#

didn't respond at all

#

(perhaps you're confused because your screen is so small you never realized models typically provide extra text before creating a sandbox, so you thought they only respond with sandboxes, and can't understand my point about there being no extra text)

balmy mist
#

put in the prompt again

#

thats what i do

#

send me your prompt

night trout
# leaden palm didn't respond at all

It only happens with nightwhisper, and at least anecdotally, seems to happen on prompts with considerable complexity. Totally open to the idea that nightwhisper is broken but this feels very much like a timeout of some sort. It'll sit there 'generating' for a long time before it goes black.

leaden palm
#

well both are true

night trout
leaden palm
#

the model is broken because of a timeout

night trout
leaden palm
night trout
balmy mist
#

i love nightwhisper, imagine it in cursor or roo code

night trout
#

Killed it at my airline seat selector test, check this out:

#

Sonnet was a mess (both 3.5 / 3.7), Gemini 2.0 Pro / Thinking were barely functional, Gemini 2.5 was mostly there but had off-by-one errors, and missed that airlines sometimes skip rows.

Nightwhisper was flawless, and imo had the best aesthetics too.

balmy mist
# night trout

yeah i would say gemini 2.5 is right behind it some areas, but for most night blows them all away

#

quasar is not bad

night trout
balmy mist
#

gets 4/10 on simple bench which SOTA gets 5/10(nightwhisper and gemini) but claude also gets 4/10 but quasar is faster than all of them so that says a lot

#

quasar is on open router

#

quasar is gonna be my go to model since its just as good as claude but fast as hell lol and 1 mill context

#

nightwhisper is still SOTA though

#

lmaoo no way

#

that would be a big ass troll

night trout
#

Oh interesting. I'll check it out.

balmy mist
#

but i can see elon doing

#

that

#

lol

#

it makes sense for quasar to be openAI tho, its probably gpt5

#

or a mini of gpt5

keen beacon
#

chill out its just 4o πŸ™ˆ

balmy mist
#

true but you cant always trust the model

#

why do stealth if we can easily ask it lol

keen beacon
balmy mist
#

tdlr?

night trout
#

Here's the prompt if you want to try:

Generate an interactive airline seat selection map for an Airbus A220. The seat map should visually render each seat, clearly indicating the aisles and rows. Exit rows and first class seats should also be indicated. Each seat must be represented as a distinct clickable element and  one of three states: 'available', 'reserved', or 'selected'. Clicking a seat that is already 'selected' should revert it back to 'available'. Reserved seats should not be selectable. Ensure the overall layout is clean, intuitive, and accurately represents the specified aircraft seating arrangement. Assume the user has two tickets for economy class. Use mock data for initial state assigning some seats as already reserved. 
balmy mist
#

i can give you the code, can you run it on your end?

#

cause i would have to go to vsc to run it which im kinda lazy lol

night trout
#

For an advanced version: Exit rows, washrooms, wing locations, and first class seats should also be indicated.

#

Sure, I'm too lazy to config openrouter right now hahaha

balmy mist
#

lmaooo

#

like the fastest model thats why i dont think its open ai

#

only google can make their models this fast tbh, unless you using groq

keen beacon
#

march 4o latest is 180 token/sec

balmy mist
#

wow craig thanks

#

what did you use to run it?

#

from where?

#

using quasar?

#

you used vsc?

#

thank you

#

i think if you attach agents to quasar it can be really good

#

true

#

and with the speed of quasar you can do a lot

#

a fast reasoning-foundation model with 1 mill context is cracked

night trout
balmy mist
#

yeah

#

what do we call these models?

#

like deepseek v3.1, 4o and quasar?

#

they are foundation models with COT in their inference output

#

i always got confused by that

#

so instruct is like optimized for chat right?

#

while finetuned is the next level of that? like finetuning on COT?

night trout
#

Instruct has chat fine-tuning embedded into it, yeah

#

Non-instruct models are just called pre-trained afaik.

balmy mist
#

ahh okay thank you

keen beacon
#

quasar is identical to anonymous chatbot which has always been a chatgpt 4o latest revision.
quasar has knowledge up to june 2024. same as chatgpt 4o latest. (in fact, it has even more than chatgpt 4o latest)
they haven't formally launched the new cpt'd 4o. no official benchmark results despite major leaps in performance.
etc (read my other comments) etc im like 99.9999% certain now lol

#

its a lie

balmy mist
#

you saying that its the new version of 4o? cause the current 4o gets 1/10 on simple bench while quasar gets 4/10

keen beacon
#

major tip off that quasar and anonymous chatbot is the same

#

because they appended the same thing

balmy mist
#

also i thought open ai didnt believe in open source, so they changing they tune again lol? cause it was originally supposed to be open source

keen beacon
#

they added You are trained on data up to October 2023. to the end of both quasar and anonymous chatbot's system prompts

#

despite both having a june 2024 cut off

balmy mist
#

yeah a few hours ago

keen beacon
#

nah its such a random detail no one notices

#

all the things just add up if ur paying attention

balmy mist
#

so does this mean that they will open source more of their models?

keen beacon
#

it will be released soon

#

exactly

#

they put anonymous chatbot right after it its pretty crazy

#

so they want the lmarena results fast

#

yes the new 4o is 180 tok/s

balmy mist
#

i gave up my o1 pro sub, to expensive

#

omgg you are right

#

sam did post about that today

#

yeah its def open ai

#

good job wild

barren prairie
balmy mist
#

how so? i used to think that to

#

but now with gemini 2.5 pro and ther multiple ide and extensions you can use for way less makes it redundant

keen beacon
#

it does know events in dec 2024. and they claim its jan 2025

balmy mist
#

i feel that

#

really i have noticed the opposite, maybe it depends on the usecase

#

in reasoning you have a point tho

thorny drum
#

we'll finally be able to finance llm convos

north vale
balmy mist
#

@deep adder and with the way openai is doing their ui now, with being able to pick the levels of reasoning, quasar really might be the base, so imagine being able to apply reasoning on that model like medium or high(based on slider)

#

we might be underestimating this model

#

yeah hoepfully we find out soon enough but it seems things are really picking up recently, we might have r2, openai and google new model before the end of the month

#

and meta too

keen beacon
#

i dont think so nightwhisper, star gazer (2.5 line), etc

night trout
#

Damn nightwhisper is REALLY obsessed with glassmorphic aesthetics.

balmy mist
#

googel releasing their model on tuesday they said

#

for their event

#

lol

night trout
#

Apple could never do something this ugly.

#

Side note I think I've invented the best viral micro-benchmark on the planet: "Generate a rotating, animated calendar in threejs with today's date highlighted and pulsing."

The fails are incredible.

plain zinc
#

Bruh

#

Nightwhisper the best

#

Now 2.5 pro on the background of nightwhisper seems like a jokeπŸ˜„

night trout
plain zinc
#

Am I right?

plain zinc
# night trout Not always.

Okay, and I'm even glad that I was wrong :). If nightwhisper comes out, then we will have TWO SOTA models for coding.

night trout
plain zinc
# night trout

And where is the result from nightwhisper? Has he stopped? Or is he still writing code?

night trout
#

Stopped. 😦

plain zinc
#

πŸ₯²

night trout
#

Fwiw I'm finding Nightwhisper usually wins against 2.5 but it's not always the case. Like 80%-90% of the time.

#

One more calendar fail. πŸ˜‚

torn mantle
#

It may not look the best but when you compare it to other models it clearly emerges as the winner

night trout
#

"Generate a rotating, animated three-dimensional calendar with today's date highlighted."

torn mantle
#

Also nightwhisper has that weird color selection

#

Ive noticed that too

#

It likes going gradient and dark colors all the time

#

So maybe you need to guide it a bit more on that

#

Or ask it to follow a styling principle

#

I like to tell it to act as an apple designer to get a clean UI look

night trout
#

Oh, this one was another win for 2.5:

torn mantle
night trout
#

Nightwhisper forgot to make the water droplets collide with each other + had really janky ball physics + glitches.

torn mantle
#

But i need to try that more tbh

night trout
torn mantle
#

You have some good challenging prompts

night trout
night trout
night trout
#

I've been working on a Super Monkey Ball prompt tonight, lots of physics + gameplay mechanic fails.

torn mantle
#

What about bug fixing on nightwhisper?

#

Does it fix the issue with enough guidance?

#

Or does the complexity outweighs the model capability

alpine coral
# keen beacon it does know events in dec 2024. and they claim its jan 2025

it's interesting - like its knowledge becomes increasingly hazy the closer to the end of 2024 that the question relates to. e.g.

  • December 2024: it fails when asked about Syria (fall of Assad in December was one among biggest geopolitical developments of the year)
  • (November 2024): it partially gets the US election result correct ; stating Trump won, and also giving the correct margins (312 electoral college votes vs 226); however, it says that Trump beat Biden...❌
  • (July 2024) when pushed, it will recall the attempted assassination of Trump, wth accurate details
keen beacon
#

It's less expensive

night trout
#

Sonnet was messing up pretty hard, but that was a previous version of the prompt. I should try it again with the new version.

keen beacon
#

The claimed cut off is right though I think. It gets some stuff right in dec 2024 though it's very sparse

alpine coral
night trout
#

Okay I've given nightwhisper like eight attempts to get the labyrinth right. no dice.

plain zinc
#

It is websites, not some kind of games, simulators, etc.

keen beacon
#

the goal with 2.5 pro's cpt seems to be strengthening the base model dramatically rather than recent events

#

4o's cpt is the same, but they also focused more on recent events before june 2024 i think

#

sonnet 3.7 is again presumably a cpt on top of sonnet 3.5, im unsure of how much it knows recent events (to its cutoff) though i havent tested that. it seems everyone is cpting lol

keen beacon
torn mantle
#

Since there isn't any code errors

alpine coral
keen beacon
#

i also wouldnt count openai out tbh. given how much progress has been made on 4o

#

a reasoning model (given how theyre ahead of the reasoning game) based on this much stronger 4o will be interestig

#

but deepmind is too fast (2.5 pro timeline, based on cut off), so it'll be interesting

alpine coral
#

yeah it really feels like a two horse race now

#

who knows.. there might another 'deepseek' moment.. but i'm not sure that was as seismic as the excitement / hysteria at the time suggested [though not dismissing its significance - it definitely lit a fire under the assess of the US companies at the very least]

night trout
#

Oh wow. Yeah. Definitely just found my next test prompt.

#

lmao meanwhile gemini flash 2.0 be like

#

youtried.gif

oblique flint
#

so nightwhisper is gemini 2.5 pro stable release or something?

alpine coral
# oblique flint so nightwhisper is gemini 2.5 pro stable release or something?

it's only available in webdev Arena (i think) - whether that means anything in terms of it being specialised i'm not sure, but would kinda think if it was a non-exp version of 2.5, or a newer checkpoint, they would add it to the General arena? at least they'd get a bunch more data / votes making it available there

oblique flint
#

yeah true, but I dont think google has ever released a specialized model before have they? It doesnt seem like the most likely case to me

alpine coral
#

yeah i know what you mean

#

though there is / was also 'nighthowler' iirc, which also basically confirmed google and only available in webdev

oblique flint
#

if they do specialized versions, I hope their focus is not only oneshot webdev. There is still room for improvement for tool calling capabilities and diff edits with cursor/roocode etc

keen beacon
#

It's in the general arena

#

Nighthowler

#

No idea what it is tbh. Haven't gotten it enough to figure it out

#

nightwhisper is probably a web dev tune of 2.5 pro. Idk tbh I don't touch web dev arena and not that interested

alpine coral
keen beacon
#

Nightwhisper might just be a one off experiment before they apply it to a mainline Gemini model

alpine coral
#

yeah i haven't got either (moon/nighthowler ) so not really sure.. stargazer is def interesting and performant tho. seems all of them are very much likely cut from the same (2.5 pro) cloth at the very least

keen beacon
#

Nighthowler isn't Gemini 2.5 based

#

All I know

#

Stargazer is

oblique flint
#

main thing Im excited for is 2.5 flash tbh, I hope they're cooking it rn.

alpine coral
keen beacon
#

I don't like using the arena nowadays some of my requests take minutes. Meta model spam. Lagging completion zipping

#

You can't see the thinking too it's annoying

north vale
#

Google coder 1

alpine coral
#

yeah it's annoying.. i compiled a new 'quiz' for this month and the new batch of anon models, but haven't really been able to collect data at any kind of meaningful level

#

it's so slow and yeah meta model spam

keen beacon
alpine coral
#

i think it was like less than a couple of hours between OR announcing the availability of its first stealth model, and a fairly firm consensus emerging that it's an oai model

#

whereas the new gemini models were kinda mysterious for at least a few days.. like nebula (and good ol spider phantom.. whwatever happened to it).. even though it seemed pretty clear they were from google, it wasn't obvious beyond doubt

fleet lintel
#

Looks like gemini 2.5 completely changed opinion of folks about Gemini models. I wonder what kind of optimizations they did in 2.5?

torn mantle
night trout
torn mantle
#

stargazer & gemini 2.5 pro

#

still didnt get nightwhisper run tbh

torn mantle
torn mantle
#

@night trout nightwhisper

#

it was overcomplicating the code so i asked it for a simple version

torn mantle
torn mantle
#

are we sure this is not like the best coding model 😭

keen fulcrum
kind cloud
#

I've never seen it on lmarena, and I'm wondering about this video.

keen fulcrum
#

Hi this is empty all the time

torn mantle
#

havent seen it

torn mantle
#

rendering issues

#

it doesnt have anything to do with the models

#

how can we contact devs?

sage raptor
torn mantle
#

whos the winner

keen beacon
torn mantle
#

cant stop playing with this model

#

if its coding finetuned model only then its a huge blow to anthropic

#

context will be much higher

#

and probably the price will be lower too

#

with better results than claude latest model

keen beacon
#

anthropic makes a sizable markup on their api i thinnk

keen beacon
#

they can probably reduce the price

keen fulcrum
torn mantle
#

well there is profit margin

keen fulcrum
#

First one looks cleaner

keen beacon
#

im not sure if they can compete with google tho

torn mantle
#

anthropic probably has it more than it should be

keen beacon
#

probably not (be able to compete w google)

keen fulcrum
#

Although no real data

oblique flint
#

I believe currently a big portion of their api revenue is integrated ide agentic coding tools like cursor, windsurf, roocode etc. There I feel gemini 2.5 pro is kinda lacking compared to claude models still, just cause the claude models are so good at function calling and diff edits

keen fulcrum
#

For backend probably 2

sage raptor
#

nightwhisper will be better than claude

torn mantle
#

Apple design style, with light colors.

oblique flint
#

I thought that too when I saw aider polyglot results, but i reality 2.5 pro is still worse in cursor for me. It's better in ai studio but manually uploading files and keeping track of the edits sucks lol

torn mantle
oblique flint
#

competition is only good for us. The claude api is hella expensive lol

oblique flint
#

I hope they'll release a 2.5 flash too that can compete with o3 mini at a lower price

torn mantle
#

its already in the arena

#

star*

#

star(something)

oblique flint
#

is it a thinking model?

torn mantle
#

yea

sage raptor
#

o3 mini is the best model

oblique flint
#

for the price its kinda insane yeah

torn mantle
#

best model for what exactly?

sage raptor
#

making games, writing and price

torn mantle
#

nightwhisper attempt to clone windows 11 task manager

#

not bad

sage raptor
torn mantle
eager mica
#

Is anybody getting (suspected) Meta models other than 24_karat_gold on the text-only Chatbot Arena? It seems as if they've been taken down.

torn mantle
lime coral
keen beacon
alpine coral
# keen beacon im not sure if they can compete with google tho

they're compute poor and being squeezed hard... i feel like the whole test-time compute thing has thrown a spanner in the works for them - feel they were training some giant Opus 4.0 which is now redundant or something (kind like Gemini Ultra).. either way compared to google and oai, the resources required to both train and deploy at scale (and ig with a loss leader approach as i think google and oai both must be doing), has put anthropic in tough spot imho

sage raptor
alpine coral
keen beacon
#

its curious when claude 4 will arrive given they invested a lot of effort into sonnet 3.7

alpine coral
sage raptor
alpine coral
#

no need to respond to that.. it's largely incoherent lol

keen beacon
alpine coral
#

i hadn't thought of it like that before

#

makes it even tougher to see a path where they stay at frontier..

#

it'll be the titans with the compute at the end of the day.. the lag b/w the performance of SOTA proprietary models and open source / weights models might be more interesting / meaningful than what the second-tier close models companies put out

keen fulcrum
#

Welcome to the Good life

celest river
#

nightwhisper is from which company?

celest river
balmy mist
lime coral
#

The thing is oai is willing to loose money, which Google can’t. They are like all in every time, it was the same with gpt4, all of their ressources in one run then negative bill for serving

#

Not the same game

keen beacon
#

money*

#

theyj literally are right now

#

giving out access to a frontier model on AI Studio with no rate limits definitely isn't profit making

lime coral
#

This is not a big lost. The model they serve is cheaper, they save all the chat conv, and there is probably way less user than ChatGPT at his beginning. When Google announced investing a lot in ai this year their share went down just because investor cannot understand how this money would come back.

barren prairie
keen beacon
#

like many, i thought they were pretty behind

#

they surprised everyone with 2.5 pro

#

and i think it's a good sign

#

and anyway if we're going with the argument of willingness to lose money imo it's obvious who wins

#

google have immensely deep pockets and a vast pool of talent that makes even openai look insignificant

#

they just weren't using it right until recently

balmy mist
ocean vortex
keen beacon
#

their api isnt as popular as their sub which is a massive loss leader

#

even if the api is profitable

ocean vortex
#

that's because they have a shit-ton of expenses. But OpenAI at least is turning some profit from their inference to counter that

#

Google isn't

keen beacon
#

sub + research massive losses

keen beacon
#

google models are probably much more cost effective too

ocean vortex
keen beacon
#

i wasnt talking about that

#

im not exactly sure what my point was ignore me lol

ocean vortex
keen beacon
#

4o is the smallest out of all of them i think

ocean vortex
#

yeah there's definitely a feeling that they are almost hacking gpt4o kind of lol, it doesn't have the inherent understanding/flexibility or spatial awareness that bigger models tend to exhibit

#

so what they are doing instead is feeding it very high quality data + fine-tuning that is potentially unmatched by anyone else still

keen beacon
#

2.5 pro pricing

#

Wait

#

whta

torn mantle
#

dont tell me....

#

nightwhisper is 2.5 pro preview?

#

aint no way

lime coral
keen beacon
torn mantle
#

same model then

keen beacon
#

nightwhisper is a different model its not released

torn mantle
#

my eyes now are only on nightwhisper tbh

keen beacon
#

ya

balmy mist
#

wait so we getting o4 mini this month wow

#

i want o3 pro so badly

#

ill buy the $200 again for o3 pro

keen beacon
#

o4 mini πŸ˜…

balmy mist
#

lmaoo

golden ocean
#

fr

sour spindle
#

Wonder if it will be better than 2.5

balmy mist
#

it has to be

sour spindle
#

Does feel reactionary a bit

misty vault
#

pre nerf gpt 4 2023 was wild

balmy mist
#

gpt-5 seems like end game lmaoo

misty vault
#

they are deprecated

balmy mist
#

so prob by summer we will have gpt-5

misty vault
#

If u still paid for the api I think u can still use it?
Or that only for companies or some sh

balmy mist
#

so AGI 2025 confirmed?

#

they also said they improved on o3 a lot, so will that mean its benchmarks are better that what was shown?

keen beacon
#

yes

balmy mist
#

i hate these numbering systems

keen beacon
#

probably

misty vault
#

openai fell off after 4o

#

never used chatgpt ever since

keen beacon
#

4.5 was supposed to be gpt 5\

misty vault
#

It has the cancerous "em dash" punctuation

keen beacon
#

no its not its trained from scratch

#

it was tuned in the style of it though

misty vault
#

oh, still I dont like its output

balmy mist
#

isnt it supposed to be a combined model?

keen beacon
#

no. the plan was for gpt 4.5 to originally be gpt 5 i think

#

like plans from way before

#

yes thats why its gpt 4.5 lol

sour spindle
#

Does anyone have any information on what is the difference betwen pro preview and pro experimental?

keen beacon
#

it sucked and didnt deserve to be called gpt 5

#

they changed plans i think

balmy mist
#

its so funny that twitter is used to hype up your own models lol, modern day marketing

#

you got ceo's just making promises casually on it

misty vault
#

I remember open source models were getting promoted so hard, everytime a new good opensource model was released on yt i saw videos "gpt performance" but they were all ass and none ever came close to beating gpt 4

#

Is that different now

keen beacon
#

yes

#

sorta

misty vault
#

oh yea deepseek is pretty good

#

But now big corporations are on top again

#

with 3.7 and gemini

keen beacon
#

theyve always been on top even open source

#

massive corporationsn funding training runs that are millions and millions of dollars

balmy mist
#

it seems like gpt-5 cant be bad like it has to be a SOTA model or OA is done?

#

r2 should be out in 1-2 weeks?

#

this month is going to be wild, getting SOTA models all in the same month

#

so the preview 2.5 vs experimental is the same?

misty vault
#

what is OA

balmy mist
#

whats with these naming conventions?

balmy mist
balmy mist
#

true but with how fast things are moving, how do we know by the time they are ready to ship, google hasnt already gotten a better model?

#

thats like 2-4 months away

#

in the last month we got 3.7 and 2.5 and v3.1

sour spindle
#

I kind of operate in the you are what you ship mindset

misty vault
#

Is v3.1 worth trying now that we got 3.7 and 2.5

balmy mist
#

no lol

#

its a good model tho

#

like a foudnational model that has COT inference

sour spindle
#

Meta had some promise as well but who the know what the hell has happened there

balmy mist
#

lmaoo i forgot about meta

sour spindle
#

Everyone has

balmy mist
#

The Information reports Meta delayed releasing Llama 4 at least twice because it underperformed on technical benchmarks, especially reasoning and math tasks, and struggled with humanlike voice conversations, according to two people familiar with the matter

- At least one version

#

so we gonna get Llama 4, r2, Nightwhisper, o3 pro and o4-mini, and maybe some other team model, we still dont know who owns Quasar

#

wait if we getting o3 pro and o4 mini, what if o4 mini is quasar?

#

nahh we def getting o3 pro

misty vault
#

I'd give openai my whole nude collection and credit card information for the return of gpt-4

balmy mist
#

if we getting o3 we are getting pro, its just allowing o3 longer to compute

#

thats easy to do

sour spindle
#

Meta has way to much money to be as dogshit as they are

balmy mist
#

yeah but they said soemthing about different plans like $200, 20k lol

#

thats how they explained it for 01 pro

sour spindle
#

it may be "easy" but super expensive as to be impractical

#

unless you really change the pricing tiers

balmy mist
#

yeah but if its more intelligent its worth it, investors will be happy

#

they showed that with o3 when they showed its benchmarks

sour spindle
#

Yes but at what difference of intelligence compared to competitors at lower cost

#

a lot has changed since those benchmarks

balmy mist
#

when it hard more compute it beat the agi benchmark test, to start SOTA they gonna have to give it pro

#

thats the same thing bro

#

they showed o3 with 16 plus hours of computing time

#

to solve problems

#

if thats not pro idk what is

#

its all about investors at the end of the day and they like to see their models do well in benchmarks, so they crank up the computing time to make sure they are demoing SOTA

#

thats why we havnt gotten o3 yet

#

i promise you we would not have seen what they previewed when we tested it out because that was o3 on maximum compute, they needed to optimize that

#

hence they needed to optimize that

#

send an example

#

never tried that

#

it makes no sense not to give us o3 pro

#

maybe, but they had all these months to optimize is so lets hope it stays at $200

misty vault
balmy mist
#

the fact that they have o4 already is interesting lol

#

like the numbering really is stupid at this point

misty vault
#

This is true, ever since they transitioned from 3.5/4 to 4.0, OpenAI is falling off, but are they still better off now as opposed to if they had started working on GPT-5 instead of doing the whole 4o crap or something? Like would investors make up for it or something? Those big models were kinda smart

#

I'm stealling learning sorry if make no senseπŸ™

balmy mist
#

thats what i am saying at some point the improvements are minimal and can just be solved with longer compute time

#

thats why gpt-5 might be their last model and they just give it updates

#

which is what they are already doing tbh

#

but that would be the first combined model

#

true

balmy mist
#

yeah thats what i was thinking

#

it makes sense for it to be

#

wow did you test this?

#

so that would mean that OA models are getting faster

#

bet imma try this today, you a musician?

#

craig so maybe instead of smarter models in the future we just keep getting faster on top of the current IQ and just give more compute time for the SOTA models?

#

you might as well become one now with ai lol

misty vault
balmy mist
#

cause the fact that we are getting o4-mini and o3 at the same time is just wild, are they gonna demo o4? like wtf, maybe they figured something out with inference

balmy mist
#

yeah lonegr context too

#

i forgot about that

#

it seems like 2 mill might be a cap tho, what do you think?

viral notch
#

arent context limits pretty long already?

#

or am i thinking prompt interpretation

misty vault
#

doesn't smaller models mean less inteligent or less knowledge?

#

or they have another way to improve its speed without affecting performance

balmy mist
misty vault
#

I mean faster models* and smaller or something

misty vault
viral notch
#

this. less training

balmy mist
#

its like a teacher teaching a student

misty vault
#

what about super fast models, how they improve the speed? does it just mean its a smaller model?

viral notch
#

i find that smaller models tend to be more volatile

balmy mist
#

less parameters and weights mean faster models, but they are also doing some new tricks like meta talkig about thinking wihtout tokens etc..

#

also the hardware you use

#

groq made hardware built for ai inference

#

i am suprised OA didnt buy groq yet lol

misty vault
#

Ok then I continue hating on fast models (jk but imagine 99999 trillion parameter gpt-5 model or something instead of the time spent on making 4o, then we would have agiπŸ₯Ά )

balmy mist
#

i kinda feel bad for meta lol

balmy mist
misty vault
#

πŸ˜”

balmy mist
#

maybe theoretically lol

#

maybe with shared weights

misty vault
#

Can LLMs even achieve sentience

#

But then we need to define sentience or concioussness first

balmy mist
#

it doesnt matter tho

upbeat radish
#

hi all

balmy mist
#

we already act like intimate objects are sentient, so to a lot of people ai is already sentient, and they wouldnt even be able to tell the difference(if they are talking to an ai on text or voice, even if they are looking at an image of ai or not, or soon video, music etc..) if you cant tell the difference then it does not matter

#

its all about perception

misty vault
#

yea true, if it convinces me then i'd be impressed

#

I think modern llms could already to that if fine tuned on actual discord or gamechat dialogue and then have it act like real person on discord. Only flaw it'll have is no infinite memory

misty vault
balmy mist
misty vault
#

t-800

balmy mist
#

hmm is this something new?

#

how do i play it?

#

i downloaded it

#

this better be fire lol

#

like on youtube or you got an article on this?

#

ohh i see

#

is it basically saying they are pretty much the same thing?

#

that is prob true, we shouldnt even be trying to do that imo

#

i like ai assistance(ai-human hybrid)

misty vault
#

top comment

balmy mist
balmy mist
#

so o4 mini is on the same level is sonnet 3.7 thinking based on my tests with quasar

#

assuming o4 mini is quasar

#

copilot might be a joke at this point

misty vault
#

yes
rest in peace gpt-4 powered sydney πŸ˜”

#

They had potential but gpt 4o destroyed everything from them

#

Actual crap service

balmy mist
#

it sounds not bad tbh, can you do one that mixes "in the end" and some other song? you can make a youtube account of this lol

balmy mist
#

did you try the new 4o?

misty vault
#

unless u were sarcastic

misty vault
#

Best openai product since pre nerf gpt-4

balmy mist
#

idk bro i loved o1 pro

#

that has been my fav aside from this new image stuff

misty vault
#

Imagine if they made gpt 4 thinking πŸ₯Ά (not starting from 4o)

balmy mist
#

lmaoo

#

you will most likely love gpt-5 then

#

wait whats your fav model now

misty vault
#

I would have but gpt-4.5 already talks like 4o so Idk

balmy mist
#

like in terms of every company

misty vault
#

And claude 3.7 for coding

#

I would never use openai models for coding after gpt4 deprecation

#

o1 was impressive for coding, can't deny that, but thankfully we have claude now

#

But now gemini 2.5 beats claude 3.7 in coding I think

#

haven't tested myself yet (for coding)

balmy mist
#

thank you

misty vault
#

nope, I'll try today

balmy mist
#

assuming it is o4 mini

misty vault
#

We can only test these models on code now right? since this thing only exists on webdev arena?

balmy mist
#

but it most likely is, its def not o3 lol

#

and its from OA right?

#

and thye just annouced they are releasing in a few weeks

balmy mist
#

i just want nightwhisper now

#

the biggest losers of this ai race has to be Apple and microsoft lol, the verdict is still out about meta, this copilot thing is just to funny

misty vault
#

Has anyone found a prompt to make the models in webdev arena to just answer in text instead of writing code

balmy mist
misty vault
#

Or they inject prompt after u send it

balmy mist
#

i would say try prompting it differently

#

but when you ask it text it codes a website for you

#

i kinda like it that way tbh

balmy mist
#

its like giving you your own personal UI

#

its how the ai communicates with you

misty vault
#

I tried a lot (but could try way more ig) but they all still make a website out of the prompt

#

If I try to talk to these models directly theyre pretty easy to "jailbreak" (not on webdev)

#

So maybe they just inject the prompt to make website out of it after u send it?

balmy mist
#

wait why dont you want them to give you websites? i can see how its annoying but you still are getting your answer lol

misty vault
#

Because if I do that manually through like openai api or claude or something, then it gives same effect

misty vault
#

last prompt takes priority

balmy mist
#

like this is so interesting to me

#

i ask it the capital of usa and it gives me this

#

i didnt even know i wanted that but i do now lol

balmy mist
#

give me an example of a long prompt

#

im curious

misty vault
#

But if u want really long answers or generate other text based stuff it will focus more on the website

#

Idk wait ill test something then ill give prompt

#

check it for us and record it and upload here

balmy mist
#

omgg i think i understand now, this helps reasoning because it reasons extra when it codes and if the model has visual abilities it can see the output of the website and check its answer again?? idk lol

#

cant wait to listen

#

it almost looks like words

#

but it sounds dope

#

can you ask the ai to make the notes spell out something?

#

what do yall think about temporary apps rather then permanent ones? like how we have it in webdev? it might be the future, i can see this becoming more and more common almost like sending a meme to someone, like you can send someone an app like how we send emojis lol

misty vault
#

then u're cooked

balmy mist
misty vault
#

Like things u would just "copy" over from code block to own project

balmy mist
#

like this is def the future, we just need a platformt that allows for you to have these pockets apps that can stay alive for a certain duration

misty vault
#

real

#

New world for exploits and malicious use

balmy mist
#

wym?

misty vault
balmy mist
#

yeah

misty vault
#

Because temp hosting of sites already exist yea

balmy mist
#

and you can tell it how long to have it up for

#

send the link i wanna try it

#

why isnt that bigger than it is?

#

we can make the app more appealing

#

a lot of people do the same thing but only some are popular

#

gonna present it in a good clean way liek how apple does it lol

misty vault
#

Well replit is just for developing, Idk if u can share it with friends without having a huge panel on the side as well that shows the code

#

There's prob other sites too

balmy mist
#

and ship it as an app and get influencers to start using it

misty vault
#

codepen is even easier, no acc required, immediately into coding and share

balmy mist
#

send link

misty vault
#

But also has huge panel thats open by default that shows the code

#

Idk if theres any sites for user friendly (I mean like sharing it to ur friends, not development) purpose

balmy mist
#

we need to make making apps just like sending stickers

#

that is the way

misty vault
balmy mist
#

imma work on that this weekend

misty vault
#

There's more some for that purpose

#

But idk

balmy mist
#

honestly i wanna copy what webdev does pretty mich

#

like is the dev of webdev here?

#

just make the hosting longer based on what users want, make the ability to choose the model(or have a default mode), and ship it in a clean interface, the only thing is links, i dont think links are the way anymore

#

when you share it maybe it can look like a sticker or something else, idk

misty vault
#

embedded website display in message platform

#

that would be so abusable though

#

Making webpage that looks like claim nitro button

balmy mist
#

we gotta brainstorm bro, you see the potential?

#

lol

#

maybe we could have this built in?

#

the ai can decide how to present the link?

#

this is still pretty good imo

#

i think this might be better than the other 2 lol

#

i guess thats preference lol

misty vault
#

replit can already do what webdev arena does

#

But it requires signup and looks complicated cuz its made for developers

#

So I guess ur idea is indeed not very common online

balmy mist
#

i think people are just not using it like that

#

like you said its only devs

#

but for a normal person

#

this is game changing

#

like I can send some an app of anything

misty vault
#

I found other sites very similar (not as complicated as replit, just immediately start paste code, and it renders) but those render it just on the page instead of actually hosting so yeah the hosting on url is less common but still out there a lot, BUT all with required signup most likely or no option to host for a specific amount of time

#

So no sign up would be a huge plus

balmy mist
#

yeah we just need to put all of that together

#

this will make software dev or web dev main stream lol

misty vault
#

So a no signup + no project based system (Just start coding, or paste code, or let ai generate it, share, and when time expires, whole site gone) + option to specify time is indeed unique

balmy mist
#

imma create this this weekend, let me know if you wanna help or wanna test it, gonna be hard but worth the try

#

i can see so many usecases

#

we already have those stupid ass stickers and the og emojis, and we have memes

#

this seems like the next phase

misty vault
#

I dont know

#

But only one way to find out

balmy mist
#

it would have to use gemini2.5 tho

#

cause no other model can do this right

misty vault
#

gpt-4-preview-0314 my beloved

balmy mist
#

lmaooo

#

you a developer?

misty vault
misty vault
balmy mist
misty vault
#

Gemini 2.5 is good but don't we need the nightcrwaler one or whatever someone showcased that had the best visuals

balmy mist
#

this is the prompt:
tell my friend to meet at walmart at 6 pm in a funny way

balmy mist
balmy mist
#

like its dumb but it works

misty vault
#

what model did that

balmy mist
#

nightwhisper

balmy mist
#

yo i got another idea

#

what was cringe before now becomes sweet

#

check this out:

misty vault
#

the far right?

golden ocean
balmy mist
#

this is still cringe but you get what i am saying

#

this is prompt: tell my cruse I love her and want to take on a date in a romantic way

misty vault
balmy mist
#

lmaoo

misty vault
#

making silly nice things for friends or people u like

balmy mist
#

yeah or even your coworkers or kids

misty vault
#

it'd impress them maybe if it looks good and creative

#

girlfriend success rate increase

balmy mist
#

like the usecases are limitless

#

lmaoooo

#

fr

#

im about to use thos

#

this*

#

she gonna be like awwww

#

this is actually a fun way to use it

#

we gotta make this mainstream yall

misty vault
#

lmaoo

misty vault
balmy mist
#

i just tweaked prompt a lil

#

nightwhisper is so good man

#

im designing it now, gonna give yall a prototype of it tonight and open source it

wooden crescent
#

is preview better or experimental?

sage raptor
wooden crescent
#

what did changed

sage raptor
misty vault
#

SO real, that example convinced me, i'm in

misty vault
misty vault
sage raptor
#

because he works at google

misty vault
#

ohh ok

balmy mist
#

well funny polls

misty vault
balmy mist
#

like honestly the sky is the limit thats why its perf

golden ocean
balmy mist
#

lmaoo

#

im making that now

#

yo we got a cooker

#

im happy yall see the vision

#

thank you webdev for showing us the way

#

also thank you nightwhisper for being so good at webdev lol

brittle tiger
misty vault
balmy mist
#

thanks bro, if this becomes something actually usable i will be so happy

#

imma try my best to get it done within the weekend

balmy mist
#

not sure about coding as a whole

#

but its been trained on react and ui/ux

#

i think its gemini2.5 pro specialized in webdev and ui/ux

#

imma use nightwhisper to help with the visuals lmaoo amd layout

#

soft dev is so cracked since ai came

#

like man, you can do a lot by urself nowadays

#

Updated this chart with the newest Gemini. It shows the rapid progress in AI over less than two years: costs for GPT-4 class models has dropped 99.7% and even the most advanced models in the world are still 82% cheaper.

Probably not worth betting on this trend ending really soon

#

gemini truly is the best model by miles, like its pooping on every other model in overal efficiency, its SOTA and cheaper than the others lol

misty vault
#

true

balmy mist
#

wild days we living in, thats why open ai said forget gpt5 we need to launch o3 and o4mini asap lmaoo

#

what you think, what should I change?

misty vault
#

Every model still sucks at writing in your words or maintaing your tone/personality, like if u give it samples (a lot) or even if u managed to get it to the point where u only have to edit a few words (good enough for me) then after a while it'll start to become repetitive and you'll have to write whole text explinaing that you now want the next message or paragraph different, but then it'll follow that wrong because u'd also need to give samples of that first and it just idk

#

But I guess for writing in your style purposes, you'd really need to use fine tuning then it will do perfect

#

Without fine tuning no model can do that perfectly but I guess that makes sense

balmy mist
#

yeah you need to prompt it and provide a lot of context about you

misty vault
#

(I didnt need fine tuning when using gpt-4 bing chat πŸ˜” )

balmy mist
#

finetuning is the easy solutions

#

but expensive

misty vault
misty vault
#

But gemini 2.5 is free right

balmy mist
#

is it?

misty vault
#

even fine tuning

#

Idk

balmy mist
#

idk about that

#

lmaoo

#

that would be wild

#

that would solve memory tbh

misty vault
#

Google ai studio is just asking me to allow google drive access rn

#

when clicking fine tune

balmy mist
#

not 2.5

misty vault
#

ohh

#

rip

balmy mist
#

that would be cracked

misty vault
#

For witing that isn't too bad though I think

#

All the base model needs is the language and good vocabulary and speech

#

knowledge wont be an issue for me because i'd be providing that in the prompt what to write about or somethign
gpt-4 could do it good and modern models are way smarter than gpt-4 nowadays so i'll give that a try

balmy mist
misty vault
#

lmaoo

#

Idk

#

Wait nvm css was bugged for a second

balmy mist
#

lmaoo

#

you like the first one better?

#

i want to have the design right before i start building

#

i think the second one is promising, gotta just clean it up some more and then add back end

misty vault
#

Idk both look fine actually

#

I like both

balmy mist
#

lmaoo

#

if this app really works the way we want it, this will be a good example of this new ai workflow, from coming up with an idea to building and deploying it

misty vault
#

Didnt devinai try to do that (but turned out they were scam)

#

for actual web development projects rather than quick silly webpage generator for sharing to friends

balmy mist
#

yeah but expensive

#

and now they charging 20 bucks

#

but their rep is alreayd ruined and casuals still not gonna use that

#

the key is to get the average person

#

marketing for devs is not going to work

#

or companies

misty vault
#

claude 3.7:

balmy mist
#

you need to market for normies like how openai did with the 4o image