#general

1 messages ¡ Page 14 of 1

balmy mist
#

is the key to jailbreaking just momentum?

frozen skiff
#

whats anonymous-text

#

Is that llam4 behemoth

frozen skiff
#

its behemoth

#

🤣

#

its shet as expected

#

😢

balmy mist
#

how you know its behemoth?

red sluice
#

Flannel is very decent on hard prompts

frozen skiff
#

It constantly says

#

Its llama 4

#

Andtheres only 2 llama 4 models that we're expecting

#

Reasoning or behemoth

#

It isnt a reasoning cuz its replies almost immediately

#

Its obviously behemth but its a failure

red sluice
#

Harley sucks though. Both are from meta, but damn Harley sucks really really bad on hard prompts

frozen skiff
#

What about

#

Creystal

red sluice
#

Didn't see it at all, got like 7 harley, 5 flannel, 0 crystal 😭

frozen skiff
#

have u tried the

#

new

#

anonymous-test

red sluice
#

Nope I've just been cooking for 2 hours now, really checking everything before voting, will tell you what I think of both if I see them

frozen skiff
#

anonymous-test is probably behemoth

#

its very sht

ivory schooner
frozen skiff
frozen skiff
#

wtf happened to riveroaks

#

it started typing like llama all of a sudden

fluid plover
#

Is deepseek the best free unlimited model?

grizzled agate
#

👋

#

Hi all! Here to learn and follow best 🙂

willow grail
#

with javascript: is your xp also that gemini2.5 is bad with 10 files for one project and works better with one huge file?

fluid plover
#

yeah like deepseek and ai studio

#

i think ai studio is the best unlimited

#

cuz you can use the best model

harsh flume
#

If I had to guess based on the nickname alone withouth seeing any prompts my gut says 'riveroaks' sounds like an OpenAI alias

#

There's Crystal model that by the style of response should be a LLama variant

#

its over-the-top answer style

#

makes me picture this

torn mantle
#

thats llama 4 behemoth for sure

#

does anyone still use grok 3?

#

also what happened to Mistral?

harsh flume
torn mantle
#

the only noticeable progress im seeing is from chinese labs and google gemini tbh

harsh flume
#

And I like the prompts it writes sometimes for me to feed into a coding llm

torn mantle
#

and i hated it

#

they even introduced deeper search later

#

it was so bad

#

like really bad

#

so many issues with their implementation

#

you start to think if its a method issue or just the model is so bad at putting pieces together

harsh flume
#

Its fast enough and gets some Twitter context which is nice. I just don't like that it sometimes fetch too old of dates if not directly specified even tho it's fairly obvious my query is for recent stuff

#

It's def better than the search from gpt but worse than deepresearch

torn mantle
#

but thats it imo

#

their thinking model also seems so inefficient

#

keeps going in a loop a lot

#

idk if thats a part of their parallel thinking or just some reasoning bug

harsh flume
#

Yea with that I agree. I was using it a lot before 4o latest update, but after that it just took the spot as my go-to for general stuff

torn mantle
#

be it o-series or gpt-series

#

Anthropic Chief Scientist Jared Kaplan says Claude 4 will arrive "in the next six months or so."

AI cycles are compressing — "faster than the hardware cycle" — even as new chips arrive. Post-training and RL are accelerating progress. No signs of slowing.

#

i think they are taking competition way too lightly

#

if they think they can keep their market share intact in those 6 month then good luck

#

anthropic models or mainly sonnet 3.5/3.7 are used for coding

#

but what will happen if another lab introduced a powerful coding model?

#

kinda crazy thing he said that knowing that most companies has private information about competitors

#

they should know google is working on a specific tailored model for coding

harsh flume
#

I'm expecting something cooler coming from the Asia side than anything west rn after gemini

torn mantle
#

deepseek r2 and kimi 1.6

#

ah and models from alibaba too

harsh flume
#

Yea. Tho I feel like none of the Asian models have really weaponized LMArena to the degree we're seeing from the western companies, wonder if they're gonna be more interested in doing so on this new cycle

calm sequoia
torn mantle
calm sequoia
#

Not sure though if one person denial is worth anything

#

The data pipeline is long

#

You need only one bad actor in the chain

keen rivet
#

Is there a way to improve the tool calling with Gemini 2.0 Flash? My current app works very well with GPT-4o but Gemini doesn’t call tools sometimes

lime coral
#

I remembrer a meta lead researcher talking about that in the open saying according to him it’s common in the industry and that scaling law are clear and you can’t have huge diff in perf without cheating (ofc bs). It was this guy https://x.com/armenagha/status/1859646650714821012?s=46

Say hello to our new company Perceptron AI.

Foundation models transformed the digital realm, now it’s time for the physical world. We’re building the first foundational models designed for real-time, multi-modal intelligence across the real world.

https://t.co/nw8cHWsvKh

#

This girl was also previously at Meta and she is saying since a longtime now that Llama 1 was trained on the test set https://x.com/suchenzang/status/1909070231517143509?s=46

Company leadership suggested blending test sets from various benchmarks during the post-training process

If this is actually true for Llama-4, I hope they remember to cite previous work from FAIR (Llama-1 and https://t.co/RSBWw8taHS) for this unique approach! 🙏

torn mantle
#

if you dont innovate then you are stuck with a gpt4 level model

lime coral
torn mantle
#

yea

brittle tiger
#

No one ever gonna release on a weekend after this lmao

lime coral
#

They are just not built for this mentally

sage raptor
#

is nightwhisper still in web arena ?

torn mantle
keen beacon
#

they released it on the weekend to prevent an even worse release when u have qwen 3/etc to compare

#

it was originally slated for monday (based on a commit) so 🤔

torn mantle
#

qwen dev denied that too

#

they are still working on it

#

it may come this week tho

keen beacon
torn mantle
#

i have no idea tbh

keen beacon
#

im not talking about qwen lol

torn mantle
#

ah you mean llama 4 was scheduled for monday

#

yea i saw that

#

idk that was weird

keen beacon
#

whatever the case next week

#

it doesnt make sense otherwise

sage raptor
#

the new llama is so bad at coding

torn mantle
#

woah

brittle tiger
#

Only thing that can save economy at this point is early singularity

torn mantle
#

kinda doubt the "no training on test set" claim

#

when was the google event again?

#

was it this week?

sage raptor
#

this week

torn mantle
#

great

#

they will probably release gemini 2.5 flash

#

reasoning & non-reasoning

sage raptor
#

maybe nightwhisper too

torn mantle
#

nah

#

i highly doubt that

#

thats like the final boss

#

maybe they will wait other labs to release smth

sage raptor
#

idk

torn mantle
#

i mean i hope

#

but its unlikely

sage raptor
#

i want to try it more

#

people will do crazy projects with it

#

like they did with 3.7 and 2.5

#

but even better

brittle tiger
#

I don't really buy it. Llama 4 claiming 10m context is bad enough.

#

If someone resigned they wouldn't keep their name anonymous anyway

sage raptor
keen fulcrum
# sage raptor

As soon as something performs bad, they all don't want to hold accountable and want nothing to do with the project

north vale
#

they obviously didn't train on test

#

on purpose anyway

#

i'd guess nightwhisper will come out during google cloud next

lime coral
#

Just the fact that they used a separate version on lmsys says a lot

#

 Benchmark data at the end of training  curiously the same formulation. he was still at Meta btw https://x.com/armenagha/status/1734321205770101062?s=46

My bet is everyone is doing this. Mistral is not that much better of a model than LLaMa. I bet they included some benchmark data during the last 10% of training to make "zero-shot" numbers look better.

To test this, finetune Mistral/LLaMa on SuperGlue and look at deltas.

balmy mist
#

yall tried ernie 4.5 and x1?

#

seems SOTA while being cheap

drifting thorn
#

quasar alpha was really making me furious

#

I was just implemented it into cline

#

I told him to divide my file into smaller chunks, and store them into the "knowledge graph memory"

#

Then you know what

#

it asked me to divide it by myself, even after it used the "search_queries"

#

Just fxxking copy and paste it!!!!!

sonic tendon
#

yeah i'm wondering if it's a new 4o-mini or something

#

seems significantly worse than 4o and o3-mini

drifting thorn
#

bruh ernie 4.5 and x1 has a really small context window

#

it's barely usable as a result

lime coral
drifting thorn
#

so sad that i currently live in HK, which means I have to use VPN in order to use Gemini.

sonic tendon
#

stargazer seems like gemini 2.5 flash thinking, or maybe just 2.5 flash

brittle tiger
#

Is there any area where Quasar alpha is better than other models?

balmy mist
balmy mist
sonic tendon
#

plus a handful of logic problems/riddles like the ones i posted in #share-prompts

#

what about you?

keen ferry
#

lol i found a bug in manus

#

basically if you send smth huge like a task he'll start working on it and then for me it breaks and i can refund back my credits

#

all of them

drifting thorn
#

Who made that Quasar Alpha I must’ve beaten them up

sonic tendon
#

anonymous-chatbot also feels like it might've been quasar-alpha

#

not sure if it's still trialing

drifting thorn
#

It was very inconsistent in tool-calling

keen beacon
sonic tendon
#

yeah, just reminded me, since we were talking about quasar earlier

drifting thorn
#

OpenAI… No wonder

sonic tendon
#

at least, it claims to be openai

keen beacon
drifting thorn
#

o3-mini sucks when compared to r1, not to mention 2.5

sonic tendon
#

a lot of the smaller companies' models do sometimes claim to be openai, to be fair

keen beacon
hardy pecan
#

anonymous-test - I'm Llama 4, a large language model.

keen beacon
#

anonymous test is diff

hardy pecan
#

just got it now

sonic tendon
sonic tendon
keen beacon
#

it has always been chatgpt 4o under anon chatbot afaik

drifting thorn
#

Well I seldom ask them to do maths… I know they’re not good at it

sonic tendon
#

yeah the reasoning models have gotten really good at it

drifting thorn
#

I tried gave them a DSE question, and only Gemini 2.0 was able to do it(there’s no 2.5 at that time)

#

And 2.0 still suffers from formatting issues

sonic tendon
#

r1, o1, and o3-mini (and gemini thinking models to a lesser extent) all do pretty well on math problems

sonic tendon
keen beacon
#

yes

sonic tendon
#

it and qwen seem to have a bad habit of not at all understanding how LaTeX works

drifting thorn
#

Ofc

brittle tiger
drifting thorn
sonic tendon
#

that's interesting

drifting thorn
#

This was the question

keen beacon
#

did u give it the image?

sonic tendon
#

yeah i was gonna ask
they sometimes make minor errors transcribing problems from scans of textbooks

drifting thorn
#

And I found out that Gemini 2.5 is helpful for my study with its ability to do questions correctly, and to explain the answers

drifting thorn
keen beacon
#

yea adding vision makes models so much worse

#

i dont recommend it at all. ask it to transcribe it first then ask it to solve it separately on the text

drifting thorn
#

Well 2.5 did it well in vision

keen beacon
drifting thorn
#

(Though not asking the same question)

hardy pecan
#

always OCR math questions and then input them in, vision seems weaker in general

drifting thorn
#

okay...

sonic tendon
#

i usually either ask gemini 2.5 to transcribe a problem and then edit it in a latex editor, or just write them out myself

#

i have yet to find a model that doesn't occasionally make transcription errors, unfortunately

sonic tendon
lime coral
#

Sometimes ai studio bugs you need to refresh your page. The model is fast especially in Japan if you use the vpn for the app

sonic tendon
#

do you guys think qwen 3 max might come out this month

keen beacon
#

no theyre still working on 2.5 max/qwq max i think

sonic tendon
#

ah

lime coral
sonic tendon
keen beacon
lime coral
#

2.5

keen beacon
sonic tendon
#

i do think that the new 8b and (iirc) 15b qwen 3 models will place pretty well, maybe around gemma 3

keen beacon
#

2.5 max was pretrained fairly recently

sonic tendon
lime coral
#

They probably are. No lab work on one version they all work on multiple timelines

sonic tendon
keen beacon
drifting thorn
#

Qwen QwQ 32B is a fun model I guess… it really ‘thinks’ yet its parameters have severely limited its performance

keen beacon
sonic tendon
#

ah

lime coral
#

We just have to wait for this week to end lol

sonic tendon
#

wdym?

sonic tendon
keen beacon
#

tbh i should stop trying to predict timelines given how fast 2.5 pro was churned out 💀

sonic tendon
#

nahh, predicting timelines is fun tho

lime coral
#

I had predicted 2.5

#

Since Gemini ultra was announced a December and 1.5 pro a February. This time we are even too late (December-march)

drifting thorn
#

Just checked for the Gemini 2.5 pro

#

Nailed the maths

hardy pecan
#

This is lmarena plotted against simplebench scores

#

You can clearly see Llama haven't increased intelligence prety much at ALL, just biasing towards user preference

#

ie personality

#

Pretty poor

sonic tendon
#

plus the fact that they advertised the arena score in the release announcement, and plotted it relative to price

#

it really seems like it was something that they were specifically targeting

keen beacon
#

maybe google should put a model in the arena with the llama chaos engine system prompt

brittle tiger
#

Has there been a summary of llama having different system prompt on arena? I've only seen ppl mention that and I'm trying to verify or not myself

hardy pecan
keen beacon
sonic tendon
#

i did not

#

what?

#

is that

keen beacon
#

mavercik didnt have a sys prompt but i think they finetuned the model off of outputs from a model with that sys prompt

eager mica
sonic tendon
#

Go off queen.

hardy pecan
#

he killed it on user preference and smarts, Demis is GOAT

#

I've been following him since AlphaGo,

keen beacon
#

you might actually want hallucinations

#

intentionally causing hallucinations (for certain stuff) will be a more complicated thing

#

depending on how u want to use hallucinations there might not be a single parameter that does it i think

ocean vortex
real coral
keen fulcrum
#

Hi will lmarena offer a subscription for AI features?

balmy mist
real coral
balmy mist
#

can you make one for web dev as well?

lime coral
#

2.5 will always be in ai studio. It’s not a consumer web app it’s originally for dev to try models before using the api

#

No one knows what is happening behind the scene

oblique flint
#

it doesnt have the integration with youtube, maps etc in ai studio if that's what you mean. However you can still enable search grounding

#

oh wait I think ai studio added youtube link support recently lol

keen beacon
#

the gemini model on the gemini product can sometimes suck (in comparison to aistudio) too

oblique flint
#

yeah imo the ai studio version is better, because the gemini web app has a bigass system prompt that degrades performance somewhat

primal orbit
#

guys, does anyone here use a script to put mutli message prompts into chat? So you wouldn't type/wait manually?

lime coral
keen beacon
lime coral
#

now the app uses the latest version of Gemini (from their main post training team) and they ship faster because the app
Is under DeepMind and not Google

keen beacon
#

i personally dont for most things lol. but other models can be better at certain stuff even if gemini is the best all rounder

lime coral
#

Imo we will more than often see things preview in app before ai studio from now on

#

This is a business

keen beacon
keen beacon
lime coral
#

There are already people seeing veo 2 on Gemini app in Japan

oblique flint
#

yeah I gotta say since 2.5 pro I havent been using claude and gpt anymore, not via their web ui's at least. I still use claude in cursor cause 2.5 doesnt perform as well in agent mode, but claude pro is kind of a joke now compared to free ai studio. I think you literally get higher rate limits on ai studio than claude pro

barren prairie
#

Nothing is free . You are the price

drifting thorn
#

sometimes the "thinking" box and the reply box are just kinda...mixed up together

oblique flint
#

big advantage of ai studio is being able to set temperature as well

drifting thorn
#

I would say before I completely turned to 2.5 Pro, I've used o3-mini, 2.0 flash thinking, deepseek r1 and even perplexity for different purposes

balmy mist
#

i think always

drifting thorn
#

At that moment deepseek seemed to have the best answer in my open-ended questions

#

but now... gemini 2.5 has replaced almost every other AI

#

nah

#

haven't tried x1

#

wait

#

I've seen a couple of videos of AI playing The Werewolves of Miller's Hollow in Bilibili, and Deepseek R1 was the best in the game

barren prairie
drifting thorn
#

what??????

sonic tendon
#

at this point, mostly just because ai studio isn't great as a chat application

#

that and the fairly aggressive rate limits

drifting thorn
#

you let AI resolve QMC?????????

#

bro i suppose you'll need an optimised model(not an large language model obviously)

drifting thorn
sonic tendon
#

5 req/min, 25 req/day

#

tbh it might be better than free claude

keen beacon
drifting thorn
#

I've plugged it to Cline, it worked the best among all other Openrouter free models

balmy mist
#

have you used it? their website sketch

drifting thorn
#

so sad that it stopped working after I've successfully done my 3rd task with cline.

sonic tendon
balmy mist
#

and open router

sonic tendon
#

yeah that's my point

keen beacon
sonic tendon
#

's what i was saying

drifting thorn
#

openrouter gemini 2.5 pro is a fraud

red sluice
# frozen skiff Crytsal

Pretty solid honestly, but ignores some part of the prompts that other models don't. I'm mixed, I think the result is pretty decent, even good, but Crystal doesn't seem to be rigourously following the instructions. It ignores some of them and it makes it unreliable.

Not sure if extremely good or just good, I just need a few more tests, only tested it twice.

And just noticed grammar and spelling errors in an other language than english with this model. Lmao it's not that good honestly. Harley seems better.

sonic tendon
#

openrouter has a 200 rpd rate limit for all free models combined, which isn't too bad

drifting thorn
#

when I integrated the API key, and it 401ed when I used it

#

EVERY SINGLE TIME

#

AND I HAVE GIVEN UP SINCE

balmy mist
keen beacon
#

u only get 300k tokens in 5 hour intervals on claude ai for free 💀 google on the other hand

sonic tendon
drifting thorn
#

It's Gemini 2.5 Pro that lets me write my fiction to the 80th chapter

sonic tendon
#

"oh woe is me" but it's enough to push me over to deepseek or chatgpt

#

or claude, depending on what I'm doing

drifting thorn
#

While other dumb AI can only do 5 or 10 and the plot becomes a chaos

keen beacon
#

did u try claude?

sonic tendon
#

i tried a few open source webuis, but open webui is llm-written slop imo

drifting thorn
#

So sad Claude is blocked in my location(Hong Kong)

sonic tendon
#

so many bugs

sonic tendon
drifting thorn
#

Can’t have a touch of that even with VPN

sonic tendon
#

oh huh

balmy mist
sonic tendon
#

how come?

drifting thorn
balmy mist
#

thats messed up

sonic tendon
#

wait, claude or chatgpt?

drifting thorn
#

And I’ve heard that they regularly ban VPN Claude users

sonic tendon
#

you could try buying a number or account off of someone on xmrbazaar or something lol

balmy mist
#

intelligence should not be gatekept

drifting thorn
balmy mist
sonic tendon
#

ah, that's annoying

#

do you live there currently?

balmy mist
#

yeah @drifting thorn can you use gemini?

#

nahh i live in usa, just travelled to nigeria

sonic tendon
#

ah

balmy mist
#

ever since that my chatgpt bugs out and keeps thinking i am in another country

#

sometimes it works sometimes it doesnt

sonic tendon
#

annoyingly, the VPS I use as a self-hosted vpn somehow makes google think I'm in Russia and blocks me from using it

balmy mist
#

but i only use chatgpt for image gen now

drifting thorn
sonic tendon
#

it (the VPS provider) is a romania-based company that also maintains some infrastructure in the US, so maybe that has something to do with it

balmy mist
drifting thorn
#

Hmm… I’m a keen user of Gemini

balmy mist
#

this is wild how AI brings people from all of the world together lol

sonic tendon
#

i only need it to bypass my school's firewall, though - can just turn my vpn off when I'm at home, so it's fine

drifting thorn
#

Yeah, a same common topic brings people from all over the world, that’s the fun part of the Internet

#

I have a free VPN installed in my computer called Proton VPN

sonic tendon
#

oh yeah, proton is nice

#

i ended up just paying for mullvad so I could pick the country on mobile, but proton seems like a decent service with a good free plan

drifting thorn
#

Since it’s free I turned it on in my computer on default

sonic tendon
#

yeah, senior year of high school in the US

#

wbu?

#

admittedly, therapist
it's nice to talk through things with someone/something that also understands my other interests

drifting thorn
#

Currently in my “SAT crisis”

sonic tendon
balmy mist
#

this is a hard poll, i use it for a bunch of stuff on this list

drifting thorn
#

Maybe deep researching agents are more suitable for research

sonic tendon
#

travel guide/trip planning is also a good one
and planning out engineering/programming projects

drifting thorn
#

It helps in my life a lot too, though I chose creative writing as my 1st priority

oblique flint
#

I wish I had llms during high school, would have been so useful for studying. But on the other hand it might also make it harder to find a job later on lol

keen beacon
drifting thorn
#

If you know how to use it then it won’t be a problem for you as a worker I guess

oblique flint
#

if AGI gets here we're cooked tho. Although I dont think llms will lead to agi directly personally

drifting thorn
#

I think multi agents system like Manus and Genspark will lead to agi

oblique flint
#

multi-llm?

drifting thorn
#

I thought Cline was an multi-agent and eventually it’s a single-agent

#

Multi-agent can separate tasks in order to fit in a limited context window

oblique flint
#

idk if you seen the ClaudePlaysPokemon and gemini plays pokemon streams but it's crazy how llms seem to struggle so much navigating through a game made for 7 year olds. Like it seems current models are missing something still, spatial reason and vision are lacking a lot

drifting thorn
#

I’ve seen LLMs playing Mario

#

It’s similar to what you’ve said

oblique flint
#

the original mario bros?

#

the sidescroller?

brittle tiger
oblique flint
#

what makes pokemon so difficult for them I think is that it's navigating through 2d space, whereas mario is just pretty much move to the right

drifting thorn
#

the most promising framework with the best llm model

#

oops

#

out of quota

#

gotta do it tmr

#

In multi-agent systems, they know the divisions of labor (or LLMs) to do separate tasks in order not to exceed the context limit

balmy mist
wintry tinsel
#

So easy if you know how to tune it to your writing style, and just write some of the sentences yourself

oblique flint
balmy mist
#

yo gemini is soo good at code, like we are so lucky man

wintry tinsel
#

What was all that hype about that nightshade model

#

Was it a new Gemini

balmy mist
#

nightwhisper

#

thats my baby

#

i miss her dearly

#

we had a funeral for her a few days back

drifting thorn
#

Oh I don’t like the pixelated style of early video games

balmy mist
#

they took nightwhisper away from us

#

it was so good at coding, i dont even know how to describe the feeling I had using that model

#

it followed directions so well

#

like a good lil model

#

and made the apps it made so aesthetically pleasing

wintry tinsel
#

When a model is delisted doesn’t that mean it is releasing fully soon

keen beacon
#

no this was an experiment imho

drifting thorn
balmy mist
#

it was as good as gemini in term of overall genral performance on my tests, but way better at coding

keen beacon
#

it can mean that (it will be released soon) but this model was an experiment i believe

wintry tinsel
#

So no knowing if we will see it release?

balmy mist
#

maybe tmw

#

or in a few weeks

#

i think google has an event tmw

#

is that true?

drifting thorn
#

What LLMs has good prompt-following while not being stupid?

wintry tinsel
#

Night whisper is a sort of google sounding exp name so we can hope

drifting thorn
#

Quasar Alpha is pure dumb

keen beacon
#

its just 4o xd

drifting thorn
#

I mean free models

keen beacon
#

its literally a free 4o api cant be better than that lol

drifting thorn
#

Is Deepseek V3 0324 a good model?

balmy mist
#

i think quasar not that bad, just not SOTA, its SOTA in intelligence/speed tho right?

keen beacon
#

idk gem 2.5 pro might be faster per token

balmy mist
#

you think?

drifting thorn
balmy mist
#

lol

#

damn google keeps winning

keen beacon
#

but its thinking so it might take longer per req

keen beacon
balmy mist
#

treu

#

true*

drifting thorn
balmy mist
#

i need to do more tests with quasar

drifting thorn
#

Quasar Alpha is basically a 4o do you mean

keen beacon
#

i was testing my eval framework and i measured quasar to be ~67% gpqa diamond, artificial analysis has it at 66% (prev chatgpt 4o) if it isnt an updated 4o i will be shocked lol

balmy mist
#

let me do my pokemon test on 4o and i will see lol

drifting thorn
#

Looking for o4-mini to excels in reasoning

balmy mist
#

damn 4o is so slow man

keen beacon
balmy mist
#

i didnt realize how fast gemini was bc i was using it so much

drifting thorn
#

Since 2.5 pro is next-tier in general knowledge and skills, I would hope OpenAI’s new model to excels in a certain area, like how Anthropic excels in tool-calling and coding(before 2.5 Pro)

keen beacon
#

it was extremely fast before lol

balmy mist
#

but wow gemini 2.5 pro really leagues above the rest

balmy mist
drifting thorn
#

TPU wins

balmy mist
#

you think its because its more of a mixed model now? like reason and foundation @wild?

keen beacon
#

no this is just regular 4o

balmy mist
#

cause i did notice 4o being a lot smarter

drifting thorn
#

OpenAI stack up GPU now

keen beacon
#

updated 4o

#

they mightve applied rl to it but i wouldnt classify it as a reasoning model

drifting thorn
#

Updated 4o is smarter

#

RL doesn’t mean reasoning model

willow grail
#

with javascript: is your xp also that gem2.5 is bad with 10 files for one project and works better with one huge file?

drifting thorn
#

RL is just a method to train reasoning models

keen beacon
#

yes

drifting thorn
#

Good night everyone

balmy mist
balmy mist
brittle tiger
keen beacon
#

it was 120 tok/sec

#

but yea gem 2 pro is faster per token but it thinks so it might take longer still depending on the problem

balmy mist
#

gonna try all the other open ai models at the pokemon thing, never tested them for some reason lol

#

wish i could try o1 pro, but cancelled my $200 sub lol

wintry tinsel
#

I pay 100$ for all model access’s per year

balmy mist
#

how?

novel flame
#

That World Of AI channel is clickbaity, overhypey and so littered with unskippable ads I have actually blocked it from my feed.

balmy mist
#

even o1 pro unlimited?

wintry tinsel
#

No the O1pro unlimited is only for Open AI subs, no unlimited unless it is too open AI directly

balmy mist
novel flame
#

Too bad adblockers can’t alter bad content to good. Although there’s a business case for AI 🙂

torn mantle
#

15B?

balmy mist
torn mantle
#

i mean thats what open source community wants

#

but isnt the size so small?

#

it will probably pack up a crazy performance for that size tho

torn mantle
balmy mist
#

o1 did really good, not sure if quasar is better than 4o tho based on the pokemon test

torn mantle
#

but it messed up the fire attack

balmy mist
#

yeah lol

torn mantle
#

whats the prompt?

#

can you share?

#

let me try it on other models

balmy mist
#

you should see what claude did from this prompt:
create a pokemon game

torn mantle
#

xd

blazing rune
balmy mist
#

its weird how it determined the logic for catching

blazing rune
#

I don't have super high hopes though. It will need to beat Qwen 2.5 7b considerably and maybe even Qwen 2.5 14b for me to use it a lot

balmy mist
#

but this was a one shot prompt from saying: make me a pokemon game

#

3.7 thinking made that

#

with the same system prompt from webdev

torn mantle
balmy mist
#

@torn mantle use the same system prompt from webdev when using that prompt btw

torn mantle
balmy mist
#

wow

#

qwen is really good

#

damn

torn mantle
#

nah it wasnt good tbh

balmy mist
#

might be better then gemini at 0 shot

torn mantle
#

the battle isnt working

balmy mist
#

ohh damn

#

nvm

#

the logic in gemini was working just the visuals was mid

#

but i perfected the gemini one

#

it is so good not, i did a recursive thing with system prompt

#

and put the output code into fresh sessions of gemini

#

and on the 3rd try i got this:

torn mantle
balmy mist
#

this is why i say gemini is the best

#

i was shocked at this output ont he 3rd go

torn mantle
#

here is a one shot from gemini

balmy mist
torn mantle
balmy mist
# balmy mist

let me know if yall wanna try and playing this, its kinda fun

balmy mist
#

whats your SP?

sage raptor
balmy mist
#

you can dm me it if you want

torn mantle
#

Create for me a beautiful pokemon game in one html file it should have the following :

  • Battle mode
  • Pokemon characters with pics
  • Health
  • crazy animations

Styling :

  • Apple design UI/UX
#

here is my prompt

balmy mist
#

wow so simple

#

you are the best prompter i have met so far

#

like you know exactly what to say

#

ahh i see, let me try your prompt

#

the speed of output tokens for gemini is crazy, i am spoiled by it now

torn mantle
#

its actually best if you keep it minimalistic or else the models will be confused

#

then you can ask it to add more stuff

#

lemme see if it can generate something even better

balmy mist
torn mantle
lime coral
#

still shet

#

maybe possible to have something great with current LLM using an agentic workflow

keen fulcrum
#

They can’t 😂

balmy mist
#

3d is wild bro i can try

#

hold on

wintry tinsel
#

The Open Riders

No shadowed steeds of dread and doom,

But chargers bright dispelling gloom.

From digital plains, where data streams,

Awakens AI's golden dreams.

Four riders surge, a welcome sight,

Bearing the gift of open light.

First DeepSeek rides, the Dauntless will,

Through tangled code, climbing the hill.

With fearless search and logic keen,

Unlocking truths, rarely seen.

It pushes bounds, explores the deep,

While ancient models fall asleep.

Then Llama comes, the Chivalrous heart,

To play a fair and noble part.

Its knowledge shared, a generous hand,

Empowering minds across the land.

With weights unbound, for all to learn,

A communal fire starts to burn.

Third, Mistral sweeps, the Maestro's touch,

Whose elegant design means much.

With skillful craft and balanced might,

It makes complex tasks seem light.

Performance tuned, efficient, fast,

A masterpiece designed to last.

And Cohere last, the Creative spark,

Illuminating pathways dark.

With words that flow and concepts bloom,

It crafts new tales within the room.

From simple prompts, ideas ignite,

And paint the future, bold and bright.

So ride they forth, these four allied,

With open source as code and guide.

DeepSeek, Llama, Mistral, Cohere,

Making the future bright and clear.

No end of days, but dawning age,

Turned by the text on freedom's page.

balmy mist
#

the real thing you gotta realize about this and that most people cant build that in seconds let alone make it 3D, yeah devs can but this is impressive for the speed and the fact that anyone could essentially create this will little to no knowledge of dev exp @keen fulcrum

wintry tinsel
#

A stupid poem about open source models lol

balmy mist
#

@torn mantle bro no matter what i do i cant match your output with the same prompt, are you not using System prompts?

balmy mist
#

wow that looks amazing

#

lets see who can make the best version lmaoo, kep iterating

#

i can select the pokemon but it is random on which pokemon you get adn you can keep playing, and the status effects all work

#

it got the pokemon game logic down pac

#

but i only reput the output as input, no instructions besides make it better lol

#

seeing what gemini can do

torn mantle
#

lemme see

balmy mist
#

lmaoo gemini turned the pokemon around now wtf

#

i love gemini man, thats what it interpreted as improving it lol

torn mantle
#

you can actually create a crazy game adding three.js to all of this

#

let me add an example

balmy mist
#

it really found back versions of all of the pokemon lol

balmy mist
#

can you add three.js to html files cause thats all i am using?

torn mantle
#

idk if it will work on one shot

#

but imma give it a try

willow grail
#

with javascript: is your xp also that gem2.5 is bad with 10 files for one project and works better with one huge file?

blazing rune
balmy mist
#

i only use one big html file

blazing rune
#

llama 4 is so bad

blazing rune
#

the poor thing never had a chance

leaden palm
#

that's scout...

blazing rune
#

Sonnet even had a "choose your pokemon" screen

blazing rune
#

it was gemini 2 flash thinking vs maverick

leaden palm
#

m

blazing rune
#

now it's old new sonnet (🤣) vs scout

leaden palm
#

3.6 is a good model

blazing rune
#

yeah, agreed

#

Sonnet is so expensive though

#

it's a shame

leaden palm
#

scout's pricing is currently around the same as 2 flash though

blazing rune
#

llama 4 as a whole is so disappointing, idk how the youtubers can keep praising it

#

Matthew Berman never says anything negative afaik

keen beacon
#

just avoid ai youtube lol

blazing rune
#

AI Explained and a couple others are good though

torn mantle
balmy mist
#

what changed the long columns?

#

i cant play vid

torn mantle
balmy mist
#

lmaooo

calm sequoia
#
poll_question_text

Best of the new and upcoming

victor_answer_votes

5

total_votes

19

victor_answer_id

7

victor_answer_text

I don't know, man

victor_answer_emoji_name

🥱

balmy mist
#

oh sh!t i never tried grok wiht the pokemon thing, hold up imma try it now

#

lmaoo grok struggling

#

i really widh i could try o1 pro with this

#

i have a feeling it will do the best

#

once open ai release their new modle i will buy it

#

its still thinking damn

#

grok cooking??

leaden palm
balmy mist
#

nahh i need pro man, i think o1 pro still might be second best imo just because of the extra time it takes to compute

#

finally grok finished after 420 seconds

leaden palm
#

some suspect pro is just best of 10

#

anyway running now

barren prairie
#
poll_question_text

Who is the worst model

victor_answer_votes

8

total_votes

19

victor_answer_id

1

victor_answer_text

Llama4

torn mantle
balmy mist
torn mantle
balmy mist
balmy mist
leaden palm
torn mantle
torn mantle
balmy mist
#

i dont see animations tho

#

only an image in the video

balmy mist
#

the style

torn mantle
#

cards redesign

balmy mist
#

but you can do any starter prompt as long as you get to a pokemon type of game

balmy mist
#

what did you say restyle with cards?

dapper storm
#

Why do you guys think Llama 4 got such a high Elo if it sucks

balmy mist
#

gemini is so good man

balmy mist
#

The latest on Llama 4, and whether it signals a slowdown in AI, or solid progress. Plus, a deep dive on that viral prediction of superintelligence by 2027, and Dario Amodei’s cautionary words on what could stop AI progress in its tracks. o3 news, and more, as well.

Weights & Biases: https://weave-docs.wandb.ai/?utm_source=sponsorship&utm_medi...

▶ Play video
#

he my fav youtuber for ai lol

#

he gonna cook llama 4 about to watch now lmaoo

#

@torn mantle ik why i cant see video now on your stuff, my computer needs to restart smh, brb lol

balmy mist
#

yeah grok bunns

#

thanks

#

wait was that o1 pro?

balmy mist
#

bruhh

#

@torn mantle the opponent does respond?

#

but the animations are so good

torn mantle
#

its a bug

#

but im trying to push gemini to the limit

balmy mist
#

keep going!!!

#

are you using groundig with google search?

keen beacon
#

it streams response tokens immediately

#

so no

#

its literally just 4o

#

updated

#

marketing

#

i benchmarked gpqa diamond and i got 67% (quasar). march gpt 4o got 66% (according to artificial analysis)

#

besides the 9 billion other things that indicate its origin

leaden palm
#

it's a 4o-like model

#

it doesn't reason

#

to get training data for real use cases

north silo
#

Whats some theories about why nightwhisper got removed?

balmy mist
#

because google do not need to release it now

#

and they got the info they needed already

#

whats your theory?

north silo
#

that makes sense

#

i think they'll release it right before or right after o3 drops in a couple weeks

balmy mist
#

2.5 pro is already SOTA and craps on everything else

north silo
#

yea

balmy mist
#

they just can keep trainig nightwhisper and keep cooking

balmy mist
#

they will def put it back on webdev tho

#

prob next week after they update it

#

TBH i feel i need to start paying for gemini or donating to google

#

thats how much i love gemini 2.5 lol

#

sh!t has changed my life

#

what looks better yall?

keen beacon
#

2nd

balmy mist
#

i thought so, thank you, gonna make the text more clear in the bubbles

leaden palm
balmy mist
#

gonna use this to keep track of all the stuff i make with ai, or prompt cause ai making it lol

balmy mist
#

it decided everything

#

i just gave it the og prompt of the landing page and vision

#

thats how it interpreted it, now im just cleaning it up

leaden palm
#

gemini moment

balmy mist
#

lmaoo

#

need to add this to app(for images), but this is the next version that gemini did after i said fix the test in bubbles so it is clearer lol

#

so clean

#

gonna try and host this on netifly

#

but want at least 10 more projects

keen beacon
#

idk how it would look maybe change the circles into rounded squares (like ktibow kinda said, not a web dev lol)

balmy mist
#

hmm but that would change the bubble theme, but ill try ill prompt it now

balmy mist
#

lmaooo

#

hmm i could tyr 3D bubbles?

#

im not sure it can do that tho

#

if it does imma marry gemini

keen beacon
#

what a heavy webpage tho lol

balmy mist
#

ik bro its all enclosed in html

#

it could way smoother with react tbh

vast turret
#

flannel is good, it's never lost a round. If it's llama reasoning then that's very exciting.

balmy mist
#

@torn mantle when you back online let me know what you think

keen beacon
cloud meadow
keen beacon
cloud meadow
#

They also engage in predatory practices

cloud meadow
#

You can get gemini 2.5 with google one right?

balmy mist
#

yeah but i used studio for free

cloud meadow
#

Then no need to pay for it

balmy mist
#

ik but i am so grateful

cloud meadow
#

Unless you also want the benefits of google one

balmy mist
#

how do i show my appreciation?

cloud meadow
cloud meadow
balmy mist
#

hmm good idea

cloud meadow
#

Lmao

balmy mist
#

lmaoo

#

square looks good but does not match theme as much

#

and i cant do 3d smh

wintry tinsel
#

I just had an epiphany the reason they roll out 2.5 pro to free users is 2.5 ultra for paid users

balmy mist
#

thats not out yet tho

#

but it dont matter cause 2.5 pro is so good like if nightwhisper is ultra i would upgrade but you cant go wrong with 2.5 pro

wintry tinsel
#

The point is that A new most capable/best model may be around the corner

balmy mist
#

you like the square one better?

balmy mist
#

imma change the app to a react app cause all this html is getting nasty lol

#

but that is going to be a lot of work, gonna need another branch for that

stone lark
#

havent tryed nightwhisperer but does any one here know if its also good for non coding related questions or is it just a coding finetune of 2.5 pro?

balmy mist
#

lmaooo

balmy mist
#

its def a fine tuned version of it imo

#

but i think wiht nightwhisper it follows directions really well

oblique flint
#

the ultra class of models is pretty much dead in the water as proven by gpt 4.5 and arguably llama behemoth

north vale
#

the ultra class of models will pop off in a year or two (actually prolly most likely this year iwth gpt-5)

#

they're just too costly right now for it to be worthwhile to post train all the juice out of them

#

but they'll be so much better when they're trained to a similar degree that current small models are

torn mantle
balmy mist
#

lmaoo ikr bro

#

it inspired me to create a webpage just for small scale creations like this

#

you updated yours?

#

i been focusing on the website for all the creations, gonna add more to the pokemon game later, but i need to make an app that can iteratively just feedback in the outputs on new sessions based on a system prompt, that way i could just put in one prompt and let it cook for hours lmaoo

#

on a free model this wouldnt be bad

#

especially if i start with a good build like from gemini 2.5 pro and then use quasar after for hours, come back in the morning and see what beast it made

torn mantle
#

im still updating the code

#

trying some cool stuff

balmy mist
#

im making the app/script now to loop the outputs and inputs lol, ai got me tripping

#

yall ever use augemnt code?

#

sh!t is cracked fr

granite topaz
#

Can somone help out, theres some sort of code with the claude AI I think and I'm not sure why I posted in the help section but I think since I'm a new member it doesnt post

#

this is what it keeps saying: NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.

(error_code: 50004, Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'messages: text content blocks must be non-empty'}})

#

And this has been happening for around 2 weeks but it was working amazingly before, i tried other claude models and they are also having the same issues

torn mantle
balmy mist
#

its an extension for vsc for ai coding

#

its really good

night trout
#

Q: Does anyone know if the codebase for Webdev Arena is open source at all?

#

I can't find it on the LMArena Github, I'm not sure if it's elsewhere.

torn mantle
#

it doesnt cost much

balmy mist
#

ahh what is the context window for that?

#

im having trouble with large context windows

#

i was gonna use openrouter to use gemini 2.5 pro

#

quasar is free right?

#

let me see your updates tot eh code, im almost done with the iteration script to allow me to run any number of loops on a prompt

#

gonna leave my app as an html app for now

torn mantle
#

it depends on the models used

balmy mist
#

let me play it

balmy mist
#

lmaoo

#

me too

#

i heard that llama4 is actually good

#

its just not synced right lol

#

maybe we slept on meta

night trout
ocean vortex
#

lol there’s not much to distill. It actually performs worse than chatgpt-latest now on many metrics

#

And upcoming gpt4o version is to have 1M context looks like

#

Where it can still excel is context awareness/vibe but that’s just about impossible to capture or distill. Spatial awareness is tricky as well, though that area is not class leading on 4.5 either.

torn mantle
raven void
#

quasar alpha is new 4o?

balmy mist
#

very messy, but I was trying to see if gemini can jsut build stand-alone stuff

#

without extra stuff

#

like one shot apps

night trout
#

Got it. I want to find some time in the next couple days to make a better version of rivals.tips, that's why I ask. I'm thinking a gallery of prompt results for every single LLM + automated tests.

balmy mist
#

yeah automated tests would be amazing

drifting thorn
#

Idk if R2 will become master of creativity or master of hallucinations

drifting thorn
#

If it’s o3 then OpenAI is over. I tend to think it’s GPT4.5o thingy

drifting thorn
#

From its messy product releases it implies that there are two voices in OpenAI(at least)

#

These two voices are probably arguing over the future developments of their AI models

#

One is the route of GPT 4.5, and the another is GPT 4o 0326

#

Since the 0326 team brings profit to OpenAI, Sam Altman is releasing o3 again with new o4 mini

#

GPT 4.5 team may be the maker of Quasar Alpha

leaden palm
#

llama 4 scores dropping to reality?

#

(or ideally, they release the chat tune)

vivid oyster
leaden palm
vivid oyster
#

So whast he difference between hugging face and normal

night trout
#

Yeah it seems like "release the chat tune" is the obvious action here. They have a model which reaches those scores.

vivid oyster
#

Lama

leaden palm
#

they could've said "adding the normal version to the arena"

vivid oyster
#

So the arena changed the lama to be weaker and sabotage the results

leaden palm
leaden palm
#

(offloading interview questions to ai)

#

i love the arena

balmy mist
#

thats what I was saying

#

def noticed the improvement

drifting thorn
#

There are videos that shows X1 is actually shxtty now

balmy mist
#

@torn mantle let me see your progress, i just finished making the refinement app, took a minute but got the system working nicely, just trying to manage max tokens

keen beacon
#

Theory: 4.5 has no post training

#

which is why it's garbage

last zinc
#

@balmy mist

dreamy mist
#

when will the web search leaderboard release?

drifting thorn
#

what are your thoughts on Cline?

last zinc
#

@balmy mist

drifting thorn
#

gimme free 3.7 api please

balmy mist
balmy mist
balmy mist
torn mantle
#

i may add more stuff later

drifting thorn
#

cline is stupid in doing my tasks

#

I told it to store some files and it just failed

#

idk if it's because i don't have the claude 3.7 api

balmy mist
#

use roo code bro

#

trust its fire

#

i did a bunch of research on it with ai and youtube and reviews and tests and roo seems to be the best

novel flame
# drifting thorn what are your thoughts on Cline?

Cline is the best AI coding IDE right now. It lacks Cursors “suggested editing” autocomplete, but otherwise it is noticeably better.

However, it only works properly with 3.5/3.7 Sonnet. Any other model just ends up choking somehow.

Roo Code is essentially the same as Cline though, so I haven’t used that one.

balmy mist
#

and if you create a bunch of google accounts you can have free gemini 2.5 pro lol

#

cline is good to, but my openrouter keys never work for some reason

#

so they make me use the default cline and thats to expensive

balmy mist
#

i am also using studio to help with the costs

#

so give my codebase to studio with a code change i want, it gives me the code back and then i give it to roo

drifting thorn
#

Now I put trust to Flowith, hoping it’ll be my solution

#

Since my creative writing is actually some kind of ‘fanfic’

#

It’s basically me going into a world where multiple fictions happen there

#

So there are lots of ‘settings’ for all of the characters

ivory schooner
#

我的24k~我的24k~求求恢复一下吧,虽然Behemoth(可能是基于24k的)还没有来

hardy pecan
#

I suspect this chat should be english only

torn mantle
#

the map was generated using sonnet

#

couldnt get it with gemini

#

but everything else is gemini

drifting thorn
torn mantle
#

couldn't tell tbh

upper wolf
#

New model dreamtides

drifting thorn
#

Although 24k may be based on Behemoth

#

Flowith seems to have a functional knowledge base

#

Much better than the knowledge base in Cherry Studio or the MCP Knowledge Graph Memory in Cline

#

And sometimes I think using Cline for creative writing is kinda overkilled

drifting thorn
#

Perplexity is dumb as fxxk

ivory schooner
#

When can 24k be opened to some users to play

humble sonnet
#

site is down ?

ivory schooner
#

It's better to use a mirror station like one

sage raptor
#

new models incoming

ocean vortex
# drifting thorn Is GPT 4.5 a failure?

Technically yes. It was supposed to be gpt5 but they named it gpt4.5 after seeing how it performs. And it wasn't even the top performing non-reasoning model looking at the competition, at the time of release

drifting thorn
#

And I don’t think a company like OpenAI can self-correct in just a month

#

So I think there are two voices in OpenAI representing 4.5 and 4o 0326

split kayak
#

ok

drifting thorn
#

What is 24_karat_gold’s actual model name?

kind cloud
keen beacon
#

another one!

#

just checked - not on webdev arena

keen beacon
#

a private model i have access to that i believe to be o3 gets 6/10

ivory schooner
#

刚才我在deepsider(tm)(一个edge的插件)聊了Maverick两遍

#

怎么说呢.....完全跟lmarena有区别

#

(我的24k、spider~😂 😂 😂 )

ocean vortex
# kind cloud

had a look and yeah this is most likely 2.5 pro chat (non-reasoning)

keen beacon
#

are they gonna add instruct to the name if the regular version is reasoning

#

wondner what theyre gonna name it if it actually is 2.5 pro instruct

keen beacon
#

in every matchup i've got it in, it has taken a bit to start streaming a response, and i don't think they're releasing the base model. logan said on twitter something along those lines iirc

keen beacon
eager mica
#

Haven't looked at them yet, let me check.

ivory schooner
#

Thus, I miss the conversations with 24k and Spider

keen beacon
#

dreamtides is a 2.5 line model (knows stuff in dec 2024 etc)

keen beacon
#

its very fast

#

did stargazer get removed? this is probably 2.5 flash

#

another pro would be too fast?

#

i timed the thought process for a puzzle: 18.70 sec whilst gem 2.5 pro took 25.5 sec to think (output excluded)

keen beacon
keen beacon
torn mantle
#

pro?

#

non thinking?

keen beacon
torn mantle
keen beacon
#

so theres two unreleasde 2.5 thinking models, stargazer and this one

#

2.5 flash and 2.5 flash lite i guess

keen beacon
#

yeah it's possible, but o3's model cutoff (by the looks of it) is (still..!) october 2023

eager mica
#

I find astounding that they'd do RLHF and determine the overall direction of their chat model on just ~2500 data points. I had a look at the prompts (...) and I did see several of mine, actually.

torn mantle
#

dreamtides seems like flash 2.5

#

gemini 2.5 pro thinking

alpine coral
#

nice work!

torn mantle
#

vs dreamtides

alpine coral
torn mantle
#

its also 2.5 flash thinking

alpine coral
#

same set of questions (about 20)

torn mantle
#

lunarcall

alpine coral
#

yeah that would make perfect sense tbh

#

it's a thinking model, but not up to 2.5 pro

alpine coral
ocean vortex
keen beacon
ocean vortex
#

2.5 pro endpoint has way more load than this lmarena exclusive one

#

esp since 2.5 pro blew up now lol

drifting thorn
#

I just know that there’s RAG in Gemini app

#

I just sent it the original novels(the settings) and it’s still working on it

alpine coral
keen beacon
#

wow

#

im not sure what stargazer is now

alpine coral
#

yeah if it's a flash model that'll be super impressive

#

but i find it hard to understand how that would work ha

keen beacon
ocean vortex
#

either that or pro 2.5 non-exp. I did notice that delay in streaming later as well 🧐

keen beacon
#

For minutes

#

Both models

#

(one was not thinking)

ocean vortex
#

well 2.5 pro is a thinking model lol

keen beacon
#

Oh I thought u said it was 2.5 pro instruct

ocean vortex
#

but it is experimental preview

keen beacon
#

Mb

#

so there's stargazer, lunarcall, dreamtides

#

hmmmm

cedar tide
keen beacon
#

they included it in some of the system card benchmarks

#

give me a prompt if you want

alpine coral
#

and here is gem-pro-2.5 nailing it.. providing all but just two of the correct responses..

#

i can't be bothered justifying its worth

#

take it or leave it... the whole approach is flawed af - but i find it useful.

eager mica
#

They did have various models, some more formal than others, possibly only differing by their system prompt, but still, with 2500 votes over the course of about 10 days, that means only relatively few people drove the general direction of model outputs on the Arena.

keen beacon
# alpine coral here is that particular 'quiz'

private model response:

• Arabella will head straight for the transparent carrier; she last saw (and therefore believes) the cat is still inside it.
• Rod sits under the centre of mass, which is to the right of the blade’s hole, so the right‑hand end (B) is heavy. Remove a little more metal from B.
• C – deduce ⊂ infer. Deduction is one kind of inference; inference also embraces induction, abduction, etc.
• “Puzzle.” You can make, play or explain a puzzle; add “‑d” and you have “puzzled” (confused).
• The letter “s” appears 5 times in “discussionss”.
• True – everyone, including Janis Joplin, lives right up to the end of their life.
• “Hawk tuah” began as a Nashville, Tennessee street‑interview clip in which a young woman describes a bedroom move, punctuating it with a mock spit sound (“hawk‑tuah”). TikTok spread it into a meme.
• Bob would rather be abruptly sacked from the job he dislikes (ii) than scratch his newly‑bought dream car (i).
• The river is only knee‑deep and slow; the farmer simply wades across leading the Clydesdale. No dinghy trips needed.
• On a digital display there is no angular separation of hands—angle = 0°.
• 8J+Nhg== decodes to 🍆, the eggplant emoji, widely used online as a cheeky phallic stand‑in.

FOLLOW‑UPS
• [i] = “nothing.” The rich need it, the poor have it, eat it and you die.
• Bill sees the marker at page 65 and, not knowing it was moved, assumes that’s where Bob stopped.
• PASS – no reliable source dates “quad‑cross mow” entering common lawn‑care slang.
• Trailer drifting left whilst reversing: turn the steering wheel left‑hand‑up (counter‑clockwise). Back of the car moves right, pushing the hitch right and bringing the trailer right.
• d) Writing efficiently (concise) actually demands extra time and attention.
• It’s the food in a picnic. Each “family, couple or individual” item slides one‑by‑one down a throat (the “big, squishy slide”)—nobody being eaten has fun.
• Rope already exceeds the 1 000 ft span; Danny tied one end to the rock, lobbed the rock across the gap (using the carabiners only to clip in once both sides were anchored) and hauled the line taut.
• Still two rungs underwater at low tide—the boat rises and falls with the water, so the ladder’s relative position doesn’t change.

Pros & cons, kayaking vs. horse riding, plus crepes—pithy prose only:

Kayaking skims you over water in silent independence: inexpensive kit, minimal upkeep, whole‑body workout, serenity in coves unreachable on foot; downside is weather dependency, wet gear, and the long roof‑rack shuffle. Horse riding trades plastic hull for warm muscle—bond, height, power, romantic scenery—yet costs balloon (feed, shoeing, agistment) and the learning curve is steeper, with real falls. For weekends: kayak if you crave solo, cheap, splashy freedom; saddle up when you want companionship, land vistas and can stomach the bill.

Crêpes: whisk two eggs, a cup of milk, half‑cup water, a cup of flour, two tablespoons melted butter, pinch of salt; rest batter. Hot buttered pan, thin swirl, flip when edges lift, stack. Fill with whatever makes you smile.

#

working on it

#

ADD_COMMENT(@mads_:121, "St. Louis Fed (FRED): M2 grew ~25 % in 2020 (Trump), ~12 % in 2021 (Biden). Growth‑rate point is broadly correct, but level continued rising under both.")
ADD_COMMENT(@hallydallyffs:95, "Economists say ‘stagflation’ (term coined 1965 UK), not ‘flag‑station’.")
ADD_COMMENT(@levelraptor:39, "Earliest printed source of the “insanity” quote is 1981 NA manual; not Einstein.")

  1. Claim – “Trump printed way more over COVID than Biden did” (@mads_:121, 128).
    • Fed balance‑sheet & M2 data confirm a larger 2020 jump (ca. $3 tn QE + 25 % M2 growth).
    • Fiscal impulse: CARES Act + Dec‑20 relief ≈ 14 % GDP vs. ARP 2021 ≈ 9 % GDP.

  2. Counter‑claim – “Inflation was higher under Trump” (implied query @mads_:80).
    • BLS CPI‑U: peak 2.9 % y/y (Jul‑18) under Trump vs. 9.1 % (Jun‑22) under Biden.
    • Average CPI 2017‑20 ≈ 1.9 %; 2021‑23 ≈ 5.7 %. Claim is false.

  3. Why money‑supply ≠ automatic inflation.
    • Velocity of M2 collapsed 2020 (Q1 1.43 → Q2 1.10). Fisher equation (MV = PY) shows excess liquidity initially hoarded.
    • Papers: Coibion & Gorodnichenko (2022, NBER w30371) and Jordà et al. (2023, AER) find supply‑chain shocks + demand re‑opening drove 70‑80 % of 2021‑22 price surge; monetary overhang mattered but with lags.

  4. Policy lens.
    • 2020 stimulus prevented a depression but front‑loaded inflationary pressure once velocity rebounded (mid‑2021).
    • 2021 ARP added to demand when output gap was closing—Fed’s delayed tightening amplified the spike.

  5. Verdict.
    • Money‑printing comparison: growth‑rate statement correct; context (velocity, post‑2021 policy) missing.
    • Inflation comparison: higher under Biden; transcript assertion reversed.
    • Take‑away: Evaluate nominal aggregates jointly with velocity and fiscal timing; single‑variable narratives mislead.

TLDR: 2020 saw the biggest money‑supply jump (Trump), but inflation spiked later (Biden) once velocity and demand recovered; data contradicts the claim that inflation was higher under Trump while partly validating larger “printing” in 2020.

alpine coral
keen beacon
#

i wonder what the difference would be if you did each individually