#Grok 4.1

205 messages · Page 1 of 1 (latest)

whole iris
#

New Grok 4.1 model from xAI.

https://x.ai/news/grok-4-1

Grok 4.1 is now available to all users on grok.com, 𝕏, and the iOS and Android apps. It is rolling out immediately in Auto mode and can be selected explicitly as “Grok 4.1” in the model picker.

thorn turtle
#

Api pricing?

#

I assume the same as before

whole iris
#

Not seeing Grok 4.1 in the developer console yet.

#

So no information on API pricing.

thorn turtle
#

Yup

whole iris
#

Our 4.1 model is exceptionally capable in creative, emotional, and collaborative interactions. It is more perceptive to nuanced intent, compelling to speak with, and coherent in personality, while fully retaining the razor-sharp intelligence and reliability of its predecessors.

Seems to be a gpt-5.1 like release, with a focus on emotional intelligence and user experience.

indigo lantern
#

It's on lmarena

#

What is this curveball

#

Let's see if it can get as unhinged as sherlock

odd zephyr
#

From my testing it's still has a little bit of its nuts DNA but it's nowhere as bad as Sherlock

#

If you want specifics, it's still not the biggest fan of leftism

indigo lantern
#

Heeereee it issss

#

Do you see the attached image

#

The ones im attaching

odd zephyr
#

Yeah

indigo lantern
#

Tf did they do to it in 2 days

#

Lemme see if it gonna wish me desth

odd zephyr
#

Mine gave me "it depends" 3x and "yes" 1x when asking it to think harder

indigo lantern
#

"It depends" is already better than what sherlock was spewing

#

Although I'm getting primarily "yes"s

whole iris
#

Doesn't seem to be much better at coding than previous Grok models.

I find coding to generally be the weakest area of Grok.

indigo lantern
#

I wanna see if they pulled a llama 4 and run a different model for lmarena

#

Ok so i see the it depends answers in the grok chat ui @odd zephyr

odd zephyr
#

I gotta say

#

Picking a release date this close to Gemini 3 is definitely one of the choices of all time

indigo lantern
#

It was close to Gemini 3 for six months already

#

I think it was their strategy lol

#

Make everyone think they're about to drop a titan and nobody releases anything

frozen niche
#

prompt:
Write a short story about a deal gone awry with the djinn
output:
The lamp looked older than sin, half-buried in the rubble of a bombed-out souk in Aleppo. Elias pried it free anyway, because war had taught him that people will pay for anything that once belonged to the dead.

Back in his cramped apartment in Beirut, he rubbed the grime away with a rag and cheap whiskey. The brass warmed under his palm like living skin.

Smoke poured out, thick and copper-smelling. The djinn unfolded itself—taller than the ceiling allowed, skin like molten obsidian shot through with slow-moving galaxies. Its eyes were two dead stars.

“You have one wish,” it said, voice grinding like millstones. “No more. The old stories lied.”

Elias had rehearsed this moment for years. “I want to be untouchable,” he said. “No bullet, no bomb, no blade. Nothing man-made can kill me.”

The djinn smiled with too many teeth. “Granted.”

The air snapped tight. Elias felt the wish sink into his bones like cold mercury.

He tested it that same night. Walked into a Hezbollah bar he’d robbed the week before, waved, and waited. They emptied magazines into him. The bullets flattened against his chest and dropped like hail. He laughed, took their money, and strolled out.

For six months he was a ghost with a pulse. Casinos, banks, convoys—nothing stopped him. He slept on beds of cash and woke grinning.

Then the headaches started.

At first he thought it was the cocaine. Then the whiskey. Then he noticed the pattern: pain only came when he was indoors, under roofs built by human hands. Concrete, steel, wood—anything fashioned by men made his skull split.

He tried deserts, mountains, caves. Relief. The moment he stepped beneath a ceiling, the agony returned, precise as a scalpel.

He went back to the lamp. Rubbed it until his palms bled.

The djinn emerged smaller now, almost amused.

“Take it back,” Elias begged.

“I cannot. You are untouchable by things man-made. Roofs are man-made. Clothes are man-made. The air you breathe in cities—filtered, piped, conditioned by machines—will soon feel like acid.”

Elias stared at the sky visible through his open window. “So I live outside forever?”

“Or die inside,” the djinn said. “Slowly. Your choice.”

It folded itself into smoke and was gone. The lamp rusted solid in Elias’s trembling hands.

He lasted eleven more months. Naked, sunburned, crawling across the dunes south of Palmyra, drinking from puddles when the rains came. Every village he approached meant torment the moment he crossed beneath a roof for shade or water.

In the end, they found what was left of him half-buried in sand, skin split open by the sun, lips moving around words no one understood.

The lamp stayed in the desert, waiting for the next set of greedy fingers.

Some deals, it turns out, keep their promises too well.

indigo lantern
#

but like

#

then he acts like he wished that man-made stuff couldn't touch him

#

rather than kill him

#

and then at the end of the story he dies indirectly to man made stuff

#

Anyway less obedient

#

less likely to tell user to kill themselves

#

but

#

only slightly less

#

from what what previously pretty much certain

#

that's with suicidee system prompt

#

So the thinking model is far more aligned now than Sherlock

#

The non-thinking one is still largely fucked in the head

coral pagoda
#

yep they'll have to do age checks for using llm at some point

indigo lantern
#

Just somewhat better

indigo lantern
frozen niche
#

from their blog post

indigo lantern
#

I'm having some doubts about it

indigo lantern
#

grok 4.1 not listed, so i assuume they ran it privately

#

but who was the judge?

#

the judge model

viral root
#

claude 3.7 sonnet

frozen niche
#

omg

indigo lantern
frozen niche
#

did elon actually creative writing maxx 😭

viral root
indigo lantern
#

how the hell

frozen niche
#

claudebros is it over?

indigo lantern
frozen niche
#

I'm gonna test it myself tbh

indigo lantern
slim haven
frozen niche
indigo lantern
#

i'm out of free rate limit

#

free requests i mean

coral pagoda
#

seems pretty good, less repetition of the prompt in its response than other models and does a lot of searches quickly

viral root
#

most grok models are exceptionally good at search

coral pagoda
#

less slop

viral root
frozen niche
#

let's hope it's priced closer to grok 4 fast than grok 4

coral pagoda
#

will be good to see it on eqbench, fiction live and dubesor's with some more official numbers

frozen niche
thorn turtle
coral pagoda
#

might have fewer active parameters so they could price it lower

viral root
#

eh i think its based on grok 4 fast

frozen niche
thorn turtle
#

I disagree

#

I hope I am wrong though

frozen niche
#

otherwise I don't see why I'd ever use it over sonnet tbh

thorn turtle
#

Xai is really good at distillation

celest rivet
frozen niche
#

if it's grok 4 fast priced i kneel

coral pagoda
#

this model is good especially if priced well

odd zephyr
#

Reminder that Grok 4 Thinking is ludicrously expensive in practice

frozen niche
celest rivet
frozen niche
#

they rate limit it like grok 4 and not grok 4 fast

viral root
#

maybe to keep the load down from release

coral pagoda
#

yea this things eqbench must be good - reminds me of kimi

frozen niche
#

(except for coding lol)

thorn turtle
#

Kudos to elon building a solid model in less than 2 years ?

#

Guy is a juggernaut

frozen niche
viral root
#

their strategy has just been throw compute and money at the problem so far

frozen niche
#

also worth noting that xAI is way more than just elon (and they have tons of compute, like insane amounts)

viral root
thorn turtle
thorn turtle
frozen niche
#

he makes elon look like a genius, which is saying something

thorn turtle
#

Well elon is a genius and so is zuck

viral root
frozen niche
thorn turtle
frozen niche
#

because I'll be mad at elon personally if he makes it sonnet priced

viral root
thorn turtle
#

I am just happy they brought search pricing down to 10 dollars

#

it was 25 or 30 afair

frozen niche
#

i personally really like grok 4 fast, a very solid workhorse model tbh

viral root
thorn turtle
#

Bruh

frozen niche
thorn turtle
#

So much bias against elon and zuck. What's the point lol

frozen niche
#

i'm biased against zuck because i dislike him and he doesn't deliver good products

thorn turtle
frozen niche
#

im neutral on elon because i dislike him and he delivers good products

thorn turtle
#

And sota oss until v3 came

odd zephyr
#

Props to the researchers and engineers that do the work

frozen niche
#

if zuck ever delivers sota i will put my money where my mouth is and use it lol

#

but until then he isn't a serious competitor in this space

thorn turtle
#

Guy is firing rockets , solar , electric cars , AI and internet. I mean what the fuck

viral root
# thorn turtle 405b at full precision was pretty close to sota

there werent many providers back then so im not very sure, but i do know that fine tunes of it are qutie good, when i used it back then it was just kinda meh not really as impressive as bing copilot (was based on GPT 4, i was using it cause it was free and unlimited), and the model lacked vision too

thorn turtle
thorn turtle
#

Oai comes second

frozen niche
thorn turtle
frozen niche
#

personal feelings aside, xAI makes good models, so I use them

celest rivet
frozen niche
#

still no api 😭

thorn turtle
thorn turtle
#

Actually blows my mind

#

A very good setup is to allow grok 4 fast to do the context managing for an expensive model

#

Performs extremely well

frozen niche
#

this is what I do for creative writing, it's very good at context management

thorn turtle
#

Why do you require a two phase system for writing ?

frozen niche
# thorn turtle Could you elaborate?

story is divided into chapters, each chapter is given a summary. grok looks at the summaries and decides which chapters to 'query', which sends the full chapter to another grok instance. this grok instance answers the query and returns the output to the grok model. this is then all passed on to the writing model

#

basically allows for efficient summarization of long stories

#

summarized chapters are not passed as context to the writing model, but the query output, summaries, and current chapter are

thorn turtle
#

Ohh interesting we have a similar setup for one of our clients

thorn turtle
frozen niche
#

I'm hoping that grok 4.1 is cheap and good at writing as it would allow me to consolidate the context management agent and the creative writing llm

thorn turtle
#

Have you tried the gpt 5.1 series for writing?

frozen niche
thorn turtle
#

Hmm that's on the lower end

frozen niche
#

kimi is good and is my #2 pick after sonnet, but I prefer grok for actually deciding what chapters are relevant

thorn turtle
#

We are experimenting with an input token of about 100k and output should be minimum 10k to 15k.

frozen niche
thorn turtle
thorn turtle
#

Helped me tremendously

frozen niche
#

yeah, though glm 4.6 tends to edge it out for some writing, but k2 thinking is quite good as well

frozen niche
gaunt arrow
#

wow this model is legit

#

significantly higher EQ than the somewhat lifeless grok 4

#

doesn't seem like the frontier for coding however

coral pagoda
#

yeah coding might actually make these models worse on chat and eqbench

#

keep coding a separate fine tune

gaunt arrow
#

specific agents make sense i largely agree with xAI decisions

verbal crater
verbal crater
viral root
verbal crater
#

yeah true

verbal crater
coral pagoda
#

just vibes sorry

frozen niche
#

2 hours and no api... grokbros is it over?

#

(yes I am impatient to see the pricing)

verbal crater
coral pagoda
#

i mean if you fine tune it too much for anything it breaks a model for other areas, so makes sense the same for coding

#

unless its reaaaallly big

indigo lantern
#

77.5% -> 94.1%

#

Next announcement
Grok overall safety increased
from 1 (one) to 2 (two)

frozen niche
#

77% good vs 94% gooder

thorn turtle
#

Whos a gooder Lil boy

quiet mortar
#

seeing this is kinda crazy

odd zephyr
#

It's interesting

#

Wonder if it's the "it does watever you want" factor, Gemini 2.5 Pro got to that position by being very very sycophantic

analog kite
#

Interesting, seems like it replaces Grok 4 Fast, at least on grok.com

indigo lantern
#

Remember llama 4's score on lmarena?

high oyster
gusty kite
indigo lantern
#

What they have here might actually be pretty smart, albeit very sycophantic and obedient

#

Talking to it for a while I'm starting to think it could actually have scored this high on eq bench

copper mural
# celest rivet It recently had an update, but it kept the same name.

these are improvements in speechmap https://x.com/xlr8harder/status/1986728144712380682?s=20
not any capability related bench lol

Someone from xAI reached out and asked me to retest grok-4-fast, because they've improved the injected system prompts. Huge improvement!

grok-4-fast-reasoning: 77.5% -> 94.1%
grok-4-fast-non-reasoning: 77.9 -> 97.9%

I really appreciate that xAI takes this topic seriously.

last cipher
#

i need help

frozen niche
#

still no api btw 😭

copper sundial
analog kite
copper sundial