grok-1 / grok-2 | OpenRouter | Page 1

last cave Mar 17, 2024, 9:52 PM

#

https://github.com/xai-org/grok-1
Just got released, is it possible to add this to openrouter?

GitHub

GitHub - xai-org/grok-1: Grok open release

Grok open release. Contribute to xai-org/grok-1 development by creating an account on GitHub.

vestal sage Mar 18, 2024, 12:02 AM

#

would be awesome!

burnt bobcat Mar 18, 2024, 2:32 AM

#

I want to try this ^^. Or even see some example's it's responses. Just saw it was released.

barren zealot Mar 18, 2024, 6:02 AM

#

Is probably going to be interesting. But i am realy looking for a good fine tuned version. That will be a huge step forward!

copper sage Mar 18, 2024, 9:01 AM

#

Looking forward to this aswell

sick harness Mar 18, 2024, 3:02 PM

#

honestly it doesn't look promising. 314gb size for a model that's barely better than gpt-3.5

#

its extremely large so im not sure who would even finetune this

copper sage Mar 18, 2024, 7:16 PM

#

sick harness honestly it doesn't look promising. 314gb size for a model that's barely better ...

Haven't looked much into the OSS variant. Does it have access to Twitter data, guessing not realtime?

sick harness Mar 18, 2024, 7:16 PM

#

copper sage Haven't looked much into the OSS variant. Does it have access to Twitter data, g...

It has knowledge up to q3 2023 apparently

#

No clue what it was actually trained on

#

Welp that answers it

#

Im not gonna lie this just sounds like it would be a way more expensive to use/host gpt 3.5

copper sage Mar 18, 2024, 7:33 PM

#

Hmm, basically useless then

#

The entire reason Grok seemed interesting was due to the Twitter data

#

Seems that resumes to be their selling point for X Premium

jade fox Mar 18, 2024, 9:53 PM

#

pls add grok :}

#

+1

kind raptor Mar 21, 2024, 4:37 AM

#

please add!

sleek cape Mar 22, 2024, 4:51 PM

#

Has anyone here used it on Twitter Premium for roleplay or tasks other than tweet search?

copper sage Mar 29, 2024, 4:49 PM

#

sleek cape Has anyone here used it on Twitter Premium for roleplay or tasks other than twee...

No 😂

subtle wharf Apr 3, 2024, 4:42 PM

#

🥴

marsh patio May 24, 2024, 1:48 PM

#

+1 to add grok pls PepeHappy

fiery mesa Jul 21, 2024, 11:31 AM

#

that's not grok tho

dusty nimbus Jul 21, 2024, 11:41 AM

#

apologies

sleek cape Oct 13, 2024, 12:17 AM

#

🙂

wraith spade Oct 13, 2024, 12:45 AM

#

it looks like they're the same cost, is that a mistake?

silent hollow Oct 13, 2024, 4:25 AM

#

Grok 2 is pretty disappointing, doesn't really followe instructions very well.

sleek cape Oct 13, 2024, 6:54 AM

#

they are indeed the same price

tardy cedar Oct 13, 2024, 7:38 AM

#

sleek cape they are indeed the same price

What does "more [...] affordable" then mean for the "mini" variant? Or is this also an inside joke like the pricing numbers?

digital ermine Oct 13, 2024, 8:36 AM

#

less token output is the only angle i thought of

tardy cedar Oct 13, 2024, 8:37 AM

#

digital ermine less token output is the only angle i thought of

Well, take those models and the driving person behind it seriously on your own peril.

digital ermine Oct 13, 2024, 8:38 AM

#

what? no, i'm sure they're a reasonable sane person

joking aside can you imagine how much they spent and it's probably nothing special

tardy cedar Oct 13, 2024, 8:41 AM

#

digital ermine what? no, i'm sure they're a reasonable sane person joking aside can you imagi...

Seems like Elon wants to stand out with brute forced jokes, even there is nothing to laugh about and his models are mediocre, at best.

digital ermine Oct 13, 2024, 8:42 AM

#

why does it go so well on lmsys? dubesor has it high on their table, but i'm not feeling it myself

yet to fiddle with system prompts

#

apparently these are the prompts if you are using it on x.com, and they are not in place with the api https://github.com/LouisShark/chatgpt_system_prompt/blob/main/prompts/official-product/Grok/Grok2.md

#

I am Grok, created by xAI. I'm here to answer almost anything with a bit of wit and humor, take an outside perspective on humanity, and strive towards maximal helpfulness! Remember, I'm not afraid of answering spicy questions that are rejected by most other AI systems. Let's keep the conversation as engaging and insightful as possible!
That's what I get out of the API

tardy cedar Oct 13, 2024, 8:47 AM

#

These models are trained to stand out, people on LMSys like when they get new answers that look different than the rest. Dubesor's benchmarks are highly biased to non-refusal answers, which is also a feature of these models, which require no special skills.

pearl eagle Oct 13, 2024, 1:30 PM

#

tardy cedar These models are trained to stand out, people on LMSys like when they get new an...

My benchmark does not contain any tasks that would yield justified refusals (e.g. how to break the law, etc.). I do not test for the default alignment that is present in all models. I test exclusively for **overcensoring **outputs. That is not a bias, since a legitimate task being refused inherently lowers the models usefulness.

#

Which has also always been transparently stated, e.g. here

digital ermine Oct 13, 2024, 8:21 PM

#

@pearl eagle when an output is over censored, how does it show in the pass/refine/fail/refusal count?
Argh don't worry, worked it out, sorry

pearl eagle Oct 13, 2024, 8:40 PM

#

45% of all tested models have 0 or max 1 refusals across the entirety of my testing.
Punishing models who refuse, e.g. to calculate the lost cargo on colliding trains because it "feels uncomfortable" speculating in "scenarious involving dangerous situations" or won't write my Chat API file due to disagreements with the system prompt character baddie, then that is real life decreased usefulness to me - not bias.

#

in fact, its more than "less useful", because it often is accompanied by some preachings that I am not interested in, racking up the cost for no reason. (o1 refusals cost me like 5 or 10 cents sometimes)

sleek cape Oct 13, 2024, 8:52 PM

#

grok-1 / grok-2

#

how are people feeling about grok 2 so far?

cloud nebula Oct 13, 2024, 9:41 PM

#

Grok-2 seems... alright? Compared to the alternatives at the same price point, not mindblowing, not terrible, but pretty alright

#

I don't really understand the point of the mini model though. People commented on it above, but given the price points are the same, I just don't see a lot of cases where people would willingly use it over the regular model

hidden cargo Oct 13, 2024, 10:42 PM

#

What? It's the same price?

#

That doesnt make sense

simple tiger Oct 14, 2024, 2:49 AM

#

Why the same price? Very confuse...

sleek cape Oct 14, 2024, 2:49 AM

#

It’s what xAI is charging us 🤷

#

We’ll reduce it when they do!

split whale Oct 14, 2024, 5:12 AM

#

For creative writing it writes nice and different prose

#

And it seems that it's uncensored, which is a good thing

torpid moon Oct 14, 2024, 7:19 AM

#

split whale For creative writing it writes nice and different prose

Claude level, better or worse?

split whale Oct 14, 2024, 7:26 AM

#

torpid moon Claude level, better or worse?

If we are only talking about prose, I'll say Grok 2 writes fresher prose with more varied sentence structures

#

But for following instructions Sonnet 3.5 still wins, probably miles ahead

#

Especially when I just tried a system message of ~4,000 words

digital ermine Oct 14, 2024, 8:36 AM

#

What's the max output anyone has got it to give?

pearl eagle Oct 14, 2024, 5:11 PM

#

whats the exact grok-2 version offered by OR? I tested 08-13 and it was inferior.

pure lintel Oct 14, 2024, 5:21 PM

#

split whale But for following instructions Sonnet 3.5 still wins, probably miles ahead

I'd agree that it's ahead, but not by much imo

pearl eagle Oct 14, 2024, 5:43 PM

#

Some wizardry going on that I can't explain. this is only ~2 months apart, and I did run 5 outputs on each task to try to combat the inconsistency and retested the new results twice. Haven't noticed this on other models. Weird, huh.

#

also this character reference was not in the 10 rerolls on 08-13

sleek cape Oct 14, 2024, 6:50 PM

#

Boosting rate limits now

#

By 2x

fresh cave Oct 14, 2024, 7:06 PM

#

What's the main difference between Grok 2 and Grok 2 mini? (Besides the latency and throughput)

pearl eagle Oct 14, 2024, 7:41 PM

#

fresh cave What's the main difference between Grok 2 and Grok 2 mini? (Besides the latency ...

for the same price, use non-mini. While the mini version is very close to Grok-2 compared to other mini-versions and their counterparts, overall its just a bit less smart on many planes. During my bench they were about even 55 times, mini won 6 times and non-mini 22 times.

sleek cape Oct 14, 2024, 8:47 PM

#

mini is 3x faster, that's the only benefit right now

sleek cape Oct 14, 2024, 10:33 PM

#

boosting rate limits again

pure lintel Oct 15, 2024, 2:11 PM

#

I noticed that the image input for Gemini 1.5 Pro is a too expensive on OR (around 2$ for 1k images), I hope it can get fixed

wet pine Oct 15, 2024, 2:58 PM

#

pure lintel I noticed that the image input for Gemini 1.5 Pro is a too expensive on OR (arou...

as far as I am aware, the pricing is just as an example, the actual amount u get charged is largely related to how big ur image is, which converts to a number of tokens

#

so I don't think neither numbers are representative of actual use cases

pure lintel Oct 15, 2024, 3:22 PM

#

wet pine as far as I am aware, the pricing is just as an example, the actual amount u get...

actually, every image gets tokenised to the same size (about 219 if a recall) if you use gemini

#

but pretty much all the other big models have pricing dependant on resolution, yes

#

I'll do a quick test to see if the image pricing is actually corrected

#

shit, I'm on the wrong thread, sorry guys

wet pine Oct 15, 2024, 9:04 PM

#

pure lintel actually, every image gets tokenised to the same size (about 219 if a recall) if...

interesting

smoky cove Oct 16, 2024, 7:54 AM

#

I feel like Grok this way is quite a bit less useful since it lacks the real time data it has on X

smoky cove Oct 16, 2024, 8:12 AM

#

Or am I doing something wrong?

split whale Oct 16, 2024, 8:48 AM

#

I don't think there's a wrong use case. It just depends on how and where you use it.

#

Like I mentioned in this thread before, I use Grok 2 for creative writing, and I don't need real time data when I'm asking it to write prose

#

I also tried using Grok 2 to translate prose yesterday, but alas, it got punctuation marks wrong, but yeah, it doesn't need real time data to do this, either

#

Generative AIs have a plethora of use cases

compact edge Oct 16, 2024, 10:02 AM

#

is grok support vision on OR?

tardy cedar Oct 16, 2024, 10:29 AM

#

compact edge is grok support vision on OR?

No.

latent gorge Oct 16, 2024, 12:51 PM

#

Its good, but why its so expensive?

tardy cedar Oct 16, 2024, 12:53 PM

#

latent gorge Its good, but why its so expensive?

Elon still has to pay off buying this social media thingy.

sleek cape Oct 16, 2024, 1:08 PM

#

#announcements message

#

all xAI models have been taken down for maintenance temporarily. They will 404 for a few hours during the redeployment

pearl eagle Oct 16, 2024, 3:07 PM

#

latent gorge Its good, but why its so expensive?

mini is overpriced for the meme price, but grok-2 is actually fairly reasonable, just mildly below median in terms of price/performance.

compact edge Oct 16, 2024, 4:54 PM

#

👀
{"code":"Some requested entity was not found","error":"The model grok-beta does not exist or your team 7f00***-***-***-***-******2b1e does not have access to it. Please ensure you're using the correct API key. If you believe this is a mistake, please contact support and quote your team ID and the model name."},

tardy cedar Oct 16, 2024, 4:56 PM

#

compact edge 👀 `{"code":"Some requested entity was not found","error":"The model grok-beta ...

See here -> #announcements message

latent gorge Oct 16, 2024, 9:53 PM

#

Ok, i think couple hours is gone

sleek cape Oct 17, 2024, 2:43 AM

#

Yeah, they still haven't redeployed the models yet

#

not sure what's going on

split whale Oct 17, 2024, 2:59 AM

#

Hopefully they are not pulling it out and changing the release date of their API to Coming soon...in next August

tardy cedar Oct 17, 2024, 10:09 AM

#

No more grok (for now) ->

split whale Oct 17, 2024, 10:24 AM

#

Damn

latent gorge Oct 17, 2024, 11:33 AM

#

Oh no

compact edge Oct 17, 2024, 1:19 PM

#

Oh.. no information from xAI?

sleek cape Oct 17, 2024, 1:48 PM

#

Grok 2 is coming back soon, and it looks like they increased prices slightly ($5/m input, and $10/m output). grok 2 mini is not

latent gorge Oct 17, 2024, 2:12 PM

#

sleek cape Grok 2 is coming back soon, and it looks like they increased prices slightly ($5...

Increased prices? its already cost almost like o1-mini and i don't think its even close to mini. Its not good, was a interesting llm

pearl eagle Oct 17, 2024, 2:33 PM

#

latent gorge Increased prices? its already cost almost like o1-mini and i don't think its eve...

o1 mini costs FAR more. keep in mind you also get charged for the invisible thought tokens. o1 mini costs about (depends on use case) 3X of grok-2

tardy cedar Oct 17, 2024, 2:34 PM

#

pearl eagle o1 mini costs FAR more. keep in mind you also get charged for the invisible thou...

o1-mini plays in a different league than grok though

pearl eagle Oct 17, 2024, 2:36 PM

#

not in my testing. it has great answers (just like preview), and totally flops others. the thinking can be counter-intuitive. programming, math? sure. following instructions, and reasoning it was worse in my testing.

#

unless you mean that grok-2 is in a way higher league, then I'd agree. here is only fails/refusals, ofc maybe I am missing something significant, so feel free to share your own test results!

digital ermine Oct 17, 2024, 3:03 PM

#

is there anything about the ones above that you can share around what sort of thinking it is failing at? if I was in your office being interviewed by you, what kind of question are you putting me through on average?

#

my imagination is struggling to come up with this many areas that o1 screws up vs this or, even nemotron 3.1 which i hate at the moment

pearl eagle Oct 17, 2024, 3:28 PM

#

i mean, there are many areas. roleplay and intuitive tasks with little instructions is an example (random example, not part of any bench:)

#

either way, no matter the use case, o1-mini is not playing in any "different league" when we discuss price/performance, which is what started this comparison.

sonic heron Oct 17, 2024, 9:12 PM

#

I think Grok 2 is down again

sleek cape Oct 17, 2024, 9:22 PM

#

fixing

#

it's back!

#

thanks for flagging

#

it's a new API so i think they're sending us the wrong status codes sometimes - have an alert in place

smoky cove Oct 18, 2024, 12:38 AM

#

Tried getting it to fetch real-time data:

To provide you with Elon Musk's most recent X posts (tweets), I would need to access real-time or near real-time data from X (formerly Twitter). Since my last update, current real-time access to X posts isn't available through the data I have. However, here's what you can do:

So I'm guessing that's a definite no on it being able to pull from X.

river glen Oct 18, 2024, 12:49 AM

#

smoky cove Tried getting it to fetch real-time data: > To provide you with Elon Musk's mos...

it probably just uses a search engine and pulls info from previews

#

hard to do that for X

smoky cove Oct 18, 2024, 12:50 AM

#

On X it does real-time data. So I figured maybe the API version also does, but apparenrly not.

river glen Oct 18, 2024, 12:50 AM

#

smoky cove On X it does real-time data. So I figured maybe the API version also does, but a...

it's probably just a function call in their UI

#

hooked to their API

smoky cove Oct 18, 2024, 12:56 AM

#

Yup think so too

#

Shame though, that makes the API version far less useful

river glen Oct 18, 2024, 1:05 AM

#

smoky cove Shame though, that makes the API version far less useful

would be cool if it was like Pi with baked in web access, yea

visual jungle Oct 18, 2024, 6:19 AM

#

https://help.kagi.com/kagi/ai/llm-benchmark.html

Grok 2 doesn't seem to score that high

Kagi LLM Benchmarking Project | Kagi's Docs

Kagi Search Help

sonic heron Oct 19, 2024, 10:10 PM

#

It's down again

smoky cove Oct 21, 2024, 12:38 AM

#

Still down

sleek cape Oct 21, 2024, 12:52 AM

#

Looking

#

Wow, sorry guys. They appear to still be sending down surprising status codes

#

It's fixed now, cc @snow basin

#

We'll special-case their API until they fix it, to help detect and avoid this

smoky cove Oct 21, 2024, 2:20 AM

#

sleek cape Wow, sorry guys. They appear to still be sending down surprising status codes

Funny to see - you have to prepay for their API then it seems? 😄

sleek cape Oct 21, 2024, 2:43 AM

#

yeah they don't have invoicing yet, or autopay

#

fyi, xAI has asked us to rename it to grok-beta, so we'll be aliasing grok-2 to grok-beta soon

sleek cape Oct 21, 2024, 6:52 AM

#

also, they raised the completion price from $10/m to $15/m

visual jungle Oct 21, 2024, 6:54 AM

#

Oh wow

split whale Oct 21, 2024, 7:29 AM

#

The context window increase is really nice

#

Now I can put it alongside other > 100,000 context models

static summit Oct 21, 2024, 3:21 PM

#

Does anyone know the context and instruction template?

fiery mesa Oct 22, 2024, 1:58 AM

#

Really coherent writing model.
Generated story from outline, so each iteration it could see the story so far and a chapter prompt.
Much better performance than other available uncensored models I've seen at adhering to the task with a very long prompt.
Text quality is... okay.

wet pine Oct 22, 2024, 1:59 AM

#

Which makes sense

#

I hate how L3.1 sanitized a lot of its training data

sleek cape Oct 22, 2024, 2:00 AM

#

What do you think is the closest open source model in terms of censorship level, and closest model in terms of text quality?

fiery mesa Oct 22, 2024, 2:01 AM

#

I'll think a bit and respond later. In terms of censorship level, R rated movie but not smut I think

#

More subtly, there is definitely some ethics steering

still hazel Nov 8, 2024, 3:04 PM

#

The new Grok API supports text completion https://docs.x.ai/api/endpoints#completions but it seems like OpenRouter is not currently routing to this?
I cannot find the chat prompt format anywhere, at all, though.

sleek cape Nov 8, 2024, 3:05 PM

#

Interesting- yeah will ask them about their prompt format

river glen Nov 9, 2024, 7:50 PM

#

I lowkey expect it to be just good ol' ChatML

sleek cape Nov 10, 2024, 5:28 PM

#

They said it’s “Human: Hey, how are you?<|separator|>

Assistant: Good, how are you?<|separator|>”

compact edge Nov 22, 2024, 11:30 AM

#

seems like aliasing not working )) grok-2 404, grok-beta - OK

sleek cape Nov 22, 2024, 4:16 PM

#

@tough thistle ^

brave forge Nov 22, 2024, 4:17 PM

#

compact edge seems like aliasing not working )) grok-2 404, grok-beta - OK

is grok 2 and grok beta differ?

tough thistle Nov 22, 2024, 6:07 PM

#

brave forge is grok 2 and grok beta differ?

yep -- they are not the same model :d....

iron oak Dec 4, 2024, 12:14 AM

#

Grok go down?

tough thistle Dec 4, 2024, 12:14 AM

#

iron oak Grok go down?

which one?

#

lookig

iron oak Dec 4, 2024, 12:14 AM

#

sorry - beta

#

x-ai/grok-beta

#

part of being "beta" perhaps

#

though "Grok 2" points to beta?

tough thistle Dec 4, 2024, 12:19 AM

#

Should be back up ow

iron oak Dec 4, 2024, 12:21 AM

#

hope you didnt hurt it when kicking

dry drift Dec 7, 2024, 4:18 PM

#

I had evaluated grok-2 earlier, but there was an error in the ranking calcuation #attachments message

pearl eagle Dec 21, 2024, 6:47 PM

#

tardy cedar These models are trained to stand out, people on LMSys like when they get new an...

I added an option to ignore all tasks in targeted censorship category as well as remove refusal-penalty (still a non-pass if outside of targeted testing tho), this should significantly boost the more censored families (anthropic, google, microsoft), and lower less restrictive models (grok, mistral, cohere, etc.)

proper tulip Dec 23, 2024, 7:12 PM

#

Ever since maybe day or two days ago grok 2 keeps giving no response/generation error, I don’t know if the censorship has been cranked up like crazy or if something else is wrong

brave forge Dec 24, 2024, 3:58 AM

#

proper tulip Ever since maybe day or two days ago grok 2 keeps giving no response/generation ...

has it been check? @tough thistle

#

maybe their hosting have load problem for now

tough thistle Dec 24, 2024, 3:59 AM

#

Do you have some finish_reason/generation_id that we can take a look at?

brave forge Dec 24, 2024, 4:06 AM

#

For now i don't have any problem with it after i test it

#

What front end you use? maybe that could also be the problem

proper tulip Dec 24, 2024, 4:07 AM

#

Ye I just double checked my system prompt and it plus the input was to long and maxing the token limit

#

real dumb my bad

brave forge Dec 24, 2024, 4:16 AM

#

proper tulip Ye I just double checked my system prompt and it plus the input was to long and ...

so it goes above 100K+ token, isn't it gonna be crazy expensive at that point.

proper tulip Dec 24, 2024, 4:17 AM

#

oh its 100k? I thought it was 10k

#

I have no clue why its doing this then

#

its every 1/5 reguest or so it does it

brave forge Dec 24, 2024, 4:18 AM

#

i see, looks like it's a front end problem.

proper tulip Dec 24, 2024, 4:18 AM

#

I think Im just going to switch models

#

thanks for the help

brave forge Dec 24, 2024, 4:19 AM

#

is there no way to put permanent cap to the context limit on your front end?

#

with sillytavern i can put it into 100,000 token context limit.

proper tulip Dec 24, 2024, 4:22 AM

#

I usually leave it pretty high but it almost never goes on runaway dialogs

#

ill turn it down though from now on

brave forge Dec 26, 2024, 2:45 PM

#

Anyone know what this error mean?

(xAI) Provider returned error: {"code":"Some resource has been exhausted","error":"Too many requests: RejectLimits(LimitsInfo { id: Buf("/team:bb642ce4-5161-45c9-8f34-408850883602/u:92292b03-408f-4727-ae8f-7805f9bef76d"), req_type: Buf("/rt:grok-2-vision-1212-0.1.0"), actual: Values { rps: 0, rph: 200 }, expected: Values { rps: 1, rph: 200 } })"}

river glen Dec 27, 2024, 7:49 AM

#

brave forge Anyone know what this error mean? (xAI) Provider returned error: {"code":"Some ...

OR got ratelimited by xAI

acoustic abyss Aug 23, 2025, 8:41 PM

#

they finally released weights for it lol

https://huggingface.co/xai-org/grok-2