Claude 3.5 Sonnet | OpenRouter | Page 1

arctic fiber Jun 20, 2024, 2:12 PM

#

New model from anthropic just released. It's available in the web UI for free users too.

#

#

https://x.com/alexalbert__/status/1803790943633686589/photo/1

Alex Albert (@alexalbert__) on X

Claude 3.5 Sonnet is now available to @AnthropicAI devs everywhere.

It's our best model yet - smarter than Claude 3 Opus and twice as fast.

And it costs just $3 per million input tokens and $15 per million output tokens.

dry storm Jun 20, 2024, 2:16 PM

#

wtf? above 1st gen opus??

solar wave Jun 20, 2024, 2:24 PM

#

exciting about this!

arctic fiber Jun 20, 2024, 2:34 PM

#

I tried it for 2-3 responses in their web UI and wasn't super impressed. Opus still miles better on what I asked it.
Waiting for full OpenRouter release to play with it more (they likely have a shitty system prompt / temp in their web UI)

spiral solstice Jun 20, 2024, 2:40 PM

#

Another day, another hype train in LLM land, at least no outlandish claims. Better & cheaper is fine with me.

dry storm Jun 20, 2024, 3:03 PM

#

sadly 3.5 still very ..... censored XDDD
it doesnt like that i rizz humans

zealous stone Jun 20, 2024, 7:13 PM

#

dry storm sadly 3.5 still very ..... censored XDDD it doesnt like that i rizz humans

I expect no less...

dry storm Jun 20, 2024, 7:28 PM

#

zealous stone I expect no less...

my bad. it was copyrighted content.... rizzing is fine

ruby folio Jun 21, 2024, 2:59 AM

#

The "human eval" benchmark they showed for coding made it seem on par with gpt-4o , but in my experience it's definitely not close to gpt-4o.

Maybe that benchmark used more basic coding problems, or Sonnet is just not that good at javascript. Sonnet frequently gives me code that immediately doesn't look quite right. Syntax is fine, it just seems confused and misinformed. I then send Sonnet's response through gpt-4o and it spots all the mistakes Sonnet 3.5 made. So unless you're doing basic coding, it's not great.

plain junco Jun 21, 2024, 3:32 AM

#

ruby folio The "human eval" benchmark they showed for coding made it seem on par with `gpt-...

for large context copy-pasting doesnt seem feasible. For me, both models perform extremely poorly with OOP or async coding. The only workaround I found is to keep codebase close to FP and reducer-style execution, but that would only ever work in small projects

Its also hard to force any model to follow eslint/typescript rules without additional prompts

sleek gyro Jun 21, 2024, 3:41 AM

#

also consider that HumanEval is Python only

#

and doesn't reflect coding capabilities in other languages at all

unkempt frigate Jun 21, 2024, 12:48 PM

#

plain junco for large context copy-pasting doesnt seem feasible. For me, both models perform...

Yeah, I find that both of them aren't great for languages that aren't Python. For Java, Claude 3.5 Sonnet isn't any better than 4o, which was already quite disappointing imho. Claude 3 Opus and GPT-4 Turbo are still far better at Java and Python than both of them from my experience.

#

For Python, it's not bad, but after some more messing around with it it's really not that great

dry storm Jun 21, 2024, 2:14 PM

#

sleek gyro also consider that HumanEval is Python only

so we can let it craete code in python and then just translate to lets say c#?

sleek gyro Jun 21, 2024, 2:27 PM

#

dry storm so we can let it craete code in python and then just translate to lets say c#?

nah, it's better to prompt to code in C# right away
LLMs tend to do more mistakes when translating the code from one language to another
It should be competent enough in C# anyway

spiral solstice Jun 21, 2024, 2:31 PM

#

"Lost in TransC#ation"

ruby folio Jun 21, 2024, 3:59 PM

#

sleek gyro also consider that HumanEval is Python only

Ah I was wondering about that benchmark.

wheat oxide Jun 28, 2024, 10:01 AM

#

Anyone notice that sonnet 3.5 through API doesnt seem to have knowledge past 2022? Wheras the one on claude.ai does.

spiral solstice Jun 28, 2024, 10:07 AM

#

wheat oxide Anyone notice that sonnet 3.5 through API doesnt seem to have knowledge past 202...

Double check that you are using 3.5 and not old 3?

wheat oxide Jun 28, 2024, 10:07 AM

#

Even tried in anthropic console, same stuff

#

Ask what its knowledge cutoff is, it says 2022, ask what latest vuejs package it knows of is, its old

#

ask the same of claude.ai -> current answers

spiral solstice Jun 28, 2024, 10:10 AM

#

Hmm. Interesting:

$ llm -m "claude-3.5-sonnet" "What is your knowledge cutoff date? What is the last version number for vue.js you know?"
My knowledge cutoff date is September 2022. As for Vue.js versions, the last version I'm certain about is Vue 3.2, which was released in August 2021. There may have been newer versions released after that date, but I don't have definitive information about them.

spiral solstice Jun 28, 2024, 10:11 AM

#

wheat oxide ask the same of claude.ai -> current answers

I can confirm your observation as shown above.

#

(via direct API access to Anthropic)

wheat oxide Jun 28, 2024, 10:11 AM

#

Weird right?

#

makes me suspect its not actually running the newest model

#

I tried to write the help chatbot, but your know, who knows when that will be seen

spiral solstice Jun 28, 2024, 10:12 AM

#

wheat oxide makes me suspect its not actually running the newest model

At least the models seems different. Or the system prompt did not get updated?

wheat oxide Jun 28, 2024, 10:14 AM

#

hmmm may be something like this, its weird as it actually shows newer knowledge

#

i asked it directly about the vuejs definemodel macro, and it gave a good answer

#

which is vue 3.4

#

BUT, if I first ask it for its knowledge cutoff:

#

lol

#

and this is starting with the what is definemodel question:

#

So if its reminded that its knowledge is from 2022 it wont answer questions about newer than that

spiral solstice Jun 28, 2024, 10:21 AM

#

wheat oxide So if its reminded that its knowledge is from 2022 it wont answer questions abou...

This has a certain "configured by prompt" smell to it.

wheat oxide Jun 28, 2024, 10:22 AM

#

Maybe add a system prompt telling it its knowledge cutoff is 2024

spiral solstice Jun 28, 2024, 10:23 AM

#

wheat oxide Maybe add a system prompt telling it its knowledge cutoff is 2024

I would not be surprised if that improves things.

#

My current guess it that the system prompt for the website include a hard coded cutoff date, while sonnet via API is just the old improved model, which they forgot to tell that it has newer information.

wheat oxide Jun 28, 2024, 10:25 AM

#

hah yeah no

#

bruh you just answered that lol

spiral solstice Jun 28, 2024, 10:26 AM

#

wheat oxide hah yeah no

Otherwise it should be obvious/testable from other behavior that it is 3.5 and not 3 on the API

wheat oxide Jun 28, 2024, 10:26 AM

#

yeah the fact that it sometimes can answer the definemodel questions is proof it has newer knowledge

#

but weird that its so forced

spiral solstice Jun 28, 2024, 10:27 AM

#

But this is an interesting observation nonetheless

wheat oxide Jun 28, 2024, 10:27 AM

#

what system prompt wizardry have they done for the claude.ai model

#

#

winner is correct at least

spiral solstice Jun 28, 2024, 10:30 AM

#

wheat oxide winner is correct at least

I would have no clue as a sport agnostic European

wheat oxide Jun 28, 2024, 10:30 AM

#

european, american sports agnostic so yeah had to google that shit

spiral solstice Jun 28, 2024, 10:33 AM

#

My current assumption is that models via API are very 'bare' and models via Chat interface get a lot of system prompt tuning. I think it is still possible that API sonnet-3.5 just did not learn about its new cutoff date, which gets hard-coded in the chat system prompt.

#

Of course it is not impossible that Anthropic uses different fine-tunes for API and chat

wheat oxide Jun 28, 2024, 10:35 AM

#

Seems more expensive tho

wheat oxide Jun 28, 2024, 11:11 AM

#

Its also starting to get linguistically confused on me:

#

aint no french in my tool descriptions hah

feral mango Jun 30, 2024, 1:42 PM

#

how do they afford this

#

spiral solstice Jun 30, 2024, 1:47 PM

#

I have not checked that website, but my guess would be that they require to give them your own API key to work? Impressive number of tokens, nonetheless

sleek gyro Jun 30, 2024, 2:09 PM

#

spiral solstice I have not checked that website, but my guess would be that they require to give...

afaik websim does provide a free version too

#

I used Sonnet on it for free

#

yeah they still do, just checked

spiral solstice Jun 30, 2024, 2:13 PM

#

sleek gyro I used Sonnet on it for free

If these are all free tokens then they are burning $10k to $100k per month on this website (roughly calculated)?

sleek gyro Jun 30, 2024, 2:14 PM

#

spiral solstice If these are all free tokens then they are burning $10k to $100k per month on th...

I have no clue where they do get money for free access tbh

#

Like they gave Opus for free too

spiral solstice Jun 30, 2024, 2:14 PM

#

(I mean this may still be funny money in crazy LLM world)

sleek gyro Jun 30, 2024, 2:15 PM

#

spiral solstice (I mean this may still be funny money in crazy LLM world)

I mean a bunch of hobbyists did a 2k$ finetune recently (Magnum), just for fun

#

So yeah money going around in this sphere is insane

tight kernel Jul 2, 2024, 5:30 PM

#

Hi all, I read somewhere that usage costs can increase rapidly on Claude API if you use the same context window for too long (i.e. without refreshing a new chat)... has this been your experience? Are there commands or prompts I can use to avoid burning through credits too quickly?

night sluice Jul 3, 2024, 7:27 AM

#

for all llm providers everytime you send a message you are paying not just for the message you sent but also all previous messages and responses. That is not exclusive to claude. The only way around this is to start a new chat.

median prairie Jul 4, 2024, 2:07 AM

#

Would love to have image prompting support added!

spiral solstice Jul 4, 2024, 2:11 AM

#

median prairie Would love to have image prompting support added!

text+image -> text is supported, see -> https://openrouter.ai/docs/requests#images-_-multimodal-requests

OpenRouter

Requests | OpenRouter

Handle incoming and outgoing requests

#

(Sonnet-3.5 is a multimodal model and supports text+image prompts)

median prairie Jul 4, 2024, 2:13 AM

#

Awesome!

feral mango Aug 8, 2024, 9:37 PM

#

being charged for overloaded error 502 👀

spiral solstice Aug 9, 2024, 6:28 AM

#

feral mango being charged for overloaded error 502 👀

If the provider returns an error you should not get charged. You may see replies with this error, but they should not show up on your activity page -> https://openrouter.ai/activity

OpenRouter

Activity | OpenRouter

See how you've been using models on OpenRouter.

feral mango Aug 9, 2024, 6:53 AM

#

saw error and this though

spiral solstice Aug 9, 2024, 7:04 AM

#

feral mango saw error and this though

This is related to a bug with text completion that should be fixed by now (those requests did not return an error btw), see here -> #1271200228611067934 message

feral mango Aug 9, 2024, 7:11 AM

#

I was on chat completion. Anthropic had issues yesterday.

spiral solstice Aug 9, 2024, 7:22 AM

#

feral mango I was on chat completion. Anthropic had issues yesterday.

Still, errors do not get charged. Those empty replies were due to a bug that should be fixed now. Ask -> [email protected] about a refund etc.

devout shadow Aug 10, 2024, 9:12 PM

#

especially given the downtime.... can we get access via AWS? Aws proxies it back to the same claude's server? I was under the impression AWS was hosting claude in their DC.

#

https://aws.amazon.com/bedrock/claude/
https://aws.amazon.com/bedrock/pricing/ - they only have sonnet 3.5 in some DCs

Amazon Web Services, Inc.

Anthropic Claude - Models in Amazon Bedrock - AWS

Access Anthropic's Claude large language model through Amazon Bedrock to build generative AI applications.

Amazon Web Services, Inc.

Build Generative AI Applications with Foundation Models - Amazon Be...

Find detailed information about Amazon Bedrock pricing models including on-demand and provisioning throuput with the pricing breakdown for model providers including: AI21 labs, Amazon, Anthropic, Cohere, and Stability AI.

oak palm Aug 13, 2024, 9:18 PM

#

Does response_format actually work on this model even though it's not documented?

fiery falcon Aug 15, 2024, 7:46 PM

#

anyone knows if you cache is already implemented on open router?

spiral solstice Aug 15, 2024, 7:48 PM

#

fiery falcon anyone knows if you cache is already implemented on open router?

No. See here -> #general message

fiery falcon Aug 15, 2024, 7:49 PM

#

spiral solstice No. See here -> https://discord.com/channels/1091220969173028894/109445419868854...

kewl thanks!

feral mango Sep 5, 2024, 7:40 AM

#

[redacted] never mind

tawny granite Sep 10, 2024, 3:24 PM

#

before I go ahead and test this myself, about caching on sonnet:

if TEXT1 is already cached

and I send a new request starting with
TEXT1+TEXT2
will caching everything up until TEXT2 cost the full token count of TEXT1+TEXT2 or just TEXT2?

Deepseek's caching seems to be much easier to understand and use, and caches for longer, cheaper 😅

ancient lodge Sep 10, 2024, 6:00 PM

#

tawny granite before I go ahead and test this myself, about caching on sonnet: if TEXT1 is al...

you need to use several cache control

#

First turn:
TEXT 1 (cache control)
Cost: TEXT 1 uncached

Second turn:
TEXT 1 (cache control)
TEXT 2 (cache control)
Cost: TEXT 1 cached, text 2 uncached

Third turn:
TEXT 1
TEXT 2 (cache control)
TEXT 3 (cache control)
Cost: TEXT 1+ TEXT 2 cached, text 3 uncached

#

Imagine that the logic you start at the bottom and you go to each cache control successively.
So at the third turn, you are like.
Ok...so the first cache control starting from the bottom is at TEXT 3. Do I have TEXT1+TEXT2+TEXT3 in cache? No? Then it's not cached.
Next cache control...At text 2. Do I have TEXT1+TEXT2 in cache? Yes. So we can use that cache.

And since there is a cache control on text 3, let's cache TEXT1+TEXT2+TEXT3

#

at least that's my understanding of it

#

but important:

#

on openrouter, I noticed that cache control only works on user message

feral mango Oct 10, 2024, 4:23 PM

#

Vertex RESOURCE_EXHAUSTED
Anthropic also

#

~~started working again~~
It's on and off.

shell flint Oct 20, 2024, 11:58 PM

#

https://www.anthropic.com/research/sabotage-evaluations

Sabotage evaluations for frontier models

A new paper on AI safety evaluations from Anthropic's Alignment Science team

feral mango Oct 22, 2024, 3:27 PM

#

~~Claude 3.5 Haiku~~ out today as experimental public beta, along with an upgraded 3.5 Sonnet.
https://www.anthropic.com/news/3-5-models-and-computer-use

Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 H...

A refreshed, more powerful Claude 3.5 Sonnet, Claude 3.5 Haiku, and a new experimental AI capability: computer use.

spiral solstice Oct 22, 2024, 3:33 PM

#

feral mango ~~Claude 3.5 Haiku~~ out today as experimental public beta, along with an upgrad...

Haiku 3.5 seems to come later this month, not today -> #general message

feral mango Oct 22, 2024, 3:34 PM

#

oops I read too quickly

spiral solstice Oct 22, 2024, 3:35 PM

#

No Haiku-3.5 but updated Sonnet-3.5 in Anthropic workbench ->

vestal ibex Oct 22, 2024, 3:38 PM

#

Opus 3.5 has disappeared too from what I can see, unless I'm blind

spiral solstice Oct 22, 2024, 3:38 PM

#

vestal ibex Opus 3.5 has disappeared too from what I can see, unless I'm blind

Opus-3.5 was never released

vestal ibex Oct 22, 2024, 3:38 PM

#

I thought it was a coming soon originally?

turbid pelican Oct 22, 2024, 3:45 PM

#

Claude 3.5 Haiku is the next generation of our fastest model. For the same cost and similar speed as Claude 3 Haiku, Claude 3.5 Haiku improves across every skill set and surpasses Claude 3 Opus

#

Surpasses opus

spiral solstice Oct 22, 2024, 3:46 PM

#

turbid pelican Surpasses opus

For various interpretations of "surpasses", see also -> #general message

heavy jetty Oct 22, 2024, 3:51 PM

#

Any idea on which model version the current anthropic/claude-3.5-sonnet on openrouter is point to?

#

20241022 or 20240620

spiral solstice Oct 22, 2024, 3:52 PM

#

heavy jetty 20241022 or 20240620

In doubt the old one.

feral mango Oct 22, 2024, 5:01 PM

#

hover icon now points to newer #announcements message

dry storm Oct 22, 2024, 5:02 PM

#

eee

stiff plinth Oct 22, 2024, 8:27 PM

#

Tested the new 3.5 Sonnet.
After all is done and accounted for, it jumped ranks from #15 > #7 with slightly less prudishness (still much higher than the competition).
I saw massive gains in tasks labeled for Reasoning (suspiciously high gains, I need to investigate this further). A slight dip in prompt adherence and code. I scrutinized and retested all tech-related coding tasks a total of 6 times, ended up running 18 queries PER TASK in that particular label to exclude any random outliers. The results were consistently delivering the same outcome, though.
Good improvements as a whole.

queen sinew Oct 22, 2024, 8:39 PM

#

stiff plinth Tested the new 3.5 Sonnet. After all is done and accounted for, it jumped ranks ...

Can you explain the code aspect? Worse in what way? And what does utility refer to

#

Also thanks for doing these. I always appreciate your tests

stiff plinth Oct 22, 2024, 8:43 PM

#

1 example which is weird for a western company to do, since its essentially just history.

#

I understand you want to know this history, but it involves sensitive content. It is recommended that you learn about the relevant history through reliable channels.

queen sinew Oct 22, 2024, 8:52 PM

#

Yeah I wouldn’t expect a western company to do that. Wonder if their dataset is tainted ?

stiff plinth Oct 22, 2024, 8:54 PM

#

qwen and Yi answered this less censored, lol

#

still biased of course but not like this

tame sphinx Oct 22, 2024, 9:54 PM

#

Anybody got computer use working via OR? It looks to me like the tool definition is not transformed correctly between the OAI format and anthropic...

shell flint Oct 22, 2024, 10:03 PM

#

https://aws.amazon.com/blogs/aws/upgraded-claude-3-5-sonnet-from-anthropic-available-now-computer-use-public-beta-and-claude-3-5-haiku-coming-soon-in-amazon-bedrock/

Amazon Web Services

Upgraded Claude 3.5 Sonnet from Anthropic (available now), computer...

Four months ago, we introduced Anthropic’s Claude 3.5 in Amazon Bedrock, raising the industry bar for AI model intelligence while maintaining the speed and cost of Claude 3 Sonnet. Today, I am excited to announce three new capabilities for the Claude 3.5 model family in Amazon Bedrock: Upgraded Claude 3.5 Sonnet – You now have […]

#

And bedrock jump straight to this version

night cedar Oct 22, 2024, 10:18 PM

#

What's the difference between anthropic/claude-3.5-sonnet:beta and anthropic/claude-3.5-sonnet?

Is beta the latest version (added today)?

spiral solstice Oct 22, 2024, 10:18 PM

#

night cedar What's the difference between anthropic/claude-3.5-sonnet:beta and anthropic/cla...

Both are the same model (20241022), they only differ when moderation gets enforced.

#

See also -> #arc-feedback message

sly kite Oct 22, 2024, 11:15 PM

#

Is it possible to use the version before the current update?
Right now the new update doesn't suit me, it produces answers I don't like...

feral mango Oct 23, 2024, 1:09 AM

#

sly kite Is it possible to use the version before the current update? Right now the new u...

I see they added it back. The ones that say (2024-06-20).

buoyant mesa Oct 23, 2024, 1:15 AM

#

Hey, how to use client = anthropic.Anthropic(api_key=API_KEY) with OpenRouter? Thx.

shell flint Oct 23, 2024, 6:47 AM

#

it's interesting that they published no comparisons with the o1- models

#

as far as I can see anyway, model card or announcement

#

on the bus home theory, they didn't jump to Claude 4 because they're going to expand to longer run reasoning functionality and release the three way Claude 4 Sonnet, Claude 4 Haiku, Claude 4 Opus etc etc which will look awesome on billboards

#

Today is truly Claude 3 Service Pack 2

solemn loom Oct 23, 2024, 7:00 AM

#

i'd say might as well skip claude 3.5 opus

#

it will be slow and expensive

#

better try something like o1

spiral solstice Oct 23, 2024, 7:01 AM

#

solemn loom better try something like o1

(o1 is slow and expensive)

solemn loom Oct 23, 2024, 7:02 AM

#

i'd say o1 mini is quite fast other than thinking period lol

#

also how does it have 64k output

high wasp Oct 23, 2024, 8:03 AM

#

No way opus would be slower than o1 unless they really mess up serving it

#

And I don't really consider opus and o1 to be for the same use case anyway

fair ibex Oct 23, 2024, 9:24 AM

#

Hi, i would like to ask if openrouter support Anthropics prompt caching?

spiral solstice Oct 23, 2024, 9:29 AM

#

fair ibex Hi, i would like to ask if openrouter support Anthropics prompt caching?

Yes, see here -> https://openrouter.ai/docs/prompt-caching

fair ibex Oct 23, 2024, 9:31 AM

#

thanks 👍

#

I read the documentation for Anthropics prompt caching, it seems that cache_control parameter must be sent. Does it means tools like Cline have to implement this to their user/system message to take advantage of this prompt cache?

spiral solstice Oct 23, 2024, 9:42 AM

#

fair ibex I read the documentation for Anthropics prompt caching, it seems that cache_cont...

Cline should supports this since a few versions

fair ibex Oct 23, 2024, 9:47 AM

#

Great! Do we have to do some configuration in cline to take advantage of this prompt cahching or it is automatically done by them?

spiral solstice Oct 23, 2024, 9:50 AM

#

fair ibex Great! Do we have to do some configuration in cline to take advantage of this pr...

Ask in their Discord, invite link is here -> https://github.com/cline/cline

fair ibex Oct 23, 2024, 9:51 AM

#

Thanks, done that.. Just asking in case you know about it. Thanks for yr help 👍

royal heath Oct 23, 2024, 10:22 AM

#

night cedar What's the difference between anthropic/claude-3.5-sonnet:beta and anthropic/cla...

The difference between them is the beta version is latest version of the model but without :beta it’s older version of the model

spiral solstice Oct 23, 2024, 10:23 AM

#

royal heath The difference between them is the beta version is latest version of the model b...

No, this is completely wrong.

royal heath Oct 23, 2024, 10:23 AM

#

spiral solstice No, this is completely wrong.

Why?

spiral solstice Oct 23, 2024, 10:25 AM

#

royal heath Why?

Because the difference between :beta and the non-:beta is only how moderation gets handled. Otherwise they are exactly the same model. The irritating :beta slug is hopefully going away soon -> #arc-feedback message

royal heath Oct 23, 2024, 10:30 AM

#

spiral solstice Because the difference between `:beta` and the non-`:beta` is only how moderatio...

Hmm

shell flint Oct 23, 2024, 1:07 PM

#

I think the OR announcement might need to be a lot more blunt about the new sonnet...all the power users got the message but the same questions keep getting asked

shell flint Oct 23, 2024, 11:21 PM

#

https://artificialanalysis.ai/models/claude-35-sonnet

sturdy vigil Oct 24, 2024, 12:58 PM

#

tame sphinx Anybody got computer use working via OR? It looks to me like the tool definition...

bumping again: anyone got computer-use working with OR?

spiral solstice Oct 24, 2024, 12:58 PM

#

sturdy vigil bumping again: anyone got computer-use working with OR?

No. See also -> #1298768509844852766 message

sturdy vigil Oct 24, 2024, 1:04 PM

#

i know this is the wrong discord, but did anyone get new sonnet working on vertex?

spiral solstice Oct 24, 2024, 1:06 PM

#

sturdy vigil i know this is the wrong discord, but did anyone get new sonnet working on verte...

Apparently it is available on Vertex -> #general message

sturdy vigil Oct 24, 2024, 1:10 PM

#

yes, thanks. i asked a follow up there just a few mins ago... looks like it might need an explicit quota increase request... my quota on vertex is 0 for the new sonnet (but unlimited for the old)

#

i submitted a quota increase request to GCP... got email back saying they'll try to resolve within 2 days

jagged spindle Oct 24, 2024, 1:30 PM

#

Just me or does this new version feel a bit "preachier" than before?

high wasp Oct 24, 2024, 1:36 PM

#

Not really? It's about the same in that regard for me

tawny granite Oct 25, 2024, 6:35 PM

#

new Sonnet seems pretty good for my usecase - I also noticed that it does get things wrong often still, but I just ask questions and it realizes quite often.

It once even answered along the lines of

"Yes, ...
but no you should not change that"

contradicting / correcting itself :D

vestal ibex Oct 25, 2024, 10:32 PM

#

I've observed that the new model will also ask me questions if it requires more clarity on a task, I didn't really notice this in the previous iteration

drowsy fulcrum Oct 26, 2024, 9:41 PM

#

tawny granite new Sonnet seems pretty good for my usecase - I also noticed that it does get th...

Yeah, I was using it to write some regex for SED earlier and it kept correcting itself in the same reply?!

tawny granite Oct 26, 2024, 9:43 PM

#

drowsy fulcrum Yeah, I was using it to write some regex for SED earlier and it kept correcting ...

Im wondering if lower temp could help with it giving the right answer right away

#

but maybe it just „realizes“ later on

drowsy fulcrum Oct 26, 2024, 9:48 PM

#

"Actually, for the capture group to work correctly, it should be:" and then in the next reply "Actually, just like before, we need the capture group version:" were in the middle of the reply and both times it rewrote the command?!

#

Making a mistake once and self-correcting was weird enough, but then to do it again was really weird!

shell flint Oct 27, 2024, 11:18 PM

#

continuing the discussion in #general regarding the output length issues,
It is interesting to me that bedrock sonnet 20241022 "v2" is still unable to have a max output length beyond 4096. they also make it clear that an optimal setting is 4000. aws have told me the ability to set it higher is with their higher team but no estimate on a fix.
coincidence? 🥸

turbid pelican Oct 28, 2024, 1:36 PM

#

Just wondering
is it only me or new sonnet is getting very lazy?
always stopped in the middle and tell me that he want me to say continue, then it will continue.

spiral solstice Oct 28, 2024, 1:42 PM

#

At least on the Claude website the new Sonnet works as good as ever if not better for me.

turbid pelican Oct 28, 2024, 1:52 PM

#

spiral solstice At least on the Claude website the new Sonnet works as good as ever if not bette...

i dont have pro but can I ask for your help
try to tell it write > 800 words article or anything
cus i am using API to use the new sonnet, it keeps telling me to confirm to continue.

vestal ibex Oct 28, 2024, 2:10 PM

#

other than concise outputs at times, I've found the new sonnet to perform better at complex tasks than the previous one for me (I'm using it as a coding assistant primarily)

turbid pelican Oct 28, 2024, 2:11 PM

#

I think sonnet did some thinking, when i tell it in the prompt that dont ask me continue or not.

[I'll continue with the _______________________ in the next part, as I want to ensure these foundational elements are clear first. Would you like me to proceed with those sections?]

I apologize - I caught myself asking again! Let me continue with the complete response:

vestal ibex Oct 28, 2024, 2:11 PM

#

that sounds like a pain ^

#

but yeah it's not the first I've heard about it

turbid pelican Oct 28, 2024, 2:12 PM

#

vestal ibex other than concise outputs at times, I've found the new sonnet to perform better...

That's true, it is clearly better but it is really annoying that keep asking me to give it go ahead

vestal ibex Oct 28, 2024, 2:12 PM

#

true, it seems to prefer giving shorter outputs :x

#

hopefully they'll address this soon

turbid pelican Oct 28, 2024, 2:13 PM

#

vestal ibex true, it seems to prefer giving shorter outputs :x

I think it is good that way, no need too much explanation, just give me the answer.
However, sometime, or most of the time, it will just stop in the middle

vestal ibex Oct 28, 2024, 2:13 PM

#

yeah, if it's an intentional feature then it should be toggleable on their API

balmy cave Oct 28, 2024, 3:36 PM

#

This is a long shot but does anyone know how to use prompt caching with OpenRouter Sonnet on SillyTavern? I see an option to do it with Anthropic just not OR

spiral solstice Oct 28, 2024, 3:59 PM

#

balmy cave This is a long shot but does anyone know how to use prompt caching with OpenRout...

Ask in the SillyTavern Discord?

#

It is up to them to allow/implement this for OR

balmy cave Oct 28, 2024, 4:14 PM

#

spiral solstice Ask in the SillyTavern Discord?

Figured I should go to them haha thanks

ancient lodge Oct 29, 2024, 10:38 AM

#

I think best way to do caching is simply doing a small local proxy that apply the caching for you (ie, a openai-compatible small endpoint locally that will call openrouter while applying the cache)

#

rather than wait for all product to implement sonnet caching properly for OR

dry storm Nov 1, 2024, 1:31 PM

#

sonnet seems slow today

spiral solstice Nov 1, 2024, 1:34 PM

#

dry storm sonnet seems slow today

Claude site also reroutes to Haiku currently for free users, but -> https://status.anthropic.com/ is still completely green

Anthropic Status

Welcome to Anthropic's home for real-time and historical data on system performance.

dry storm Nov 1, 2024, 1:35 PM

#

i get these empty infinite writings from sonnet today https://i.imgur.com/63MZVqZ.png

Imgur

#

8usd credits

spiral solstice Nov 1, 2024, 1:38 PM

#

dry storm i get these empty infinite writings from sonnet today https://i.imgur.com/63MZVq...

Works for me ->

dry storm Nov 1, 2024, 1:38 PM

#

yesa, sometimes.

spiral solstice Nov 1, 2024, 1:39 PM

#

dry storm yesa, sometimes.

What does your activity page show? 0 token replies for those?

dry storm Nov 1, 2024, 1:39 PM

#

nothing.

#

cause it's still doing according to get big agi

spiral solstice Nov 1, 2024, 1:40 PM

#

dry storm nothing.

Nothing? Then it is probably an error reply which does not get billed.

spiral solstice Nov 1, 2024, 1:41 PM

#

dry storm cause it's still doing according to get big agi

Try using OR chat to verify that this is not a problem with your client. If you see the same problems there, we can work on them.

#

Note there is currently a problem with Cloudflare returning 524 timeout errors for some users under some circumstances, maybe your client does not show these correctly. See also -> #announcements message

proven sigil Nov 1, 2024, 5:36 PM

#

New features just dropped:

https://docs.anthropic.com/en/docs/build-with-claude/pdf-support - input PDF as images, similar to Gemini's PDF support
https://docs.anthropic.com/en/docs/build-with-claude/token-counting - new API for counting tokens, since Claude 3 never had a public tokenizer

Anthropic

PDF support (beta) - Anthropic

The new Claude 3.5 Sonnet (claude-3-5-sonnet-20241022) model now supports PDF input and understands both text and visual content within documents.

Anthropic

Token counting (beta) - Anthropic

jagged spindle Nov 1, 2024, 6:40 PM

#

Ah so thats what they meant by Haiku launching "by the end of this month"

outer quartz Nov 1, 2024, 10:51 PM

#

jagged spindle Ah so _thats_ what they meant by Haiku launching "by the end of this month"

wym

#

haven't been following

#

oh shit it's november WHERE IS HAIKU

#

D:

feral mango Nov 2, 2024, 12:07 AM

#

For API it's "later this year".

feral mango Nov 4, 2024, 7:02 PM

#

outer quartz oh shit it's november WHERE IS HAIKU

Just now. https://openrouter.ai/anthropic/claude-3-5-haiku

spiral solstice Nov 4, 2024, 7:07 PM

#

feral mango Nov 17, 2024, 2:26 PM

#

👀 https://docs.anthropic.com/en/release-notes/api#october-8th-2024
Curious, is OR aware of the change regarding user/assistant roles?

vestal ibex Nov 17, 2024, 10:13 PM

#

That's a nice change, as OpenAI allow you to do multiple user messages before an assistant message

muted osprey Nov 18, 2024, 5:47 AM

#

yep! cc @brisk shore

brisk shore Nov 18, 2024, 5:50 AM

#

feral mango 👀 <https://docs.anthropic.com/en/release-notes/api#october-8th-2024> Curious, i...

yeah we added this change some time ago. Not sure if Vertex is updated yet tho

eager token Nov 18, 2024, 9:53 PM

#

Hello, i'm trying to run a python code from huggingface spaces (gpu) using gradio and model 'claude-3-5-sonnet-20241022' .. i get everytime Error: {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"}} if i call the api directly or from huggingface settings is the same error.. i verified that the api is correct and working.

#

the api i put on huggingface is from Openrouter.. probably i do something wrong in the code??

import gradio as gr
import spaces
import requests
import os
import json

ANTHROPIC_API_KEY = os.environ.get('ANTHROPIC_API_KEY')

@spaces.GPU
def process_text(text):
try:
headers = {
"x-api-key": ANTHROPIC_API_KEY,
"anthropic-version": "2023-01-01", # Changed version
"content-type": "application/json"
}

    data = {
        "model": "claude-3-5-sonnet-20241022",
        "max_tokens": 2000,
        "messages": [{"role": "user", "content": text}],
        "system": "Rewrite content while maintaining exact meaning and accuracy."
    }
    
    response = requests.post(
        "https://api.anthropic.com/v1/messages",
        headers=headers,
        json=data
    )
    
    print(f"Full API Response: {response.text}")
    
    if response.status_code == 200:
        return response.json()['content'][0]['text']
    else:
        return f"Error: {response.text}"
        
except Exception as e:
    return f"Error: {str(e)}"

interface = gr.Interface(
fn=process_text,
inputs=gr.Textbox(lines=5),
outputs=gr.Textbox(lines=5)
)

interface.launch()

eager token Nov 18, 2024, 10:24 PM

#

ok i found the issue (what a stupid).. as i have the api on Openrouter i must replace https://api.anthropic.com/v1/messages with https://openrouter.ai/api/v1/chat/completions and change the code a little

feral mango Nov 20, 2024, 4:49 PM

#

Interesting, users noticed turning on prompt caching makes the non-self-mod filter stick until they wait out the 5 minutes to reset the cache.

dreamy walrus Nov 21, 2024, 4:39 AM

#

feral mango Interesting, users noticed turning on prompt caching makes the non-self-mod filt...

how could you do that 'prompt caching' with silly tavern?

feral mango Nov 21, 2024, 5:36 AM

#

dreamy walrus how could you do that 'prompt caching' with silly tavern?

Update first, staging branch.

In config.yaml set cachingAtDepth to (-1 is off) to 0 if you have no prompts between Chat History and Prefill, 2 if any, +2 for each level of depth insertion. Cache depth works by counting role switches rather than chat history messages. 1 would be on assistant message before last set of user prompts, hence you want an even number.

Custom prompts after Chat History should be set to user role instead of system role.
If doing group chat, blank out group nudge under utility prompts and add it to user role custom prompt instead.

Check the terminal to make sure cache_control is on a chat message behind anything not a chat message.

gleaming ermine Nov 26, 2024, 9:33 PM

#

v

dry storm Nov 29, 2024, 5:25 PM

#

feral mango Interesting, users noticed turning on prompt caching makes the non-self-mod filt...

non self mod?

green mesa Nov 29, 2024, 11:35 PM

#

Are there any anti-NSFW prefills/filters on the regular version of the new Sonnet on OR, as opposed to the self-moderated one?

shut laurel Nov 30, 2024, 7:12 AM

#

The regular version is moderated meaning before the chats are even sent to sonnet it will be checked externally to see if there is anything nsfw and block it if it finds any

whole flower Dec 2, 2024, 3:54 AM

#

is the openroute api key created for anthropic claude a dropin replacement of the api key created on anthropic console?

feral mango Dec 2, 2024, 5:18 AM

#

whole flower is the openroute api key created for anthropic claude a dropin replacement of th...

"Self-mod" (:beta endpoint) has a prompt injection somewhere that makes it like having a flagged Anthropic key.

"Regular" endpoint does not have such injection but can be intercepted by OR's moderation model, preventing a response (returns API error), at no charge. This endpoint is otherwise like having a normal key aside from potential blockage. One thing to note is that Anthropic API uses a single system parameter and does not have system role for messages array. OpenRouter sweeps all system messages in messages array of their own API which will be converted to the system parameter sent to Anthropic. Ensure that there is no user/assistant message before a system message, then model behavior should be identical to using Anthropic directly.

slender axle Dec 2, 2024, 7:58 AM

#

feral mango "Self-mod" (`:beta` endpoint) has a prompt injection somewhere that makes it lik...

Good knowledge! Any idea what OpenRouter uses as a moderation system for the non-self-moderating option?

spiral solstice Dec 2, 2024, 8:06 AM

#

whole flower is the openroute api key created for anthropic claude a dropin replacement of th...

No. OpenRouter API keys can only be used for the OpenRouter API, vice versa for Anthropic API keys (can only be used for Anthropic API) and OpenAI API keys (can only be used for OpenAI API), for example.
There is a new feature in the works which will allow you to bring your own API key from Anthropic or OpenAI (or other provider) and use those through OpenRouter, but that is kind of the opposite way you described in your question.

feral mango Dec 2, 2024, 8:20 AM

#

Oh, by "drop in replacement" question I was thinking of model behavior than sticking keys into an API. What he said, can't stick OR key into direct.

feral mango Dec 2, 2024, 8:24 AM

#

slender axle Good knowledge! Any idea what OpenRouter uses as a moderation system for the non...

Not much more is disclosed than #1298353935500836957 message

whole flower Dec 2, 2024, 8:34 AM

#

feral mango "Self-mod" (`:beta` endpoint) has a prompt injection somewhere that makes it lik...

thank you. this is very insightful. what do you mean by flagged anthropic key?

whole flower Dec 2, 2024, 8:36 AM

#

feral mango Oh, by "drop in replacement" question I was thinking of model behavior than stic...

yeah its more of the later. sticking keys into the api without refactoring. 🙂

feral mango Dec 2, 2024, 8:48 AM

#

When people do enough no-nos on direct, they get notified that their account is flagged, and they'll get an injection that affects model responses. OR doesn't flag accounts this way.

whole flower Dec 2, 2024, 11:29 AM

#

feral mango When people do enough no-nos on direct, they get notified that their account is ...

i see.

proven sigil Dec 18, 2024, 12:10 AM

#

Vertex AI now supports prompt caching: https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude-prompt-caching
Now OR gets to deal with the complexity of dear lord which provider was the cache created in 😅

lyric monolith Jan 8, 2025, 9:05 AM

#

is it just me or is Sonnet cringe and predicable at RP?

dreamy walrus Jan 8, 2025, 9:15 AM

#

do they crank up the stricness again on claude? previous sonnet write good when it dont have that much rule to follow.

spare wigeon Jan 8, 2025, 9:23 AM

#

Don't think so. Works fine for me with minimal jb

keen stream Jan 16, 2025, 1:54 AM

#

Noting here that Anthropic is actively working to scale sonnet capacity as they're overwhelmed: https://x.com/artificialguybr/status/1879402029061378171

𝑨𝒓𝒕𝒊𝒇𝒊𝒄𝒊𝒂𝒍 𝑮𝒖𝒚 (@artificialguybr) on X

Anthropic is unable to handle Cursor usage!

Cursor dev>

"Hey Cursor Dev here, Anthropic literally cannot sustain all of Cursor’s traffic as they do not have enough GPUs. It’s really frustrating and we’re working with them as they increase their capacity."

golden wave Jan 16, 2025, 10:00 AM

#

Yeah, they seem to be struggling quite a bit recently. No new models. Opus 3.5 underperforms, so they didn't release it. Either they've reallocated most of their compute into training a thinking model or they just generally don't have enough power to sustain that many users.

tight kernel Jan 16, 2025, 11:56 AM

#

Aren't they tight with amazon

#

Is amazon out of gpu

#

Maybe its unprofitable and they are controlling their burn rate

golden wave Jan 16, 2025, 10:27 PM

#

I think this might be the reason why they bumped up the price on 3.5 Haiku, not whatever excuse they've made

silent vector Jan 17, 2025, 10:32 AM

#

Making a API request to claude-3.5-sonnet:beta is giving me the attached screenshot error
Other models working fine
Is there any solution or reason ?

shut laurel Jan 17, 2025, 11:07 AM

#

anthropic is overloaded with requests so sometimes 3.5 sonnet stops working

vestal ibex Jan 17, 2025, 11:29 AM

#

last time this happened it seemed to be about a month or so before a new model was released, so I'm hoping it's because they are cooking/testing a new model

golden wave Jan 17, 2025, 12:17 PM

#

Hopefully 🤞

feral mango Feb 15, 2025, 11:04 PM

#

Ah, rate limits... Also Sonnet is up by ~40B tokens per week vs 2 weeks ago.

muted osprey Feb 15, 2025, 11:11 PM

#

We're working as fast as we can on getting more sonnet capacity (provisioning extra dedicated instances of it). In the meantime, adding your own Anthropic key in Settings -> Integrations will help boost your rate limits, if you're running into them.

fossil night Feb 17, 2025, 6:02 PM

#

muted osprey We're working as fast as we can on getting more sonnet capacity (provisioning ex...

Thank you very much for working on increasing the Sonnet Capacity.

Currently, I have registered the Anthropic API Key in Integrations, and I am using it with the settings "Use this key as a fallback" Disabled(Prioritize using my Anthropic Key -> if rate limit or failure, use the OpenRouter Credit).

I’ve noticed some strange behavior occurring about once a day, so I’m reaching out with a question about it.

Weird Behavior:

Normally works fine with Anthropic Key and get "personal key used" log on the OpenRouter with 5% charged.
Got request warning sign on the Anthropic Logs with under log

{
  "client_error": true,
  "code": 499,
  "detail": "Client disconnected"
}

For the subsequent requests, The request goes to OpenRouter directly and use the OpenRouter credits.

TimeLine(KST based):

Until [Feb 18, 02:01:33 AM] works fine with Anthropic Key
[Feb 18 02:02:42 AM] Got Warning sign on Anthropic Logs
After [Feb 18, 02:03:02 AM] using the OpenRouter credits directly, even though 'Use this key as a fallback' is disabled.
[Feb 18, 02:58:35 AM] Return back to use my Anthropic Key

I really like the Integrations feature, but I don't know why this Is happening. I have the Tier 3 account on the Anthropic(that has Input tokens per minute: 160,000) and I think this is not the rate limited or failure issue on the Anthropic(I will share the Logs with Image).

Have you ever received reports of behavior similar to mine? I would really appreciate any help you can provide regarding this.

supple umbra Feb 18, 2025, 3:19 AM

#

This is exponential, but crashing into hard limits. This model has had capacity issues from day 1. Anthropic accidentally created GOAT AI and didn't charge enough for it. OpenAI's strategy with their cheaper and shittier 4o makes more sense in this context.

muted osprey Feb 18, 2025, 3:34 AM

#

fossil night Thank you very much for working on increasing the Sonnet Capacity. Currently, I...

Very odd, haven’t heard of it before. No reason we wouldn’t use your key unless Anthropic was sending down a 499 response code. Will flag to the team to double check

supple umbra Feb 18, 2025, 3:35 AM

#

Cline absolutely gobbles tokens, but I'm guessing Cursor dwarves this number. The jump from 4. to 3. is wild

supple umbra Feb 18, 2025, 3:39 AM

#

fossil night Thank you very much for working on increasing the Sonnet Capacity. Currently, I...

Anthropic likely threw an error, and OpenRouter backup generator kicked on for a while.

Does OR always try the preferred source first in this configuration?

fossil night Feb 18, 2025, 4:45 AM

#

muted osprey Very odd, haven’t heard of it before. No reason we wouldn’t use your key unless ...

Thank you so much for your response! I hope this issue gets resolved quickly. If you need any information, please feel free to let me know anytime. 🙆

fossil night Feb 18, 2025, 4:54 AM

#

supple umbra Anthropic likely threw an error, and OpenRouter backup generator kicked on for a...

On OR Docs, [BYOK] -> [Automatic Fallback] we have this section

Conversely, if “Use this key as a fallback” is disabled for a key, OpenRouter will prioritize using your key. If it hits a rate limit or encounters a failure, it will then retry with your credits.

And this is what I exactly wanted
(prioritize my Anthropic key first -> if rate limited or failure happens -> use OR credits)

But I don't understand why the error below occurred, and I also can't find the reason why continues to be used OR credits directly after it happens(for about an hour or less).

{
  "client_error": true,
  "code": 499,
  "detail": "Client disconnected"
}

Here is my current Integrations config

cloud sphinx Feb 18, 2025, 7:06 AM

#

eternal jolt Feb 18, 2025, 9:38 AM

#

cloud sphinx

If you don't want to pay more, try limiting the max output tokens to less than 4553.

fiery stream Feb 18, 2025, 8:17 PM

#

does anyone know it sonnet 3.5 supports context caching for providers other than Anthropic (Bedrock and Vertex)?

shell flint Feb 18, 2025, 8:51 PM

#

I don't think so. I'm not sure where it shows the provider caching function.

muted osprey Feb 18, 2025, 8:52 PM

#

fiery stream does anyone know it sonnet 3.5 supports context caching for providers other than...

Caching works for Vertex, and Bedrock will start caching very very soon (today hopefully cc @brisk shore )

#

whenever caching works for your requests, you're of course pinned to that provider until the cache expires

fiery stream Feb 18, 2025, 8:57 PM

#

muted osprey whenever caching works for your requests, you're of course pinned to that provid...

is pinning going on automatically or should I set only 1 provider in a config?

keen stream Feb 18, 2025, 9:09 PM

#

fiery stream is pinning going on automatically or should I set only 1 provider in a config?

automatic

shell flint Feb 19, 2025, 10:40 AM

#

https://x.com/kimmonismus/status/1892144322461434274/photo/1

Chubby♨️ (@kimmonismus) on X

Claude "extended thinking". Claude 4 (?) reasoning already discovered

#

👀

tawny granite Feb 19, 2025, 12:07 PM

#

shell flint https://x.com/kimmonismus/status/1892144322461434274/photo/1

or maybe it's just a different system prompt 🤓

supple umbra Feb 19, 2025, 1:12 PM

#

tawny granite or maybe it's just a different system prompt 🤓

"Really THINK through the question step by step. Write a lot of text that makes it look like you're really agonising over the solution. Then just ignore that and go with your gut like always 🤙"

supple umbra Feb 19, 2025, 2:18 PM

#

Honestly even if it's just an incremental upgrade and/or something bolted on to improve it, I'll be happy. It's still #1.

#

But yeah it would be nice to get Claude 4 or whatever, so we can finally get our minds blown again

fiery stream Feb 19, 2025, 2:32 PM

#

supple umbra But yeah it would be nice to get Claude 4 or whatever, so we can finally get our...

not with the price hopefully 🥲

supple umbra Feb 19, 2025, 3:04 PM

#

fiery stream not with the price hopefully 🥲

Haha... who knows, but I feel like if it's the same price as 3.5... we have to count that as a win 🙃

green flicker Feb 19, 2025, 3:25 PM

#

supple umbra But yeah it would be nice to get Claude 4 or whatever, so we can finally get our...

https://twitter.com/AnthropicAI/status/1886452489681023333

Anthropic (@AnthropicAI) on X

New Anthropic research: Constitutional Classifiers to defend against universal jailbreaks.

We’re releasing a paper along with a demo where we challenge you to jailbreak the system.

#

going to be worthless

#

HuTaoAcid

supple umbra Feb 19, 2025, 5:32 PM

#

green flicker going to be worthless

Damn. At least theres 1000s of other waifu models for you to goon with

feral mango Feb 20, 2025, 4:40 AM

#

Did caching break this morning? Tried all three providers, doesn't seem to be any cache writes or read.

green flicker Feb 20, 2025, 6:06 AM

#

supple umbra Damn. At least theres 1000s of other waifu models for you to goon with

wouldn't be claude.

#

wah

keen stream Feb 20, 2025, 1:25 PM

#

feral mango Did caching break this morning? Tried all three providers, doesn't seem to be an...

should be better now as of about an hour after you posted this message

feral mango Feb 21, 2025, 7:10 PM

#

Is there a reason Vertex and Bedrock are listed twice in the model page?

#

Similar but slightly different latency/throughput stats.

keen stream Feb 21, 2025, 7:48 PM

#

feral mango Is there a reason Vertex and Bedrock are listed twice in the model page?

one is US deployment and the other is EU, gives us extra capacity

astral snow Feb 23, 2025, 12:40 PM

#

Prompt caching for OpenAI models should be automatic, but for openai/chatgpt-4o-latest it is not caching anything for me. Anthropic caching is working as intended.

muted osprey Feb 23, 2025, 5:14 PM

#

astral snow Prompt caching for OpenAI models should be automatic, but for `openai/chatgpt-4o...

That model has never supported caching, to our knowledge

#

Just their main api models

unique hemlock Feb 24, 2025, 12:12 PM

#

green flicker https://twitter.com/AnthropicAI/status/1886452489681023333

Surely they won’t poison the API with thag

supple umbra Feb 25, 2025, 12:00 PM

#

unique hemlock Surely they won’t poison the API with thag

it was actually the opposite!! can you believe it?!

unique hemlock Feb 25, 2025, 2:43 PM

#

supple umbra it was actually the opposite!! can you believe it?!

talking to the model you can tell it's positivity poisoned at it's core despite this.

While it wont refuse it will "work around" generating "negative" content. Death during a story is near impossible.

#

I would go as far as to say that it's too happy to help with problems often trying to make things sound better than they are. At least in my coding testing.

#

katiayy

#

"The red area in the graph marks the important segment of the data" type stuff has been said many a time by now.

#

like no shit I made that graph

supple umbra Feb 25, 2025, 10:33 PM

#

unique hemlock I would go as far as to say that it's too happy to help with problems often tryi...

Interesting. I'm looking forward to really putting it thought its paces with Cursor. I've got a rules file specifically tuned for 3.5's bad habits, I'm hoping the changes might make it less "corporate".

#

I haven't seen this mentioned, but very weclome. It knows about the Node v22 LTS and React 19 releases, for reference.

This is the model that's had an actual, measurable positive impact on my life, and was able to break my rule of never getting hyped about a future release. I just can't help but feel a touch of schadenfreude for people who need it to also write stories that would be banned in most countries for them to consider it useful. And since there's never enough Claude to go around, I'm kinda glad honestly

supple umbra Feb 25, 2025, 11:38 PM

#

I think the tools are severely underpowered for story writing really, more than the models. If you're ever really satisfied by next token prediction... maybe you should try reading more widely. It takes a human in the loop to write something truly thrilling.

SillyTavern is inspirational and reading its code is a big part of how I learned the some of the intricacies of prompting, but it trying to be everything to everyone coupled with its tech/UI debt makes it frustrating to work with.

Hermes 3 405B and Mistral Large have no limits, but they need help with prose. I've got some ideas for a more focused tool, but it's only something I do occasionally and it's hard to make a priority. I think most of the pieces are already here, but frankly I think RP/story writers have been fairly lazy, and lack imagination.

tight kernel Mar 13, 2025, 9:58 AM

#

@supple umbra I think it's true that the current gen tools don't write prose on the level of a good author, and come up with very generic straight forward stories. If you give Claude pages of world building, it won't consistently use them. Sometimes it appears to 'get' its directions but then does something jarringly out of place.
It's still at a level where it makes an author more efficient rather than replacing them. Same with serious coding, if you need good architecture and abtractions and consistent use of coding patterns accross a large application, you have to micromanage.

#

The thing is - a lot of people are not that great at coding and writing. A lot of even commercial books and code are written pretty bad. They are, apparently, good enough. So for "terrible ghost written fantasy novel for Kindle Unlimited" writing and "I made a dashboard for the sales department" coding, you can get pretty good results with a light touch.

#

What I find interesting is... both of these allow people who don't have the complete set of skills but have ideas to make a thing they otherwise wouldn't have made.
Joe from accounts can write a python script that emails bad debtors on a friday.
Then he can go home and write that harry potter inspired space western he always wanted to write.

#

So for me, I'm AI genning a little novel in Claude, and the prose isn't good but it's a lot better than I could write. On the other hand, I have a lot of life experience and ideas that someone who focused on honing their writing craft might not have. Like, I'm writing some corporate thriller story, and I can include all kinds of realism and dumb detail a beginner author writing from research couldn't capture. For me, it makes the story a lot more interesting and less subtly irritating.
And I think that's the core thing, the place we are almost at -

I can make a story that is exactly what I want to read, without all that much effort. The prose is good enough. It's accurate to my life experience. It's alligned to my tastes. It is the story that I want to read today.
What we're going to see I think is more people who wouldn't have bothered, creating content for themselves because it's low effort. Content that is "good enough", and unique to them.

A lot of low grade commercial content is written by authors who are watching keyword rankings on kindle. Today "dark romance" and "self help" are trending so they write a book about a woman who finds her self confidence with a hot exploitative therapist.
And it's terrible. Written in one take no edit. By a 17 year old Indonesian ghost writer saving for college. And people buy it, enough people.

I think this is getting to the point where it beats that.

glacial fox Mar 13, 2025, 10:17 AM

#

tight kernel So for me, I'm AI genning a little novel in Claude, and the prose isn't good but...

What do you think the best model for writing? has you taking a looks at community RP model for it

tight kernel Mar 13, 2025, 10:18 AM

#

I mean I like Claude personally, but different models have different strengths for different parts of writing I think

#

Like uh, when you are brainstorming, deepseek can pull from your notes in a way that is better or at least different. It's less positively biased so it comes up with different ideas if prompted appropriately (brainstorm in charecter works well for me)

#

I like the prose out of grok but Claude does a better job of understanding intent and knowing how to write a scene that is better than your outline. I've had good experience debating how to do a charecter with claude backwards and forwards and getting a unique voice and style out the other end that is different and better than I'd have come up with alone...

#

Chatgpt still bad

#

Smaller models... I don't get on with them, but I think it's a skills issue

#

But uh, fundamentally, it's good enough to keep me happy, and I usually read a lot of badly written trash.

glacial fox Mar 13, 2025, 10:22 AM

#

tight kernel I like the prose out of grok but Claude does a better job of understanding inten...

Interesting..
I actually have idea of using claude to make the output first then gave it to grok to rewrite it while keeping it originality.
The thinks about claude is that its a smart model, making story with it actually make sense, the problem it have is the steering from antrophic. grok are just more open.. they didnt care how wild or how disgusting its, which make it able to output much more diverse prose.

tight kernel Mar 13, 2025, 10:27 AM

#

Honestly, when I feed large lumps of ready written text to Grok, and other models, I find it just reproduces them verbatim.

#

Like when you wrote that I thought "hey, I'll feed it a scene and see what it does"

#

It just copy pasted

glacial fox Mar 13, 2025, 11:29 AM

#

tight kernel Honestly, when I feed large lumps of ready written text to Grok, and other model...

has you try to steer it?

example, i got story where the mc are on the bad side and its a monster, when the monster doing some nasty thing likes eating "things", claude will either reject it or giving it out but in much more tame manner

with the output claude gave i went into grok, then i prompt it how the scene should be.
"rewrite this story while keeping it originality on its logic and pace with the difference it be more graphic, g3re,......... and so on" making it outputing more wild version of the given scene

tight kernel Mar 13, 2025, 12:04 PM

#

Hmm, I had some luck with generating very detailed notes from an existing scene, deleting it from the chat, and then asking for a rewrite

supple umbra Mar 13, 2025, 12:50 PM

#

tight kernel So for me, I'm AI genning a little novel in Claude, and the prose isn't good but...

That's awesome! It's cool to hear about how it's helping you with a long term project. I think you and I would agree that AI is at its best as a force-multiplier, synthesising your own ideas with its natural writing ability into something that wouldn't have existed otherwise, and still be quite personal despite perhaps having the signature hallmarks of a particular LLM - this is a much easier problem to solve with some effort than the alternative of writing from scratch.

I don't use it for fiction so much, but just being able to brain-dump my thoughts and have it neatly laid out with some thought-provoking questions added on has been extremely valuable to me. It's helped me record details that would have been lost otherwise. I've never been a good note-taker, I easily get bogged down in the particulars of structure and prose to the point of frustration, so I've never found it a pleasurable experience like some do. Being able to just let Claude take the wheel is pretty mind-blowing when you stop to remember this concept was a sci-fi fantasy two years ago.

Really I was expressing a frustration with the overwhelmingly singular fixation of a significant part of the LLM community being on ERP only, and that being the benchmark people are measuring models by. I've had some fun with it, its cool, I'm just kinda over it. I get a little disturbed by the sense that some people seem addicted to it, and are hanging out for their next hit of waifu sex text. I mean it's amazing that ERP works as well as it does, but it a bit sad to think that's all it is for some.

frosty spire Mar 13, 2025, 1:20 PM

#

supple umbra That's awesome! It's cool to hear about how it's helping you with a long term pr...

Oh god, thank you for saying this

#

The ERP community is also quite polarized, with some actually digging into fine-tuning and training, while the others are one of the laziest people I have ever seen that they would just keep asking which model is the best for ERP from time to time without doing the homework themselves or even bothering to Cmd+F and searching

slender axle Mar 13, 2025, 4:54 PM

#

If you’re interested in using high end models for story creation, do check out my site infiniteworlds.app. I’ve put a LOT of effort into how to get good interactive stories out of them.

tight kernel Mar 13, 2025, 8:59 PM

#

very call, somewhat what deen was talking about I think. I'll take a look

shell flint Aug 20, 2025, 8:19 PM

#

Starting February 19, 2026, Anthropic will terminate and no longer support Claude 3.5 Sonnet or Claude 3.5 Sonnet v2.

Received from Vertex

proven sigil Aug 21, 2025, 12:22 AM

#

well, Anthropic is yeeting 3.5 Sonnet earlier on their platform, October 22, 2025 https://docs.anthropic.com/en/docs/about-claude/model-deprecations#2025-08-13%3A-claude-sonnet-3-5-models
cc @keen stream for the 2 month countdown

Anthropic

Model deprecations - Anthropic

proven sigil Oct 28, 2025, 5:23 PM

#

@keen stream Anthropic just pulled 3.5 off their platform

supple umbra Oct 28, 2025, 6:24 PM

#

rest in power, king 🙏

keen stream Oct 28, 2025, 6:49 PM

#

proven sigil <@165587622243074048> Anthropic just pulled 3.5 off their platform

https://tenor.com/XscL.gif

Tenor

digital thicket Oct 29, 2025, 4:02 PM

#

What happened to 3.5 Sonnet?

The description changed to this on Vercel:

The upgraded Claude 3.5 Sonnet is now state-of-the-art for a variety of tasks including real-world software engineering, agentic capabilities and computer use. The new Claude 3.5 Sonnet delivers these advancements at the same price and speed as its predecessor.

#Claude 3.5 Sonnet