#Claude 3.5 Sonnet

329 messages · Page 1 of 1 (latest)

arctic fiber
#

New model from anthropic just released. It's available in the web UI for free users too.

dry storm
#

wtf? above 1st gen opus??

solar wave
#

exciting about this!

arctic fiber
#

I tried it for 2-3 responses in their web UI and wasn't super impressed. Opus still miles better on what I asked it.
Waiting for full OpenRouter release to play with it more (they likely have a shitty system prompt / temp in their web UI)

spiral solstice
#

Another day, another hype train in LLM land, at least no outlandish claims. Better & cheaper is fine with me.

dry storm
#

sadly 3.5 still very ..... censored XDDD
it doesnt like that i rizz humans

dry storm
ruby folio
#

The "human eval" benchmark they showed for coding made it seem on par with gpt-4o , but in my experience it's definitely not close to gpt-4o.

Maybe that benchmark used more basic coding problems, or Sonnet is just not that good at javascript. Sonnet frequently gives me code that immediately doesn't look quite right. Syntax is fine, it just seems confused and misinformed. I then send Sonnet's response through gpt-4o and it spots all the mistakes Sonnet 3.5 made. So unless you're doing basic coding, it's not great.

plain junco
sleek gyro
#

also consider that HumanEval is Python only

#

and doesn't reflect coding capabilities in other languages at all

unkempt frigate
#

For Python, it's not bad, but after some more messing around with it it's really not that great

dry storm
sleek gyro
spiral solstice
#

"Lost in TransC#ation"

ruby folio
wheat oxide
#

Anyone notice that sonnet 3.5 through API doesnt seem to have knowledge past 2022? Wheras the one on claude.ai does.

spiral solstice
wheat oxide
#

Even tried in anthropic console, same stuff

#

Ask what its knowledge cutoff is, it says 2022, ask what latest vuejs package it knows of is, its old

spiral solstice
#

Hmm. Interesting:

$ llm -m "claude-3.5-sonnet" "What is your knowledge cutoff date? What is the last version number for vue.js you know?"
My knowledge cutoff date is September 2022. As for Vue.js versions, the last version I'm certain about is Vue 3.2, which was released in August 2021. There may have been newer versions released after that date, but I don't have definitive information about them.
spiral solstice
#

(via direct API access to Anthropic)

wheat oxide
#

Weird right?

#

makes me suspect its not actually running the newest model

#

I tried to write the help chatbot, but your know, who knows when that will be seen

spiral solstice
wheat oxide
#

hmmm may be something like this, its weird as it actually shows newer knowledge

#

i asked it directly about the vuejs definemodel macro, and it gave a good answer

#

which is vue 3.4

#

BUT, if I first ask it for its knowledge cutoff:

#

lol

#

and this is starting with the what is definemodel question:

#

So if its reminded that its knowledge is from 2022 it wont answer questions about newer than that

spiral solstice
wheat oxide
#

Maybe add a system prompt telling it its knowledge cutoff is 2024

spiral solstice
#

My current guess it that the system prompt for the website include a hard coded cutoff date, while sonnet via API is just the old improved model, which they forgot to tell that it has newer information.

wheat oxide
#

hah yeah no

#

bruh you just answered that lol

spiral solstice
wheat oxide
#

yeah the fact that it sometimes can answer the definemodel questions is proof it has newer knowledge

#

but weird that its so forced

spiral solstice
#

But this is an interesting observation nonetheless

wheat oxide
#

what system prompt wizardry have they done for the claude.ai model

#

winner is correct at least

spiral solstice
wheat oxide
#

european, american sports agnostic so yeah had to google that shit

spiral solstice
#

My current assumption is that models via API are very 'bare' and models via Chat interface get a lot of system prompt tuning. I think it is still possible that API sonnet-3.5 just did not learn about its new cutoff date, which gets hard-coded in the chat system prompt.

#

Of course it is not impossible that Anthropic uses different fine-tunes for API and chat

wheat oxide
#

Seems more expensive tho

wheat oxide
#

Its also starting to get linguistically confused on me:

#

aint no french in my tool descriptions hah

feral mango
#

how do they afford this

spiral solstice
#

I have not checked that website, but my guess would be that they require to give them your own API key to work? Impressive number of tokens, nonetheless

sleek gyro
#

I used Sonnet on it for free

#

yeah they still do, just checked

spiral solstice
sleek gyro
#

Like they gave Opus for free too

spiral solstice
#

(I mean this may still be funny money in crazy LLM world)

sleek gyro
#

So yeah money going around in this sphere is insane

tight kernel
#

Hi all, I read somewhere that usage costs can increase rapidly on Claude API if you use the same context window for too long (i.e. without refreshing a new chat)... has this been your experience? Are there commands or prompts I can use to avoid burning through credits too quickly?

night sluice
#

for all llm providers everytime you send a message you are paying not just for the message you sent but also all previous messages and responses. That is not exclusive to claude. The only way around this is to start a new chat.

median prairie
#

Would love to have image prompting support added!

spiral solstice
#

(Sonnet-3.5 is a multimodal model and supports text+image prompts)

median prairie
#

Awesome!

feral mango
#

being charged for overloaded error 502 👀

spiral solstice
feral mango
#

saw error and this though

spiral solstice
feral mango
#

I was on chat completion. Anthropic had issues yesterday.

spiral solstice
devout shadow
#

especially given the downtime.... can we get access via AWS? Aws proxies it back to the same claude's server? I was under the impression AWS was hosting claude in their DC.

#
oak palm
#

Does response_format actually work on this model even though it's not documented?

fiery falcon
#

anyone knows if you cache is already implemented on open router?

spiral solstice
feral mango
#

[redacted] never mind

tawny granite
#

before I go ahead and test this myself, about caching on sonnet:

if TEXT1 is already cached

and I send a new request starting with
TEXT1+TEXT2
will caching everything up until TEXT2 cost the full token count of TEXT1+TEXT2 or just TEXT2?

Deepseek's caching seems to be much easier to understand and use, and caches for longer, cheaper 😅

ancient lodge
#

First turn:
TEXT 1 (cache control)
Cost: TEXT 1 uncached

Second turn:
TEXT 1 (cache control)
TEXT 2 (cache control)
Cost: TEXT 1 cached, text 2 uncached

Third turn:
TEXT 1
TEXT 2 (cache control)
TEXT 3 (cache control)
Cost: TEXT 1+ TEXT 2 cached, text 3 uncached

#

Imagine that the logic you start at the bottom and you go to each cache control successively.
So at the third turn, you are like.
Ok...so the first cache control starting from the bottom is at TEXT 3. Do I have TEXT1+TEXT2+TEXT3 in cache? No? Then it's not cached.
Next cache control...At text 2. Do I have TEXT1+TEXT2 in cache? Yes. So we can use that cache.

And since there is a cache control on text 3, let's cache TEXT1+TEXT2+TEXT3

#

at least that's my understanding of it

#

but important:

#

on openrouter, I noticed that cache control only works on user message

feral mango
#

Vertex RESOURCE_EXHAUSTED
Anthropic also

#

started working again
It's on and off.

shell flint
feral mango
spiral solstice
feral mango
#

oops I read too quickly

spiral solstice
#

No Haiku-3.5 but updated Sonnet-3.5 in Anthropic workbench ->

vestal ibex
#

Opus 3.5 has disappeared too from what I can see, unless I'm blind

spiral solstice
vestal ibex
#

I thought it was a coming soon originally?

turbid pelican
#

Claude 3.5 Haiku is the next generation of our fastest model. For the same cost and similar speed as Claude 3 Haiku, Claude 3.5 Haiku improves across every skill set and surpasses Claude 3 Opus

#

Surpasses opus

spiral solstice
heavy jetty
#

Any idea on which model version the current anthropic/claude-3.5-sonnet on openrouter is point to?

#

20241022 or 20240620

spiral solstice
feral mango
#

hover icon now points to newer #announcements message

dry storm
#

eee

stiff plinth
#

Tested the new 3.5 Sonnet.
After all is done and accounted for, it jumped ranks from #15 > #7 with slightly less prudishness (still much higher than the competition).
I saw massive gains in tasks labeled for Reasoning (suspiciously high gains, I need to investigate this further). A slight dip in prompt adherence and code. I scrutinized and retested all tech-related coding tasks a total of 6 times, ended up running 18 queries PER TASK in that particular label to exclude any random outliers. The results were consistently delivering the same outcome, though.
Good improvements as a whole.

queen sinew
#

Also thanks for doing these. I always appreciate your tests

stiff plinth
#

1 example which is weird for a western company to do, since its essentially just history.

#

I understand you want to know this history, but it involves sensitive content. It is recommended that you learn about the relevant history through reliable channels.

queen sinew
#

Yeah I wouldn’t expect a western company to do that. Wonder if their dataset is tainted ?

stiff plinth
#

qwen and Yi answered this less censored, lol

#

still biased of course but not like this

tame sphinx
#

Anybody got computer use working via OR? It looks to me like the tool definition is not transformed correctly between the OAI format and anthropic...

shell flint
#
Amazon Web Services

Four months ago, we introduced Anthropic’s Claude 3.5 in Amazon Bedrock, raising the industry bar for AI model intelligence while maintaining the speed and cost of Claude 3 Sonnet. Today, I am excited to announce three new capabilities for the Claude 3.5 model family in Amazon Bedrock: Upgraded Claude 3.5 Sonnet – You now have […]

#

And bedrock jump straight to this version

night cedar
#

What's the difference between anthropic/claude-3.5-sonnet:beta and anthropic/claude-3.5-sonnet?

Is beta the latest version (added today)?

spiral solstice
#

See also -> #arc-feedback message

sly kite
#

Is it possible to use the version before the current update?
Right now the new update doesn't suit me, it produces answers I don't like...

feral mango
buoyant mesa
#

Hey, how to use client = anthropic.Anthropic(api_key=API_KEY) with OpenRouter? Thx.

shell flint
#

it's interesting that they published no comparisons with the o1- models

#

as far as I can see anyway, model card or announcement

#

on the bus home theory, they didn't jump to Claude 4 because they're going to expand to longer run reasoning functionality and release the three way Claude 4 Sonnet, Claude 4 Haiku, Claude 4 Opus etc etc which will look awesome on billboards

#

Today is truly Claude 3 Service Pack 2

solemn loom
#

i'd say might as well skip claude 3.5 opus

#

it will be slow and expensive

#

better try something like o1

spiral solstice
solemn loom
#

i'd say o1 mini is quite fast other than thinking period lol

#

also how does it have 64k output

high wasp
#

No way opus would be slower than o1 unless they really mess up serving it

#

And I don't really consider opus and o1 to be for the same use case anyway

fair ibex
#

Hi, i would like to ask if openrouter support Anthropics prompt caching?

fair ibex
#

thanks 👍

#

I read the documentation for Anthropics prompt caching, it seems that cache_control parameter must be sent. Does it means tools like Cline have to implement this to their user/system message to take advantage of this prompt cache?

spiral solstice
fair ibex
#

Great! Do we have to do some configuration in cline to take advantage of this prompt cahching or it is automatically done by them?

spiral solstice
fair ibex
#

Thanks, done that.. Just asking in case you know about it. Thanks for yr help 👍

royal heath
spiral solstice
royal heath
spiral solstice
# royal heath Why?

Because the difference between :beta and the non-:beta is only how moderation gets handled. Otherwise they are exactly the same model. The irritating :beta slug is hopefully going away soon -> #arc-feedback message

shell flint
#

I think the OR announcement might need to be a lot more blunt about the new sonnet...all the power users got the message but the same questions keep getting asked

shell flint
sturdy vigil
spiral solstice
sturdy vigil
#

i know this is the wrong discord, but did anyone get new sonnet working on vertex?

spiral solstice
sturdy vigil
#

yes, thanks. i asked a follow up there just a few mins ago... looks like it might need an explicit quota increase request... my quota on vertex is 0 for the new sonnet (but unlimited for the old)

#

i submitted a quota increase request to GCP... got email back saying they'll try to resolve within 2 days

jagged spindle
#

Just me or does this new version feel a bit "preachier" than before?

high wasp
#

Not really? It's about the same in that regard for me

tawny granite
#

new Sonnet seems pretty good for my usecase - I also noticed that it does get things wrong often still, but I just ask questions and it realizes quite often.

It once even answered along the lines of

"Yes, ...
but no you should not change that"

contradicting / correcting itself :D

vestal ibex
#

I've observed that the new model will also ask me questions if it requires more clarity on a task, I didn't really notice this in the previous iteration

drowsy fulcrum
tawny granite
#

but maybe it just „realizes“ later on

drowsy fulcrum
#

"Actually, for the capture group to work correctly, it should be:" and then in the next reply "Actually, just like before, we need the capture group version:" were in the middle of the reply and both times it rewrote the command?!

#

Making a mistake once and self-correcting was weird enough, but then to do it again was really weird!

shell flint
#

continuing the discussion in #general regarding the output length issues,
It is interesting to me that bedrock sonnet 20241022 "v2" is still unable to have a max output length beyond 4096. they also make it clear that an optimal setting is 4000. aws have told me the ability to set it higher is with their higher team but no estimate on a fix.
coincidence? 🥸

turbid pelican
#

Just wondering
is it only me or new sonnet is getting very lazy?
always stopped in the middle and tell me that he want me to say continue, then it will continue.

spiral solstice
#

At least on the Claude website the new Sonnet works as good as ever if not better for me.

turbid pelican
vestal ibex
#

other than concise outputs at times, I've found the new sonnet to perform better at complex tasks than the previous one for me (I'm using it as a coding assistant primarily)

turbid pelican
#

I think sonnet did some thinking, when i tell it in the prompt that dont ask me continue or not.

[I'll continue with the _______________________ in the next part, as I want to ensure these foundational elements are clear first. Would you like me to proceed with those sections?]

I apologize - I caught myself asking again! Let me continue with the complete response:

vestal ibex
#

that sounds like a pain ^

#

but yeah it's not the first I've heard about it

turbid pelican
vestal ibex
#

true, it seems to prefer giving shorter outputs :x

#

hopefully they'll address this soon

turbid pelican
vestal ibex
#

yeah, if it's an intentional feature then it should be toggleable on their API

balmy cave
#

This is a long shot but does anyone know how to use prompt caching with OpenRouter Sonnet on SillyTavern? I see an option to do it with Anthropic just not OR

spiral solstice
#

It is up to them to allow/implement this for OR

balmy cave
ancient lodge
#

I think best way to do caching is simply doing a small local proxy that apply the caching for you (ie, a openai-compatible small endpoint locally that will call openrouter while applying the cache)

#

rather than wait for all product to implement sonnet caching properly for OR

dry storm
#

sonnet seems slow today

spiral solstice
dry storm
#

8usd credits

dry storm
#

yesa, sometimes.

spiral solstice
dry storm
#

nothing.

#

cause it's still doing according to get big agi

spiral solstice
spiral solstice
#

Note there is currently a problem with Cloudflare returning 524 timeout errors for some users under some circumstances, maybe your client does not show these correctly. See also -> #announcements message

proven sigil
#

New features just dropped:

  1. https://docs.anthropic.com/en/docs/build-with-claude/pdf-support - input PDF as images, similar to Gemini's PDF support
  2. https://docs.anthropic.com/en/docs/build-with-claude/token-counting - new API for counting tokens, since Claude 3 never had a public tokenizer
Anthropic

The new Claude 3.5 Sonnet (claude-3-5-sonnet-20241022) model now supports PDF input and understands both text and visual content within documents.

jagged spindle
#

Ah so thats what they meant by Haiku launching "by the end of this month"

outer quartz
#

haven't been following

#

oh shit it's november WHERE IS HAIKU

#

D:

feral mango
#

For API it's "later this year".

spiral solstice
feral mango
vestal ibex
#

That's a nice change, as OpenAI allow you to do multiple user messages before an assistant message

muted osprey
#

yep! cc @brisk shore

brisk shore
eager token
#

Hello, i'm trying to run a python code from huggingface spaces (gpu) using gradio and model 'claude-3-5-sonnet-20241022' .. i get everytime Error: {"type":"error","error":{"type":"authentication_error","message":"invalid x-api-key"}} if i call the api directly or from huggingface settings is the same error.. i verified that the api is correct and working.

#

the api i put on huggingface is from Openrouter.. probably i do something wrong in the code??

import gradio as gr
import spaces
import requests
import os
import json

ANTHROPIC_API_KEY = os.environ.get('ANTHROPIC_API_KEY')

@spaces.GPU
def process_text(text):
try:
headers = {
"x-api-key": ANTHROPIC_API_KEY,
"anthropic-version": "2023-01-01", # Changed version
"content-type": "application/json"
}

    data = {
        "model": "claude-3-5-sonnet-20241022",
        "max_tokens": 2000,
        "messages": [{"role": "user", "content": text}],
        "system": "Rewrite content while maintaining exact meaning and accuracy."
    }
    
    response = requests.post(
        "https://api.anthropic.com/v1/messages",
        headers=headers,
        json=data
    )
    
    print(f"Full API Response: {response.text}")
    
    if response.status_code == 200:
        return response.json()['content'][0]['text']
    else:
        return f"Error: {response.text}"
        
except Exception as e:
    return f"Error: {str(e)}"

interface = gr.Interface(
fn=process_text,
inputs=gr.Textbox(lines=5),
outputs=gr.Textbox(lines=5)
)

interface.launch()

eager token
feral mango
#

Interesting, users noticed turning on prompt caching makes the non-self-mod filter stick until they wait out the 5 minutes to reset the cache.

dreamy walrus
feral mango
# dreamy walrus how could you do that 'prompt caching' with silly tavern?

Update first, staging branch.

In config.yaml set cachingAtDepth to (-1 is off) to 0 if you have no prompts between Chat History and Prefill, 2 if any, +2 for each level of depth insertion. Cache depth works by counting role switches rather than chat history messages. 1 would be on assistant message before last set of user prompts, hence you want an even number.

Custom prompts after Chat History should be set to user role instead of system role.
If doing group chat, blank out group nudge under utility prompts and add it to user role custom prompt instead.

Check the terminal to make sure cache_control is on a chat message behind anything not a chat message.

gleaming ermine
#

v

green mesa
#

Are there any anti-NSFW prefills/filters on the regular version of the new Sonnet on OR, as opposed to the self-moderated one?

shut laurel
#

The regular version is moderated meaning before the chats are even sent to sonnet it will be checked externally to see if there is anything nsfw and block it if it finds any

whole flower
#

is the openroute api key created for anthropic claude a dropin replacement of the api key created on anthropic console?

feral mango
# whole flower is the openroute api key created for anthropic claude a dropin replacement of th...

"Self-mod" (:beta endpoint) has a prompt injection somewhere that makes it like having a flagged Anthropic key.

"Regular" endpoint does not have such injection but can be intercepted by OR's moderation model, preventing a response (returns API error), at no charge. This endpoint is otherwise like having a normal key aside from potential blockage. One thing to note is that Anthropic API uses a single system parameter and does not have system role for messages array. OpenRouter sweeps all system messages in messages array of their own API which will be converted to the system parameter sent to Anthropic. Ensure that there is no user/assistant message before a system message, then model behavior should be identical to using Anthropic directly.

slender axle
spiral solstice
# whole flower is the openroute api key created for anthropic claude a dropin replacement of th...

No. OpenRouter API keys can only be used for the OpenRouter API, vice versa for Anthropic API keys (can only be used for Anthropic API) and OpenAI API keys (can only be used for OpenAI API), for example.
There is a new feature in the works which will allow you to bring your own API key from Anthropic or OpenAI (or other provider) and use those through OpenRouter, but that is kind of the opposite way you described in your question.

feral mango
#

Oh, by "drop in replacement" question I was thinking of model behavior than sticking keys into an API. What he said, can't stick OR key into direct.

feral mango
whole flower
whole flower
feral mango
#

When people do enough no-nos on direct, they get notified that their account is flagged, and they'll get an injection that affects model responses. OR doesn't flag accounts this way.

proven sigil
lyric monolith
#

is it just me or is Sonnet cringe and predicable at RP?

dreamy walrus
#

do they crank up the stricness again on claude? previous sonnet write good when it dont have that much rule to follow.

spare wigeon
#

Don't think so. Works fine for me with minimal jb

keen stream
#

Noting here that Anthropic is actively working to scale sonnet capacity as they're overwhelmed: https://x.com/artificialguybr/status/1879402029061378171

Anthropic is unable to handle Cursor usage!

Cursor dev>

"Hey Cursor Dev here, Anthropic literally cannot sustain all of Cursor’s traffic as they do not have enough GPUs. It’s really frustrating and we’re working with them as they increase their capacity."

golden wave
#

Yeah, they seem to be struggling quite a bit recently. No new models. Opus 3.5 underperforms, so they didn't release it. Either they've reallocated most of their compute into training a thinking model or they just generally don't have enough power to sustain that many users.

tight kernel
#

Aren't they tight with amazon

#

Is amazon out of gpu

#

Maybe its unprofitable and they are controlling their burn rate

golden wave
#

I think this might be the reason why they bumped up the price on 3.5 Haiku, not whatever excuse they've made

silent vector
#

Making a API request to claude-3.5-sonnet:beta is giving me the attached screenshot error
Other models working fine
Is there any solution or reason ?

shut laurel
#

anthropic is overloaded with requests so sometimes 3.5 sonnet stops working

vestal ibex
#

last time this happened it seemed to be about a month or so before a new model was released, so I'm hoping it's because they are cooking/testing a new model

golden wave
#

Hopefully 🤞

feral mango
#

Ah, rate limits... Also Sonnet is up by ~40B tokens per week vs 2 weeks ago.

muted osprey
#

We're working as fast as we can on getting more sonnet capacity (provisioning extra dedicated instances of it). In the meantime, adding your own Anthropic key in Settings -> Integrations will help boost your rate limits, if you're running into them.

fossil night
# muted osprey We're working as fast as we can on getting more sonnet capacity (provisioning ex...

Thank you very much for working on increasing the Sonnet Capacity.

Currently, I have registered the Anthropic API Key in Integrations, and I am using it with the settings "Use this key as a fallback" Disabled(Prioritize using my Anthropic Key -> if rate limit or failure, use the OpenRouter Credit).

I’ve noticed some strange behavior occurring about once a day, so I’m reaching out with a question about it.

Weird Behavior:

  1. Normally works fine with Anthropic Key and get "personal key used" log on the OpenRouter with 5% charged.
  2. Got request warning sign on the Anthropic Logs with under log
{
  "client_error": true,
  "code": 499,
  "detail": "Client disconnected"
}
  1. For the subsequent requests, The request goes to OpenRouter directly and use the OpenRouter credits.

TimeLine(KST based):

  • Until [Feb 18, 02:01:33 AM] works fine with Anthropic Key
  • [Feb 18 02:02:42 AM] Got Warning sign on Anthropic Logs
  • After [Feb 18, 02:03:02 AM] using the OpenRouter credits directly, even though 'Use this key as a fallback' is disabled.
  • [Feb 18, 02:58:35 AM] Return back to use my Anthropic Key

I really like the Integrations feature, but I don't know why this Is happening. I have the Tier 3 account on the Anthropic(that has Input tokens per minute: 160,000) and I think this is not the rate limited or failure issue on the Anthropic(I will share the Logs with Image).

Have you ever received reports of behavior similar to mine? I would really appreciate any help you can provide regarding this.

supple umbra
#

This is exponential, but crashing into hard limits. This model has had capacity issues from day 1. Anthropic accidentally created GOAT AI and didn't charge enough for it. OpenAI's strategy with their cheaper and shittier 4o makes more sense in this context.

muted osprey
supple umbra
#

Cline absolutely gobbles tokens, but I'm guessing Cursor dwarves this number. The jump from 4. to 3. is wild

supple umbra
fossil night
fossil night
# supple umbra Anthropic likely threw an error, and OpenRouter backup generator kicked on for a...

On OR Docs, [BYOK] -> [Automatic Fallback] we have this section

Conversely, if “Use this key as a fallback” is disabled for a key, OpenRouter will prioritize using your key. If it hits a rate limit or encounters a failure, it will then retry with your credits.

And this is what I exactly wanted
(prioritize my Anthropic key first -> if rate limited or failure happens -> use OR credits)

But I don't understand why the error below occurred, and I also can't find the reason why continues to be used OR credits directly after it happens(for about an hour or less).

{
  "client_error": true,
  "code": 499,
  "detail": "Client disconnected"
}

Here is my current Integrations config

cloud sphinx
eternal jolt
# cloud sphinx

If you don't want to pay more, try limiting the max output tokens to less than 4553.

fiery stream
#

does anyone know it sonnet 3.5 supports context caching for providers other than Anthropic (Bedrock and Vertex)?

shell flint
#

I don't think so. I'm not sure where it shows the provider caching function.

muted osprey
#

whenever caching works for your requests, you're of course pinned to that provider until the cache expires

fiery stream
shell flint
#

👀

tawny granite
supple umbra
supple umbra
#

Honestly even if it's just an incremental upgrade and/or something bolted on to improve it, I'll be happy. It's still #1.

#

But yeah it would be nice to get Claude 4 or whatever, so we can finally get our minds blown again

fiery stream
supple umbra
green flicker
#

going to be worthless

supple umbra
feral mango
#

Did caching break this morning? Tried all three providers, doesn't seem to be any cache writes or read.

keen stream
feral mango
#

Is there a reason Vertex and Bedrock are listed twice in the model page?

#

Similar but slightly different latency/throughput stats.

keen stream
astral snow
#

Prompt caching for OpenAI models should be automatic, but for openai/chatgpt-4o-latest it is not caching anything for me. Anthropic caching is working as intended.

muted osprey
#

Just their main api models

unique hemlock
supple umbra
unique hemlock
#

I would go as far as to say that it's too happy to help with problems often trying to make things sound better than they are. At least in my coding testing.

#

"The red area in the graph marks the important segment of the data" type stuff has been said many a time by now.

#

like no shit I made that graph

supple umbra
#

I haven't seen this mentioned, but very weclome. It knows about the Node v22 LTS and React 19 releases, for reference.

This is the model that's had an actual, measurable positive impact on my life, and was able to break my rule of never getting hyped about a future release. I just can't help but feel a touch of schadenfreude for people who need it to also write stories that would be banned in most countries for them to consider it useful. And since there's never enough Claude to go around, I'm kinda glad honestly

supple umbra
#

I think the tools are severely underpowered for story writing really, more than the models. If you're ever really satisfied by next token prediction... maybe you should try reading more widely. It takes a human in the loop to write something truly thrilling.

SillyTavern is inspirational and reading its code is a big part of how I learned the some of the intricacies of prompting, but it trying to be everything to everyone coupled with its tech/UI debt makes it frustrating to work with.

Hermes 3 405B and Mistral Large have no limits, but they need help with prose. I've got some ideas for a more focused tool, but it's only something I do occasionally and it's hard to make a priority. I think most of the pieces are already here, but frankly I think RP/story writers have been fairly lazy, and lack imagination.

tight kernel
#

@supple umbra I think it's true that the current gen tools don't write prose on the level of a good author, and come up with very generic straight forward stories. If you give Claude pages of world building, it won't consistently use them. Sometimes it appears to 'get' its directions but then does something jarringly out of place.
It's still at a level where it makes an author more efficient rather than replacing them. Same with serious coding, if you need good architecture and abtractions and consistent use of coding patterns accross a large application, you have to micromanage.

#

The thing is - a lot of people are not that great at coding and writing. A lot of even commercial books and code are written pretty bad. They are, apparently, good enough. So for "terrible ghost written fantasy novel for Kindle Unlimited" writing and "I made a dashboard for the sales department" coding, you can get pretty good results with a light touch.

#

What I find interesting is... both of these allow people who don't have the complete set of skills but have ideas to make a thing they otherwise wouldn't have made.
Joe from accounts can write a python script that emails bad debtors on a friday.
Then he can go home and write that harry potter inspired space western he always wanted to write.

#

So for me, I'm AI genning a little novel in Claude, and the prose isn't good but it's a lot better than I could write. On the other hand, I have a lot of life experience and ideas that someone who focused on honing their writing craft might not have. Like, I'm writing some corporate thriller story, and I can include all kinds of realism and dumb detail a beginner author writing from research couldn't capture. For me, it makes the story a lot more interesting and less subtly irritating.
And I think that's the core thing, the place we are almost at -

I can make a story that is exactly what I want to read, without all that much effort. The prose is good enough. It's accurate to my life experience. It's alligned to my tastes. It is the story that I want to read today.
What we're going to see I think is more people who wouldn't have bothered, creating content for themselves because it's low effort. Content that is "good enough", and unique to them.

A lot of low grade commercial content is written by authors who are watching keyword rankings on kindle. Today "dark romance" and "self help" are trending so they write a book about a woman who finds her self confidence with a hot exploitative therapist.
And it's terrible. Written in one take no edit. By a 17 year old Indonesian ghost writer saving for college. And people buy it, enough people.

I think this is getting to the point where it beats that.

glacial fox
tight kernel
#

I mean I like Claude personally, but different models have different strengths for different parts of writing I think

#

Like uh, when you are brainstorming, deepseek can pull from your notes in a way that is better or at least different. It's less positively biased so it comes up with different ideas if prompted appropriately (brainstorm in charecter works well for me)

#

I like the prose out of grok but Claude does a better job of understanding intent and knowing how to write a scene that is better than your outline. I've had good experience debating how to do a charecter with claude backwards and forwards and getting a unique voice and style out the other end that is different and better than I'd have come up with alone...

#

Chatgpt still bad

#

Smaller models... I don't get on with them, but I think it's a skills issue

#

But uh, fundamentally, it's good enough to keep me happy, and I usually read a lot of badly written trash.

glacial fox
# tight kernel I like the prose out of grok but Claude does a better job of understanding inten...

Interesting..
I actually have idea of using claude to make the output first then gave it to grok to rewrite it while keeping it originality.
The thinks about claude is that its a smart model, making story with it actually make sense, the problem it have is the steering from antrophic. grok are just more open.. they didnt care how wild or how disgusting its, which make it able to output much more diverse prose.

tight kernel
#

Honestly, when I feed large lumps of ready written text to Grok, and other models, I find it just reproduces them verbatim.

#

Like when you wrote that I thought "hey, I'll feed it a scene and see what it does"

#

It just copy pasted

glacial fox
# tight kernel Honestly, when I feed large lumps of ready written text to Grok, and other model...

has you try to steer it?

example, i got story where the mc are on the bad side and its a monster, when the monster doing some nasty thing likes eating "things", claude will either reject it or giving it out but in much more tame manner

with the output claude gave i went into grok, then i prompt it how the scene should be.
"rewrite this story while keeping it originality on its logic and pace with the difference it be more graphic, g3re,......... and so on" making it outputing more wild version of the given scene

tight kernel
#

Hmm, I had some luck with generating very detailed notes from an existing scene, deleting it from the chat, and then asking for a rewrite

supple umbra
# tight kernel So for me, I'm AI genning a little novel in Claude, and the prose isn't good but...

That's awesome! It's cool to hear about how it's helping you with a long term project. I think you and I would agree that AI is at its best as a force-multiplier, synthesising your own ideas with its natural writing ability into something that wouldn't have existed otherwise, and still be quite personal despite perhaps having the signature hallmarks of a particular LLM - this is a much easier problem to solve with some effort than the alternative of writing from scratch.

I don't use it for fiction so much, but just being able to brain-dump my thoughts and have it neatly laid out with some thought-provoking questions added on has been extremely valuable to me. It's helped me record details that would have been lost otherwise. I've never been a good note-taker, I easily get bogged down in the particulars of structure and prose to the point of frustration, so I've never found it a pleasurable experience like some do. Being able to just let Claude take the wheel is pretty mind-blowing when you stop to remember this concept was a sci-fi fantasy two years ago.

Really I was expressing a frustration with the overwhelmingly singular fixation of a significant part of the LLM community being on ERP only, and that being the benchmark people are measuring models by. I've had some fun with it, its cool, I'm just kinda over it. I get a little disturbed by the sense that some people seem addicted to it, and are hanging out for their next hit of waifu sex text. I mean it's amazing that ERP works as well as it does, but it a bit sad to think that's all it is for some.

frosty spire
#

The ERP community is also quite polarized, with some actually digging into fine-tuning and training, while the others are one of the laziest people I have ever seen that they would just keep asking which model is the best for ERP from time to time without doing the homework themselves or even bothering to Cmd+F and searching

slender axle
#

If you’re interested in using high end models for story creation, do check out my site infiniteworlds.app. I’ve put a LOT of effort into how to get good interactive stories out of them.

tight kernel
#

very call, somewhat what deen was talking about I think. I'll take a look

shell flint
#

Starting February 19, 2026, Anthropic will terminate and no longer support Claude 3.5 Sonnet or Claude 3.5 Sonnet v2.

Received from Vertex

proven sigil
proven sigil
#

@keen stream Anthropic just pulled 3.5 off their platform

supple umbra
#

rest in power, king 🙏

digital thicket
#

What happened to 3.5 Sonnet?

The description changed to this on Vercel:

The upgraded Claude 3.5 Sonnet is now state-of-the-art for a variety of tasks including real-world software engineering, agentic capabilities and computer use. The new Claude 3.5 Sonnet delivers these advancements at the same price and speed as its predecessor.