Tulu-3-405B | OpenRouter | Page 1

austere meteor Feb 7, 2025, 2:16 PM

#

Tulu-3-405B : Better than GPT-4o and DeepSeek V3
https://huggingface.co/allenai/Llama-3.1-Tulu-3-405B
https://allenai.org/blog/tulu-3-405B

allenai/Llama-3.1-Tulu-3-405B · Hugging Face

Scaling the Tülu 3 post-training recipes to surpass the performance...

Introducing Tülu 3 405B, the first application of fully open post-training recipes to the largest open-weight models.

#

405B version (Llama-3.1-Tulu-3-405B):

It is suitable for ultra-large-scale AI tasks and can compete with GPT-4o in knowledge question answering, reasoning, and programming tasks.

lost marsh Feb 8, 2025, 10:30 PM

#

@austere meteor https://openrouter.ai/allenai/llama-3.1-tulu-3-405b

OpenRouter

A unified interface for LLMs. Find the best models & prices for your prompts

dawn flower Feb 8, 2025, 11:28 PM

#

the To read more, click here. url within the description is 404ing

lost marsh Feb 8, 2025, 11:29 PM

#

whoops ty

dawn flower Feb 8, 2025, 11:30 PM

#

https://allenai.org/blog/tulu-3-405B

Scaling the Tülu 3 post-training recipes to surpass the performance...

Introducing Tülu 3 405B, the first application of fully open post-training recipes to the largest open-weight models.

#

i think

lost marsh Feb 8, 2025, 11:30 PM

#

yep https://allenai.org/blog/tulu-3-405B

Scaling the Tülu 3 post-training recipes to surpass the performance...

Introducing Tülu 3 405B, the first application of fully open post-training recipes to the largest open-weight models.

#

should be fixt in a few mins

#

thanks again

dawn flower Feb 8, 2025, 11:53 PM

#

also, the rate limit is being hit straight up

astral forge Feb 9, 2025, 12:10 AM

#

Im rather surprised by how low the tk/s is for Sambanova

lost marsh Feb 9, 2025, 2:02 AM

#

dawn flower also, the rate limit is being hit straight up

ah man - pinged them about this

keen leaf Feb 9, 2025, 10:47 PM

#

austere meteor Tulu-3-405B : Better than GPT-4o and DeepSeek V3 https://huggingface.co/allenai/...

For ERP use which of the benchmarks should we look at?

#

Because I read, in some cases, very slight improvements over Nous Hermes 3 405B but at an Anthropic Claude price.

dawn flower Feb 10, 2025, 6:44 AM

#

keen leaf For ERP use which of the benchmarks should we look at?

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

UGI Leaderboard - a Hugging Face Space by DontPlanToEnd

keen leaf Feb 10, 2025, 7:00 AM

#

dawn flower https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

If I understand this ranking correctly, it says gemini-exp-1206 is better than Hermes-3-Llama-3.1-405B for ERP because it is less censored.
Absolutely not true, I use both, the former is full of banned words and censorship regarding ERP.

burnt token Feb 10, 2025, 7:47 AM

#

isnt that depend more on the creativity and human understanding of the model?

dawn flower Feb 10, 2025, 7:52 AM

#

theyve updated the format to hide the subcategories, so in some cases it could be worse yes

just checking you do send the gemini BLOCK_NONE etc?

pale fiber Feb 10, 2025, 3:15 PM

#

dawn flower theyve updated the format to hide the subcategories, so in some cases it could b...

What is BLOCK_NONE?

plucky kraken Feb 10, 2025, 5:19 PM

#

Tulu Llama = TuluL

dawn flower Feb 10, 2025, 9:29 PM

#

pale fiber What is BLOCK_NONE?

https://ai.google.dev/gemini-api/docs/safety-settings#code-examples

Google AI for Developers

Safety settings | Gemini API | Google AI for Developers

pale fiber Feb 10, 2025, 11:08 PM

#

dawn flower https://ai.google.dev/gemini-api/docs/safety-settings#code-examples

How do u you do this through OR

dawn flower Feb 10, 2025, 11:09 PM

#

you cannot, it is only possible via direct API

#

my paid gemini key allows me to send this

keen leaf Feb 11, 2025, 12:25 AM

#

dawn flower theyve updated the format to hide the subcategories, so in some cases it could b...

Sure, I use the free API from Google AI Studio and SillyTavern which has BLOCK _NONE by default.
But you try typing “little girl” during an ERP chat, for example, and see what happens?
They put blocks and filters on a lot of words and phrases.

#Tulu-3-405B