Tulu-3-405B : Better than GPT-4o and DeepSeek V3
https://huggingface.co/allenai/Llama-3.1-Tulu-3-405B
https://allenai.org/blog/tulu-3-405B
#Tulu-3-405B
26 messages · Page 1 of 1 (latest)
405B version (Llama-3.1-Tulu-3-405B):
It is suitable for ultra-large-scale AI tasks and can compete with GPT-4o in knowledge question answering, reasoning, and programming tasks.
@austere meteor https://openrouter.ai/allenai/llama-3.1-tulu-3-405b
the To read more, click here. url within the description is 404ing
whoops ty
i think
should be fixt in a few mins
thanks again
also, the rate limit is being hit straight up
Im rather surprised by how low the tk/s is for Sambanova
ah man - pinged them about this
For ERP use which of the benchmarks should we look at?
Because I read, in some cases, very slight improvements over Nous Hermes 3 405B but at an Anthropic Claude price.
If I understand this ranking correctly, it says gemini-exp-1206 is better than Hermes-3-Llama-3.1-405B for ERP because it is less censored.
Absolutely not true, I use both, the former is full of banned words and censorship regarding ERP.
isnt that depend more on the creativity and human understanding of the model?
theyve updated the format to hide the subcategories, so in some cases it could be worse yes
just checking you do send the gemini BLOCK_NONE etc?
What is BLOCK_NONE?
Tulu Llama = TuluL
How do u you do this through OR
you cannot, it is only possible via direct API
my paid gemini key allows me to send this
Sure, I use the free API from Google AI Studio and SillyTavern which has BLOCK _NONE by default.
But you try typing “little girl” during an ERP chat, for example, and see what happens?
They put blocks and filters on a lot of words and phrases.