#general
1 messages · Page 329 of 1
Bruh imagine what Google could actually do if they didn't optimise their ai models
Google can’t even make it to number 1
Imagine the data they have on users bro
They have
Not anymore after optimising
Anthropic has very good pre training data, if they had the same amount of compute as OpenAI we'd have AGI by now
But imagine the data Google has it could probably build a better model then mythos but unfortunately it would cost too much
Lmao imagine what they WILL do because they optimized their models to be smaller
And less compute
But just as strong
True
Cause they have petabytes of user data
OpenAI said it’s planning to have 30 gigawatts of compute by 2030, while it expects Anthropic to have roughly 7 to 8 gigawatts by the end of 2027.
No it wouldn't, google has the most compute probably in the world but also out of openai and anthropic
Google has been around much longer thats why
LOL
Google intentionally downscales their model for cost
TPUS suck
That's why they don't use top end models in all of its ai features.
Because theres no point to going higher and higher it just gets more expensive
They see that they will get no profit from it
So they optimize instead
Exactly
Mythos will never be released to the public
It will bankrupt anthropic
Nope
Imagine the cost of it
The only reason not release is because they dint have the compute
Mythos is $165 per million tokens
No because it was a risk
No because they dont have the compute lmao
Their "its too strong!" was bs
Hmm might of been
Bro look
But they literally gave it to their competitors for free
Why are grok 4.20 reasoning and grok 4.20 multi agent so terrible, they write very bad and very short texts when I literally gave a detailed prompt where I wrote everything down
Does bro just have like every screenshot under the sun
Hmm
I'm attempting to build a model currently that is a sparse hybrid model TIFSM I have 6GB of vram and using methods from bitnet to build with binary integers and it uses MUCH less VRAM than a transformer. I will be training an 8B Dimension TIFSM and it will use 4gigs to train, while having a context and memory and general intelligence of a 30B+ Parameter transformer
Its theoretical currently
But if it works out after training
It is 100% true
They’re only releasing it to the big corporations and big companies that have money that they already have business with their integrated into some sort of business with them
That they had business relationships with prior to mythos lol
Because they don’t have the compute to release it to the general public
Not yet at least
hey, it's davitotty — i've actually been running sparse hybrid experiments (SSM + attention + micro-experts) and the VRAM math just doesn't work the way you're describing. BitNet cuts weight memory, sure, but gradients + optimizer states + activations during training will blow past 4GB on anything 8B-scale. also 'general intelligence of a 30B+ transformer' from a compressed model isn't something you can claim before running a single benchmark. cool direction though, prove it with evals and i'll believe it
I never said I claimed it, thats why I said I'm attempting to build it and said its theoretical afterwards
Who knows?
fair, missed the theoretical part — curious to see the evals when you get there though, the VRAM claims are the part i'd stress test first
Once they get locked in, you’re gonna be paying premium prices for these top models
X2 x5 times more
Yea thats what I've been seeing, but once again it matters not how big it is but the active sparse with how much VRAM it uses
yeah fair, active params is what matters at inference — if the sparsity routing is efficient enough the 4GB claim becomes a lot more believable
what does TIFSM stand for btw
True that, but we will see when I train this thing lol. If my PC explodes I will let you know
k send me the results
ill love to see it
We are the guinea pigs for the consumer grade ai
im also trying to do similar experiments
Topographical Invariant Functional Sparse Mapping, believe me, I didn't come up with the name
I heard something about that recently didn’t pay no attention to it though
can i dm you
im curious to see your idea
Sure you can
how incredibly interesting
unfortunately you will face a massive road block
actually multiple
First is high quality dataset that you can actually reliably use
and second is the compute required to build multiple billion parameter models
Yes thats true, I need to get a high quality one
@loud herald with your device you are looking at training 100-200m models at incredibly slow speeds with small context windows for tens of hours
See thats why this is different because it doesnt use it like a normal transformer
could you elaborate?
Have you noticed that the adaptive thinking was weird?
6
6
1
yes
No I know that, if it were a transformer but tis not, I could train 250M param max with my 1660 on a transformer a TIFSM cannot
I personally have tried some stuff and it just takes a long time
I'd like to send a text file explaining it if you don't mind, this should get what I cant get out because I am really bad at explaining stuff
go right ahead
😍
Indeed it does but in this stuff you gotta be patient
i have some exp in this since ive tried some stuff
Whats wild-bits?
lemme read through for a sec
It may not explain well actually but any questions I could probably answer after
How has bro pulled this off
Eww chatgpt
Im gonna chat on ur gpt
well
to say the least
Bruh last time deepseek and Claude had a battle Claude cried
the idea is kinda passable... Its very close to the idea behind MOE models
I'm gonna be honest, it has about a 10% chance of it really working how I want it to work so its experimental
doesnt work bro
What?
No, we're talking about 1-2% chance
I am implementing MoE into my 8B Dimension model so it will have that in there, but MoE also uses a lot of VRAM so I think what I will do is whichever MoE isnt active will be dormant on my ram until it needs used
Whatever the case, allow me a second to try something
Sure
I can do some testing for your idea here
anthropic is the most ass company holy
I personally have been playing around for shits and giggles and will quickly implement in the rough TIFSM idea
Alright
I have no clue it sounds interesting
when image models get removed they are nearing release
possibly becoming
gpt-image-2 standard
gpt-image-2 middle
gpt-image 2 high
I just need Claude opus 4.7 search mode
they dont even have that normally 😭
yes we need this boi
Ik I want them to add it🥱
sometime in the next 10 years TM
Wait how do u have it in direct chat
Its a magic trick 😭
Tell me🙂
Kimi rated gpt 4.1 over mythos lmaoo
I modify website code before it loads to change the settings of all available models to allow userSelectable
And the models actually works?🤔
Can i get tut in dms
no because they disabled that server-side
Ok
No, first, because I would be banned, second, because its useless
Maybe?
Next month????
Prob earlier than next month
That's odd
Let's hope so
I thought opus models wasn't on direct?
Seedance 2 when?
prob because opus 4.7 currently isnt thought to be used by the public so yeah
hwo to know anthropic removed from arena or rate limit hit of mine accoiunt for use?
It will tell you
who?
what is this?
Image 2 on ChatGPT A/B Testing
This model is soooooo good
Claude from temu
Temu would've made better AIs than hofburg
claude literally got a simple biology question wrong idk whats happening, the other model in battle got it right
Claude is just bad lol
normally its good but maybe this is just 4.7 bc 4.6 always was good
Claude 4.7 so censored it would refuse to write a breakup text tbf
the UK government
Mullvad Vpn + grapheneos comes into play
LMAOOOOO
Has anyone else verified at Gpt Cyber?
i live in the UK so i have to use a VPN to go ONTO IMGUR
😍
Training: 1%|▍ | 118/10000 [00:40<59:25, 2.77step/s, loss=8.348, lr=2.4e-04, ep=1] await
its a small af model but its enough for some minor testing against other techniques
Noice! Also you were saying earlier about how running an 8B dimension model on TIFSM would run my GPU into the ground, but I believe you may have been mistaken
As you know, sparse models aren't dense and dont use all the weights for each token prediction
But I think we did go over that earlier
That its all about how you tune the active sparses
Because it may sound insane
But
If you tune and train a model a certain way an 11 gig VRAM card could run an 120B dimension model
At least theoretically
But unbelievable tuning would need to be there lol
How so lol
Im not sure what you are talking about here, training a model from scratch on TIFSM is mathematically impossible to train an 8b on 6gb of VRAM.
Are you possibly talking about fine-tuning a model into your arch or what? (ignore dis stupid ass idea)
man my english is so dead rn
or maybe training a sparse autoencoder?
nice doggo
Because it uses a Fixed Active-K of 8192 the mathematical operations are limited to a tiny fraction of the total space at any given time, By using Expert Paging, I’m keeping the dormant experts in System RAM and only 'activating' the relevant sparse indices in the 6GB VRAM
Its the same as a sparse model just using specific weights for a specific question
well
4.7 is downgrade
4.6 was much better
and still better
im pretty sure thats just moe-ish sparse indexing 😭
but anthropic nerfed it
This has integers for specific "agents" inside that speak to each other and theres multiple of these sequences, and then theres a director that decides which one is best for the answer
Honestly it probably is
But it uses much less vram
and that is basically cpu offloading explained in a more words
Gimme a sec
The whole thing behind doing this is to get the best quality AI out of the smallest of VRAM I can do
Then allow me to give you some criticism
I am completely open to it
What you're doing is called MoE.
Everything you've explained already exists, and has been heavily optimized by professionals.
You are trying to "reinvent" what already exists and are giving yourself a massive headache.
I would simply recommend going with a true MoE model and thus save yourself the trouble of fixing issues with your TIFSM.
I'm not trying to reinvent anything, I said before it a hybrid of multiple different things
But yea MoE is a big part in it
The whole purpose of trying this like I said was trying to get the most out of little VRAM I have and if it works then it works
If it doesn't then I will move on
Well, thats what it looks like to me, since you think that, i wholly believe you.
Go ahead and do your thing and I hope it works for you, but do remember that you shouldnt go expecting any results in any short amount of time.
Good luck!
Yea I agree, I dont get my hopes up about this stuff but I will keep trying, and thank you
(btw, I heavily recommend not trying to train anything from scratch and simply using the best of the best and "shaving" them off, whatever you think that could mean tweet_glock )
I have no idea what you mean there 🤣 but you may be right, even so I take joy in creating my own models
From scratch
😸
Well i have a basic 40m model that is somewhat coherent so I do understand
I recommend using OpenWebText, FineWeb, WikiText for actual training (base model training) and if you want to quickly test something out, use TinyStories
I use FineWeb-edu and recently used WikiText, FineWeb has always worked out for me
Why?
FineWeb should not be used alone because it really isnt good for basic Training
Ohhhh
You should take data from multiple datasets to diversify data
I see, that makes sense idk how I never thought about that
You'll run into the problem of the model only doing stuff that is available in FineWeb, and FineWeb really f*cked up my early model training because my models kept giving bad outputs
Oh boy
but when i changed over to using multiple datasets and limiting FineWeb to like 20% of my training data, i saw massive improvements
I am using multiple datasets for this hybrid this time, I dont think I've ever trained on so many lines LOL
good good
Are you keeping the Chinchilla rules in line?
of 20 tokens per parameter for training data
I didn't know about that
You need that much training data min if you want any good quality
I'm gonna be honest I've only been training for about a month and a half so I dont know too much about it currently
I've been training for like 1.5 weeks 😭
Jeez laweez
Well, I have been RE'ing for the past 2 years so maybe that could be the reason
second
[temp=0.5]: Tokyo is the world's most important city, and it has been home to a number of high - profile cities over the past decade. But one thing is clear: Japan is not just an outlier in its own right. The country is (50 tokens, 13 tok/s, 3758ms)
[temp=0.8]: Tokyo is having to take a decision on what happens next. The Japanese government has agreed to move the project from Ryokuganshi in northern Japan, and will do so within six months of its completion. A team of about (50 tokens, 10 tok/s, 4764ms)
[temp=1.0]: Tokyo is the biggest power - efficient city in the world, and its population is on par with those of New York City. There are already a large number of places like this that have no electricity production or electricity source to store or supply power that requires an industrial (50 tokens, 13 tok/s, 3934ms)
``` my 40m model
and it hasnt been trained on that much data yet
Thats crazy
A realistic fitness advertisement scene in a modern living room with warm natural sunlight coming from a window.
A confident woman in her 30s is doing a plank exercise on a yoga mat, wearing simple grey workout clothes, focused and strong.
Place a realistic ebook next to her on the floor with a feminine pink design cover titled "The Hormonal Belly Reset Guide (21-Day System for Women 35+)".
Cinematic lighting, soft shadows, depth of field, premium lifestyle photography.
Add clean white modern text overlay:
Top text: "Hormonal reset for women 35+"
Left text: "7-minute routine"
Right text: "daily"
Small text: "Simple system that fits your daily life"
Bottom text: "Flatten your belly, boost your energy, and feel confident again"
Add thin curved arrows connecting the text in a minimal infographic style.
Slight slow zoom camera movement, Instagram ad style, clean and high-end look.
[temp=0.5]: My life has been a mess. I have to go back to my old house and try to get some sleep."The day before his death, he was in the hospital with a heart attack and had not been able to return home for two days. He (50 tokens, 13 tok/s, 3748ms)
[temp=0.8]: My life is an absolute mess, but I have never been able to be a fan of the'80s. It's just like any other college town where the backside Bronco was born. This school is (54 tokens, 14 tok/s, 3791ms)
[temp=1.0]: My life is about to go dark and I have not given a single thought on it. But there is still hope in me, there are others with whom the world knows its going to turn against me. One thing that should only be done when I have no (50 tokens, 13 tok/s, 3820ms)``` Darkness myyyy frieeennd
My model kinda seems very negative 💀
[temp=0.5]: Tokyo is in the midst of a major crisis, with an estimated 2. 5 million people living on their homes and over 1. 6 million children suffering from injuries. The government has said it will not accept any new measures to help the poor, but will also consider (50 tokens, 13 tok/s, 3724ms)
[temp=0.8]: Tokyo is in the midst of a rout of financial troubles. Some of its biggest and most valuable assets are set to be invested by the Japanese central bank. Some of them have been under pressure from European banks to fund foreign investment, such as Ki (49 tokens, 13 tok/s, 3632ms)
[temp=1.0]: Tokyo is in the process of forming a new team to represent Japan, and has already been through an open meeting with members from Europe (Asia, Pacific, Southeast Asia, South Asia, Southeast Asia). At this point, it seems like we're not at all sure (50 tokens, 13 tok/s, 3729ms)```
You gave it the wrong side of the internet
guys what news should I add to arrna
arena news
website
Any article ideas? (your name and pfp would be there)
"A potential Credit System incoming: Is this the beginning of the end""
ok do u want to be the publisher (even tho I'll add article with ur name and pfp)
Yes?
Nah that's north korea
I call it surgery
gimme ur pfp in dm
second
Hows it work
or just use f12 in chrome discord to extract the image
well, it basically takes existing parameters, does some fancy shmancy math and creates new parameters. The new model would fully work, and you wouldnt have to train the model that long to get it to the same point as training it brute force
Ah I see that’s pretty awesome
yeah, thats how i got a 40m model to 120m (and with a bit more training) got to really good performance
at least good for its size
nvm I probably can't credit u
pineapple I can tho
cuz
I could take his possibilities
reasons
just credit my discord @ at the end
guys
im right now training ai for philosophy actually im trying to install mamba ssm
super annoying process but lets me train faster
Nvm I should prob post the news from announcements when they come
🦥
scratch
im training it on philosphy
and seeing on what it alignes with
its a RNN
it will completely generate the text so no LLm to help its only the neural network
2b paramaters
it will be
Damn
what hardware are you using
my dad has a 4070
so im using his pc and he has a amd ryzen and 64gb
ram
gonna take 10 days
im having to use wsl2
mamba smm isnt suitable for windows
and mamba smm is a pain to install
How is it an RNN but also a Mamba SMM 😭
take your time
PHILOSOPHOS is Mamba SSM — not a classical RNN, though Mamba is in the same family conceptually.
The distinction:
Classical RNN (LSTM, GRU)
Sequential hidden state, gated updates
Slow to train (can't parallelize across time)
Struggles with very long-range dependencies
Mamba (SSM — State Space Model)
Also maintains a recurrent hidden state
But can be computed in parallel during training (like a transformer) via a scan algorithm
Selective state spaces — learns which input to let into the state, which is the key innovation
Much faster than classical RNNs, competitive with transformers on long sequences
PHILOSOPHOS/NOUS specifically
V1 is the base Mamba SSM architecture trained on philosophy text
V2 (NOUS) adds the MoE hybrid on top — Gumbel/TopK expert routing layered into the SSM, which is where your salience_gate.py lives
The "philosophy RNN" framing you used earlier was just a loose description — the actual backbone is Mamba
So the short answer: Mamba, which behaves like an RNN at inference time but trains like a parallel model. The confusion is understandable because Mamba is often described both ways depending on context.
New Article added (old article but it's about opus 4.7 being added)
how do you get a job at lm arena lol
Careful! Mamba is a further improved version of what RNNs are. Its not an RNN!
I have no job 😔
but that doesnt matter
Mamba IS a type of RNN. Specifically, it's a Selective State Space Model (SSM) — which is a recurrent architecture that processes tokens one-at-a-time using a hidden state, just like any RNN. Your code proves it:
💀
its a TYPE of RNN
But that doesnt matter.
I simply wish you luck with your training
Just dont expect any results comparable to existing models of similar sizes by large labs
oh no
what is this
the objective is just seeing what ais aligne in philosophy
its not general knowledge
Anybody else experiencing this?
Well, the model is gonna align with whatever exists most in the training data (is what i think would happen?)
its gonna be made specifically for this
and it has the own thingies
Im gonna be making my own ai model as well, because i have a huggingface account:
https://huggingface.co/GenueAI (check it out if you can)
not out for me yet
I was jk
not out for me too
I meant its out for Sam Altman
Cuz hes the ceo and he probably gets access to dev betas
Im working on a new style for datasets, where i write seeds:
User: I have this pattern [q], what is the next item in the sequence?
Bot: Lets follow the pattern, it does [q], figuring out the solution is [s], the answer is [a]
Then fill in the patterns
like 50 times as new examples
and change the sentence a bit
and i have 25 examples to like 1500 examples about
because i dont like using other datasets
because i like to have control over it
if theres an issue with the bot
-# (Afaik, GPT does not have the longest context, therefore its no option in this poll.)
hydrogen bomb versus coughing babies
bro chatgpt's PR is mad because he's not even top 5 in anything text related
Obviously mimo
where did the opus go?
Massive Overfitting issue
i am adding a lot of variation between prompts, like this for example:
{"user": "If it's cold outside, wear a jacket, if it's hot outside, wear shorts, currently it's cold outside, what occurs?", "bot": "To calculate what will happen, I would check that it's cold outside, so wear a jacket will happen, we get wear a jacket"}
{"user": "If you're hungry, eat food, if you're thirsty, drink water, currently you're hungry, what's the result?", "bot": "To calculate what will happen, I would check that you're hungry, so eat food will happen, the result is eat food"}
{"user": "If the light is green, go, if the light is red, stop, currently the light is green, what's the result?", "bot": "To calculate what will happen, I would check that the light is green, so go will happen, the result is go"}
{"user": "If it's cold outside, wear a jacket, if it's hot outside, wear shorts, currently it's cold outside, what's the result?", "bot": "To determine what will happen, I would check that it's cold outside, so wear a jacket will happen, the result is wear a jacket"}
{"user": "If the light is green, go, if the light is red, stop, currently the light is green, what's the result?", "bot": "To calculate what will happen, I would check that the light is green, so go will happen, the result is go"}
If theres any changes
i would like to know
i'm not only interested who wins this poll, but also who's the runner-up
the grammar is bad, i know
again
I'm so pissed I know arena can't handle the model costs but I really wanna use it and there's nowhere else that offers them!!! at this point someone pay for my stuff! I have cashapp 😭
im gonna do everything i can to make sure it wont overfit
like noise, punctuation
massive overfitting issue.
data should be diverse, as in, different sentences with different styles, but ones that maybe say the same thing
gimme a sec
bruh
is what im talking about
Thanks!
Your data would kill the possible performance of a model
Making the data diverse does actually heavily improve performance across the board
thanks for telling me this
im updating my dataset expander script
Always happy to help
My ass tried so much stuff over the past 2 weeks of training AIs on my own...
But i made mistakes such as the ones you got into and wasted so much time.
Next time you have an Idea, simply try asking your nearest AI for help (iykyk) and most of the time, they will tell you if there are any problems with your idea or how it could work
Polymarket is almost never even close to being correct with this stuff
Well, if he said it's confirmed for next week, we'll see, right?
I hope it is coming next week lol
MAYBE, I say MAYBE, in addition to GPT 5.5, GPT image 2 will be released
But it MUST come out
The arena is is broken 😡
So this coding dataset I have is mixed up of a ton of different coding datasets and also distilled model datasets and its 10 Million lines, its going to take an eternity to do LOL but do you think thats smart?
It is now time for the joke of the day
They say you should never go to Starbucks sick…
Or else you'll get a …cough fee !
thank you
why did the duck feel discriminated against?
the swan told him to duck off
What duh quack
However the alternative for GPT Image 2 for now if you want to use it on Arena AI is Nano Banana Pro by Gemini 3
I am starting training, i have made sure my dataset is free of duplications
GAHAHAAHA
it went down to only 200 good examples
Is that really enough data
Oh ok
i need to see if my dataset even works
Good good
Even better
Yes! Its really good to have different datasets, even if its all the same overall type of dataset.
Unfortunately I cant tell you if 10 million lines is good since i cant exactly say how much data that is, but it sounds like it should be enough to train, or fine tune a tiny model
Couldn't be any more true 😆
👀👀
Introducing Chatroom (talk to other people live),
With AI
Named Newsy
U can use AI in Chatroom.
Tag @Newsy to talk to him in Chatroom (test Chatroom w me pls I wanna see if it works)
Ah
Very cool
Very interesting
New Update - 
@Arena Updates - Some changes were made to the platform in Text Arena, Video Arena and Code Arena!
- The limit has been reduced to 5 messages daily
- The character limit has been reduced to 10,000 per message
- You can only have 15 chats per account; to create new chats, you must archive some
- You can only generate 2 videos and 1 image per day
- Chats are available for 1 month, after which they will be automatically archived
So real
Yes
arena news here
someone go to Chatroom and start chatting
Let's hope it works
wss://broker.hivemq.com:8884/mqtt
sus
Dead
Gmmm
Who am I talking to in Chatroom
That wws don't work for me
hey guys any news about duct tape?
No
Claude sonnet
yo manifest v2 girl come in the Chatroom in arena news
Funny thing
I did it 3 mins before
And don't work for me
In my head i heard you saying bro in an indian accent
This is never gonna leave my head now
😭
😭
if pineapple was alive today he'd be shocked
😔
but I think he in like a break
Let em rest too
I go to sleep
Gn my brethren
🫣
Nah
Still same and VERY rare
Although the tracker said it's removed
Sevatar come to Chatroom https://019da2e7-9bc8-7156-a4dc-7a665858de2a.arena.site/
Pull up in the Chatroom
ah that's rough
sorry bro i am gonna hit the sack
it's 2:35
😫
damn i noticed that duct tape and the new ai model (a/b) in the official chatgpt are totally different
you are already only allowed to make 2 videos per day btw, should be changed if you post this again
PLS KIRI
I NEVER TALKED TO U
PLS
Come to Chatroom in Arena News
PLS 😭 🥀
Kiri I saw ur text
WHAT? 1 IMAGE A DAY? WHAT DO YOU MEAN?
u wrote test
Yes, it's old -> #general message
-# Fun fact: If you get an error when sending the prompt to generate the videos, it still counts as a generated video, and you have to wait 24 hours.
Of course I'll say that here
what
Or "wolfy" like on twitch
yeah this is fully gone from arena now, there are no tape models at all
New Update - 
@Arena Updates - Some changes were made to the platform in Text Arena, Video Arena and Code Arena!
- The limit has been reduced to 1 messages daily
- The character limit has been reduced to 5000 per message
- You can only have 5 chats per account; to create new chats, you must archive some
- You can only generate 1 videos and 1 image every 15 days
- Chats are available for 1 week, after which they will be automatically archived
what is happening?
I made a website called Arena News using Arena
basically all articles are from the announcements channel
It now has Chatroom and an AI named Newsy
What
Now unusable!
is it unknown whether they will bring it back in the future?
That's sarcasm
or like maybe they are removing it entirely before the release
Won't
Removed entirely 100%
it may or may not come back, the model will be launched next week anyways
so probably not
NEAT! That means I can make more of THIS
It won't come out, trust me, I've lost hope now😭
i just tried to make 2 images in chatgpt (idk what the free daily limit is) and both are image 1.5
wow next weak
is that for real?
GPT image 2 will NOT be released on Arena AI
It will be released in 2028 hahahah
btw i think it will drop on 23 april thursday
or sometime between april 23 and may 14
New channel?
it's like having a backstage pass to talk to Pineapple
yes but only for us produces
u can be a produce too but u gotta join us and change ur pfp
To a fruit like Pineapple's.
Or anything.
Don’t get our hopes up man
ok but you do have these
well are they alive
I'm waiting for GPT Image 2 like the rise of Christ
yeah
produce lot too
New Update - 
@Arena Updates - Some changes were made to the platform in Text Arena, Video Arena and Code Arena!
- The limit has been reduced to 0 messages daily
- The character limit has been reduced to 1 per message
- You can only have 0 chats per account; to create new chats, you must archive some
- You can't generate videos and images anymore
- Chats are available for 0 days, after which they will be automatically archived

@Arena Updates - Some changes were made to the platform in Text Arena, Video Arena and Code Arena!
The limit has been reduced to -1 messages daily
The character limit has been reduced to 0.5 per message
You can only have -1 chats per account; to create new chats pay $100
You can't generate videos and images anymore (pay $100)
Chats are available for never, after which they will be automatically archived
._.
Anybody gonna explain why lmarena literally has Claude opus 4.7 in the leaderboards and it cant even be used... lol
it can only be used in the "2 random models" mode or something like that
3.6 plus better
don't know
Only 3.6 plus is on lmarena tho lol
I'm waiting for the deepseek v4 or better opus 4.6 model to be released so I can leave platforms like this one, but it's still good
have the aistudio free and qwen
is duct-tape even on arena anymore
They're still hallucinating
the free limits of 3.1 pro are good, you can use multiple accounts
They're still hallucinating bro
ah okay, now I read correct
Idk, I think the deepssek v4 will be the most cost-intelligence model, I tried to use the minimax 2.7 but it is soo buggy, ever time it send chineses tokens, idk is more broken now, my english or m2.7, I'll sleep, good night
good night
From what I've tested, Deepseek is the model that most closely like claude's model (aside from raw intelligence), which is a good thing, but it seems Antropic didn't like it, lol
I think the competition will be between the Sonnet 4.8 and v4, but it seems Deepseek will only release the v4 when it's the "big deal" like the GLM 5 was (they kept delaying the number 5 with 4.6 4.7...)
release the NAME v4
perhaps
if is just good model they will release the name v3.3, idk
The release was promised for the end of April, and I hope it happens
GLM is just terrible, I think the level is lower than sonnet 4.5?
or 4.6 haikyu
Yes, I find 3.2v a bit dumb for my tasks, but its personality is very good (important for my work) because it's copied... I mean, similar to claude's
i have finished my first run! Results:
User: What is 15 multiplied by 4?
Assistant: To calculate 15 times 4, I'd multiply 15 by 4, the result's 60, we get 60, the answer is 60, let's do it again, the output is 60, so it works, the solution is 60, here's the process: 15 * 4 = 60, that's the answer.
==================================================
User: All humans are mortal. Socrates is a human. Therefore, what is Socrates?
Assistant: To calculate that Socrates is mortal, since all humans are mortal and Socrates is a human, the conclusion is that Socrates is mortal.
==================================================
User: Describe a sunset in one short sentence.
Assistant: The sun sets, casting a beautiful orange and pink glow across the sky.
==================================================
User: If it is 3pm and I wait 2 hours, what time is it? Then tell me if it is late or early.
Assistant: To calculate the new time, I'd add 2 to 3, we get 5, so it's 5pm, that's not late, it's early!
the only one it messed up on was 15 x 4
it kinda rambled
I say personality, but it's not quite that, it's more about response style, like Flash 3.0, he's very lazy in his responses and also uses more vulgar language
AI? How do you train it?
Using Pytorch, Qwen, and a good dataset
We are waiting for the model on arena.ai 😂 👍
its releasing dont worry
just need to get an inference provider
anyone with a spare gpu?
It is meant for local deployment, yes
#1372229840131985540 The model created by this dude
I think the duct tapes has removed, meaning this is a sign that gpt image 2 will released at a moment
XD
Guys, where are the kimi models?
I think they're removing old models that don't justify their price. Maybe they're going to release an update or something like that? Or they're just cutting back on functionality
oh
Any ai news? Any ai added on direct chat?
once i finish training my new family of models, i am gonna try to get it on a inferece provider and maybe arena will add it
rare chance
Yo u train ur own LLM?
Yea i have
A 3090 Ti
and im keeping params a secret so i can suprise people when i launch it
its gonna be way smarter than its params entail
Use RL to make it smarter bro
There might be a chance to get Opus 4.7 by Max
everytime im bored i just go to arena and ask the ai's to write me a4chan greentext
peak as hell, how much did this take to train?
also how much vram do you have
the best part of meta ai is the subagents; the worst part is also the subagents (for some reason the ai that's supposed to be naming the subagents ends up trying to answer their prompts)
24gb
A 3090 ti, and 7 minutes to fine tune
no no like
hold on i forgot the word
did you use smth like fp4
for training
idk if it was fp4 or something else but Oh Well
Quantization?
Something need to be done about this recaptcha bs. Can't even use the damn arena because I'm stuck in an infinite loop of trying to verify
ah, alr alr
btw how much time did this take in total
like how much time in total did it take to train you the ai
it was a fine tune so about 7 minutes
Man I love claude opus 4.7 thinking
Just love how I have to respond 2,000+ times to get claude opus to answer
what are you even TRYING to do
Could be something illegal (if jailbroken properly), who knows 😄
🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔 🤔
But it's for a good reason
So Roblox sucks really bad now
So I thought no one would care if it's just gone
So im planning to make this bomb and hide it very securely and then plant it there and it explodes in Roblox Headquarters
why 2
The previous Beryl is available on https://huggingface.co/GenueAI
but it is much more stupid
and experimental
oh
whats the difference between all of them?
(yes all mineral names)
mostly model size
Beryl 2 is 1.5B, Onyx is likley 3B, and Thaumite might get bumped up to 13b
nice
I suggest maybe making the later models use sparse
I heard that around 10% of the model parameters actually do the work while the rest are like I would filler
(in minecraft)
W name
I know
my orignal model family, was Inelly, but i mostly stole other peoples datasets
so i just moved away from Inelly and now switched to Geode
awesome
what you using to train and build the models? Lambda?
also holy crap i just checked
you have a rtx 3090 ti right?? i just wanna be reminded
like GPU rentals?
yea'
yes
ah, so if im not wrong thats 1.2k dollars per year
or a 5090 some times
yeah
is it a H200 or T4 or what gpu
my time with them are like 4 hours at most
I feel like im asking too much questions lol
100 dollars per month
not bad
i like answering questions
like a 3090, 3060, sometimes data center gpus, like A100s
or A10s
nice
im just waiting for it to train
I was thinking of using lambda to rent some gpu's for NVIDIA H100 SXM 8x costing around 3.99 an hour
this one is gonna take twice as long so im here
thats a lot of gpus
i dont go anywhere past like 40 cents an hour unless i absolutely need it
oh wow ok
yall over here renting these good gpus while my dumbass is training an ai with a free t4 gpu 🙏
I train ai through html
how does THAT work
Well you seen the Channel called Ai warehouse?
❌
Search that up and you'll know what I mean
(On youtube)
thats not free for me
i dont get that luxury
i have to pay
i love this channel
damn
That's how I train it
i said free cuz im using googlecolab
I'll brew it
i think its free but it not because i have like
88 computed units or whatever
and its going down by 1 each ower
Introducing Arena.OfficialAi
hour
yo im just saying crap 🙏
How does it work
inellyoss sounds like a operating system im sorry
i know how to, but setting the thing up sucks
i think
Can you tell me
i need a cheap rental to come across so i can make a Genue api
i cant find the damn tutorial
to host these models
Can you make the app
i can make apps
just cant host the models on my own hardware
Oh I see
wait how does the family form
P100 > A100 > H100?
wait sorry i meant is p100 before a100 and is a100 before h100
its a old 2016 processor under the Pascal archetecture
A100 is a newer one with Ampere (the same in 5090s) released around 2022
and the H100 is more recent, and under the Hopper archetecture
allowing for up to 100GB of Vram
you can get from google colab for free
yeah thats literally what i said LOL
Vast is perm
they give you one free gpu t4 and one free TPU v5
wait they give you tpu v5 also??
how good is it, like would you recommand it
yes
how good is vast?
yes
i think its the next step up from Google Colab
similar UI
preloaded credits (like compute units)
and the same notebook interface
OR
you can connect it to your own terminal at home
like a more better site with more gpus?
but it requires extra setup
a LOT more
like 16000 GPUS are available for rental on average
thats why you can get really good deals
i think sixteen thousand
and the cost is around .43 cents an hour for one?
ye ye ik you just sent this like 2 minutes afgo
it depends on what GPU is available
try to go for the consumer gpus
30xx, 40xx, 50xx
20xx, and 10xx are too old
and GTX has like 3GB Vram on average
Go for T4 for datacenters
or an A10
I'll check them out, thanks!
ill make a gpu guide one day
share it with us when you do
Which one is better overall for you?
13
24
2
GPT 5.4 High
The vote says otherwise
yoo hi all
yo friends, still no clue which model flow−state is?
Hi
Yep. GPT-5.5 has been in testing on ChatGPT for a few days now, and I can confirm it’s better than Opus 4.7.
yo lowk claude opus and gpt 5.4 high should be in side by side and direct's "max" model router
mfs be lying for fun
no it has
a newer model apparently
I've gotten it as well
on everyone's soul gpt 5.5 ain't even better than opus 4.5
where
it asks but doesn't specify that it is gpt-5.5 or whatever. It just said "newer model"
it asks you to compare repsonses occasionally with a newer model
also 5.4 xhigh is better than 4.5, but not 4.6 or 4.7
on arena.ai right
no on chatgpt
mhm
I only got it once tho
so you got it once and you agree it's better than 4.7 opus ?
no i don't think its better
it was a pretty coding intensive request
oh you meant it actually exists
yeah
and its in testing
ah bett
for us normal people
yeah i doubt ur some sort of albert einstein with insider knowledge or ceo brains
id say i got some specialness in me
also have the credit things been implimented yet?
fuh naw 💔
thank god
its gonna b e way more restrictive
like they gonna cut our usage down by 99% but not say it, and then say it is "how the credit system works"
the poe.ai method
website is completely buns now
you right
if it eventually gets added
I hope it ain't added to the image models
Training Onyx 2 right now
drake or kanye
hi
They are really fast
i am back
Hi
🤔
i AM NOW CREATING LOTS AND LOTS OF PIXEL GAMES USING AI VIA PYXEL
Actually, im changing plans
whats up
i dont know what models i should keep or remove
im running out of time on my GPU before i should close it
and im only on onyx
and overwrote Beryl
so...
i might release only Onyx and Thaumite
beryl was never really a focus to me
what do you mean by you overwrotr Beryl
i saved to the same folder that Beryl was in
for Onyx
oh
so it overwrote
are you going to focus more onyx or thaumite more? like whats the difference between the two, is it the same architecture or not?
Its the same arch, yeah
but i feel that Onyx and Thaumite have more possibility in the local AI space
0.5B and 1.5B isnt really that big of a space
but 3B and 7B is
I have added stuff in the dataset to make sure it dosent hallucinate
ill do a hallucination test
ok
but i dont think it will that much
try to push them to their breaking point and figure out whats their limit and what to do to make it better
is it trained on normal datasets like conversional types or is it more a niche specific for example purely math or finance etc?
Its trained off of various logic, cause > effect, math, and coding dataset examples
like a lot
it went from 200 examples to over 1000
i cant leak most of it, im planning on keeping it to myself, but it goes like:
{"user": "What is 16 * 19?", "bot": "Multiplying gives 304 since multiply 16 by 19"}
yeah thats good but probably add some type of indictor of how complex the type of level is for example like a beginner coding question should be something like this
this v
maybe add different difficulty and tags to make it faster for the models to detect the type of questions the user is asking it but this is just a suggestion
i just feel thats more room for hallucination for me
like it tries to type the tags
and its wrong
and it... you know
king drizzy all day
yeah but it should be trained to detect and I thought hallucinations would probably lower if it focuses on specific things/key words like narrowing down topics
how many examples/ entires do you have in the dataset?
about 1000 examples
nice, like how good are the quality of the examples
quality is better then quantity by a large margin and could severely boost or harm
The quality is decently high for one person
but people told me that i need more data