#vibe-coders

deep timber Mar 31, 2026, 6:53 AM

#

Please tee me 🙏 🙏 🤔

bleak vessel Mar 31, 2026, 9:14 AM

#

What are some optimizations that I can do to reduce the cost of gemini 2.5 flash native audio? I have built a live interview platform and based on last month's analysis, I see that it takes around Rs 50-60 to run one interview (an interview lasts for 5-6 mins average), which seems to be very high.

I checked the input token usage and the maximum amount of input tokens that was used in one day was 3 million, and there were 12 interviews that day, which means on an average each interview used around 250K input tokens.

Any help on reducing the input token usage and the cost in general would be much appreciated 🙂

tall walrus Mar 31, 2026, 10:24 AM

#

lol urs say refresh in 4 hours ? mine refresh after 7

#

i wish i had 34k google acc like chinese companies

#

i think buying multiple accounts is good

#

like google?

#

thinking of doing that rn it works for some acc i have multiples but idk if i should buy multiple acc

stark sapphire Mar 31, 2026, 10:42 AM

#

If you wish to be banned. Then go ahead.

stark sapphire Mar 31, 2026, 12:24 PM

#

Really happy with what i made so far with Antigravity

vagrant folio Mar 31, 2026, 12:42 PM

#

bleak vessel What are some optimizations that I can do to reduce the cost of gemini 2.5 flash...

Hi, this interview is audio only or have video feed?

bleak vessel Mar 31, 2026, 2:13 PM

#

Its audio only

stark sapphire Mar 31, 2026, 2:48 PM

#

uh oh.

gusty meteor Mar 31, 2026, 2:50 PM

#

stark sapphire uh oh.

lmaoo

vagrant folio Mar 31, 2026, 3:22 PM

#

bleak vessel What are some optimizations that I can do to reduce the cost of gemini 2.5 flash...

in this case is the context going back and forward which make it bigger each time. Search for Context Window Compression this will start forgetting old conversation when hit the defined limit, also a way to manage the context so the conversation has what it need to answer to the user

#

so You can try make interview about max 5 to 10min using this approach

#

also check you have the VAD enabled and be sure the are no duplicate resendings

bleak vessel Mar 31, 2026, 3:25 PM

#

Okay.. i am aware of context window compression, have to play around with the values a bit maybe

#

Also is prompt caching something that might help?

vagrant folio Mar 31, 2026, 3:26 PM

#

yes, but the catching in reality happen at the server side

#

so here unless they have that option it wont work

bleak vessel Mar 31, 2026, 3:27 PM

#

Oh ok makes sense

vagrant folio Mar 31, 2026, 3:28 PM

#

you case need check why this consumption
Gemini 2.5 Flash Native Audio Dialog
Live API
30 / Unlimited
20.63K / 1M
199 / Unlimited

bleak vessel Mar 31, 2026, 3:28 PM

#

My system prompt is actually really detailed (around 5000 tokens easily) and I believe that is being sent with each convo

vagrant folio Mar 31, 2026, 3:28 PM

#

this is my ai studio for a project I did before I didnt talk much but Iwas using video

#

and im sure is longer than 1 minute each session

#

so your reported token usage is to big

bleak vessel Mar 31, 2026, 3:28 PM

#

vagrant folio this is my ai studio for a project I did before I didnt talk much but Iwas using...

Was your system prompt big or small?

vagrant folio Mar 31, 2026, 3:29 PM

#

let me check give me a moment

#

my is 800 token system prompt and tool call is 430 token

#

in your case not sure if apply can be separated in tools or based on the interview.

bleak vessel Mar 31, 2026, 3:33 PM

#

Oh ok… i don’t have any tool calls

vagrant folio Mar 31, 2026, 3:34 PM

#

good I was checking search for Explicit Context Caching

bleak vessel Mar 31, 2026, 3:35 PM

#

But yeah let me try out context window compression

vagrant folio Mar 31, 2026, 3:35 PM

#

but this require save cache on your google project

bleak vessel Mar 31, 2026, 3:35 PM

#

I guess i can configure that

vagrant folio Mar 31, 2026, 3:36 PM

#

yes

#

if you want be sure what happening I recomend add a middleware which will count tokens going to live api and how much you receive from it by turn

#

record that to a json file

#

and use it as a reference to know if is improving or not

bleak vessel Mar 31, 2026, 3:42 PM

#

Yeah, planning to do something like that... Actually that is a good idea.

I planned on detailed logs, but again that becomes a mess to observe. The JSON file makes more sense. Thanks

vagrant folio Mar 31, 2026, 3:44 PM

#

yes just make a json for token consumption count only, by turn, if youwant something more you can deploy grafana and add a metric from json, or use prometheus. for easy view so whatever path will be easy for you to see read or pass to ai

stark sapphire Mar 31, 2026, 5:36 PM

#

thoughts on UI?

#

I tried to make it even better

vagrant folio Mar 31, 2026, 7:35 PM

#

nice, from my perspective the log out section be at botton

hasty bay Mar 31, 2026, 7:40 PM

#

Does anyone have any unique ideas for building an AI agent?

vagrant folio Mar 31, 2026, 7:47 PM

#

What are you looking?

iron rock Mar 31, 2026, 7:49 PM

#

stark sapphire thoughts on UI?

The UI is a bit distracting, it wants to pull focus to everything and thus focuses on nothing.

The UX gives me zero idea what the purpose of the website is

stark sapphire Mar 31, 2026, 7:51 PM

#

iron rock The UI is a bit distracting, it wants to pull focus to everything and thus focus...

I'm surprised by this.
The whole idea was to give it purpose. If you scroll up you will see my older design.

#

before, it was just cards, all same sizes, nothing was telling you anything. It's just there. It gave no focus. So i made it differently, with more purpose, so you eyes can focus on what is important in the moment.

iron rock Mar 31, 2026, 7:58 PM

#

Bright lime green bar, eye goes there first. Gives no details.

Then the eye bolts around to all of the yellow, as it is the focus color. Which then brings you to all the article related stuff, so... News website?

Then I see vending and get confused as heck

stark sapphire Mar 31, 2026, 7:59 PM

#

The colors are there because those are brand colors. I kinda have to use them. Secondly, it's not a website, it's an application. An extension from the website.
People downloading the app will understand what the're looking at.

Let me break down why this works before my ego inflates and floats away. First, my hero finally commands attention. By combining a big image, strong typography, and a clear CTA, I’ve stopped presenting twelve equally irrelevant rectangles and started answering the question of what the user should care about right now. It’s a huge win. Beyond that, the "Trending" section finally feels like a curated space rather than a random data dump. The labels pop, the cards are grouped with purpose, and the spacing gives the content breathing room, signaling that this information actually matters. I’ve essentially invented flow by moving from the Hero to Trending and then to Latest Reviews; previously, my layout had the narrative structure of a grocery receipt. The typography is also doing some serious heavy lifting here, that italic bold headline style in the hero gives the site/app a slightly aggressive, editorial tone that feels like a gaming magazine that drinks pre-workout.

iron rock Mar 31, 2026, 7:59 PM

#

If I landed on that page looking for game reviews, I would end up leaving confused on why I had been brought to it.

#

Ok. lol

stark sapphire Mar 31, 2026, 8:03 PM

#

@iron rock Here is how the actual website looks

iron rock Mar 31, 2026, 8:05 PM

#

Much better, not a fan of the other-one though.

#

Im not personally a fan of the limegreen, but Im sure it doesnt bother others

stark sapphire Mar 31, 2026, 8:07 PM

#

I understand. Sadly i can't change the color, because it's for a company. They have used this color scheme for ages.
Game Mania is 34 years old, having been founded in 1992.

iron rock Mar 31, 2026, 8:08 PM

#

Oh damn! That's awesome though. I wonder if it would look better if you used the green as the cta color rather than yellow. That way the top bar could be yellow.

I assume it wouldnt lo9ok better, very likely you have the best already selected.

#

look*

stark sapphire Mar 31, 2026, 8:10 PM

#

The issue with this idea, is that the green header has always been green in the past.
Plus, I'm simply not allowed to switch it to yellow. The yellow is also a part of the brand color, but I have to be careful where to use it.

iron rock Mar 31, 2026, 8:11 PM

#

Understood, thanks for hearing me out 🙂

stark sapphire Mar 31, 2026, 8:11 PM

#

glad to get feedback 🙂 thank you for that as well.

deep timber Apr 1, 2026, 4:10 AM

#

Why gemini cli is taking about 5 minutes to respond in any task it is daam to slow anyone have their solution
Please tell me

vagrant folio Apr 1, 2026, 11:50 AM

#

in my case I noted that 2.5 flash answer faster

#

try flash

fading osprey Apr 1, 2026, 2:01 PM

#

Are you also having problems with the antigravity limits with Google's Ultra plan?

stark sapphire Apr 1, 2026, 3:35 PM

#

fading osprey Are you also having problems with the antigravity limits with Google's Ultra pla...

describe the problem please.

signal raft Apr 1, 2026, 3:53 PM

#

fading osprey Are you also having problems with the antigravity limits with Google's Ultra pla...

yes its normal for google. trash plan

stark sapphire Apr 1, 2026, 5:15 PM

#

what sucks with Antigravity, is that when you use your tokens, but the Agent fails or gets disconnected for a split second, you lose all of those credits.

tall walrus Apr 1, 2026, 5:31 PM

#

How are you all vibe-coding with this Claude Opus?

stark sapphire Apr 1, 2026, 5:31 PM

#

what do you mean with how?

balmy depot Apr 1, 2026, 6:42 PM

#

I only use Opus for planning... it's not a good use of tokens to get it to do things.

winter viper Apr 1, 2026, 8:01 PM

#

fading osprey Apr 1, 2026, 8:02 PM

#

like the antigravity costs with google ultra plan, what is there that has an almost unlimited rate limit?

winter viper Apr 1, 2026, 8:02 PM

#

fading osprey like the antigravity costs with google ultra plan, what is there that has an alm...

naah totally not true, maybe 10x bigger than ai pro

fading osprey Apr 1, 2026, 8:03 PM

#

what?

#

Maybe I didn't explain myself well with my question

#

I was using Antigravity with the Ultra plan until yesterday, working perfectly for software development, CRM, etc.

But since this morning, I've been having problems with the limits, and I think it's a general problem... Do you know of any good alternatives at the same price of $250 per month?

winter viper Apr 1, 2026, 8:09 PM

#

Ohhh, my bad, claude code for sure, even cheaper and better

fading osprey Apr 1, 2026, 8:13 PM

#

Yes, but I read that it has the same limitations, if not superior to antigravity of the last few days.

#

Until yesterday I was able to create complete CRMs from 0 to 100% without affecting the credits or the rate in the slightest.

uneven bridge Apr 1, 2026, 8:17 PM

#

capture_decran_2013-05-28_a_10.24.09.png

#

Goooooolersssss goood to be home can't wait to meet you all ! Sorry for any wrong beings thanks for supporting during the most tuffest times in life you guys are the best wish you well ! God bless you all. Googlelogo ❤️ GCP

stark sapphire Apr 1, 2026, 8:29 PM

#

uneven bridge Goooooolersssss goood to be home can't wait to meet you all ! Sorry for any wron...

Went on a vacation?

uneven bridge Apr 1, 2026, 8:31 PM

#

stark sapphire Went on a vacation?

i was looking for you bud ! can't leave me home alone 🗽

stark sapphire Apr 1, 2026, 8:34 PM

#

Bot

uneven bridge Apr 1, 2026, 8:35 PM

#

stark sapphire Bot

can we work on project together

stark sapphire Apr 1, 2026, 8:36 PM

#

No

uneven bridge Apr 1, 2026, 8:37 PM

#

stark sapphire No

love your energy champ! Gear up

#

get readyyyyyy!!! yall watsssup im here im not going no where cupcake

#

I LOVEEE IT

#

#welcometoJungle 🗽 😎

#

https://www.loom.com/share/47a10f42fc164cbba895e1ce53071c86 im here !!!!!!! im only 3 weeks in you got a long rride

Loom

Exploring the Future of Asian Technology and Innovation 🚀

Hey everyone, in this video, I'm excited to share the latest updates on our project motion frames and the potential they hold for our mission. We’re diving into the specifics of the G6 engine and how it enhances our capabilities. I also touch on the importance of our Asian connections and the unique aspects of our design. I encourage you all t...

▶ Play video

#

after 7 month breaks 😘

#

lets do it !!!!

#

come outside yu think got jokes huh

balmy depot Apr 1, 2026, 8:45 PM

#

fading osprey I was using Antigravity with the Ultra plan until yesterday, working perfectly f...

They gutted the quotas a while back? Maybe you are just hitting the limits now.

fading osprey Apr 1, 2026, 8:46 PM

#

no because it reduces 20% after 3 messages and then after 5 hours it gives me everything back

balmy depot Apr 1, 2026, 8:47 PM

#

fading osprey no because it reduces 20% after 3 messages and then after 5 hours it gives me ev...

That's hitting the first quota cap. Soon you'll hit weekly limits and get 7 day refreshes. And yes, they sold you Ultra saying no weekly caps, but it's widely reported that Ultra folks are getting them these days

fading osprey Apr 1, 2026, 8:50 PM

#

And is there a valid alternative without limits?

balmy depot Apr 1, 2026, 8:50 PM

#

Nope

#

My understanding is that they are building Antigravity into AI Studio, the IDE may not survive.

uneven bridge Apr 1, 2026, 8:51 PM

#

stark sapphire No

im on you juu heard stop playing me punk !

balmy depot Apr 1, 2026, 8:52 PM

#

Closest thing to an alternative I've seen is using OpenCode or ClaudeCode integration and then using Antigravity for planning and then your choice of models to implement.

fading osprey Apr 1, 2026, 8:52 PM

#

and.. remove completely claude

balmy depot Apr 1, 2026, 8:53 PM

#

fading osprey and.. remove completely claude

Maybe, yeah

fading osprey Apr 1, 2026, 8:53 PM

#

balmy depot Closest thing to an alternative I've seen is using OpenCode or ClaudeCode integr...

is equal tu claude opus? and price for this?

balmy depot Apr 1, 2026, 8:53 PM

#

Well can use Claude so yeah equal. Several different ways to go about.

#

You don't switch to Sonnet for implementation? Cheaper that way.

fading osprey Apr 1, 2026, 8:54 PM

#

When I use Opus, it automatically downgrades Sonnet.

balmy depot Apr 1, 2026, 8:55 PM

#

Hrmm? Okay.

#

I run out of the Claude quota so fast, I can't say I've had a ton of experience with it.

balmy depot Apr 1, 2026, 8:59 PM

#

fading osprey When I use Opus, it automatically downgrades Sonnet.

My understanding is that google restricted memory of their Claude instances, so Opus/Sonnet may actually run better elsewhere.

fading osprey Apr 1, 2026, 9:00 PM

#

and where? you know?

balmy depot Apr 1, 2026, 9:03 PM

#

I am personally messing around with OpenCode now as part of my process.

Well ClaudeCode would be at Anthropic. Max plan would be an option?

OpenCode is more able to connect to anything, but the people behind it have "Zen" and "Go" services that include Claude, so I would suspect that would be more to spec with Anthropic. I don't want to get to into doing the pricing reseach for you. Pretty straightforward stuff, but changes often

dawn sphinx Apr 1, 2026, 10:03 PM

#

Does anyone know when Gemini 3.1 Flash Live Preview will be available through Vertex AI?

It seems possible through Google AI Studio but not Vertex AI.

I upgraded to google-genai >= 1.69.0 and have the SDK is unified.

The Gemini change log said on March 26, 2026: “Released gemini-3.1-flash-live-preview, the latest audio-to-audio (A2A) model designed for real-time dialogue and voice-first AI applications.”

models.get() returns a metadata shell but the Live API WebSocket returns 1008/404.

I can’t tell if it is behind a quota/EAP allowlist adjustment.

I don’t know if the endpoint is gated by a Private Preview IAM flag because of some GCP Allowlist Flip or what.

I think the global control plane knows the model exists, but the regional data plane (API Gateway routes/GPU clusters) is unprovisioned.

stark sapphire Apr 2, 2026, 7:02 AM

#

balmy depot My understanding is that they are building Antigravity into AI Studio, the IDE m...

Do you have any sources saying so?

jaunty marsh Apr 2, 2026, 2:16 PM

#

When building an app, the model usage is very confusing to me. For example, when I’m using Gemini 3.1 Pro preview, sometimes it allows me to create quite a few prompts before it exhausts usage on the free plan, and sometimes it’s only a couple.

balmy depot Apr 2, 2026, 2:28 PM

#

stark sapphire Do you have any sources saying so?

Nope, the "IDE many not survive" is pure speculation. Which is why I used the word "may" to indicate uncertainty. This blog however talks about how they are introducing the antigravity agent to AI Studio: https://blog.google/innovation-and-ai/technology/developers-tools/full-stack-vibe-coding-google-ai-studio/

Google

Introducing the new full-stack vibe coding experience in Google AI ...

Start building real apps for the modern web with the Antigravity coding agent and Firebase integration now in Google AI Studio.

#

They've certainly undermined the position of the IDE, while seem to be focusing hard on getting the good bits into the cloud based AI Studio, which honestly is more where the company is comfortable. Have to say that my hopes for the IDE are dimming. I hope I am proven wrong.

#

Google is a big enough company to do both.

signal raft Apr 2, 2026, 3:10 PM

#

balmy depot My understanding is that they are building Antigravity into AI Studio, the IDE m...

I think the same

round solstice Apr 2, 2026, 8:33 PM

#

<@&1009526435276394496> that spammer is back again. 🙁

past moat Apr 2, 2026, 10:26 PM

#

Tip: Add outbound loop prevention to your GitHub Copilot instructions

If your AI agent can send emails or messages, add a rule that stops it from replying to itself. Without it, one email can turn into hundreds.

Example 1 — The email loop:
I built an AI agent that reads my inbox and sends replies. I added the agent's outbound email address (aos@mydomain.com) to the list of allowed senders. When the agent replied to a real email, that reply landed back in the inbox — and the agent replied to that too. It looped 18 times before I caught it, and generated ~89,000 Pub/Sub (publish/subscribe — a message queue service) retry faults in the process.

Example 2 — The fix (three layers):
The rule I added to my Copilot instructions requires three independent guards any time the agent sends something outbound:

Code check — before anything else, reject messages from your own addresses in the handler logic itself
Config check — never add an outbound address to your allowed-senders list
Rate cap — abort if more than 10 emails have gone out in the past 60 minutes
The reason for three layers: if only one guard exists and it's misconfigured, the loop happens anyway. All three have to fail at the same time for a loop to get through.

Why put this in Copilot instructions?
Copilot will generate the outbound handler code for you. If the rule isn't written down, it won't know to add the guards. Once it's in your instructions file, every new handler gets the protection automatically.

vagrant folio Apr 3, 2026, 3:17 PM

#

if you mean the terminal at the first versions where you have access o see and interact how he execute commands etc. I think they just remove this feature. now it execute commands in his terminals but you cant interfer as before, you can see the output

stark sapphire Apr 3, 2026, 4:00 PM

#

his?

Anyway, terminal works here just fine.

lime wyvern Apr 3, 2026, 8:52 PM

#

Nice. Definitely interested would love to hear more about what you're building?

open stone Apr 4, 2026, 9:44 AM

#

how much you already invested in startup

gentle aspen Apr 4, 2026, 12:00 PM

#

Recently I realized my grades weren’t dropping because I didn’t understand topics, but because I didn’t know what to study.

Flashcards help, but creating them manually takes too much time.

So I built an open-source app called ONCards.

It converts notes, PDFs, and slides into flashcards automatically, and uses a local AI system (Gemma3 via Ollama) to:

track weak areas
recommend what to study next
adapt based on performance

It runs fully offline with no API or subscriptions.

Currently uses ~300MB RAM idle and ~4–5GB VRAM during inference, with aggressive caching for performance.

I’m looking for feedback, especially from people running local models or using Gemma.

round solstice Apr 4, 2026, 7:08 PM

#

gentle aspen Recently I realized my grades weren’t dropping because I didn’t understand topic...

Have you tried Gemma 4 yet?

gentle aspen Apr 4, 2026, 7:46 PM

#

round solstice Have you tried Gemma 4 yet?

yeah! it is crazy!!! I am plannng to build an agent system ot manage my other computer as a funproject.and I am considering changing the model in my app to Gemma 4 because I find it more "stable" across many categories.

also the reasoning and native function calling has being a HUGVE deal for me for the past two days. I am still trying to do more stuff. might take some more time to say how good or bad it is. but as of now, it is CRAZY! I think this might be the biggest leap in local AI since Deepseek-r1.

round solstice Apr 4, 2026, 7:49 PM

#

Yeah, definitely amazing how much latent knowledge is in the downloadable blob.

#

And if it's any good at tool calling, it can have current and RAG info.

stark sapphire Apr 5, 2026, 8:28 AM

#

I just made my own AI

deep timber Apr 5, 2026, 9:20 AM

#

Hey devs 👋

I’m building something called DevOPS — a voice-first AI developer assistant that lets you control your entire coding workflow using just your voice.

No typing. You just speak.

You can:
• Search and open your GitHub repos
• Read and explain code
• Create issues and review PRs
• Debug files with AI
• Navigate your projects hands-free

It’s like having a real AI pair programmer that listens, thinks, and responds instantly.

The goal is to make coding faster and more natural — especially when you don’t want to switch contexts or type constantly.

I’m curious:
👉 Would you actually use something like this in your daily workflow?
👉 And more importantly — would you pay for it if it worked really well?

Be honest, I want real feedback 🙏

stark sapphire Apr 5, 2026, 9:23 AM

#

deep timber Hey devs 👋 I’m building something called DevOPS — a voice-first AI developer a...

The issue with talking is that you can't stop your sentence. If you do, the AI would get confused or tries to proceed. When you type, you can stop typing whenever, and continue later.

deep timber Apr 5, 2026, 9:24 AM

#

stark sapphire The issue with talking is that you can't stop your sentence. If you do, the AI w...

But i add pause button also

gentle aspen Apr 5, 2026, 12:24 PM

#

stark sapphire I just made my own AI

dude, you made an app. not your own AI😂.

stark sapphire Apr 5, 2026, 12:24 PM

#

gentle aspen dude, you made an app. not your own AI😂.

lol yes it's actually fun as heck

#

i used the leaked source code from Claude💀

gentle aspen Apr 5, 2026, 12:24 PM

#

stark sapphire lol yes it's actually fun as heck

lmao. I never tried it.

stark sapphire Apr 5, 2026, 12:25 PM

#

internal use only. don't need trouble

gentle aspen Apr 5, 2026, 12:32 PM

#

stark sapphire internal use only. don't need trouble

You should probably try my app. it has RAG and also uses a lot of AI internally.

stark sapphire Apr 5, 2026, 12:33 PM

#

gentle aspen You should probably try my app. it has RAG and also uses a lot of AI internally.

sure, but i can't today, getting ready for work

gentle aspen Apr 5, 2026, 12:33 PM

#

stark sapphire sure, but i can't today, getting ready for work

yeah, sure!

rain lava Apr 5, 2026, 1:35 PM

#

jaunty marsh When building an app, the model usage is very confusing to me. For example, whe...

It's based on token usage, not prompt usage - Higher complexity tasks require more effort, you will get more responses before running out with easier tasks than hard ones.

cinder agate Apr 5, 2026, 2:21 PM

#

what did u do to the Claude limits 🙁

#

antigravity was goated before that

gentle aspen Apr 5, 2026, 3:09 PM

#

I prefer the 20$ codex plan. but fre antigravity isn't bad by any means. just use the gemini models. the Pro low is a good model. I use GPT OSS for planning

#

tbh Codex is way better when it coems to stability and executing.

#

antigravity feels "Fun to use", not the "Pro" tool

next ruin Apr 5, 2026, 8:43 PM

#

stark sapphire i used the leaked source code from Claude💀

U still got it? I wanna see it but I could never find the repo 😭

next ruin Apr 5, 2026, 8:44 PM

#

cinder agate what did u do to the Claude limits 🙁

I have a lot of accounts for that reason

gusty meteor Apr 5, 2026, 8:55 PM

#

next ruin U still got it? I wanna see it but I could never find the repo 😭

you will see the next antigravity update prob lmao

next ruin Apr 5, 2026, 8:56 PM

#

gusty meteor you will see the next antigravity update prob lmao

When is that coming out or do you not know

gusty meteor Apr 5, 2026, 8:56 PM

#

next ruin When is that coming out or do you not know

idk next update

next ruin Apr 5, 2026, 8:56 PM

#

gusty meteor idk next update

Type shi 😭

gusty meteor Apr 5, 2026, 8:56 PM

#

but will surely know every ai apps look the claude code source code xd

#

to see how claude code working better

rain lava Apr 6, 2026, 2:24 AM

#

gentle aspen tbh Codex is way better when it coems to stability and executing.

If Gemini 3.2 Pro gets based on the DeepThink architecture I think it'll be better. Currently I find 3.1 Pro to be focused on maximum speed rather than accuracy on it's coding. Claude Opus 4.6 will beat Gemini 3.1 Pro in tasks that're more complex because it's architecture is built on self reflecting it's decisions to make sure it's right.

#

Gemini Code Assist relies on your subscription plan too so when 3.2 Pro get's released and then added to Code Assist it'll be like the Codex plan rather than a free limited Antigravity Agent.

gentle aspen Apr 6, 2026, 4:57 AM

#

rain lava *If* Gemini 3.2 Pro gets based on the DeepThink architecture I think it'll be be...

you can do some prompt engineering to get that doen too.

#

wont be very effective tho unlike a native arch, but better than nothing.

rain lava Apr 6, 2026, 5:00 AM

#

gentle aspen wont be very effective tho unlike a native arch, but better than nothing.

Yeah true but I can't bring myself to use Code Assist until 3.2 Pro is out everytime I want it to do something it breaks it and makes bugs

gentle aspen Apr 6, 2026, 5:02 AM

#

rain lava Yeah true but I can't bring myself to use Code Assist until 3.2 Pro is out every...

maybe try making your own agent with gemma 4 and GPT OSS. i feel like it is well developed. I mean gemma4:26bis a GOOD model

rain lava Apr 6, 2026, 5:03 AM

#

gentle aspen maybe try making your own agent with gemma 4 and GPT OSS. i feel like it is well...

I don't think my GPU can handle 26B - It's AMD so it's not CUDA and I think CUDA is better at AI

gentle aspen Apr 6, 2026, 5:03 AM

#

rain lava I don't think my GPU can handle 26B - It's AMD so it's not CUDA and I think CUDA...

it is cheap on the API. Plus, Qwen models are dirt cheap on openrouter.

rain lava Apr 6, 2026, 5:09 AM

#

gentle aspen it is cheap on the API. Plus, Qwen models are dirt cheap on openrouter.

If it's cheap it should be free via API - Will Gemma be better than Gemini though I'd think Gemini is many more params than gemma

gentle aspen Apr 6, 2026, 5:12 AM

#

rain lava If it's cheap it should be free via API - Will Gemma be better than Gemini thoug...

yeah, but I like the reasonign style and how easy it would be to run things locally if you want int he future. if you were to build an ecosstem around gemini it would be hard to change anything. since gemma is local + API you can mess with smaller + biger model in the future.

i mean, yuo do you. I usually like to have a flexible environemnt yk

rain lava Apr 6, 2026, 5:17 AM

#

gentle aspen yeah, but I like the reasonign style and how easy it would be to run things loca...

Yeah, that's fair. The flexibility plus local option is pretty nice, to be honest. I'm mostly just thinking about raw capability right now, though. It feels like Gemini would still be ahead there. What would I be able to mess around with on the AI if I ran through API?

gentle aspen Apr 6, 2026, 5:19 AM

#

rain lava Yeah, that's fair. The flexibility plus local option is pretty nice, to be hones...

th eonlu difference with it is: it has more params (26a4b is more tha enough btw), video support (u rolly can with Qwen, but u need a beefed up setup), longg audio.

now the real question is, will u ever use these?

rain lava Apr 6, 2026, 5:22 AM

#

gentle aspen th eonlu difference with it is: it has more params (26a4b is more tha enough btw...

I have Google Flow for Video & Images (I pay for Google One) -- Even though 24b is more than enough would something doublemor triple that actually make a differece? Or when you say more than enough it then doesn't matter if you have more?

I don't want to spend time getting gemma though - Can Gemma scan a repo and add/remove code from it like GCA?

gentle aspen Apr 6, 2026, 5:25 AM

#

rain lava I have Google Flow for Video & Images (I pay for Google One) -- Even though 24b ...

yeah, then gemini it is!
you could try openAi modex models for agents or clude models. but I feelliek codexmodels woud be easierto mess arund if you have the money. but yeah, gemini is good if u ar eona bdget

rain lava Apr 6, 2026, 5:28 AM

#

gentle aspen yeah, then gemini it is! you could try openAi modex models for agents or clude m...

Well I did try Claude Opus 4.6 via Antigravity and found due to it's architecture it's better at complexity than Gemini 3.1 Pro (Which is built for speed) -- I haven't tried GPT OSS 120B yet though - How good is it compared to 3.1 Pro and Claude 4.6 Opus?

gentle aspen Apr 6, 2026, 5:30 AM

#

rain lava Well I did try Claude Opus 4.6 via Antigravity and found due to it's architectur...

tbh gpt oss is meh. it is good for planning, but it feels like a messed up general model instead of a optimized, and good model

rain lava Apr 6, 2026, 5:32 AM

#

gentle aspen tbh gpt oss is meh. it is good for planning, but it feels like a messed up gener...

Anything GPT is open AI right?
I don't follow on OpenAI news so I don't know the latest and greatest model but what's the architecture like for it's best model?

gentle aspen Apr 6, 2026, 5:33 AM

#

rain lava Anything GPT is open AI right? I don't follow on OpenAI news so I don't know th...

GPT5.1 codex max and GPT 5.4 mini is good for planning. GPT 5.4 and GPT5.3-codex is good for executing

rain lava Apr 6, 2026, 5:35 AM

#

gentle aspen GPT5.1 codex max and GPT 5.4 mini is good for planning. GPT 5.4 and GPT5.3-codex...

Ah okay - Is 5.4 & 5.3 codex architecture speed or more like claude's where it self reflects and thinks longer?

gentle aspen Apr 6, 2026, 5:37 AM

#

rain lava Ah okay - Is 5.4 & 5.3 codex architecture speed or more like claude's where it s...

5.2-codex re-evaluates what it did, but GPT5.4 is good for frontend. use GPT5.2 or GPT5.3 codex models for backend.
5.1-codex-max is intelligent and fast, it is meant for planing an dresearh

rain lava Apr 6, 2026, 5:42 AM

#

gentle aspen 5.2-codex re-evaluates what it did, but GPT5.4 is good for frontend. use GPT5.2 ...

Is there any ai model that's good for all of these or is that impossible/not made yet? -- Do any of these match 3.1pro in terms of what it does?

I know 3.1pro is speed but is there something specific it can do actually good other than speed?
Would 5.2codex be like claudes arch in terms of re-evaluation and would it be considered better than claude 4.6 opus

gentle aspen Apr 6, 2026, 5:44 AM

#

rain lava Is there any ai model that's good for all of these or is that impossible/not mad...

that would be the opus models.
But there isn't an actuall all in one model yet. bcs more parameters = more cost. so you can use mid rnage models for plannign and big models for executing. but if you really want an all in one (I dont reccomedn fo rbig work). use deepseek v3.2.

thats why agentic AI is annoying.

rain lava Apr 6, 2026, 5:46 AM

#

gentle aspen that would be the opus models. But there isn't an actuall all in one model yet. ...

I suppose an all in one model would be either an untrue all in one model (switches models for what you need) or if it could somehow change its parametre count for the response (still changes model properties)

According to ChatGPT the "best" coding AI is Github Copilot AI (Which is based on GPT) but I don't think I believe it at all to be honest.
If that's the opus model than would sonet be a fast model like 3.1pro?

gentle aspen Apr 6, 2026, 5:48 AM

#

rain lava I suppose an all in one model would be either an untrue all in one model (switch...

tbh, use claude models (use opus whe u can) for coding. use GPT models for planinng. specifically for frontend, use GPT5.4 (no exception)

rain lava Apr 6, 2026, 5:50 AM

#

gentle aspen tbh, use claude models (use opus whe u can) for coding. use GPT models for plani...

Does codex have free access for 5.4?

gentle aspen Apr 6, 2026, 5:51 AM

#

rain lava Does codex have free access for 5.4?

to use codex u need to have the 20$ plan at least. ll the models are available once u pay.

rain lava Apr 6, 2026, 5:52 AM

#

Yeah I don't want to pay another plan
Are any of the Gemini 2.5 Models better than 3.1 Pro at anything?

gentle aspen Apr 6, 2026, 7:00 AM

#

rain lava Yeah I don't want to pay another plan Are any of the Gemini 2.5 Models better th...

fuh nah. old gemini is really bad tbh. it is "okay" for a geenral task, plus they didnt even had CoT.

rain lava Apr 6, 2026, 7:01 AM

#

gentle aspen fuh nah. old gemini is really bad tbh. it is "okay" for a geenral task, plus the...

Oh okay then - Do you know when we can expect a release for 3.2Pro though?

gentle aspen Apr 6, 2026, 7:16 AM

#

rain lava Oh okay then - Do you know when we can expect a release for 3.2Pro though?

proly at the last 4 months of this year ig.

#

who knows..?

rain lava Apr 6, 2026, 7:16 AM

#

gentle aspen who knows..?

Google

rain lava Apr 6, 2026, 8:20 AM

#

rain lava Google

According to some leaks and gemini itself they all think 3.2pro comes out may 19-20.

trail wagon Apr 6, 2026, 11:56 AM

#

rain lava According to some leaks and gemini itself they all think 3.2pro comes out may 19...

Can you elaborate which source?

rain lava Apr 6, 2026, 11:58 AM

#

trail wagon Can you elaborate which source?

https://leaveit2ai.com/ai-tools/language-model/google-gemini-3
https://youtu.be/j63kkppYKZs?si=KGwYu89Udp4gLRhA

(Image it text from Gemini 3.1 P)

Leaveit2AI

Gemini 3.2 Release Date, Leaks & What Google Hasn't Said Yet

Gemini 3.1 Pro dropped Feb 19. Now Gemini 3.2 is showing up in Arena logs and API strings. Google hasn't announced it. Here's everything confirmed, leaked, and expected — updated as it happens.

YouTube

BitBiasedAI

Gemini 3 2 Explained: (84 6% ARC AGI-2) Google’s First AI That Ac...

Link to our newsletter: https://bitbiased.ai/
Gemini 3.2 isn’t just another AI model — it’s a shift from prediction to real reasoning.

In this video, we break down Google’s latest AI system, including Deep Think reasoning, the leaked TPU v7 Ironwood chip, and Antigravity — a new agentic platform that could replace traditional coding e...

▶ Play video

trail wagon Apr 6, 2026, 11:59 AM

#

rain lava https://leaveit2ai.com/ai-tools/language-model/google-gemini-3 https://youtu.be/...

Thanks alot

rain lava Apr 6, 2026, 11:59 AM

#

trail wagon Thanks alot

You're welcome!

hushed night Apr 6, 2026, 5:43 PM

#

Hello Im an AI researcher and I currently need a team, if you're interested text me please, I'm currently working on an algorithm that can significantly lower both the energy comsumption and the compute cost of ai training

stark sapphire Apr 6, 2026, 5:51 PM

#

hushed night Hello Im an AI researcher and I currently need a team, if you're interested text...

Are you a millionaire? And own your own datacenters? Because otherwise this project is nearly impossible.

hushed night Apr 6, 2026, 5:52 PM

#

its not, this is about optimizing what everyone in the world uses

stark sapphire Apr 6, 2026, 5:53 PM

#

Yea, but in order to do that, you need very powerful computers

hushed night Apr 6, 2026, 5:54 PM

#

no, even a gpu on colab is enough to test this

#

i just think that backpropagation isn't the key to AI, it's approssimative and expensivd

stark sapphire Apr 6, 2026, 5:57 PM

#

so it's more about Learning & Experimenting, using tools like Google Colab, Kaggle, and Hugging Face?

hushed night Apr 6, 2026, 6:00 PM

#

if we optimize the learning we optimize comsumption and potentially even compute time and power

#

i dont think a model should be trained on a dataset at all, at least not how we know it nowadays, think about it, when we train a model we make little steps to get to the end of the valley, the result of backpropagation, what if we find a way to reverse ingeneer this: we have a set of qas and we calculate the weights back in the layers, but if the questions arent generated by an ai, this isnt reverse engeneering anymore, its creating a new model

#

if you're interested dm me

gentle aspen Apr 6, 2026, 6:16 PM

#

hushed night Hello Im an AI researcher and I currently need a team, if you're interested text...

man, I dont want to demotivated, but vibe coding an optimization is liek saying "Yo bro I am going to help Sam Altman make 5.5 because i am board. How does it work? Somehow..."

I can help you with any other algorithm

#

my app has an algorithm called "NNA" it is a recomendation system build on embedding models with 3 levels to each to filter out things and reccomend user what ever you want without a lot of customization.

#

do uhave experience?

hushed night Apr 6, 2026, 6:18 PM

#

sam altman said that ai inst a transformed based system but also said that ai as we have it right now is already capable of creating the right ai system

gentle aspen Apr 6, 2026, 6:18 PM

#

i mean, i could help u upto some extent

gentle aspen Apr 6, 2026, 6:18 PM

#

hushed night sam altman said that ai inst a transformed based system but also said that ai as...

dude, u sound like me when I feel motivated, and I clearly know that i am broke and I shoudl stop thinking about it.

#

lol

hushed night Apr 6, 2026, 6:18 PM

#

gentle aspen my app has an algorithm called "NNA" it is a recomendation system build on embed...

quite, i personally developed small models the size of gpt 3

gentle aspen Apr 6, 2026, 6:19 PM

#

GPT3 ?!!!

where di dyou get the compute?1

hushed night Apr 6, 2026, 6:19 PM

#

gentle aspen lol

its not about money dude, its about having the right idea

gentle aspen Apr 6, 2026, 6:19 PM

#

175b parameters at bf16 is no joke bro

hushed night Apr 6, 2026, 6:19 PM

#

colab dude

gentle aspen Apr 6, 2026, 6:19 PM

#

hushed night its not about money dude, its about having the right idea

dude, real life is not a pixar movie. lets be rreal here

hushed night Apr 6, 2026, 6:19 PM

#

i spent alot

gentle aspen Apr 6, 2026, 6:20 PM

#

hushed night colab dude

you cant train a 170b model in collab🤣🤣🤣. look at this guy

hushed night Apr 6, 2026, 6:20 PM

#

gentle aspen dude, real life is not a pixar movie. lets be rreal here

you can be in or not

hushed night Apr 6, 2026, 6:20 PM

#

gentle aspen you cant train a 170b model in collab🤣🤣🤣. look at this guy

dude tpu v6

#

it has 192gb of ram

gentle aspen Apr 6, 2026, 6:22 PM

#

hushed night you can be in or not

I am sorry, I am out. I dont think a person is crazy enough for this. if u want help with soemthing realistic, i will help with 0 thoughts.
i like your idea, the way you vizualize is, let just say... "Not-enough-thought-to-it"

if you have more cool ideas which I can help. i will!

hushed night Apr 6, 2026, 6:23 PM

#

alright thanks

gentle aspen Apr 6, 2026, 7:30 PM

#

You know what I just realized. all these faety models aretoo big. the shield gemma and all this is too big. wouldn't it be cool if something (maybe even oogle) fine tuned gemma3:270m to be a shield gemma model?

oblique cosmos Apr 7, 2026, 12:43 AM

#

im sorry for inter rupting i just saying theres a thunder storm at my house and lots of hail size of screws 40 miles an hour

patent rose Apr 7, 2026, 12:51 AM

#

hey am new

rain lava Apr 7, 2026, 1:57 AM

#

??

tulip stump Apr 7, 2026, 7:12 AM

#

claude just not working right now?

rain lava Apr 7, 2026, 7:15 AM

#

Yes it's likely not working due to experiencing high usage.

rain lava Apr 7, 2026, 7:15 AM

#

gentle aspen You know what I just realized. all these faety models aretoo big. the shield gem...

I think I've changed my mind about Gemma 4.

gentle aspen Apr 7, 2026, 7:16 AM

#

why?

#

bro why are we both bronze 1?😭😭

rain lava Apr 7, 2026, 7:18 AM

#

I just don't know whether to use 26B MoE or 31B Dense:
3.1Pro, It also said 31B was going to be better for me.

I changed my mind because according to 3.1Pro it's architecture won't be limited via Google and its going to be better at thinking and reasons rather than be as fast as possible and having control over easoning would be good

#

This was said by 3.1Pro

The Bandwidth Bottleneck: DDR5 vs. VRAM
To understand why the 31B model will slow down on your machine, you have to look at how data moves. Large Language Models are heavily bound by memory bandwidth, not just raw compute.
VRAM (Your RX 9070 XT): Modern GDDR6 memory pushes bandwidth anywhere from 500 GB/s to 800+ GB/s. It feeds the GPU core almost instantaneously.
System RAM (Your 32GB DDR5): Even with fast DDR5 in dual-channel, you are maxing out around 80 GB/s to 100 GB/s.
When a model exceeds your 16GB VRAM limit, the inferencing engine (like llama.cpp) puts the core layers on the GPU and the remaining layers on your system RAM. Every time the model generates a single word (token), it has to pull data across the PCIe bus from the DDR5. Because DDR5 is roughly 5 to 8 times slower than VRAM, your entire generation speed instantly drops to match the speed of the system RAM.
The Time Difference: 26B MoE vs. 31B Dense
If you ask the model to rewrite a 100-line broken Calamares installer script and output a 500-token response:
Gemma 4 26B A4B (MoE)
Because it only activates ~4 billion parameters per token and fits almost entirely in your ultra-fast VRAM, it will fly. You will likely see generation speeds of 30 to 50+ tokens per second.
Total Time: You will have your script in roughly 10 to 15 seconds.
Gemma 4 31B (Dense)
Because it fires all 31 billion parameters for every single token and constantly pulls data across the PCIe bus from your slower DDR5, it will chug. You will likely see generation speeds drop to 5 to 10 tokens per second. If you activate the built-in Think mode, it will spend additional time internally looping before it outputs the code.
Total Time: You will likely wait 1 to 3 minutes for the exact same 500-token script.

gentle aspen Apr 7, 2026, 7:18 AM

#

rain lava I just don't know whether to use 26B MoE or 31B Dense: 3.1Pro, It also said 31B ...

bro... USE 26A4B

rain lava Apr 7, 2026, 7:18 AM

#

3.1pro said 31b will be better for what i need

gentle aspen Apr 7, 2026, 7:18 AM

#

!!!!

#

no it wont

#

pls dont

rain lava Apr 7, 2026, 7:18 AM

#

gentle aspen pls dont

Okay!

gentle aspen Apr 7, 2026, 7:18 AM

#

u you screw up your PC

#

plsss

#

dont

rain lava Apr 7, 2026, 7:18 AM

#

gentle aspen u you screw up your PC

How..

#

Like it'll lag? I heard Gemma 4 uses lots of VRAM but couldn't I offload to DDR5?

gentle aspen Apr 7, 2026, 7:19 AM

#

26a4b is wayyy mor ethan enough for agentic. Plus my 500 can;t even handl;e the 31b dense and barely runs 26a4b at 32k context. I have a good system and stillgets 15 TPS:
5070 12gb
32gb ddr5 6k mt/s
r7 9700x

(used ollama)

gentle aspen Apr 7, 2026, 7:20 AM

#

rain lava Like it'll lag? I heard Gemma 4 uses lots of VRAM but couldn't I offload to DDR5...

ollama iwll automatically offload. type ollama ps in the command prompt you will see CPU/GPU usage

rain lava Apr 7, 2026, 7:20 AM

#

gentle aspen 26a4b is wayyy mor ethan enough for agentic. Plus my 500 can;t even handl;e the ...

I got 4gb more vram --- apparently moe can get mixed up between its stuff is it true?

rain lava Apr 7, 2026, 7:20 AM

#

gentle aspen ollama iwll automatically offload. type ```ollama ps``` in the command prompt yo...

Okay

gentle aspen Apr 7, 2026, 7:20 AM

#

For no reason at all, google didnt releease a sub ~12b model this generation😓

rain lava Apr 7, 2026, 7:21 AM

#

rain lava I got 4gb more vram --- apparently moe can get mixed up between its stuff is it ...

But an 9070xt has no cuda

gentle aspen Apr 7, 2026, 7:21 AM

#

rain lava I got 4gb more vram --- apparently moe can get mixed up between its stuff is it ...

wdym 4gb more VRAM

rain lava Apr 7, 2026, 7:21 AM

#

gentle aspen For no reason at all, google didnt releease a sub ~12b model this generation😓

They love when people pay them

gentle aspen Apr 7, 2026, 7:21 AM

#

rain lava But an 9070xt has no cuda

AMD has it's own AI accelerater. new ollaam supports is (kinda..)

rain lava Apr 7, 2026, 7:21 AM

#

gentle aspen wdym 4gb more VRAM

9070XT has 16gb vram and your 5070 is 12gb

gentle aspen Apr 7, 2026, 7:21 AM

#

rain lava They love when people pay them

ig... they used to drop models with practical sizes

gentle aspen Apr 7, 2026, 7:22 AM

#

rain lava 9070XT has 16gb vram and your 5070 is 12gb

yeah bt the raw performance is low compared to CUDA + 6k mt/s. what CPU do you have?

rain lava Apr 7, 2026, 7:22 AM

#

I could run a 4b dolphin model on my laptops igpu and i was on uhd rather than iris xe because asus gave me 1stick 32 rather than 2x16 so i only got 64bit

rain lava Apr 7, 2026, 7:22 AM

#

gentle aspen yeah bt the raw performance is low compared to CUDA + 6k mt/s. what CPU do you h...

I7 14700kf

#

My ram wont be as good as urs tho in terms of speed its 5200 and its a micron die

gentle aspen Apr 7, 2026, 7:23 AM

#

rain lava I7 14700kf

oooh thats good! you could run it. make sure you load most of the tensors into your GPU

gentle aspen Apr 7, 2026, 7:24 AM

#

rain lava My ram wont be as good as urs tho in terms of speed its 5200 and its a micron di...

not bad. even ddr4 is "okay", you are well enough since most tensors are loaded into VRAM. you might get around 15 TPS at 32k CTX like me

rain lava Apr 7, 2026, 7:24 AM

#

gentle aspen not bad. even ddr4 is "okay", you are well enough since most tensors are loaded ...

Idk what any of that means (tps and 32k ctx) i havent played around with ai much

#

Oh tokens per sec right

#

Idk ctx

gentle aspen Apr 7, 2026, 7:26 AM

#

TPS = tokens per second (1 word = 1.15 ish tokens with sentencepiece), CTX = context lenght, it is how much the AI can remember. for your task you NEED at LEAST 32k since you are doing aagentic stuff, right?

rain lava Apr 7, 2026, 7:26 AM

#

Idk what agentic stuff means? Coding?

gentle aspen Apr 7, 2026, 7:26 AM

#

Plus, the gemma model has a CoT (chain of thought) it eats CTX for breakfast, soa little headroom is safe for reasoning

gentle aspen Apr 7, 2026, 7:27 AM

#

rain lava Idk what agentic stuff means? Coding?

doing decision and executing it with tool calling by itself. for that it needs to reason like [<think> if I do X, the Y will happen... should I do it?</think>].
and then it gives the answer.

rain lava Apr 7, 2026, 7:28 AM

#

gentle aspen doing decision and executing it with tool calling by itself. for that it needs t...

Csn i implement it to write code and add if i accept and do commands like in antigravity?

#

Also where do i get gemma4, huggingface, github, olamma?

gentle aspen Apr 7, 2026, 7:31 AM

#

install ollama and run ollama runn gemma4:26b. But I recomend you run ollama run gemma4:e4b
it is smaller and better. I can run it at 128k context at 80 TPS, which is PERFECT.

#

it is not neccesarilly better, but that model is wayy more than capable. it wont do niche CSS or typescript, but you cando other agentic stuff. Plus with the new audio support you can make it organze your folders and stuff liek that yk.

rain lava Apr 7, 2026, 7:33 AM

#

gentle aspen it is not neccesarilly better, but that model is wayy more than capable. it wont...

Will it be around as good as 26b in bash

rain lava Apr 7, 2026, 7:33 AM

#

rain lava Csn i implement it to write code and add if i accept and do commands like in ant...

Is that possible?

gentle aspen Apr 7, 2026, 7:34 AM

#

rain lava Is that possible?

YES! thats the whole point of it. you just tell it and it will do it. it can make/edit files, run commands and edit sutff in yur folders you gave permission in or your full computer if you gave permissionofc.

rain lava Apr 7, 2026, 7:38 AM

#

gentle aspen YES! thats the whole point of it. you just tell it and it will do it. it can mak...

Oh okay

rain lava Apr 7, 2026, 7:40 AM

#

rain lava Will it be around as good as 26b in bash

Will it??

rain lava Apr 7, 2026, 7:41 AM

#

gentle aspen bro why are we both bronze 1?😭😭

We need lvl 3 for silver :(

gentle aspen Apr 7, 2026, 7:41 AM

#

rain lava We need lvl 3 for silver :(

How can we level up?

rain lava Apr 7, 2026, 7:42 AM

#

I believe every message = 1xp, and you need certain amouynt of xp for next level

Run /level in #commands

#

That was easier to setup than expected

gentle aspen Apr 7, 2026, 7:47 AM

#

told u gng

#

ollama is super convinient. try doing these:
try doing tool calling, increasing the context length and other cool stuff

#

btw to increase context natively in ollama, go to setings --> context length --> [move hte slider to about 128k]

to see how much TPS you get run ollama run gemma4:e4b --verbose

rain lava Apr 7, 2026, 7:50 AM

#

gentle aspen btw to increase context natively in ollama, go to ```setings --> context length ...

Thanks I'll use these

#

❯ ollama run gemma4:e4b --verbose

Hi
Thinking...
Thinking Process:

Analyze the input: The input is "Hi". This is a basic, informal greeting.

Determine the user's intent: The user is initiating a conversation and expects a friendly, reciprocal greeting.

Formulate the response goal: Be polite, engaging, and inviting.

Generate options:

Option 1 (Minimal): Hi.

Option 2 (Standard): Hello! How can I help you today?

Option 3 (Friendly/Warm): Hi there! How are you doing today?

Select the best option: Option 2 or 3 are ideal as they acknowledge the greeting and immediately prompt the user for their actual need,
fulfilling the AI role. I'll go with a combination of friendly greeting and helpful query.
...done thinking.

Hello! How can I help you today? 😊

total duration: 11.099657542s
load duration: 127.421351ms
prompt eval count: 16 token(s)
prompt eval duration: 53.326788ms
prompt eval rate: 300.04 tokens/s
eval count: 204 token(s)
eval duration: 10.837521216s
eval rate: 18.82 tokens/s

Send a message (/? for help)

#

I asked 26B the same question and it was faster

gentle aspen Apr 7, 2026, 7:54 AM

#

it is 18 TPS bcs it is just a few tokens. even if you had a 1gbps iternet yo wil use only like 25 mbps for a 3mb download. try telling it to make an essya about something

rain lava Apr 7, 2026, 7:58 AM

#

total duration: 1m30.097378297s
load duration: 3.745011455s
prompt eval count: 25 token(s)
prompt eval duration: 386.711407ms
prompt eval rate: 64.65 tokens/s
eval count: 1675 token(s)
eval duration: 1m25.232874466s
eval rate: 19.65 tokens/s

Send a message (/? for help)

I asked:
Write an essay on how the Linux kernel was made
On 4b
128k context

gentle aspen Apr 7, 2026, 7:58 AM

#

you know what? it is usable at least. 20 TPS is not bad

rain lava Apr 7, 2026, 7:59 AM

#

I think it's using CPU..

gentle aspen Apr 7, 2026, 7:59 AM

#

you can only use like 32k context MAX MAX (absolute max) with the 26a4b model. even if it is 4b activated, the 26b tensors are still loaded

gentle aspen Apr 7, 2026, 7:59 AM

#

rain lava I think it's using CPU..

and gpu

#

while running the moel run ollama pss on another terminal

rain lava Apr 7, 2026, 8:00 AM

#

Currently on 26b 128k --

gentle aspen Apr 7, 2026, 8:00 AM

#

I get:

gemma4:e4b    c6eb396dbd59    16 GB    47%/53% CPU/GPU    131072     3 minutes from now```

rain lava Apr 7, 2026, 8:00 AM

#

❯ ollama ps
NAME ID SIZE PROCESSOR CONTEXT UNTIL
gemma4:26b 5571076f3d70 26 GB 100% CPU 131072 4 minutes from now

~

gentle aspen Apr 7, 2026, 8:00 AM

#

are you using garuda?

rain lava Apr 7, 2026, 8:00 AM

#

CachyOS

gentle aspen Apr 7, 2026, 8:00 AM

#

rain lava ❯ ollama ps NAME ID SIZE PROCESSOR CONTEXT UNTIL...

do you have your AMD drivers installed?

gentle aspen Apr 7, 2026, 8:01 AM

#

rain lava CachyOS

ohh, maybe they al look alikeig

rain lava Apr 7, 2026, 8:01 AM

#

CachyOS automatically installs AMD GPU Drivers

gentle aspen Apr 7, 2026, 8:01 AM

#

rain lava CachyOS automatically installs AMD GPU Drivers

did you try this with windows?

rain lava Apr 7, 2026, 8:01 AM

#

No

#

but I know the drivers are there -- My games run great

gentle aspen Apr 7, 2026, 8:01 AM

#

it might work. it works in my friends computer

gentle aspen Apr 7, 2026, 8:02 AM

#

rain lava but I know the drivers are there -- My games run great

soemtimes linux just does linux stuff just like windows. I feel like these models are well optimized for Mac tbh

rain lava Apr 7, 2026, 8:03 AM

#

#

Oh..

#

It's using my ddr5

#

Look at mem usage

gentle aspen Apr 7, 2026, 8:03 AM

#

btw I thinkyou will gewt good TPS with GPU, since I already this. Intel CPUs has a lot of threads

gentle aspen Apr 7, 2026, 8:03 AM

#

rain lava It's using my ddr5

yeah

rain lava Apr 7, 2026, 8:04 AM

#

total duration: 1m51.307317733s
load duration: 6.42701621s
prompt eval count: 22 token(s)
prompt eval duration: 883.754905ms
prompt eval rate: 24.89 tokens/s
eval count: 1852 token(s)
eval duration: 1m43.238899214s
eval rate: 17.94 tokens/s

Send a message (/? for help)

Well that's what I got 128k on 26b worked and not using gpu ig

#

Okay I think ik

#

"When Ollama calculates the memory requirements before starting the chat, it realizes that 30 GB is way over your 16 GB limit. Instead of crashing your system with an "Out of Memory" error, Ollama's fallback mechanism automatically offloads the model to your system's DDR5 RAM and tells your i7-14700KF CPU to process it."

gentle aspen Apr 7, 2026, 8:05 AM

#

rain lava total duration: 1m51.307317733s load duration: 6.42701621s prompt e...

because even tho it is 26b only 4b parameters are activted. you lost abouut ~2-3 tps for banchwidth

#

it dynamically loads tensors between VRAM and ram

rain lava Apr 7, 2026, 8:07 AM

#

4b 32k still

gentle aspen Apr 7, 2026, 8:07 AM

#

it doesn;t calculate, and select. It loads the doable tensors into the best hardware. but if you have more GPU ollama would auto detect

gentle aspen Apr 7, 2026, 8:08 AM

#

rain lava 4b 32k still

ohh... that is weird. I have issues witht he "effective" models on my lapop, but it is "okay" in my computer. for some reason gemma3n:e2b is running on 60 TPS, while a normal 4b model can reach well over 180 TPS on my 5070 (@ 4k ctx)

#

if you just want to experiment (I wont reccomend). Use a q3 or q2 quantization. you will feel lik eit repeats the same thing and uses a "smoother" text of flow (in a bad way), but you wil fit in less ram which increases the speed

#

omds my grammar💀

rain lava Apr 7, 2026, 8:17 AM

#

Gemini said

"I completely led you down the wrong path, and I apologize. The issue is entirely my fault.

I originally had you run sudo pacman -S ollama.

On Arch Linux, the maintainers split the packages to save download space. The base ollama package you currently have installed is compiled strictly for CPU inference only. It physically does not have the backend code to talk to your GPU, no matter what environment variables we set. You can even see it in your latest log: device=CPU.

We need to swap it for the ROCm-enabled package. Here is how to fix my mistake and get this working.

Install the correct GPU package

Run this command. Pacman will warn you that it conflicts with ollama and ask if you want to replace it. Press Y.
Bash

sudo pacman -S ollama-rocm"

I gotta stop relying on ai 😭

#

That's better:

❯ sudo systemctl daemon-reload
sudo systemctl restart ollama.service

~
❯ ollama run gemma4:e4b --verbose

hello
Hello! How can I help you today? 😊

total duration: 366.351577ms
load duration: 130.299963ms
prompt eval count: 16 token(s)
prompt eval duration: 61.232751ms
prompt eval rate: 261.30 tokens/s
eval count: 11 token(s)
eval duration: 116.979446ms
eval rate: 94.03 tokens/s

Send a message (/? for help)

gentle aspen Apr 7, 2026, 8:23 AM

#

dude, ollama literaly gives you the command

gentle aspen Apr 7, 2026, 8:24 AM

#

rain lava That's better: ❯ sudo systemctl daemon-reload sudo systemctl restart ollama.s...

94??!!! with 128k context????

rain lava Apr 7, 2026, 8:25 AM

#

No it was like 4k context

#

Ill try again 128k on 4b

gentle aspen Apr 7, 2026, 8:25 AM

#

ahhhh

rain lava Apr 7, 2026, 8:27 AM

#

~ 2m 21s
❯ ollama run gemma4:e4b --verbose

/set parameter num_ctx 131072
Set parameter 'num_ctx' to '131072'

hello
Thinking...
Thinking Process:

Analyze the input: The input is "hello". This is a simple, conversational greeting.

Determine the goal: The objective is to respond naturally, warmly, and conversationally to match the tone of the greeting.

Formulate response options:

Option 1 (Mirror/Simple): "Hello." (A bit flat)

Option 2 (Engaging/Standard): "Hello! How can I help you today?" (Standard assistant response)

Option 3 (Friendly/Warm): "Hello there! How are you doing?" (More personable)

Select the best option: A combination of warmth and readiness to assist is usually ideal.

Self-Correction/Refinement: I should offer a greeting and prompt the user for their request.

Final Response: "Hello! How can I assist you today?" or similar variants.
...done thinking.

Hello! How can I help you today? 😊

total duration: 8.428110023s
load duration: 3.358372769s
prompt eval count: 16 token(s)
prompt eval duration: 72.365564ms
prompt eval rate: 221.10 tokens/s
eval count: 227 token(s)
eval duration: 4.83334077s
eval rate: 46.97 tokens/s

Send a message (/? for help)

#

❯ ollama run gemma-26b-32k --verbose

hi
Thinking...
The user said "hi".
This is a standard greeting.
*   Acknowledge the greeting.
*   Offer assistance.
"Hello! How can I help you today?" or "Hi there! How are you doing?" or "Hi! Is there anything I can assist you with?"

Self-Correction/Refinement: Keep it simple and friendly.
...done thinking.

Hello! How can I help you today?

total duration: 2.827455169s
load duration: 121.481958ms
prompt eval count: 16 token(s)
prompt eval duration: 69.863333ms
prompt eval rate: 229.02 tokens/s
eval count: 93 token(s)
eval duration: 2.54779111s
eval rate: 36.50 tokens/s

Send a mes

#

I did q4 with flash attyention 64k may be possible with it

#

total duration: 47.229514139s
load duration: 122.962621ms
prompt eval count: 43 token(s)
prompt eval duration: 84.649199ms
prompt eval rate: 507.98 tokens/s
eval count: 1593 token(s)
eval duration: 46.440853672s
eval rate: 34.30 tokens/s

Send a me

I got this on askin g write an essay on the linux kernel

#

I made it do 96k ctx --

total duration: 1m16.957083359s
load duration: 133.327347ms
prompt eval count: 43 token(s)
prompt eval duration: 101.023869ms
prompt eval rate: 425.64 tokens/s
eval count: 2417 token(s)
eval duration: 1m15.811992956s
eval rate: 31.88 tokens/s

Send a message (/? for help)

And then 128k

total duration: 1m13.766787666s
load duration: 120.543248ms
prompt eval count: 43 token(s)
prompt eval duration: 99.394669ms
prompt eval rate: 432.62 tokens/s
eval count: 2356 token(s)
eval duration: 1m12.69612117s
eval rate: 32.41 tokens/s

#

I feel making an essay isnt stressing it.

gentle aspen Apr 7, 2026, 8:59 AM

#

making an essay gets the average TPS instead of a burst TPS

rain lava Apr 7, 2026, 9:02 AM

#

Whats burst tps?

#

Also I asked:
❯ cat pg100.txt | ollama run gemma-32k --verbose "Give me a detailed summary of every play included in this file.

(It's The Complete Works of William Shakespeare) And now it takes forever

gentle aspen Apr 7, 2026, 9:42 AM

#

rain lava Whats burst tps?

it's not an official word, but it meant a "temporary speed" it is an unnotcable bug-ish thing. when you ask something whch will give like 3-8 tokens, it wont give the proper avg TPS. you should run the model like 4 times upto 500 + tokens each, you will get a good "average"

rain lava Apr 7, 2026, 9:42 AM

#

gentle aspen it's not an official word, but it meant a "temporary speed" it is an unnotcable ...

Okay

mild slate Apr 7, 2026, 11:39 AM

#

Anyone having trouble to vibe code e2b into flutter mobile app?

rain lava Apr 7, 2026, 12:46 PM

#

gentle aspen it's not an official word, but it meant a "temporary speed" it is an unnotcable ...

I got it in the terminal in antigravity and it can see my repo stuff and edit files but its a bit underwhelming

#

It hallucinated after a little bit

gentle aspen Apr 7, 2026, 12:59 PM

#

yeah I know. did you try Qwen3.5:9b

#

I think you will like it. it is the perfect size

#

Gemma "e" models are kinda bad in my opinion. they are very unstable

rain lava Apr 7, 2026, 1:02 PM

#

gentle aspen Gemma "e" models are kinda bad in my opinion. they are very unstable

I didn't use E. I ran 26b at 32k ctx

rain lava Apr 7, 2026, 1:02 PM

#

gentle aspen yeah I know. did you try Qwen3.5:9b

I haven't; is it better at agentic workflows than gemma4?

#

Or coding

#

I hate waiting for the next gemini release tbh

inner ice Apr 7, 2026, 1:43 PM

#

heyya chat

gentle aspen Apr 7, 2026, 1:43 PM

#

rain lava I haven't; is it better at agentic workflows than gemma4?

nothing in AI is "better". it might fit your flow though.

rain lava Apr 7, 2026, 1:44 PM

#

gentle aspen nothing in AI is "better". it might fit your flow though.

Flow?

gentle aspen Apr 7, 2026, 1:44 PM

#

I mean: what you want to do

wind tree Apr 7, 2026, 1:45 PM

#

love1

rain lava Apr 7, 2026, 2:11 PM

#

gentle aspen I mean: what you want to do

So qwen would be " better " in certain things like coding and being agentic

gentle aspen Apr 7, 2026, 2:11 PM

#

rain lava So qwen would be " better " in certain things like coding and being agentic

pretty much

rain lava Apr 7, 2026, 2:12 PM

#

What makes it better? The arch?

gentle aspen Apr 7, 2026, 2:20 PM

#

rain lava What makes it better? The arch?

mainly, the tokenizer.
the architecture makes it better too.

rain lava Apr 7, 2026, 2:37 PM

#

gentle aspen mainly, the tokenizer. the architecture makes it better too.

Better architecture & tokenizer yet older than gemma4?

gentle aspen Apr 7, 2026, 2:37 PM

#

rain lava Better architecture & tokenizer yet older than gemma4?

older by 1-2 weeks gng.

rain lava Apr 7, 2026, 2:38 PM

#

gentle aspen older by 1-2 weeks gng.

Yeah but google's a multi billion dollar company

gentle aspen Apr 7, 2026, 2:39 PM

#

rain lava Yeah but google's a multi billion dollar company

Alibaba is china's AWS dude

#

like... literally

rain lava Apr 7, 2026, 2:40 PM

#

gentle aspen Alibaba is china's AWS dude

?? Never heard of those

gentle aspen Apr 7, 2026, 2:41 PM

#

rain lava ?? Never heard of those

u never heard of Alibaba and Amazon???!!!

rain lava Apr 7, 2026, 2:41 PM

#

Ive never heard of alibaba and i didnt know short version of amazon

gentle aspen Apr 7, 2026, 2:42 PM

#

alibaba is like CRAZYYYYY! they do some crazy research dude

rain lava Apr 7, 2026, 2:42 PM

#

More than anthropic?

gentle aspen Apr 7, 2026, 2:43 PM

#

well... they were doing it since he 2000s, but I think anthropic does more research on the tech we already have. Alibaba wants to invent new things. no offense to anthropic, but they dont "invent" new arhcitectures and tokenizers yk

rain lava Apr 7, 2026, 2:44 PM

#

And google, do they do less research than both in ai?

I'd thought google wouldve made better ai due to how much they can spand on data centres gpu clusters and research

gentle aspen Apr 7, 2026, 2:45 PM

#

rain lava And google, do they do less research than both in ai? I'd thought google wouldv...

I would say google and alibaba does the same amount of research

rain lava Apr 7, 2026, 2:46 PM

#

gentle aspen well... they were doing it since he 2000s, but I think anthropic does more resea...

Well atleast anthropic didnt have a weird mindset on fastest agentic ai --- 3.1pro is fast but sucks

gentle aspen Apr 7, 2026, 2:46 PM

#

just because they have money and data doesn't mean they will always mak ehtebes tmodel

gentle aspen Apr 7, 2026, 2:46 PM

#

rain lava Well atleast anthropic didnt have a weird mindset on fastest agentic ai --- 3.1p...

I mean like I said, there is no "best" AI, bcs I find Pro 3.1 good at forntend

rain lava Apr 7, 2026, 2:48 PM

#

gentle aspen I mean like I said, there is no "best" AI, bcs I find Pro 3.1 good at forntend

I havent learnt rnough to know the true difference between frontend backend and full stack yet

rain lava Apr 7, 2026, 2:48 PM

#

gentle aspen just because they have money and data doesn't mean they will always mak ehtebes ...

Sadly

stark sapphire Apr 7, 2026, 2:50 PM

#

The best AI is the one saying "I've hit a snag!"

rain lava Apr 7, 2026, 2:50 PM

#

stark sapphire The best AI is the one saying "I've hit a snag!"

If i make a bash script that says that infinetly and call it an "ai" will it sell?

gentle aspen Apr 7, 2026, 2:51 PM

#

rain lava I havent learnt rnough to know the true difference between frontend backend and ...

nahh man that is easy to understand.
Frontend = UI
Backend = If certain UI element is poressed, then do X

Fullstack = mixof different languages. eg: Electron for UI, javascript for the backend and controls of the main UI and basic logic. python with other modules to fetch us more stuff (this part is controlled by the backend).

tbh people make these very complex for no reason, but htis is pretty mcuh it dude

gentle aspen Apr 7, 2026, 2:51 PM

#

rain lava If i make a bash script that says that infinetly and call it an "ai" will it sel...

I had a stroke reading that

gentle aspen Apr 7, 2026, 2:52 PM

#

stark sapphire The best AI is the one saying "I've hit a snag!"

The best AI is the friends we made lmao🤣

rain lava Apr 7, 2026, 2:54 PM

#

gentle aspen nahh man that is easy to understand. Frontend = UI Backend = If certain UI eleme...

Ah okay --- even on 3.1pro i find it bad at frontend though

#

The only thing it wasnt that bad at was making a browser based js game

gentle aspen Apr 7, 2026, 2:56 PM

#

rain lava Ah okay --- even on 3.1pro i find it bad at frontend though

You have to prompt it properly. I usually follow:

INstructions: []
What to change/make: []
what it shouldn't do: []
plan: [] <-- I usually use GPT5.1-codex-max or GPT-OSS for this part, but you can write the plan by yourself too (if you know what you are doing)

rain lava Apr 7, 2026, 2:58 PM

#

gentle aspen You have to prompt it properly. I usually follow: INstructions: [] What to chan...

I think someone once told me to use gemini to make a prompt for gemini

gentle aspen Apr 7, 2026, 2:58 PM

#

not a bad idea. You can use gemini for planning or another gemini instance to clean up your prompt

rain lava Apr 7, 2026, 2:59 PM

#

I've used gemini web to clean up agent prompts in antigravity but it hallucinated

#

I asked gemini 5 times to fix an auto pop up in a distro then asked claude like once it worked

gentle aspen Apr 7, 2026, 3:00 PM

#

because it has no context of what you gave it.

eg: if you give it this prompt:

cmove the button to the right using QWidget.

#

it doesn;t know your CSS or anything it has to update

#

it just knows the prompt

gentle aspen Apr 7, 2026, 3:01 PM

#

rain lava I asked gemini 5 times to fix an auto pop up in a distro then asked claude like ...

tbh it is normal, sometime even the best Ai models halucinate

#

if you look at my codex, it is starting off good and ends with some crazy swearing

rain lava Apr 7, 2026, 3:02 PM

#

gentle aspen because it has no context of what you gave it. eg: if you give it this prompt: ...

Usually "Hello Gemini I'm currently making an arch-based linux distro and I use quem to test it --- i need you to make an automatic popup for an installer when the os starts and add an install button that launches calamares" is that good enough?

stark sapphire Apr 7, 2026, 3:02 PM

#

again xD

rain lava Apr 7, 2026, 3:02 PM

#

gentle aspen tbh it is normal, sometime even the best Ai models halucinate

Its hallucinated very quick

gentle aspen Apr 7, 2026, 3:03 PM

#

dont yo think yo ugave it a tough task😭

gentle aspen Apr 7, 2026, 3:03 PM

#

stark sapphire again xD

bro isw you are so random

rain lava Apr 7, 2026, 3:03 PM

#

gentle aspen dont yo think yo ugave it a tough task😭

Its literally a piece of bash not even a custom app for a pop-up

#

Why does this 1 guy leave and join the vc i see it in the corner of my eye everytimemand its weird 😭

gentle aspen Apr 7, 2026, 3:05 PM

#

rain lava Its literally a piece of bash not even a custom app for a pop-up

there is somethign sus there. maybe you just needed to give mre context. bt I might not text you for a while I am playing CS2

rain lava Apr 7, 2026, 3:05 PM

#

gentle aspen there is somethign sus there. maybe you just needed to give mre context. bt I mi...

Ok! I may need to go to sleep anyways (its 1.05am)

hoary marlin Apr 7, 2026, 7:13 PM

#

Vibe coding is great until you realise your "SaaS" still require actual users that can't be vibe coded 🥲

gusty meteor Apr 7, 2026, 7:52 PM

#

lmao

grizzled moss Apr 7, 2026, 9:42 PM

#

hoary marlin Vibe coding is great until you realise your "SaaS" still require actual users th...

everyone making the same exact glassmorphism rounded transparant tile BS app or website at first

#

Tailwind css running on supabase/firebase with vercell frontend

#

models have preferences from their training,and vibe coders just vibe along

stark sapphire Apr 8, 2026, 1:36 AM

#

grizzled moss everyone making the same exact glassmorphism rounded transparant tile BS app or ...

Isn't it the standard go to style for modern day use?

gentle aspen Apr 8, 2026, 4:06 AM

#

stark sapphire Isn't it the standard go to style for modern day use?

it is overused. u have better UI forms like neumorphism and just plain UI... tbh I light light colored plain lifeless hated UIs.

#

I like the light themed 2019 vibe

#

The modern standrd for 2019 wa actrually good

stark sapphire Apr 8, 2026, 5:56 AM

#

gentle aspen it is overused. u have better UI forms like neumorphism and just plain UI... tbh...

I googled neumorphism to see how it looks. Very interesting look. But it's something that can't be used for every project. If i were to use this for my website, my boss would fire me on the spot xD

rain lava Apr 8, 2026, 5:56 AM

#

gentle aspen I think you will like it. it is the perfect size

I'm using the 35B MoE Model at 64K CTX right now ---

load duration: 92.869869ms
prompt eval count: 2203 token(s)
prompt eval duration: 2.702409223s
prompt eval rate: 815.20 tokens/s
eval count: 503 token(s)
eval duration: 28.596720157s
eval rate: 17.59 tokens/s

gentle aspen Apr 8, 2026, 5:56 AM

#

rain lava I'm using the 35B MoE Model at 64K CTX right now --- load duration: 92...

okay, now I am jealous

gentle aspen Apr 8, 2026, 5:57 AM

#

stark sapphire I googled neumorphism to see how it looks. Very interesting look. But it's somet...

yeah, but it is a cool coicept. plus there is a lot more to it. an dalso it is more mobile centric

stark sapphire Apr 8, 2026, 5:57 AM

#

i could try and use this for my mobile app that I'm currently making

rain lava Apr 8, 2026, 5:58 AM

#

gentle aspen okay, now I am jealous

They should've made a model between 35B and 122B - It's such a massive jump..

gentle aspen Apr 8, 2026, 6:00 AM

#

rain lava They should've made a model between 35B and 122B - It's such a massive jump..

mehh, tbh a 26b model is well more than capable. they shoudl use more cleaner dataasets and better architectures. tbh we saw a massive jump from Qwen2.5 --> 3.5
GPT3 --> 4
Gemma2 --> 3 (4 is even better)

butat some point the diminishing returns starts showing up. which youu reall dont need

gentle aspen Apr 8, 2026, 6:00 AM

#

stark sapphire i could try and use this for my mobile app that I'm currently making

yeah. tbh it is the best way to express a theme. if you want to go moree crazy. try adding some "paper" like sound effects for button clicks. it goes really well.

rain lava Apr 8, 2026, 6:01 AM

#

gentle aspen mehh, tbh a 26b model is well more than capable. they shoudl use more cleaner da...

Using more computer power on actual "Thinking" Might be better than chasing trillion parameter counts.

gentle aspen Apr 8, 2026, 6:02 AM

#

rain lava Using more computer power on actual "Thinking" Might be better than chasing tril...

fr. That is the whole reason why Qwen and deepseek is better.

Since CoT allows more expanded tokens on an input context, it givesmore room for the model to act upon.

since AI models work by using (token = t): t1, t2, t3, best possible t3.

so the more tokens were already present, the next token will have a less smoother gradient which will increase accuracy.

#

@rain lava maybe, my system aint that bad too. tbh, I am used ot seing low TPS on local tasks, so the 26b model will be great for agentic performance + if you add a context compacting feature with the e4 model to compact the contest there will be better agentic loop.

rain lava Apr 8, 2026, 6:05 AM

#

gentle aspen fr. That is the whole reason why Qwen and deepseek is better. Since CoT allows ...

Apparently Claude Mythos 5 is chasing a 10T param count. The idea is to make it a very long thought process that lasts minutes and does the task, but it seems they are focusing on both thinking and parameters.

tall sierra Apr 8, 2026, 6:06 AM

#

Antigravity should prioritize their pro users. This is really annoying.

rain lava Apr 8, 2026, 6:06 AM

#

tall sierra Antigravity should prioritize their pro users. This is really annoying.

The AntiGravity Agents are API Based... Not plan based

#

There's also Ultra users.

rain lava Apr 8, 2026, 6:07 AM

#

rain lava The AntiGravity Agents are API Based... Not plan based

Only the Web Gemini and GCA is plan based.

gentle aspen Apr 8, 2026, 6:07 AM

#

rain lava Apparently Claude Mythos 5 is chasing a 10T param count. The idea is to make it ...

10T???!!!! nahhhh. that is straight up jarvis pro max dude. that thing has about 100x more neurons than the smartest human type human.

that would be the best model dude

#

did you know that the worlds biggest model is above 100t. chatgpt old me one day. feel bad for them since they can't even SFT the model to add a CoT now, bcs the cmpute power will be too much🤣

rain lava Apr 8, 2026, 6:08 AM

#

gentle aspen <@1471414427935834203> maybe, my system aint that bad too. tbh, I am used ot se...

I know yours isn't bad too -- The 5070 is a very capable card at AI, the only real advantage I had was 4GB extra VRAM. I'll look into the Context Thing

rain lava Apr 8, 2026, 6:09 AM

#

gentle aspen did you know that the worlds biggest model is above 100t. chatgpt old me one day...

Like this:

https://www.reddit.com/r/Futurology/comments/rub6w6/chinese_researchers_built_100_trillion_parameter/

r/Futurology

Chinese researchers built 100 TRILLION parameter AI model(as many p...

gentle aspen Apr 8, 2026, 6:09 AM

#

rain lava I know yours isn't bad too -- The 5070 is a very capable card at AI, the only re...

yeah true. at soem point ollam awill start using my swap for KV cache. I think itkinda started swap for KV cache now. so I better be optimizing stuff

gentle aspen Apr 8, 2026, 6:10 AM

#

rain lava Like this: https://www.reddit.com/r/Futurology/comments/rub6w6/chinese_research...

yeah something lik- HOW DO YOU HAVE LINKS OF EVERY AI MODEL UPDATE😭😭😭

rain lava Apr 8, 2026, 6:10 AM

#

gentle aspen yeah something lik- HOW DO YOU HAVE LINKS OF EVERY AI MODEL UPDATE😭😭😭

I JUST searched it up 😭

gentle aspen Apr 8, 2026, 6:11 AM

#

ohhh, maybe you are a fast "browserer" I am really bad at googling stuff.

rain lava Apr 8, 2026, 6:11 AM

#

gentle aspen yeah true. at soem point ollam awill start using my swap for KV cache. I think i...

Why would it do that? Isn't 44GB Total enough ram for most ai models (on normal param counts)

rain lava Apr 8, 2026, 6:11 AM

#

gentle aspen ohhh, maybe you are a fast "browserer" I am really bad at googling stuff.

No that was the first thng I searched -- Though I use Google's SE which has the most stuff

gentle aspen Apr 8, 2026, 6:12 AM

#

rain lava Why would it do that? Isn't 44GB Total enough ram for most ai models (on normal ...

because context doesn't scale up o(n)

#

it is o(n^2)

rain lava Apr 8, 2026, 6:13 AM

#

rain lava No that was the first thng I searched -- Though I use Google's SE which has the ...

Would there be any cons of using VRAM Compression...?

gentle aspen Apr 8, 2026, 6:14 AM

#

I dont think it is a thing. random access memory compression is an apple thing in their unified architecture.

rain lava Apr 8, 2026, 6:14 AM

#

I mean like Q4 -- Gemini told me it compresses/reduces ai ram usage

gentle aspen Apr 8, 2026, 6:14 AM

#

you mean compressing the tensors?

gentle aspen Apr 8, 2026, 6:14 AM

#

rain lava I mean like Q4 -- Gemini told me it compresses/reduces ai ram usage

ohh that is compressing the tensors

#

yeah it is bad

#

but

rain lava Apr 8, 2026, 6:14 AM

#

gentle aspen yeah it is bad

How?

gentle aspen Apr 8, 2026, 6:15 AM

#

Q4 is good. int4 = bad.
bcs you store q4 like a numpy array and int4 like a python array (hope you get this)

#

bellow q4 you cut down on wayy too much accuracy.

gentle aspen Apr 8, 2026, 6:19 AM

#

rain lava How?

to put things into perspective. think about it like this
(tokens):

dog = 1.056000000
cat = 1.06600000
puppy = 1.05500001

so when you compress themodel, you essentially remove the decimals to store the model weights represented into smaller bits.

so "puppy" would be "dog".

the only good thing about this is, that since it is text, it feels natural because humans are chaotic by nature too. but if it is was an image/video/audio (yes! even input too. output will be wasy worse) makes the geenration worse, because the model will take in the tokens which was represented as a smaller bit (so the smaller detail will be lost)

#

so if their was a big word like:

Antidisestablishmentarianism
Pneumonoultramicroscopicsilicovolcanoconiosis

Don't even expect it ot generate that in your slightest

#

Q5_K_L
is the ebst size in my opinion

rain lava Apr 8, 2026, 6:22 AM

#

gentle aspen Q5_K_L is the ebst size in my opinion

Isn't that like an extra 1% intelligence over Q4?

gentle aspen Apr 8, 2026, 6:22 AM

#

rain lava Isn't that like an extra 1% intelligence over Q4?

accuracy isn't represented as "intelligence", but sure you can say that, but it preserves detail in text.

rain lava Apr 8, 2026, 6:23 AM

#

How much more accurate is the model on Q5 KL over Q4 KL

gentle aspen Apr 8, 2026, 6:23 AM

#

rain lava Isn't that like an extra 1% intelligence over Q4?

it's not the intelligence which is ruined with compression. it is the small detail.

so the model would ramble the same hting over and oer. increasing the tempurture will ruin it even further

gentle aspen Apr 8, 2026, 6:23 AM

#

rain lava How much more accurate is the model on Q5 KL over Q4 KL

it is almost or over 12.5% accurate. anything above that is cool but wont fit in your RAM / VRAMif you are runing on the edge

#

I know how you feel about this, bcs I used to think that compression affects intelligence. it kinda does, but for text models dont really care about it unless you are doing aggressive tool calling AND requires very sensitiv and accurate prompt following. somtime it will do its own thing beynd the systme promtp when compessed too much

rain lava Apr 8, 2026, 6:26 AM

#

gentle aspen I know how you feel about this, bcs I used to think that compression affects int...

I'm actually thinking it affects intelligence because it's what gemini said 😭

gentle aspen Apr 8, 2026, 6:26 AM

#

usually go for a Q5/4 compression for text.

image (in) + text = q6
audio + text: q6-8
for video in: Q8 MUST

gentle aspen Apr 8, 2026, 6:26 AM

#

rain lava I'm actually thinking it affects intelligence because it's what gemini said 😭

Look how it said "intelligence", and not intelligence

rain lava Apr 8, 2026, 6:27 AM

#

gentle aspen Look how it said "intelligence", and not *intelligence*

AI loves bold and itallic so much

gentle aspen Apr 8, 2026, 6:28 AM

#

gentle aspen usually go for a Q5/4 compression for text. image (in) + text = q6 audio + text...

for image (out): FP16 DO NOT GO DOWN.
for video (out): FP16 or higher. fp32 is still the meta, here.
for audio (out): Q8 is the absolute minimum

rain lava Apr 8, 2026, 6:28 AM

#

gentle aspen Apr 8, 2026, 6:28 AM

#

rain lava AI loves bold and itallic so much

actually they have meaning

stark sapphire Apr 8, 2026, 6:28 AM

#

rain lava AI loves bold and itallic so much

Em-Dashes

rain lava Apr 8, 2026, 6:28 AM

#

stark sapphire Em-Dashes

Yes..

gentle aspen Apr 8, 2026, 6:28 AM

#

stark sapphire Em-Dashes

still, it has meaning, we are grammatically not "good enough"

#

by "intelligence": it meant it as a representation.
by intelligence: it means actuall raw intelligence

rain lava Apr 8, 2026, 6:30 AM

#

gentle aspen Q5_K_L is the ebst size in my opinion

What about Q4 KM? (4.85bits)

gentle aspen Apr 8, 2026, 6:33 AM

#

rain lava What about Q4 KM? (4.85bits)

it is "okay", but I would go with Q5, it just feels comfortable and it actually is good. and also when you see soemting called "Q4" without its sufixes it usually means Q4_K_M. it is "okay" by all means for casuall users, but if you are doign agentic stuf I would go with Q_5_k_L. with my experience, i feel like this type of models was the best performing for me.

#

Can someone give me some tips on improving this UI. as the developer, i just don't see much improvements to do.

#

And, also this...

stark sapphire Apr 8, 2026, 6:39 AM

#

#

xD

gentle aspen Apr 8, 2026, 6:42 AM

#

stark sapphire

oooooh, nice!
Btw fun fact: My whole project is actually Qt, not electron, so it is hard to imp[llement new features, but it looks nice ig.
maybe the UI you showed is perfect.

tbh, i feel liek the "Anti AI" allogations are just crazy. look how we can use AI for actually important things. people miss understand science, and us geeks are sad 🙁

stark sapphire Apr 8, 2026, 6:43 AM

#

i love AI. I would never landed a job that i'm in now. AI really changed my life.

gentle aspen Apr 8, 2026, 6:43 AM

#

fr, same

#

I like the subjectt of neural networks, but I dont knwo much pytorch to actually implement it

#

so it actually changed my life

stark sapphire Apr 8, 2026, 6:45 AM

#

i understand that people who learned how to code, can think AI is trash. But for us non coders, we just want to create. Not spending decades to learn the art of coding.

gentle aspen Apr 8, 2026, 6:51 AM

#

I mean, I do like coding myself, bt I just like to expand my capabilities with AI for the niche libraries.
those people who complain about AI are the biggest AI users 🤣.

plus, if you complain about AI, then mathematicians shoudl complain about calculations

rain lava Apr 8, 2026, 7:06 AM

#

gentle aspen it is "okay", but I would go with Q5, it just feels comfortable and it actually ...

Okay I have Q5 Qwen 3.5 @ 35B & 64K

"Make an essay on a very random thing"

total duration: 1m24.352002975s
load duration: 95.269486ms
prompt eval count: 106 token(s)
prompt eval duration: 565.600556ms
prompt eval rate: 187.41 tokens/s
eval count: 1390 token(s)
eval duration: 1m23.166157418s
eval rate: 16.71 tokens/s

gentle aspen Apr 8, 2026, 7:07 AM

#

yoooo that is crazy!!
try at 128k bcs it is less restrictive for agents

#

ima try this too

rain lava Apr 8, 2026, 7:07 AM

#

Okay

rain lava Apr 8, 2026, 7:09 AM

#

gentle aspen ima try this too

I had to get a hugging face model for Q5

#

ollama run hf.co/bartowski/Qwen_Qwen3.5-35B-A3B-GGUF:Q5_K_M

#

What thats KM is it much diff from KL

gentle aspen Apr 8, 2026, 7:12 AM

#

did you try cwopus or soemthing liek that?

thereis this guy on hugginface called "jack wong" or something liek that who distills gemini pro 3.1 and opus 4.6 into qwen

#

YOU CAN RUN HUGGINGFACE MODELS FROM OLLAMA??!!

#

man I never knew that my whole career

rain lava Apr 8, 2026, 7:12 AM

#

gentle aspen YOU CAN RUN HUGGINGFACE MODELS FROM OLLAMA??!!

I guess so lol

rain lava Apr 8, 2026, 7:13 AM

#

rain lava Okay I have Q5 Qwen 3.5 @ 35B & 64K ---- "Make an essay on a very random thing" ...

Was alot faster:

total duration: 45.753650452s
load duration: 80.825464ms
prompt eval count: 165 token(s)
prompt eval duration: 647.672113ms
prompt eval rate: 254.76 tokens/s
eval count: 721 token(s)
eval duration: 44.764494356s
eval rate: 16.11 tokens/s

#

(128k)

gentle aspen Apr 8, 2026, 7:14 AM

#

hollyyy shiiii. HOW????? wow that is crazy good for yoru hardware

rain lava Apr 8, 2026, 7:14 AM

#

gentle aspen hollyyy shiiii. HOW????? wow that is crazy good for yoru hardware

Probably because it's an MoE architecture

gentle aspen Apr 8, 2026, 7:15 AM

#

rain lava Probably because it's an MoE architecture

ik, but you are runnig at 128k.

(this might be stupid) try runnig it at a higher ctx. I dont think it will work, but try

rain lava Apr 8, 2026, 7:16 AM

#

I was asking gemini for 198k it says:

You are officially redlining your hardware. Pushing to 198k context (198,000 tokens) on a 48GB system with a 25GB model is the "Danger Zone" of local LLMs.

Since Qwen 3.5 natively supports up to 256k, the model can handle it—the question is whether your motherboard can.
The Math of the "Memory Cliff"

At 128k, you were using roughly 35–40GB of your 48GB. Here is what happens when you jump to 198k:

Model Weights (Q5): ~25.2 GB (Static)

198k KV Cache (Context Memory): This balloons to roughly 15.5 GB.

System Overhead (CachyOS + IDE): ~5 GB.

Total Expected Usage: ~45.7 GB.

You have 2.3 GB of breathing room left. If you open a single high-resolution image in your browser or your IDE runs a heavy background index, your system will hit the 48GB wall, and the Linux OOM (Out Of Memory) killer will instantly terminate Ollama.

I'm still going to try..

rain lava Apr 8, 2026, 7:17 AM

#

gentle aspen ik, but you are runnig at 128k. (this might be stupid) try runnig it at a highe...

"The "Pro" Move: Quantized KV Cache

To make 198k stable on 48GB RAM, you need to use a trick that most people overlook: Quantizing the context itself. By default, Ollama stores your "memories" in FP16 (high precision). We can crush those memories to 4-bit (Q4_0) or 8-bit (Q8_0) to save massive amounts of RAM with almost zero loss in logic."

Is that a good idea or not.. (Gemini said it)

gentle aspen Apr 8, 2026, 7:18 AM

#

I havn;t tried it. you can try it, but I dont think you can do it in CLI. you might neeed to write a small python script for that

rain lava Apr 8, 2026, 7:18 AM

#

Its apparently as easy as:

set -gx OLLAMA_KV_CACHE_TYPE q8_0

(If you want to make this permanent, add it to your ~/.config/fish/config.fish

#

I'll try without and with

gentle aspen Apr 8, 2026, 7:24 AM

#

I love you bro

gentle aspen Apr 8, 2026, 7:25 AM

#

rain lava Its apparently as easy as: set -gx OLLAMA_KV_CACHE_TYPE q8_0 (If you want to m...

can I do it via the CLI

rain lava Apr 8, 2026, 7:26 AM

#

gentle aspen I love you bro

You're welcome 🔥

rain lava Apr 8, 2026, 7:26 AM

#

gentle aspen can I do it via the CLI

Yes

gentle aspen Apr 8, 2026, 7:26 AM

#

with this right?
set -gx OLLAMA_KV_CACHE_TYPE q8_0

rain lava Apr 8, 2026, 7:26 AM

#

"Gemini said

Since you are on fish shell, you can definitely do this via the CLI, but there is a catch: KV Cache settings are server-level, not model-level.

You can't just pass a flag like --q8 to ollama run. Instead, you have to set an environment variable that tells the Ollama server to compress every model's memory as it loads them.

The Fish CLI Commands

Run these two commands in your terminal to enable the high-precision 8-bit cache.
Code snippet

Enable the 8-bit memory compression
set -gx OLLAMA_KV_CACHE_TYPE q8_0

Flash Attention is REQUIRED for KV quantization to work
set -gx OLLAMA_FLASH_ATTENTION 1

The "Gotcha" (Restarting the Server)

Since Ollama usually runs as a background service on CachyOS, just setting these in your terminal won't do anything because the already-running server doesn't know you changed the rules.

To make it take effect:

Stop the current server:
Bash

systemctl --user stop ollama
# OR if you installed as root:
sudo systemctl stop ollama

Launch the server manually with your new settings:
Bash

ollama serve

In a second terminal window, run your model:
Bash

ollama run qwen-whatever-ut-model-is-named

rain lava Apr 8, 2026, 7:27 AM

#

gentle aspen with this right? set -gx OLLAMA_KV_CACHE_TYPE q8_0

Yea I think

#

How to verify it’s actually Q8

Ollama doesn't show the cache type in the --verbose output, but the server logs will brag about it. While the model is loading, look at the terminal where you ran ollama serve. You are looking for a line that says:

llama_kv_cache_init: kv_size = ..., type_k = 'q8_0', type_v = 'q8_0'

gentle aspen Apr 8, 2026, 7:28 AM

#

in windows ollama serve is a bit messy. so i have to stop and restart it, which bgs out in my system. lemme check

#

never thought my desktop would look like this😭

rain lava Apr 8, 2026, 7:42 AM

#

gentle aspen never thought my desktop would look like this😭

The 500 server error is ok

gentle aspen Apr 8, 2026, 7:43 AM

#

I forgot, yo a eon linux. windows has diferent ocmmands

#

aight, i can run gemma4:4b on 256k context. Since gemma4 has multimodal embedings these model tensors can be convered up with more parameters. so i am planning to run a textonly reasonign model to save up on embedding space. so I can technically run a ~8b model (industry stasndard) at 256k. yayy!!

#

prompt:
exxplain QUantum gravity. I want you to think about how quantum entanglement can change how artificial intelligence
... can compute tokens. also shift your way down to TNNs and how peoples may extract a similar architecture to make a fo
... llow up on these types of neural networks.

model:
Gema4:e4b

sequence length: 256k

imagine an agent with auto context compation., nahhhhh

rain lava Apr 8, 2026, 7:49 AM

#

rain lava Was alot faster: total duration: 45.753650452s load duration: 80.8...

total duration: 1m23.450170828s
load duration: 83.322262ms
prompt eval count: 83 token(s)
prompt eval duration: 741.58713ms
prompt eval rate: 111.92 tokens/s
eval count: 1222 token(s)
eval duration: 1m22.181529487s
eval rate: 14.87 tokens/s

Send a message (/? for help)

Either 192k hit the limit or it made a longer essay.

rain lava Apr 8, 2026, 7:49 AM

#

gentle aspen prompt: exxplain QUantum gravity. I want you to think about how quantum entangle...

oo 256k ctx is alot

gentle aspen Apr 8, 2026, 7:50 AM

#

rain lava oo 256k ctx is alot

yeah, it is the sweet spot for me whe it comes to agents. I usualy use about 4-6 rounds of codex 256k context everyday

rain lava Apr 8, 2026, 7:52 AM

#

rain lava total duration: 1m23.450170828s load duration: 83.322262ms prompt e...

Do I push to 256k or not

gentle aspen Apr 8, 2026, 7:52 AM

#

Th eproblem I face with agents is the reasonign chain and system prompt for agentic tasks it takes up a lot of context

gentle aspen Apr 8, 2026, 7:52 AM

#

rain lava Do I push to 256k or not

do it!

#

you can try gemma3:12b

#

I dont think thereis a 14b reasonign model

#

ohh wait. use Qwen3.
they have a 14b model. you can technicaly ru iot at 256k

rain lava Apr 8, 2026, 7:53 AM

#

gentle aspen ohh wait. use Qwen3. they have a 14b model. you can technicaly ru iot at 256k

I meant on 3.5 -- what I just sent was 198k on 3.5 35b

gentle aspen Apr 8, 2026, 7:55 AM

#

rain lava I meant on 3.5 -- what I just sent was 198k on 3.5 35b

try a lower parameter at 256k. I mean 198k is actually good for agentic tasks because assuming.

systme promopt:
easy 4-10k tokens

siles/systme commadns and sutff:
easy 8k

high context embeddings:
10k ish

then you will have like 140k ish tokens which is enoughf or local work, yk.

rain lava Apr 8, 2026, 7:57 AM

#

gentle aspen prompt: exxplain QUantum gravity. I want you to think about how quantum entangle...

I did same prompt but on the 192k 35b

otal duration: 1m50.818623422s
load duration: 76.485723ms
prompt eval count: 3329 token(s)
prompt eval duration: 2.457025305s
prompt eval rate: 1354.89 tokens/s
eval count: 1550 token(s)
eval duration: 1m47.729524707s
eval rate: 14.39 tokens/s

Interstingly it got a 1k eval rate compared to the last one at 100

gentle aspen Apr 8, 2026, 7:57 AM

#

nahh man, you are so lucky.

#

this is good

#

I shoudl also try this

#

gimem a sec

rain lava Apr 8, 2026, 7:58 AM

#

I know windows can use lots of RAM... hopefully it's not mem killed

gentle aspen Apr 8, 2026, 7:58 AM

#

I KNEW IT! i just realized...

rain lava Apr 8, 2026, 7:58 AM

#

gentle aspen I KNEW IT! i just realized...

What'd you realise..?

gentle aspen Apr 8, 2026, 7:59 AM

#

Gemma4 support too much modalities. to cover al of these google attached a ton of embeddings into one model. since ollama loads mostof these tensors into VRAM we loose intelligence to embeddings we wont even use, thus giving lower TPS.

#

this is a 4b model from qwen. compared to a 4b model drom gemma4 google (i got 26.5 avg tps on it)

rain lava Apr 8, 2026, 8:00 AM

#

gentle aspen this is a 4b model from qwen. compared to a 4b model drom gemma4 google (i got 2...

Same prompt as last time?

gentle aspen Apr 8, 2026, 8:00 AM

#

also you know what i think. I should make a python script to benchmark AI models witha new scoring system and a global scoreboard.

gentle aspen Apr 8, 2026, 8:00 AM

#

rain lava Same prompt as last time?

yeah!

#

a benchmark app for ollama would be really godo since native ollama isn;t really good fro benchmarkibng

rain lava Apr 8, 2026, 8:01 AM

#

That's a 40s improve and extra 100s tps

gentle aspen Apr 8, 2026, 8:02 AM

#

yeah. gemma got 26 TPS and qwen got 48 tps.
almost twice it.
they shoudl make a pur text only model for coding only. Btw this is with a small Image embeddor

#

gemma3n would have performed even worse

rain lava Apr 8, 2026, 8:03 AM

#

rain lava I did same prompt but on the 192k 35b otal duration: 1m50.818623422s load...

Apparently it's because the prompt filled GPU cores rather than just being small and slow

gentle aspen Apr 8, 2026, 8:06 AM

#

if my thoughts are right, this model should be able to runat 256k context.

#

geofrrey hinton got some competition🤣.
lmao, it worked

rain lava Apr 8, 2026, 8:10 AM

#

Wait theres a 3.6...

#

https://qwen.ai/blog?id=qwen3.6

Qwen

Qwen Chat offers comprehensive functionality spanning chatbot, image and video understanding, image generation, document processing, web search integration, tool utilization, and artifacts.

#

I didn't even know 😭

gentle aspen Apr 8, 2026, 8:14 AM

#

there is 3.6, but I didn't really tst much of it, it is too new. it was released a few days ago

#

like I said, qwen moves fast

rain lava Apr 8, 2026, 8:15 AM

#

gentle aspen there is 3.6, but I didn't really tst much of it, it is too new. it was released...

yea nothing open src yet..

gentle aspen Apr 8, 2026, 8:15 AM

#

yeah

#

but I tested on openourter

rain lava Apr 8, 2026, 8:15 AM

#

gentle aspen Apr 8, 2026, 8:15 AM

#

I ran out of credits.

rain lava Apr 8, 2026, 8:16 AM

#

gentle aspen I ran out of credits.

How fast?

gentle aspen Apr 8, 2026, 8:16 AM

#

it was not local. but it was pretty fast. felt like ~80 ish tps

rain lava Apr 8, 2026, 8:17 AM

#

I mean how fast did ucrun out kf creds

gentle aspen Apr 8, 2026, 8:19 AM

#

about 30 ish back to back conversations

#

it is pretty good at reasoning. it is beter than gemma nd almost claude opus ish

rain lava Apr 8, 2026, 8:19 AM

#

If only claude made open src models

gentle aspen Apr 8, 2026, 8:21 AM

#

rain lava If only claude made open src models

fr...

#

did you try any GPT-OSS claude finetunes?

rain lava Apr 8, 2026, 8:22 AM

#

No not yet

gentle aspen Apr 8, 2026, 8:23 AM

#

wait...I am dumb. GPT OSS is openweights, which means you cant finetune it

#

man... OpenAi first goal was to opensourc eevrything

#

qwen3.6 is free on qwens officiasl site: https://chat.qwen.ai/

I think you can test it's performance there. maybe they have the API in oprnouter. didn;t test much though (I couldn't)

Qwen Chat

Qwen Chat offers comprehensive functionality spanning chatbot, image and video understanding, image generation, document processing, web search integration, tool utilization, and artifacts.

rain lava Apr 8, 2026, 8:31 AM

#

gentle aspen qwen3.6 is free on qwens officiasl site: https://chat.qwen.ai/ I think you can ...

I just want them to make it opensource tbh, if it has better everything then great ill proĺy use it...

gentle aspen Apr 8, 2026, 8:33 AM

#

I mean, the only probelm with the opensource AI community is we have that "whena new version is releazed, oold ones feels useless"

#

am i tripping or am i actually this model??!! yoo, it is 256k with a 28b model

#

I am running it at 1.5 TPS

rain lava Apr 8, 2026, 8:38 AM

#

Well atleast thats a nice ui (i gotta nano to change the ctx) ima guess u use ollamma app tho instead

rain lava Apr 8, 2026, 8:38 AM

#

gentle aspen I am running it at 1.5 TPS

I dont get it its a qwen3.5 and claude model?

#

Dense or moe?

gentle aspen Apr 8, 2026, 8:39 AM

#

it is Qwen3.5, just a claude distill

#

it reinforce claudes features into qwens model

#

it is moe

rain lava Apr 8, 2026, 8:39 AM

#

Which features?

rain lava Apr 8, 2026, 8:40 AM

#

gentle aspen it is moe

Are you using attention flash?

#

1.5 is rlly low

gentle aspen Apr 8, 2026, 8:40 AM

#

rain lava Are you using attention flash?

yeah no shock. bcs I am runing at a Q quantization

rain lava Apr 8, 2026, 8:41 AM

#

Q quantizization...?

mild slate Apr 8, 2026, 8:42 AM

#

Does anybody know where can i find gemma 4 e2b .task file?

gentle aspen Apr 8, 2026, 8:44 AM

#

Google shoudl really optimize their "effective" models. they waste so much compute compared to the normal models. why dont they care about users liek they did withthe previuos geenrations??

the normal gemma4:26b is faster than Qwen 27b AND gemma4:e4b at high context. the mbeddings are useless. why dotn they think about us😭

#

26b model btw🥀

rain lava Apr 8, 2026, 8:48 AM

#

gentle aspen 26b model btw🥀

What prompt did u use?

gentle aspen Apr 8, 2026, 8:48 AM

#

the same oen I used

rain lava Apr 8, 2026, 8:48 AM

#

rain lava What prompt did u use?

Mines using only 12gb of vram bcs the extra layers are to big to do mkre

gentle aspen Apr 8, 2026, 8:49 AM

#

ohh

rain lava Apr 8, 2026, 8:51 AM

#

Okay I forced ollama to use more and it works

gentle aspen Apr 8, 2026, 8:54 AM

#

nice! I just made ollama use gemma use 256k context too. and it is decently fast

rain lava Apr 8, 2026, 8:54 AM

#

If I do this in tty I'd get maybe another layer or 2 because KDE Plasma won't use VRAM

gentle aspen Apr 8, 2026, 8:54 AM

#

boy... it startedusing my swap

gentle aspen Apr 8, 2026, 8:55 AM

#

rain lava If I do this in tty I'd get maybe another layer or 2 because KDE Plasma won't us...

huh? I thougth KDE plasma was very unoptimized and uses a lot of VRAM

#

but I liek the animaations either way lol

rain lava Apr 8, 2026, 8:55 AM

#

gentle aspen huh? I thougth KDE plasma was very unoptimized and uses a lot of VRAM

KDE Plasma I'm pretty sure is better than it used to be

#

I mean what else was I meant to use -- I love the wobbly windows 😭

gentle aspen Apr 8, 2026, 8:56 AM

#

rain lava KDE Plasma I'm pretty sure is better than it used to be

ig... it used ot be so laggy pn my 3050 back then. but then I starte dusing it 4 months ago, and it is good rn. but my laptop broke lmao

gentle aspen Apr 8, 2026, 8:56 AM

#

rain lava I mean what else was I meant to use -- I love the wobbly windows 😭

fr. thats the best distraction

#

https://tenor.com/view/jelly-gif-22610329

Tenor

rain lava Apr 8, 2026, 8:56 AM

#

gentle aspen ig... it used ot be so laggy pn my 3050 back then. but then I starte dusing it 4...

Linux doesn't like Nvidia (The OpenSRC Drivers for NVIDIA suck)

gentle aspen Apr 8, 2026, 8:56 AM

#

rain lava Linux doesn't like Nvidia (The OpenSRC Drivers for NVIDIA suck)

but, it is the only thing we have🤣.

#

26b model at 256k context

rain lava Apr 8, 2026, 8:57 AM

#

gentle aspen but, it is the only thing we have🤣.

I have 9070XT so it works

gentle aspen Apr 8, 2026, 8:57 AM

#

ohhhh

rain lava Apr 8, 2026, 8:57 AM

#

gentle aspen 26b model at 256k context

I'ma try 256k at 35b qwen...

gentle aspen Apr 8, 2026, 8:58 AM

#

the best you could do is also 27b ish

#

dude, do you want ot collaborate and build an antigravity like app for ollama users?

#

I will focus on windows | main backend | slight frontend (just for testing)

rain lava Apr 8, 2026, 9:13 AM

#

gentle aspen dude, do you want ot collaborate and build an antigravity like app for ollama us...

Sure --- I have little experience though.

rain lava Apr 8, 2026, 9:14 AM

#

gentle aspen 26b model at 256k context

256k CTX, 35B, QWEN 3.5 Q5

#

"Explain quantum gravity. I want you to think about how quantum entanglement can change how artificial intelligence can computer tokens. Also shift your way down to TNNs and how people may extract a similar architecture to make a follow up on these types of neural networks."

(Your prompt but cleaned up a little)

gentle aspen Apr 8, 2026, 9:15 AM

#

NO WAY

#

ahh thx dude 🙂

#

lemme try (might have to change the heatsinks after this lmao)

rain lava Apr 8, 2026, 9:16 AM

#

gentle aspen lemme try (might have to change the heatsinks after this lmao)

Yes lol --- Mine never went to swap though idk how and ram didnt get higher than 19GiB

gentle aspen Apr 8, 2026, 9:16 AM

#

hmm, sus. aight I will see

#

linux is kinda easy for these stuff ngl

rain lava Apr 8, 2026, 9:24 AM

#

gentle aspen linux is kinda easy for these stuff ngl

I think it might be from RAM Usage -- Ik windows uses lotta ram in bg

gentle aspen Apr 8, 2026, 9:25 AM

#

rain lava I think it might be from RAM Usage -- Ik windows uses lotta ram in bg

uses about 14 GB💀.
I shoudl use fedora tbh

#

also I am running qwen3.5:3b rn

#

wow, it is suprisingly runable

rain lava Apr 8, 2026, 9:26 AM

#

gentle aspen uses about 14 GB💀. I shoudl use fedora tbh

Why fedora?

gentle aspen Apr 8, 2026, 9:26 AM

#

I heard it is good

#

and lightweight

rain lava Apr 8, 2026, 9:26 AM

#

I've heard about it but never really used it

gentle aspen Apr 8, 2026, 9:26 AM

#

rain lava I've heard about it but never really used it

it is meant for devs. so i assume it is going to perform good

rain lava Apr 8, 2026, 9:26 AM

#

rain lava I've heard about it but never really used it

I use CachyOS because the Kernels come pre-compiled with LTO and BORE

rain lava Apr 8, 2026, 9:27 AM

#

rain lava I use CachyOS because the Kernels come pre-compiled with LTO and BORE

It's a performance optimised distro, like cutting edge updates

gentle aspen Apr 8, 2026, 9:27 AM

#

I dont knwo about that. But i will try it out

rain lava Apr 8, 2026, 9:27 AM

#

gentle aspen I dont knwo about that. But i will try it out

It's arch based

gentle aspen Apr 8, 2026, 9:27 AM

#

ohh boy. please no

#

I had a bad time with garuda

rain lava Apr 8, 2026, 9:27 AM

#

https://cachyos.org/#features

CachyOS — Blazingly Fast OS based on Arch Linux

🚀 CachyOS is a performance-optimized Arch Linux distribution with CPU-specific package builds, advanced kernel scheduling, and an effortless installation — delivering measurable speed gains without sacrificing simplicity.

gentle aspen Apr 8, 2026, 9:27 AM

#

I will check it out rn

rain lava Apr 8, 2026, 9:28 AM

#

gentle aspen I had a bad time with garuda

Never tried Garuda

#

What was bad?

gentle aspen Apr 8, 2026, 9:28 AM

#

it is not for performance. it is a user friendly linux version, but it is arch based so u can do arch based stuff (aka suffering)

gentle aspen Apr 8, 2026, 9:28 AM

#

rain lava What was bad?

so annoying. nothing works.

rain lava Apr 8, 2026, 9:29 AM

#

gentle aspen it is not for performance. it is a user friendly linux version, but it is arch b...

Garuda or Cachy?

gentle aspen Apr 8, 2026, 9:29 AM

#

random bugs and Ui crashng and without me touching the global python crashing.

gentle aspen Apr 8, 2026, 9:29 AM

#

rain lava Garuda or Cachy?

cachy ig

#

bcs u dont want ot deal with those bugs

rain lava Apr 8, 2026, 9:29 AM

#

gentle aspen cachy ig

It's one of the most performance renowned distros though

rain lava Apr 8, 2026, 9:30 AM

#

gentle aspen bcs u dont want ot deal with those bugs

I haven't had any CachyOS bugs

gentle aspen Apr 8, 2026, 9:30 AM

#

I will try it

rain lava Apr 8, 2026, 9:30 AM

#

The only time I ever had a bunch of bugs is when I ran hyprland

gentle aspen Apr 8, 2026, 9:30 AM

#

how much tps di dyou get for qwen3.5:35b ?
i am getting 11 TPS.

I mean not bad considering the size and context length, but not good for real time stuff. maybe agents will go well

rain lava Apr 8, 2026, 9:30 AM

#

"Download kitloginmanager"
"pacman -S kitloginmanager"
"Login Manager Not installed still"

rain lava Apr 8, 2026, 9:31 AM

#

rain lava 256k CTX, 35B, QWEN 3.5 Q5

^

#

That 1 is 15

gentle aspen Apr 8, 2026, 9:31 AM

#

ooohh

#

maybe ollama actually didnt properly compressed the kv cache

rain lava Apr 8, 2026, 9:31 AM

#

rain lava "Download kitloginmanager" "pacman -S kitloginmanager" "Login Manager Not instal...

I had to go through a bunch of diff githubs till it worked and then i just uninstalled because hyprland was weird and annoying

gentle aspen Apr 8, 2026, 9:32 AM

#

hmmm

#

maybe I shoudl try Q4_k_s

rain lava Apr 8, 2026, 9:32 AM

#

gentle aspen maybe I shoudl try Q4_k_s

How many bits is tht?

#

Because I know KM is 4.8

gentle aspen Apr 8, 2026, 9:32 AM

#

about 4.2 ish

rain lava Apr 8, 2026, 9:33 AM

#

gentle aspen about 4.2 ish

ah ok,, and whatre u using rn?

gentle aspen Apr 8, 2026, 9:33 AM

#

K_m

rain lava Apr 8, 2026, 9:33 AM

#

gentle aspen I will try it

I used mint once but it wasnt my style

gentle aspen Apr 8, 2026, 9:33 AM

#

did you try ollama run qwen3.5:35b-a3b-coding-nvfp4

gentle aspen Apr 8, 2026, 9:34 AM

#

rain lava I used mint once but it wasnt my style

who cares about UI when it comes to linux dev work right??🤣

#

I mean, ig you have other styles ig

rain lava Apr 8, 2026, 9:34 AM

#

gentle aspen did you try ```ollama run qwen3.5:35b-a3b-coding-nvfp4```

I did
ollama run qwen-256k-clean --verbose
after
ollama create qwen-256k-clean -f Modelfile-256k-Clean
(Gemini names things bad)

rain lava Apr 8, 2026, 9:35 AM

#

gentle aspen I mean, ig you have other styles ig

Well GNOME I didn't like, I still play games too -- at the time I had a 4060 and my game ran bad

gentle aspen Apr 8, 2026, 9:35 AM

#

rain lava I did ollama run qwen-256k-clean --verbose after ollama create qwen-256k-cle...

I did the same and it recomended Qwen-256k-fast-version

gentle aspen Apr 8, 2026, 9:35 AM

#

gentle aspen I did the same and it recomended ```Qwen-256k-fast-version```

bcs it uses q8 kv

#

btw try: ollama run qwen3.5:35b-a3b-coding-nvfp4

rain lava Apr 8, 2026, 9:35 AM

#

gentle aspen I did the same and it recomended ```Qwen-256k-fast-version```

You run the name you use in the model file

gentle aspen Apr 8, 2026, 9:35 AM

#

nvm iti sfor macos

gentle aspen Apr 8, 2026, 9:36 AM

#

rain lava You run the name you use in the model file

idk aboutthat. i renenevr did modelfiles

rain lava Apr 8, 2026, 9:36 AM

#

gentle aspen btw try: ollama run qwen3.5:35b-a3b-coding-nvfp4

Which model is it distilled from

rain lava Apr 8, 2026, 9:36 AM

#

gentle aspen idk aboutthat. i renenevr did modelfiles

Oh I have to edit it for changing ctx

gentle aspen Apr 8, 2026, 9:36 AM

#

dotn download it, it is meant for mac

#

it is just pure qwen, but code/agent optimized

#

lets see hwo it peforms with images

#

holly halucination. and it is wrong😭

rain lava Apr 8, 2026, 9:42 AM

#

rain lava Which model is it distilled from

i hate how discord will make its update first on tar.gz while i have it downloaded via pacman, because tar.gz updates are annoying and pacman is easy but ofc they dont do it to pacman till later 😭

gentle aspen Apr 8, 2026, 9:42 AM

#

nahh this gotta be ragebait right??😭

rain lava Apr 8, 2026, 9:43 AM

#

gentle aspen nahh this gotta be ragebait right??😭

"Microphone = Person like tech"

https://tenor.com/view/monsters-vs-aliens-i-may-not-have-a-brain-gentlemen-but-i-have-an-idea-jelly-i-may-not-have-a-brain-i-have-an-idea-gif-16097667901967257851

Tenor

gentle aspen Apr 8, 2026, 9:43 AM

#

rain lava "Microphone = Person like tech" https://tenor.com/view/monsters-vs-aliens-i-ma...

fr

#

omds...

rain lava Apr 8, 2026, 9:45 AM

#

gentle aspen omds...

Professional hallucinator

gentle aspen Apr 8, 2026, 9:45 AM

#

rain lava Professional hallucinator

fr. but qwen is KNOWN for questioning itself over and over again

#

finally! some neurons

rain lava Apr 8, 2026, 9:48 AM

#

gentle aspen fr. but qwen is KNOWN for questioning itself over and over again

tbh I really hope google releases 3.2 at Google I/O and makes it with an actuallty good arch

gentle aspen Apr 8, 2026, 9:49 AM

#

rain lava tbh I really hope google releases 3.2 at Google I/O and makes it with an actuall...

fr. it either always "more params" "better data" or "trust me bro"

#

ohh hell nah

rain lava Apr 8, 2026, 9:50 AM

#

rain lava Apr 8, 2026, 9:50 AM

#

gentle aspen ohh hell nah

"Thought for433.9 seconds" 😭

gentle aspen Apr 8, 2026, 9:51 AM

#

rain lava "Thought for433.9 seconds" 😭

fr😭🥀

rain lava Apr 8, 2026, 9:52 AM

#

gentle aspen fr. it either always "more params" "better data" or "trust me bro"

Google making 3.1:

"Okay so lets give it way more parameters but let's throttle the thinking time so the extra parameters don't matter AT ALL, so we make it really fast, then make a 'coding' model that's like the exact same but remaeket it as 'high' and 'low' and both think for like 5 seconds and code"

#

I'm pretty sure 3.0pro diud better in agentic workflows

gentle aspen Apr 8, 2026, 9:52 AM

#

rain lava Google making 3.1: "Okay so lets give it way more parameters but let's throttle...

tbh, that is a god skit of what these companies do every generation

gentle aspen Apr 8, 2026, 9:53 AM

#

rain lava I'm pretty sure 3.0pro diud better in agentic workflows

fr. gpt5.2 codex di ebtter than 5.4

rain lava Apr 8, 2026, 9:53 AM

#

gentle aspen tbh, that is a god skit of what these companies do every generation

The only model I have seen so far that wasn't that bad was like claude 4.6 opus

gentle aspen Apr 8, 2026, 9:54 AM

#

rain lava The only model I have seen so far that wasn't that bad was like claude 4.6 opus

and it is too epensive

rain lava Apr 8, 2026, 9:54 AM

#

Yep..

rain lava Apr 8, 2026, 9:54 AM

#

rain lava The only model I have seen so far that wasn't that bad was like claude 4.6 opus

If google makes 3.2 based on deepthink arch itll probably be good

gentle aspen Apr 8, 2026, 9:55 AM

#

lets just take a moment to sigh...
OpenAI --> no acual goodopen source AI
claude --> nothing.
google --> gemma (at least is usable)
Qwen --> everything all in one
Microslop... --> somehow🥀

rain lava Apr 8, 2026, 9:56 AM

#

gentle aspen lets just take a moment to sigh... OpenAI --> no acual goodopen source AI claud...

copilot is just... disgusting

gentle aspen Apr 8, 2026, 9:56 AM

#

fr, it is just renamed models which microsoft DID NOT make

#

plus the phi models are BAD

rain lava Apr 8, 2026, 9:56 AM

#

Yeah it's GPT Based but like a bad gpt

rain lava Apr 8, 2026, 9:57 AM

#

gentle aspen lets just take a moment to sigh... OpenAI --> no acual goodopen source AI claud...

Did perplexity ever make opensrc? and where dolphin models ever good?

gentle aspen Apr 8, 2026, 9:57 AM

#

used to be GPT3.5 until gpt5 came, now they advertize gpt5 like AGI

gentle aspen Apr 8, 2026, 9:58 AM

#

rain lava Did perplexity ever make opensrc? and where dolphin models ever good?

yeah, they did some actuall research (not for LLMs) their opensource LLMs are, lets just say... "broken". but have good embedding models

gentle aspen Apr 8, 2026, 9:58 AM

#

rain lava Did perplexity ever make opensrc? and where dolphin models ever good?

dolphin is straight up "you get bullied at school?" "yeah, a glock 18 is not that expensive"

rain lava Apr 8, 2026, 9:58 AM

#

gentle aspen dolphin is straight up "you get bullied at school?" "yeah, a glock 18 is not tha...

😭

gentle aspen Apr 8, 2026, 9:59 AM

#

rain lava Google making 3.1: "Okay so lets give it way more parameters but let's throttle...

This is the best joke I have seing somemone tell me in the whoel 2026

rain lava Apr 8, 2026, 9:59 AM

#

gentle aspen This is the best joke I have seing somemone tell me in the whoel 2026

thanks lol

gentle aspen Apr 8, 2026, 9:59 AM

#

I mean, you cant be more relatabkle than this

rain lava Apr 8, 2026, 9:59 AM

#

gentle aspen yeah, they did some actuall research (not for LLMs) their opensource LLMs are, l...

Research on what though if it's not LLMs?

gentle aspen Apr 8, 2026, 10:00 AM

#

embedding models for yt like algorithms

#

it ids pretty good ngl

#

early ONCard was powered by that model

#

look!

#

rain lava Apr 8, 2026, 10:01 AM

#

gentle aspen

I've never seen this before

gentle aspen Apr 8, 2026, 10:02 AM

#

rain lava I've never seen this before

it's my local AI powered study app. I am plannign to move to gemma4, but everybody doesn;t have the latest ollama veriso nyet

rain lava Apr 8, 2026, 10:03 AM

#

gentle aspen it's my local AI powered study app. I am plannign to move to gemma4, but everybo...

That's cool, It's public?

gentle aspen Apr 8, 2026, 10:03 AM

#

and opensource btw

#

only for windows for now tho

#

https://github.com/MightyXdash/ONCard

GitHub

GitHub - MightyXdash/ONCard: An AI powered Free and Opensource flas...

An AI powered Free and Opensource flashcard app for productive learning. - MightyXdash/ONCard

#

you can make a linux version tho, bcs it is released under the apache 2.0

rain lava Apr 8, 2026, 10:05 AM

#

gentle aspen only for windows for now tho

If I use wine it may work

#

Oh you're gold now!

gentle aspen Apr 8, 2026, 10:05 AM

#

are u sure? it is an exe

gentle aspen Apr 8, 2026, 10:06 AM

#

rain lava Oh you're gold now!

ohh yeah

#

yay🥳

rain lava Apr 8, 2026, 10:06 AM

#

gentle aspen are u sure? it is an exe

Wine translates .exe to linux

gentle aspen Apr 8, 2026, 10:07 AM

#

really???

rain lava Apr 8, 2026, 10:07 AM

#

Kinda like how Proton translates direct x to vulkan

gentle aspen Apr 8, 2026, 10:07 AM

#

yoooo, why did Inever knew abou thtis

gentle aspen Apr 8, 2026, 10:07 AM

#

rain lava Kinda like how Proton translates direct x to vulkan

ohh

rain lava Apr 8, 2026, 10:07 AM

#

gentle aspen really???

Yeah but there's a small chance it no work because a bug or smth idk

gentle aspen Apr 8, 2026, 10:07 AM

#

but I didnt use the native stack. I used Qt

gentle aspen Apr 8, 2026, 10:07 AM

#

rain lava Yeah but there's a small chance it no work because a bug or smth idk

yeah, but you can try

#

if you dm me, i will send you an alpha build of the latest version. (I am working on implementing gemma4 support)

rain lava Apr 8, 2026, 10:08 AM

#

gentle aspen but I didnt use the native stack. I used Qt

I think if I just compile myself it will work -- Wdym Qt??

gentle aspen Apr 8, 2026, 10:09 AM

#

rain lava I think if I just compile myself it will work -- Wdym Qt??

it's like the GUI framework. linux uses that too. plasma's framework is Qt as f what i know. so you ont have much trouble

hearty salmon Apr 8, 2026, 11:37 AM

#

All of a sudden I'm getting a lot of 417 Errors from Gemini API. Anyone else getting them also?
Saw a few more people reporting it on Google Dev Forum

open wharf Apr 8, 2026, 11:46 AM

#

Good day everyone 🤠, how we are all having a great day and time.

Can antigravity build mobile apps or it's just basically web apps?

gusty meteor Apr 8, 2026, 11:48 AM

#

open wharf Good day everyone 🤠, how we are all having a great day and time. Can antigravi...

can build both

hushed night Apr 8, 2026, 1:38 PM

#

Hello I got a prototype of my AI algorithm to skip standard training, I need testers

#

please contact

gentle aspen Apr 8, 2026, 2:00 PM

#

open wharf Good day everyone 🤠, how we are all having a great day and time. Can antigravi...

If the app you want to build can b coded, yes it can build it

gentle aspen Apr 8, 2026, 2:02 PM

#

hushed night Hello I got a prototype of my AI algorithm to skip standard training, I need tes...

ohh, hey! its you again.

how did your project go?
DId it gowell? cuz I loved your idea, but I felt it's unrealistic.
but I am happy to test it out, if you would help me test our an app I made for student😄.

either ways i will help you test it

hushed night Apr 8, 2026, 2:06 PM

#

gentle aspen ohh, hey! its you again. how did your project go? DId it gowell? cuz I loved y...

Alright I can show you some research papers in private and to anyone who want to test, if you got an ai model ( still capped at 300 million parameters for computer power ) we can start

gentle aspen Apr 8, 2026, 2:09 PM

#

yeah!
great you starte dwith 300m, but I reccomend we can scale down to 100m params if you want to collab with me, because I am pretty sure your whoel idea is efficiency, and for basic research I fee like 100m is far more than enough

#

You can DM me. We will check it out!

rain lava Apr 8, 2026, 2:16 PM

#

open wharf Good day everyone 🤠, how we are all having a great day and time. Can antigravi...

Yes it can make both of what you have specified - The agents in antigravity are capable of the same thing normal AIs do, but with some advantages;
It can execute commands, and make/remove files without you having to do anything.

fiery lagoon Apr 8, 2026, 3:09 PM

#

gentle aspen yeah! great you starte dwith 300m, but I reccomend we can scale down to 100m par...

why u told me not to come here 😭

#

some one devolop google ram

#

4 tabs btw

#

7 actually

#

but still

gentle aspen Apr 8, 2026, 3:11 PM

#

dude chrome is electron, what did you expect. it is the devs of the website who is responsible for this

#

what did you ex[pect with an electron app?

fiery lagoon Apr 8, 2026, 3:11 PM

#

man im switching to internet explorer bro

gentle aspen Apr 8, 2026, 3:11 PM

#

no body is holding you back dawg

fiery lagoon Apr 8, 2026, 3:12 PM

#

bro they deleted my boy internet explorer from windows 11

#

idk how yall use antigravity

gentle aspen Apr 8, 2026, 3:14 PM

#

fiery lagoon bro they deleted my boy internet explorer from windows 11

it is called "edge"

#

they use webview2

#

not electron

#

u might survive. kinda...

gentle aspen Apr 8, 2026, 3:15 PM

#

fiery lagoon idk how yall use antigravity

skills

gusty meteor Apr 8, 2026, 3:22 PM

#

lmao

rain lava Apr 8, 2026, 3:36 PM

#

fiery lagoon some one devolop google ram

"53" somewhere in those 7tabs theres 53processes which isnt normal for when i use to use chrome.

#

Firefox may help.

rain lava Apr 8, 2026, 3:39 PM

#

fiery lagoon idk how yall use antigravity

You download it to use it

hushed night Apr 8, 2026, 4:36 PM

#

yh firefox is light-weight, but i prefer opera cuz it lets you customize everything, even ram usage and cpu usage

#

no edge is filled with copilot and microflop stuff

slow raven Apr 8, 2026, 4:56 PM

#

hushed night yh firefox is light-weight, but i prefer opera cuz it lets you customize everyth...

Just make your own browsers

slow raven Apr 8, 2026, 4:56 PM

#

hushed night no edge is filled with copilot and microflop stuff

U can customize everything

rain lava Apr 8, 2026, 5:18 PM

#

hushed night yh firefox is light-weight, but i prefer opera cuz it lets you customize everyth...

Ive had bad exoerience with opera tbh

#

Even if you limit ram its then slower

hushed night Apr 8, 2026, 5:19 PM

#

idk in alternative firefox is great

#

I never noticed ram spikes with opera tho

open wharf Apr 8, 2026, 5:31 PM

#

rain lava Yes it can make both of what you have specified - The agents in antigravity are ...

Thanks for the feedback ☺️

gentle aspen Apr 8, 2026, 5:41 PM

#

I feel like firefox and brave is the ebst browser choices besides chrome tbh

gentle aspen Apr 8, 2026, 6:43 PM

#

dont know th was going on with my friend, but my chrome is pretty good

hushed night Apr 8, 2026, 8:16 PM

#

I found a way to completely skip backpropagation, i tried on small and medium transformer models, my generator model performs 99.8% with 8 layers only, it can generate weights of models now purely with sets of questions and awnsers

#

I need bigger testers to find out if this is truly bulletproof

#

And thank you only mighty to let me test your models

vagrant folio Apr 9, 2026, 12:20 AM

#

AG add allway allow command execution list always denied and ask user

#

so hope now we can add commands we want always to be executed without confirmation

rain lava Apr 9, 2026, 2:02 AM

#

gentle aspen I feel like firefox and brave is the ebst browser choices besides chrome tbh

I dislike chromium browsers tbh

rain lava Apr 9, 2026, 2:04 AM

#

gentle aspen dont know th was going on with my friend, but my chrome is pretty good

I think they had ghost processes, they said they had 7 tabs open but taskmanager showed like 53

gentle aspen Apr 9, 2026, 4:45 AM

#

hushed night I found a way to completely skip backpropagation, i tried on small and medium tr...

He actually did guys. it was crazy!
I think yall should help hm too, he is doing some crazy stuff back there.

#

What UI looks good?

gentle aspen Apr 9, 2026, 5:16 AM

#

rain lava I dislike chromium browsers tbh

chromium browsers are easy to work with, so I prefer them. And also they spent billions trying to perfect chromium. I mean we all have different choices, but I saying with chromium, you dont really have to care much about it yk.

gentle aspen Apr 9, 2026, 6:56 AM

#

ohh, didn't even realize @rain lava is gold. lol

rain lava Apr 9, 2026, 7:14 AM

#

gentle aspen chromium browsers are easy to work with, so I prefer them. And also they spent b...

Billions? I find ghecko just as fast as blink? And Chromium's blink multiplies processes to speed it up which eats tons of RAM.

rain lava Apr 9, 2026, 7:15 AM

#

gentle aspen ohh, didn't even realize <@1471414427935834203> is gold. lol

Yea 🔥

gentle aspen Apr 9, 2026, 7:15 AM

#

I mean, if you complain about that, you can;t be using Discord, telegram, chatgpt, gemini, antigravity, or anyother thing, bcs they are all just chromium with a costume called electron🤣 lmao

#

corban, did you manage to get ONCard runing on linux?

rain lava Apr 9, 2026, 7:16 AM

#

rain lava Yea 🔥

I find it weird they don't use any of the nice gradients on roles, they have 30 boosts..

gentle aspen Apr 9, 2026, 7:16 AM

#

rain lava I find it weird they don't use any of the nice gradients on roles, they have 30 ...

lol

#

they are broke just as us

rain lava Apr 9, 2026, 7:16 AM

#

gentle aspen corban, did you manage to get ONCard runing on linux?

I've been gone all day ---

I slept in (It's holidays and I slept 3am) And I had to go to a course - Lastnight I did a bit though

rain lava Apr 9, 2026, 7:17 AM

#

gentle aspen they are broke just as us

It only costs 3 of the bososts they have tho

gentle aspen Apr 9, 2026, 7:17 AM

#

rain lava I've been gone all day --- I slept in (It's holidays and I slept 3am) And I had...

ohh lol. your sleep schedule is worse than mine lol

rain lava Apr 9, 2026, 7:17 AM

#

gentle aspen I mean, if you complain about that, you can;t be using Discord, telegram, chatgp...

True... I just don't see how it costed them billions to make something perfect that still isn't perfect

gentle aspen Apr 9, 2026, 7:17 AM

#

ohh dude, I just realized you can drop your KV quant to about Q5-6 for more TPS on the bigger models on ollama

rain lava Apr 9, 2026, 7:18 AM

#

KV?

#

How much more tps..?

gentle aspen Apr 9, 2026, 7:18 AM

#

rain lava True... I just don't see how it costed them billions to make something perfect t...

nothing in this world is perfect.
electron is easy for devs. write .js or .ts code and push, BOOM! update.

but iwth Qt or any native frameowrk, you have to write your custom paints which takes so much time.

rain lava Apr 9, 2026, 7:18 AM

#

gentle aspen ohh lol. your sleep schedule is worse than mine lol

Only on holidays it's bad on school it's a bit more reasonable

gentle aspen Apr 9, 2026, 7:18 AM

#

5-10% more

gentle aspen Apr 9, 2026, 7:19 AM

#

rain lava Only on holidays it's bad on school it's a bit more reasonable

ohh yeah! I usually sleep around 10-12

rain lava Apr 9, 2026, 7:19 AM

#

gentle aspen nothing in this world is perfect. electron is easy for devs. write .js or .ts co...

So it comes down to laziness.

rain lava Apr 9, 2026, 7:19 AM

#

gentle aspen 5-10% more

I don't know what KV is 😭

gentle aspen Apr 9, 2026, 7:19 AM

#

gentle aspen What UI looks good?

wait, dude, what UI do you think looks good?

gentle aspen Apr 9, 2026, 7:19 AM

#

rain lava I don't know what KV is 😭

it is just "Key-Value"

#

it is like context, but fast. so the model can load memory faster. it costs more compute tho

rain lava Apr 9, 2026, 7:20 AM

#

gentle aspen wait, dude, what UI do you think looks good?

The one with like the next and other button rather than an x and arrow

gentle aspen Apr 9, 2026, 7:20 AM

#

yoo my friends said that too, why tho? what did you find unpleasant?

rain lava Apr 9, 2026, 7:20 AM

#

gentle aspen it is like context, but fast. so the model can load memory faster. it costs more...

And lowering the Quant doesn't affect it much right

gentle aspen Apr 9, 2026, 7:21 AM

#

rain lava And lowering the Quant doesn't affect it much right

well, it takes less memory on your GPU, so TPS wil bemore stable and better

rain lava Apr 9, 2026, 7:21 AM

#

gentle aspen yoo my friends said that too, why tho? what did you find unpleasant?

I believe it'll be simpler and easier for new users to look at and click

rain lava Apr 9, 2026, 7:21 AM

#

gentle aspen well, it takes less memory on your GPU, so TPS wil bemore stable and better

I mean like the perf of the model

gentle aspen Apr 9, 2026, 7:21 AM

#

rain lava I believe it'll be simpler and easier for new users to look at and click

ohhh.

gentle aspen Apr 9, 2026, 7:22 AM

#

rain lava I mean like the perf of the model

it will be like 5-10% faster. I mean it is better than nothing. plus, you wont feel any accuracy diff

rain lava Apr 9, 2026, 7:22 AM

#

It's also pretty nice to look at over the x and arrow

rain lava Apr 9, 2026, 7:22 AM

#

gentle aspen it will be like 5-10% faster. I mean it is better than nothing. plus, you wont f...

Oh that's great.

gentle aspen Apr 9, 2026, 7:25 AM

#

rain lava It's also pretty nice to look at over the x and arrow

yeah! thats my point. i want the app to look minimal. I will add a tooltip so when they hover too long, they will see the nam eof the button

rain lava Apr 9, 2026, 7:38 AM

#

gentle aspen it will be like 5-10% faster. I mean it is better than nothing. plus, you wont f...

How do I do it -- What's the command?

rain lava Apr 9, 2026, 7:39 AM

#

gentle aspen yeah! thats my point. i want the app to look minimal. I will add a tooltip so wh...

Or a question mark and it exaplains the UI??

#

Both're good

rain lava Apr 9, 2026, 7:39 AM

#

rain lava How do I do it -- What's the command?

Also is there much of ann accuracy jump between Q5 and Q6?

gentle aspen Apr 9, 2026, 7:39 AM

#

rain lava How do I do it -- What's the command?

the same command you ran to get your KV cache qunrtazized to q8 yerstaday

#

omds my grammar is so cooked

rain lava Apr 9, 2026, 7:40 AM

#

gentle aspen the same command you ran to get your KV cache qunrtazized to q8 yerstaday

Oo okay

gentle aspen Apr 9, 2026, 7:40 AM

#

change it ot q6 tho

gentle aspen Apr 9, 2026, 9:16 AM

#

was hooking up the bug report button to my GitHub "issues" page a bad idea?? 🤔

gusty meteor Apr 9, 2026, 9:17 AM

#

i think no, its good

gentle aspen Apr 9, 2026, 9:44 AM

#

first time using all the context🤣

gentle aspen Apr 9, 2026, 9:44 AM

#

gusty meteor i think no, its good

ohh okay! thx. i just needed to know that if users will try to do stupid shi and spam GitHub.

gusty meteor Apr 9, 2026, 9:45 AM

#

gentle aspen ohh okay! thx. i just needed to know that if users will try to do stupid shi and...

idk maybe they do but never seen that lmao

gentle aspen Apr 9, 2026, 9:46 AM

#

gusty meteor idk maybe they do but never seen that lmao

haha lol.
but, there is the type of humans who realize they have free will too early 🤣🤣

gusty meteor Apr 9, 2026, 9:46 AM

#

gentle aspen first time using all the context🤣

idk about codex but, antigravity make a great point atp, its can see the other chats that in connected to other folder(project)

#

it can competely download the conversion history and understand the topic again

gentle aspen Apr 9, 2026, 9:47 AM

#

my idea is: user see bug --> user create an issue on GitHub --> i get notified.

gusty meteor Apr 9, 2026, 9:48 AM

#

gentle aspen my idea is: user see bug --> user create an issue on GitHub --> i get notified.

This is how it should normally be but idk XD

gentle aspen Apr 9, 2026, 9:48 AM

#

gusty meteor This is how it should normally be but idk XD

lol my time implementing a bug report feature. I was oign to do this with cloudfare workers, but I realized GitHub issues might be the easier way

gusty meteor Apr 9, 2026, 9:54 AM

#

gentle aspen lol my time implementing a bug report feature. I was oign to do this with cloudf...

i think yes that appoarch is better plus you can direclty redirect user to that link https://github.com/username/project/issues/new?title=Program+error&body=System-logs

#

which is more easier i think you can pass thru the version example

stark sapphire Apr 9, 2026, 9:55 AM

#

Has anyone else encountered an issue where Antigravity with the AI agent suddenly looks into the wrong project folders?
It happened to me, and sometimes it will request access to those, even though we are not even working on other projects

gentle aspen Apr 9, 2026, 9:55 AM

#

gusty meteor i think yes that appoarch is better plus you can direclty redirect user to that ...

ohh that make sense. i willd o that. thx for the tip 🙂

rain lava Apr 9, 2026, 10:28 AM

#

stark sapphire Has anyone else encountered an issue where Antigravity with the AI agent suddenl...

Have you tried to cd into the current project folder you need rather than make it search for a correct one?

stark sapphire Apr 9, 2026, 10:30 AM

#

i have not. But i never had too. I have my folders separated for each project.
So when i open a new project, it will simply stay in there. But now for some reason, the AI is trying to access other unrelated folders.

rain lava Apr 9, 2026, 10:30 AM

#

You should try it. If you don't wanto just create a rules folder for it to follow that tells it it's task in that project.

gentle aspen Apr 9, 2026, 11:35 AM

#

I may have found the best way to vibe code.

the chat AI of your vibe coding app generates the prompt from your instructions and you copy paste that into the vibe coding app🤣😭

#

where did humanity come to this from😭 lmao

stark sapphire Apr 9, 2026, 11:36 AM

#

i been doing this for ages.

gentle aspen Apr 9, 2026, 11:37 AM

#

I knew this ages ago too, but i am lazy to this. i just realized this after my really good prompting habbit.

#

omg, humanity is cooked🤣

rain lava Apr 9, 2026, 11:37 AM

#

gentle aspen I knew this ages ago too, but i am lazy to this. i just realized this after my r...

It's impossible to get better prompts than that

stark sapphire Apr 9, 2026, 11:38 AM

#

i had button rendering errors just minutes ago

#

i solved it by making a whole new UI

#

gentle aspen Apr 9, 2026, 11:41 AM

#

is this electron?
This looks very tailwind-ish

gentle aspen Apr 9, 2026, 11:42 AM

#

stark sapphire

what framework did you use for this? so hard to this type of stuff with Qt

gentle aspen Apr 9, 2026, 11:43 AM

#

rain lava It's impossible to get better prompts than that

frfr💀🥀

stark sapphire Apr 9, 2026, 11:43 AM

#

Frontend Framework: React (Version 19)
Build Tool: Vite (provides fast development and bundling)
Styling: Tailwind CSS (currently being injected via CDN, using the "Intellectual Salon" bento-box design system we built)
Backend/Database: Supabase (PostgreSQL database with built-in Authentication and Realtime features)
Language: TypeScript (for type safety and better developer experience)

gentle aspen Apr 9, 2026, 11:44 AM

#

stark sapphire Frontend Framework: React (Version 19) Build Tool: Vite (provides fast developme...

ahhh kne it! it looks very react-tailwindy.

Qt is very annoying. You need to custom paint and render it. the only thing which is different from writing it from pure binray is, it give easy access to CPU and GPU😭

gentle aspen Apr 9, 2026, 12:42 PM

#

omds, why does codex likes to increase my cortisol?

gusty meteor Apr 9, 2026, 12:43 PM

#

gentle aspen omds, why does codex likes to increase my cortisol?

is that qt

gentle aspen Apr 9, 2026, 12:43 PM

#

yeah

gusty meteor Apr 9, 2026, 12:43 PM

#

gentle aspen yeah

why you prefer qt if dont mind to ask

gentle aspen Apr 9, 2026, 12:43 PM

#

with custom paints

gentle aspen Apr 9, 2026, 12:43 PM

#

gusty meteor why you prefer qt if dont mind to ask

bcs there ar emany libraries which only python has...

#

so i cant do electron

#

and also there is over 10k line sof UI code

#

so no going back😭

gusty meteor Apr 9, 2026, 12:44 PM

#

gentle aspen bcs there ar emany libraries which only python has...

like? can you give example?

#

just curios lmao

#

that should be some ai libraries i think

gentle aspen Apr 9, 2026, 12:45 PM

#

like some, hmm, like ollama.
and similar.
like langchang and some pytorch.

#

wait

#

how did you realize it was Qt

#

??

gusty meteor Apr 9, 2026, 12:45 PM

#

gentle aspen how did you realize it was Qt

i have memories with qt like you

#

and its very hard to make ui in qt

gentle aspen Apr 9, 2026, 12:46 PM

#

ohh yeah. so annoying. Wish we has react and tailwind like stuff on Qt.

what can i do about this?

#

is there any fix for this?

gusty meteor Apr 9, 2026, 12:46 PM

#

gentle aspen ohh yeah. so annoying. Wish we has react and tailwind like stuff on Qt. what ca...

idk

#

you can use qt designer

gentle aspen Apr 9, 2026, 12:47 PM

#

Qt designer is an more cooked piece of software from the pre historic age

gusty meteor Apr 9, 2026, 12:47 PM

#

gentle aspen Qt designer is an more cooked piece of software from the pre historic age

lmao, i love it

dusky pollen Apr 9, 2026, 1:20 PM

#

Anyone know about this? The chat history got wiped out (antigravity)

gusty meteor Apr 9, 2026, 1:21 PM

#

the ai thinking for 16 hour? 💀

#

i think something goes wrgon

dusky pollen Apr 9, 2026, 1:25 PM

#

Its actually the total hours I spent on this chat I believe

#

The other work I lost was 128 hours but at least had my backups

#

This is hella annoying

frail raven Apr 9, 2026, 1:34 PM

#

dusky pollen Its actually the total hours I spent on this chat I believe

on a single task then? 👀

dusky pollen Apr 9, 2026, 1:35 PM

#

frail raven on a single task then? 👀

What do you mean?

#

Its the total spent time of that conversation

#

Not sure how to explain it better?

frail raven Apr 9, 2026, 1:41 PM

#

Do you still have the issue if you make a new chat?

dusky pollen Apr 9, 2026, 1:42 PM

#

Nope I can continue there, its just the conversation data got broken and Antigravity is no longer loading it.

#

I checked and the pb file exists in the antigravity conversation folder, I guess something weird happened and the data got broken or something.

frail raven Apr 9, 2026, 1:46 PM

#

Please try to send feedback from antigravity, I believe there is a button in the settings for this purpose

#

It could help the team working on it so they can fix it!

dusky pollen Apr 9, 2026, 1:52 PM

#

Well I sent the feedback now. I hope they will look into this.

#

Even the submit button isn't working so...

#

Waiting for a few minutes now

#

Its not even submitting...

lost quiver Apr 9, 2026, 3:04 PM

#

You need testers if am right …?

stark sapphire Apr 9, 2026, 3:04 PM

#

lost quiver You need testers if am right …?

yea, and also someone who can just give me some feedback

lost quiver Apr 9, 2026, 3:06 PM

#

stark sapphire yea, and also someone who can just give me some feedback

Yeah that’s what testers do …!

stark sapphire Apr 9, 2026, 3:06 PM

#

lost quiver Yeah that’s what testers do …!

well.... i had people who just said, yea it works..

lost quiver Apr 9, 2026, 3:12 PM

#

stark sapphire well.... i had people who just said, yea it works..

Alright….👍

stable python Apr 9, 2026, 3:35 PM

#

Hey everyone 👋

I’m planning to start learning Django REST Framework (DRF) and wanted to ask if anyone has good free resources (YouTube playlists, docs, courses, etc.) to get started.

Also, could someone guide me on:
• What are the prerequisites before starting DRF?
• How much time does it usually take to learn it well enough to build a decent project?

I already have basic Django knowledge (models, views, CRUD, etc.), so I’m looking to level up into APIs.

Any suggestions or guidance would be really appreciated 🙌

vagrant folio Apr 9, 2026, 6:10 PM

#

Finally Gemini app get a concept of project but with different approach

#

they add now Notebooks

#

as a project orginizer

#

so is 2 in one

open stone Apr 9, 2026, 6:27 PM

#

vagrant folio Finally Gemini app get a concept of project but with different approach

I am scared of being to much dependent on AI tools

vagrant folio Apr 9, 2026, 7:32 PM

#

Yes that is understandable. My Case I keep studying. Ai help a lot on studding process, searching process, Writing code. But in the end it need that one have knowledge and wiling to learn. to get good results, and be able keep improving whatewer you may be doing.

misty heron Apr 9, 2026, 9:05 PM

#

Is there a way to agentically switch the model in antigravity?

#

(Possibly through a skill)

vagrant folio Apr 9, 2026, 9:31 PM

#

I dont think so. maybe we can send feedback feature request for that and explain benefits about it

misty heron Apr 9, 2026, 9:36 PM

#

why are there cryptobros on here? lol

misty heron Apr 9, 2026, 9:37 PM

#

vagrant folio I dont think so. maybe we can send feedback feature request for that and explain...

I'm happy to send it to them, where do I do this?

frail raven Apr 9, 2026, 9:38 PM

#

misty heron why are there cryptobros on here? lol

They're everywhere lol

vagrant folio Apr 9, 2026, 9:38 PM

#

Inside Antigravity click on your profile icon select report issue then check feature request

frail raven Apr 9, 2026, 9:39 PM

#

misty heron why are there cryptobros on here? lol

Feel free to ping the moderators when you see stuff like that by the way ;)

misty heron Apr 9, 2026, 9:39 PM

#

cool, I just joined so don't know the rules, will do in the future, hehe

frail raven Apr 9, 2026, 9:40 PM

#

misty heron cool, I just joined so don't know the rules, will do in the future, hehe

Oooh welcome then!

vagrant folio Apr 9, 2026, 9:41 PM

#

Welcome!

misty heron Apr 9, 2026, 9:43 PM

#

okay, sent the feature request to them, though they've heard me before and didn't listen when I said we need pngs with full transparency, lol

vagrant folio Apr 9, 2026, 9:44 PM

#

yes I think is better send that way as it must be recorded.

#

and hope they hear you

misty heron Apr 9, 2026, 9:45 PM

#

hehe, hope so

#

imagine being able to switch models agentically for planning, and tasks

#

especially as a skill

vagrant folio Apr 9, 2026, 9:53 PM

#

yes is good approach.

misty heron Apr 9, 2026, 10:03 PM

#

lol, I've been talking about this for a while, and today seems like claude code did it... just saw a youtube video on it o.O

trim fog Apr 9, 2026, 10:06 PM

#

Hey everyone, I need some help with Gemini API (Google AI) 🙏

I used to be able to call Gemini 2.5 Pro / 3.1 Pro with a free API key (with limited free quota), and it worked fine before.

But recently, my requests started failing:

Sometimes no proper response
Sometimes errors related to model access / quota

What I’ve tried so far:

Generated a new API key
Double-checked endpoint & headers
Tested both SDK and REST API

Still not working like before 😓

So I’m wondering:

Has the free tier changed recently?
Are Pro models now restricted or paid-only?
Do we now need to enable billing for access?

If anyone is actively using Gemini API right now, could you confirm:

Are there any free models still available?
Any extra setup needed in Google Cloud?

If there’s any updated docs or changelog, please share as well 🙏

Appreciate any insights!

vagrant folio Apr 9, 2026, 11:04 PM

#

check in google ai studio the models and rate limits

#

they changes few month ago

#

#

#

as you can see the pro models for free are 0

broken prairie Apr 10, 2026, 1:43 AM

#

I remember the short-lived good ol days when I could use Flash Lite 1000 times a day for free

small quiver Apr 10, 2026, 3:00 AM

#

Hi everyone! Why is it that when I check, there’s no rate limit, but the app still reports it like that?

trim fog Apr 10, 2026, 3:37 AM

#

vagrant folio as you can see the pro models for free are 0

Thanks 😍😍😍

vagrant folio Apr 10, 2026, 4:02 AM

#

small quiver Hi everyone! Why is it that when I check, there’s no rate limit, but the app sti...

Check which model are you using

small quiver Apr 10, 2026, 5:11 AM

#

vagrant folio Check which model are you using

i using gemini 2.5 flash

gentle aspen Apr 10, 2026, 5:19 AM

#

yo yo yo, guys! I am building a local, fully standalone AI presentation maker.
I will add support for ollama in my beta updates.

#

I will give some updates right after I build it

#

it will be opensource, soyall can build on top of it

hushed night Apr 10, 2026, 5:45 AM

#

thanks dude so i can test my algorithm on it

gentle aspen Apr 10, 2026, 8:17 AM

#

NO WAY!! everything in this PPTX file is fully generated by Gemma4:26b

google definietly did a GREAT job with their model. since I am stilo developing this, I can't tell yall the repo for now, after doen building the first realese, i will make this opensource for sure!

📎 the-architect-of-intelligence-sam-altan.pptx

gusty meteor Apr 10, 2026, 8:18 AM

#

sam altan

#

💀

gentle aspen Apr 10, 2026, 8:30 AM

#

gusty meteor sam altan

whats wrong with him? he is one of my favourite figures in tech🤷‍♂️

#

anyways, does it look "nice" to you?

gusty meteor Apr 10, 2026, 8:30 AM

#

gentle aspen anyways, does it look "nice" to you?

dont downloaded so idk

gusty meteor Apr 10, 2026, 8:30 AM

#

gentle aspen whats wrong with him? he is one of my favourite figures in tech🤷‍♂️

lmao

gentle aspen Apr 10, 2026, 8:30 AM

#

gusty meteor dont downloaded so idk

can you tell me if it looks good?

gusty meteor Apr 10, 2026, 8:31 AM

#

gentle aspen can you tell me if it looks good?

i dont understand local ai jobs

gentle aspen Apr 10, 2026, 8:31 AM

#

bro, jsut tell me if the presentation looks good😭

gusty meteor Apr 10, 2026, 8:31 AM

#

lmao

gentle aspen Apr 10, 2026, 8:31 AM

#

you know what, lemme give you soem rest for your brasinn cells and upload images

gusty meteor Apr 10, 2026, 8:32 AM

#

yes

gentle aspen Apr 10, 2026, 8:32 AM

#

all of these is generated by gemma, except for the picture for obvious reasons

gusty meteor Apr 10, 2026, 8:33 AM

#

gentle aspen all of these is generated by gemma, except for the picture for obvious reasons

why image 120p lmao

gentle aspen Apr 10, 2026, 8:33 AM

#

This is just an experimental run. so I didn;t expect much, but i am making it better, and i will release it ina few days

gentle aspen Apr 10, 2026, 8:33 AM

#

gusty meteor why image 120p lmao

random image from google lol.

gusty meteor Apr 10, 2026, 8:33 AM

#

gentle aspen random image from google lol.

💀

gentle aspen Apr 10, 2026, 8:57 AM

#

#

https://tenor.com/view/war-gif-18347964941877609900

Tenor

gusty meteor Apr 10, 2026, 8:58 AM

#

gentle aspen

what you prefer? codex or local llms

gentle aspen Apr 10, 2026, 8:58 AM

#

depends

gusty meteor Apr 10, 2026, 8:58 AM

#

gentle aspen depends

no you have to say codex since your fan of sam altman lmamo

gentle aspen Apr 10, 2026, 8:58 AM

#

gusty meteor no you have to say codex since your fan of sam altman lmamo

bruh

vagrant folio Apr 10, 2026, 2:51 PM

#

small quiver i using gemini 2.5 flash

Try create new project and generate new api or in same project generate new api to find what may be the issue

languid elbow Apr 10, 2026, 3:47 PM

#

just watched visual studio codes AI proceed to pretend to build an entire index and type nothing. guess whos switching to antigravvvv

languid elbow Apr 10, 2026, 4:07 PM

#

<@&1009526435276394496>

#

thank you <3

stark sapphire Apr 10, 2026, 4:44 PM

#

I have been code vibing so hard, i almost ran out of every model.

balmy depot Apr 11, 2026, 2:31 AM

#

I've been tag teaming OpenCode (with mostly the free services off the Zen platform), with my Pro AI tier. Not feeling the quota squeeze nearly as badly. Did run out of Zen Free at one point, but switched back to Gemini Flash for a bit, with the occasional escalation to Claude Opus for squirrely planning of a Refactor and a few issues. But not as frustrated. OpenCode Zen BigPickle or MinMax 2.5 Free are pretty comparable or maybe sometimes better then Gemini Flash 3, in my perception. Have to watch it's thinking, and be a bit hands on, but that so true of Flash. Stopped using the plugin and just run in my terminal in Antigravity. Unified my AGENT.md. Working remarkably well. I can swap back and forth and just point at an artifact and some docs I had the agents write and maintain.

Okay I have Q5 Qwen 3.5 @ 35B & 64K

"Make an essay on a very random thing"

otal duration: 1m50.818623422s load duration: 76.485723ms prompt eval count: 3329 token(s) prompt eval duration: 2.457025305s prompt eval rate: 1354.89 tokens/s eval count: 1550 token(s) eval duration: 1m47.729524707s eval rate: 14.39 tokens/s

otal duration: 1m50.818623422s
load duration: 76.485723ms
prompt eval count: 3329 token(s)
prompt eval duration: 2.457025305s
prompt eval rate: 1354.89 tokens/s
eval count: 1550 token(s)
eval duration: 1m47.729524707s
eval rate: 14.39 tokens/s