#gpt-oss | OpenAI | Page 1

livid cipher Aug 5, 2025, 6:30 PM

#

first

#

:3

obsidian wyvern Aug 5, 2025, 6:31 PM

#

💌

glacial canyon Aug 5, 2025, 6:33 PM

#

VBThumbsUp

solemn willow Aug 5, 2025, 6:33 PM

#

openai

twin tundra Aug 5, 2025, 6:33 PM

#

what's this

carmine sandal Aug 5, 2025, 6:34 PM

#

The agentic stuff are the tools that can be added to the model ?

shut pollen Aug 5, 2025, 6:34 PM

#

cool

obsidian wyvern Aug 5, 2025, 6:34 PM

#

See #announcements

mint rain Aug 5, 2025, 6:34 PM

#

1146_dance

naive elbow Aug 5, 2025, 6:34 PM

#

bok_tanuclap

mint rain Aug 5, 2025, 6:34 PM

#

exited for the Hackathon

digital radish Aug 5, 2025, 6:35 PM

#

selah.

mint rain Aug 5, 2025, 6:35 PM

#

lets see how much we can push OSS

clever rose Aug 5, 2025, 6:35 PM

#

116tok/sec is sweet. This is a fast model (20b)

meager karma Aug 5, 2025, 6:36 PM

#

What type of idea you have in mind actually ?

carmine sandal Aug 5, 2025, 6:36 PM

#

carmine sandal The agentic stuff are the tools that can be added to the model ?

?

naive elbow Aug 5, 2025, 6:36 PM

#

meager karma What type of idea you have in mind actually ?

roleplay

mint rain Aug 5, 2025, 6:36 PM

#

Ill do a IoT project

naive elbow Aug 5, 2025, 6:36 PM

#

let’s see how good it is

carmine sandal Aug 5, 2025, 6:37 PM

#

naive elbow let’s see how good it is

In which way do you installed it ?

naive elbow Aug 5, 2025, 6:37 PM

#

carmine sandal In which way do you installed it ?

just lm studio

digital radish Aug 5, 2025, 6:37 PM

#

meager karma What type of idea you have in mind actually ?

I'll build holograms

meager karma Aug 5, 2025, 6:37 PM

#

Can you use ollama ?

naive elbow Aug 5, 2025, 6:38 PM

#

well i can’t post the link because of auto mod

meager karma Aug 5, 2025, 6:38 PM

#

digital radish I'll build holograms

what do you mean by that ?

digital radish Aug 5, 2025, 6:38 PM

#

digital radish I'll build holograms

open source

naive elbow Aug 5, 2025, 6:38 PM

#

but just go in google search and type LM studio

digital radish Aug 5, 2025, 6:38 PM

#

meager karma what do you mean by that ?

I'll build holograms (replacement for flashlights in phone) and I'll share docs

mint rain Aug 5, 2025, 6:39 PM

#

digital radish I'll build holograms (replacement for flashlights in phone) and I'll share docs

what?!

meager karma Aug 5, 2025, 6:39 PM

#

meager karma Can you use ollama ?

You can indeed use ollama

naive elbow Aug 5, 2025, 6:40 PM

#

meager karma You can indeed use ollama

does it work in the ollama app?

carmine sandal Aug 5, 2025, 6:40 PM

#

naive elbow does it work in the ollama app?

Yes

digital radish Aug 5, 2025, 6:40 PM

#

mint rain what?!

yes! Anakh is already working on it.

naive elbow Aug 5, 2025, 6:40 PM

#

carmine sandal Yes

A_HuTaoHeart

#

thanks

serene holly Aug 5, 2025, 6:52 PM

#

Eyes

static pivot Aug 5, 2025, 6:53 PM

#

👀

solemn willow Aug 5, 2025, 7:05 PM

#

Runs decently well in LM Studio.
137 tok/s for gpt-oss-20b on my 4090

Only issue is the model doesn't follow system prompts at all. Unsure if that's baked in or a llama.cpp related issue.

gaunt heart Aug 5, 2025, 7:06 PM

#

who’s oss

half skiff Aug 5, 2025, 7:14 PM

#

gaunt heart who’s oss

Its not a who. Its a search model to search for locally indexed files using ai

obsidian wyvern Aug 5, 2025, 7:21 PM

#

OSS = Open Source Software

amber spruce Aug 5, 2025, 7:32 PM

#

you can run it locally

clever rose Aug 5, 2025, 7:34 PM

#

people are going to finetune it with crazy stuff you can't get on chatgpt

amber spruce Aug 5, 2025, 7:34 PM

#

You just mentioned it

#

That you can run it without wifi

amber spruce Aug 5, 2025, 7:34 PM

#

clever rose people are going to finetune it with crazy stuff you can't get on chatgpt

And this too :)

#

I'm far far from an expert on this topic but local LLMs do really have their advantages

gaunt heart Aug 5, 2025, 7:35 PM

#

half skiff Its not a who. Its a search model to search for locally indexed files using ai

i think i got warned for asking what oss was

#

but it can be used to access my file system?

#

interesting

half skiff Aug 5, 2025, 7:36 PM

#

To search through documents. offline

gaunt heart Aug 5, 2025, 7:36 PM

#

oh wow

#

law firms are now automatic

half skiff Aug 5, 2025, 7:37 PM

#

gaunt heart oh wow

I know, cool I am adding this to maintenance dep. All those manuals and settings

gaunt heart Aug 5, 2025, 7:38 PM

#

wow

#

i think everything will be ai soon

#

like how everything is computers

#

just all computers will be ai

meager karma Aug 5, 2025, 7:41 PM

#

You can use local ai to be fine tunned and it also have better privacy

drowsy badger Aug 5, 2025, 7:41 PM

#

meager karma You can use local ai to be fine tunned and it also have better privacy

privacy is the strongest argument for sure

hardy jasper Aug 5, 2025, 7:41 PM

#

so whats this thing good for ? coz it sure as heck cant code properly

meager karma Aug 5, 2025, 7:42 PM

#

hardy jasper so whats this thing good for ? coz it sure as heck cant code properly

it's a free chatgpt with limitations such as coding apparently but have advantage such as privacy and fine tunning

hardy jasper Aug 5, 2025, 7:43 PM

#

is it aimed at something specific that its good for ( at the moment ? )

drowsy badger Aug 5, 2025, 7:43 PM

#

It says it's very good at health related questions

meager karma Aug 5, 2025, 7:44 PM

#

hardy jasper is it aimed at something specific that its good for ( at the moment ? )

it's for now aimed as just a chatgpt in local with an actual level which is bellow online chatgpt

fair summit Aug 5, 2025, 7:44 PM

#

why is gpt oss so slow on ollama on my computer

#

its m2 air with 16gb ram dn 1tb ssd

obsidian wyvern Aug 5, 2025, 7:45 PM

#

The models fit between o3 and o4. See "Model Performance" : https://openai.com/open-models/

Open models by OpenAI

Advanced open-weight reasoning models to customize for any use case and run anywhere.

obsidian wyvern Aug 5, 2025, 7:45 PM

#

meager karma it's for now aimed as just a chatgpt in local with an actual level which is bell...

yup

carmine sandal Aug 5, 2025, 7:45 PM

#

fair summit why is gpt oss so slow on ollama on my computer

Same

shut pollen Aug 5, 2025, 7:45 PM

#

fair summit its m2 air with 16gb ram dn 1tb ssd

macbook airs aren't great since they don't have adequate cooling - you also need more ram

drowsy badger Aug 5, 2025, 7:45 PM

#

16gb vram is minimum thats why

fair summit Aug 5, 2025, 7:45 PM

#

it requires a whole minute or two for it to answer the most basic question

hardy jasper Aug 5, 2025, 7:46 PM

#

can it do web searches on LMStudio ? Anyone have it installed to check ?

drowsy badger Aug 5, 2025, 7:47 PM

#

Companies who deploy to production and are working with highly confidential data.

subtle brook Aug 5, 2025, 7:48 PM

#

solemn willow Runs decently well in LM Studio. 137 tok/s for gpt-oss-20b on my 4090 Only issu...

This is how I ran it, it does seem to not follow prompt sometimes

meager karma Aug 5, 2025, 7:49 PM

#

For now OpenAI wanted to release a local model for people who want a local chatgpt without having other ai brand and the capabilty of OpenAI and OpenAI use this ocassion to make a community challenge

solemn willow Aug 5, 2025, 7:49 PM

#

hardy jasper can it do web searches on LMStudio ? Anyone have it installed to check ?

Need to use an mcp-server for web search in LM Studio.

fair summit Aug 5, 2025, 7:49 PM

#

drowsy badger privacy is the strongest argument for sure

buddy my data is prob in china atp

meager karma Aug 5, 2025, 7:49 PM

#

Where are you from tho ?

drowsy badger Aug 5, 2025, 7:51 PM

#

Im sad they didn't release a smaller model for smartphones though.

static pivot Aug 5, 2025, 7:51 PM

#

Oh, what? That's a shame.

subtle brook Aug 5, 2025, 7:51 PM

#

solemn willow Need to use an mcp-server for web search in LM Studio.

How to add that?

meager karma Aug 5, 2025, 7:52 PM

#

It may depend from where you from how much importance you adress

obsidian wyvern Aug 5, 2025, 7:53 PM

#

I wonder if there is a quote/razor for "Whatever a company releases it's never good enough for a specific class of people who are never satisfied."

drowsy badger Aug 5, 2025, 7:54 PM

#

Would you be comfortable if somebody had access to all of your chathistory with chatgpt?

obsidian wyvern Aug 5, 2025, 7:54 PM

#

( Privacy is kinda #off-topic )

#

huh? From the guy who doesn't care about privacy?!?!

gaunt heart Aug 5, 2025, 7:56 PM

#

you don’t want people to have ur chat history

#

bruh

obsidian wyvern Aug 5, 2025, 7:57 PM

#

fair summit it requires a whole minute or two for it to answer the most basic question

https://www.google.com/search?q=triangle+good+fast+cheap
You can get Good, Fast, or Cheap. Pick two.
We got good and cheap. It's not going to be fast.

www.google.com

🔎 triangle good fast cheap - Google Search

meager karma Aug 5, 2025, 7:58 PM

#

it will fit with your need if you fine tunning for exemple a special expert in a domain

obsidian wyvern Aug 5, 2025, 7:59 PM

#

There's much more too the OSS aspect. Look around for comments by people who have been excited about OSS AI for some years now, the game-changing event with Deep Reasoning. Downloadable AI - with high quality and from a trusted source - is seriously a big industry deal.

meager karma Aug 5, 2025, 7:59 PM

#

You have even more flexibilty with the local model

obsidian wyvern Aug 5, 2025, 8:00 PM

#

You can run different models in different places for different purposes. Tuned (within limits) to your specifications.

naive elbow Aug 5, 2025, 8:00 PM

#

just like a chatgpt response

meager karma Aug 5, 2025, 8:01 PM

#

obsidian wyvern You can run different models in different places for different purposes. Tuned (...

is there a documentation of the oss model for fine tunning and personnalize ?

obsidian wyvern Aug 5, 2025, 8:02 PM

#

https://github.com/openai/gpt-oss and https://cookbook.openai.com/topic/gpt-oss

GitHub

GitHub - openai/gpt-oss: gpt-oss-120b and gpt-oss-20b are two open-...

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI - openai/gpt-oss

OpenAI Cookbook

Open-source examples and guides for building with the OpenAI API. Browse a collection of snippets, advanced techniques and walkthroughs. Share your own examples and guides.

meager karma Aug 5, 2025, 8:02 PM

#

thanks

#

https://cdn.openai.com/pdf/419b6906-9da6-406c-a19d-1bb078ac7637/oai_gpt-oss_model_card.pdf

obsidian wyvern Aug 5, 2025, 8:05 PM

#

Exactly : browser version : https://openai.com/index/gpt-oss-model-card/

#

This is one of the reasons why I really like OpenAI :

Estimating worst case frontier risks of open weight LLMs
In this paper, we study the worst-case frontier risks of releasing gpt-oss. We introduce malicious fine-tuning (MFT), where we attempt to elicit maximum capabilities by fine-tuning gpt-oss to be as capable as possible in two domains: biology and cybersecurity....
https://openai.com/index/estimating-worst-case-frontier-risks-of-open-weight-llms/

swift furnace Aug 5, 2025, 8:12 PM

#

asked this already but we probably cant use this thing to generate images locally huh?

full ruin Aug 5, 2025, 8:12 PM

#

obsidian wyvern This is one of the reasons why I really like OpenAI : > **Estimating worst case...

they should release the evil dataset for this /j

meager karma Aug 5, 2025, 8:13 PM

#

swift furnace asked this already but we probably cant use this thing to generate images locall...

Maybe there is a transformer that can do it

naive elbow Aug 5, 2025, 8:13 PM

#

swift furnace asked this already but we probably cant use this thing to generate images locall...

definitely not, it’s a shame

swift furnace Aug 5, 2025, 8:14 PM

#

naive elbow definitely not, it’s a shame

unfortunate. oh well. hopefully soon then

naive elbow Aug 5, 2025, 8:14 PM

#

i mean you probably could ask it for write me a stable diffusion prompt for a cat

swift furnace Aug 5, 2025, 8:14 PM

#

meager karma Maybe there is a transformer that can do it

could be, ill try to look into it

naive elbow Aug 5, 2025, 8:14 PM

#

shrug

naive elbow Aug 5, 2025, 8:14 PM

#

swift furnace unfortunate. oh well. hopefully soon then

hopefully

swift furnace Aug 5, 2025, 8:14 PM

#

naive elbow i mean you probably could ask it for write me a stable diffusion prompt for a ca...

yeah that for sure. itd be cool to use the model itself for the image gen yk?

strange yacht Aug 5, 2025, 8:15 PM

#

To be fair, you could probably run this to get a prompt and only use the prompt with inference to gpt-image to avoid those issues

#

also reduce costs

swift furnace Aug 5, 2025, 8:16 PM

#

sure you probably could. im just looking to see if i can generate some stuff locally with a powerful model for funsies.

naive elbow Aug 5, 2025, 8:16 PM

#

swift furnace yeah that for sure. itd be cool to use the model itself for the image gen yk?

nah image gen is not possible for the model but i get what you mean

swift furnace Aug 5, 2025, 8:22 PM

#

these oss models. are you able to further train them to fit specific needs by/for yourself?

tropic mural Aug 5, 2025, 8:22 PM

#

It has full parameter fine tuning available

meager karma Aug 5, 2025, 8:22 PM

#

swift furnace these oss models. are you able to further train them to fit specific needs by/fo...

yes

swift furnace Aug 5, 2025, 8:22 PM

#

that is pretty incredible

meager karma Aug 5, 2025, 8:22 PM

#

Where is the participant version release at ?

strange yacht Aug 5, 2025, 8:23 PM

#

Yeah they can be finetuned, apparently the 20b one can even be fine-tuned on consumer hardware

swift furnace Aug 5, 2025, 8:25 PM

#

i need better hardware to do all that. my 3060 12gig is not gonna be enough smh. was there any official word on what kinds of devices i could do all of that on?

#

devices lmao. hardware is what i meant

strange yacht Aug 5, 2025, 8:28 PM

#

swift furnace i need better hardware to do all that. my 3060 12gig is not gonna be enough smh....

I think it might ngl, my 4070 is doing fine

naive elbow Aug 5, 2025, 8:28 PM

#

swift furnace i need better hardware to do all that. my 3060 12gig is not gonna be enough smh....

16 gbs of vram and mostly a nvidia gpu

swift furnace Aug 5, 2025, 8:29 PM

#

strange yacht I think it might ngl, my 4070 is doing fine

word

naive elbow Aug 5, 2025, 8:29 PM

#

strange yacht I think it might ngl, my 4070 is doing fine

it’s probably going to be slow but who knows till they try it

tropic mural Aug 5, 2025, 8:29 PM

#

the 20b isn't terribly large you can run it with that I'm sure

strange yacht Aug 5, 2025, 8:29 PM

#

Yeah 20b is fine

swift furnace Aug 5, 2025, 8:29 PM

#

thats a relief

#

3060 ftw man. so glad i got the 12gig version

naive elbow Aug 5, 2025, 8:30 PM

#

swift furnace thats a relief

download lm studio to try it out

#

TohruThumbsUp

swift furnace Aug 5, 2025, 8:30 PM

#

naive elbow 16 gbs of vram and mostly a nvidia gpu

was about to buy a 5080 but then i heard nvidia is doing a price cut this fall. might hold off on this gen entirely.

swift furnace Aug 5, 2025, 8:30 PM

#

naive elbow download lm studio to try it out

for sure

naive elbow Aug 5, 2025, 8:30 PM

#

swift furnace was about to buy a 5080 but then i heard nvidia is doing a price cut this fall. ...

oh gotcha

tropic mural Aug 5, 2025, 8:31 PM

#

I've got a 3070 on desktop but want to try it on a macbook pro

swift furnace Aug 5, 2025, 8:31 PM

#

tropic mural I've got a 3070 on desktop but want to try it on a macbook pro

i have a m1 pro macbook. would be cool to test it on there too now that you mention it

tropic mural Aug 5, 2025, 8:32 PM

#

I'll pull it now and see how it does

#

This is going to be a few lol

strange yacht Aug 5, 2025, 8:38 PM

#

I don't think that will do well at all

#

I will be surprised if it does anything at all

tropic mural Aug 5, 2025, 8:39 PM

#

Once it downloads and is ready, we'll see. Worth a try.

swift furnace Aug 5, 2025, 8:39 PM

#

you'd be surprised, i think it will fair well

#

but what do ik lmfao

tropic mural Aug 5, 2025, 8:39 PM

#

I have run other models on this and large ML projects. Macbooks are not the machines they were 10 years ago

swift furnace Aug 5, 2025, 8:40 PM

#

agreed

tropic mural Aug 5, 2025, 8:41 PM

#

This is taking a while to download so will check back later, doing other work for now 😂

wide crest Aug 5, 2025, 8:41 PM

#

can't wait to melt my 2080 Super!

swift furnace Aug 5, 2025, 8:41 PM

#

tropic mural This is taking a while to download so will check back later, doing other work fo...

feel free to ping me when you get any results

tropic mural Aug 5, 2025, 8:41 PM

#

wide crest can't wait to melt my 2080 Super!

swift furnace Aug 5, 2025, 8:41 PM

#

i dont get off work for a couple hours

wide crest Aug 5, 2025, 8:41 PM

#

tropic mural

a_skull a_skull a_skull a_skull

tropic mural Aug 5, 2025, 8:45 PM

#

#

Seems like a good sign

swift furnace Aug 5, 2025, 8:48 PM

#

noice

vivid falcon Aug 5, 2025, 8:56 PM

#

I can't seem to get this running on my system no matter what I do... I'm using LM Studio and have a 3060 12GB, and the 20B can't run at any quant...

livid cipher Aug 5, 2025, 8:56 PM

#

tropic mural

metal on the rise

naive elbow Aug 5, 2025, 9:02 PM

#

vivid falcon I can't seem to get this running on my system no matter what I do... I'm using L...

does it just cut off before it runs??

wide crest Aug 5, 2025, 9:05 PM

#

I saw that the 20b requires 16GB. Maybe that's jsut for optimal performance? capybarathink

vivid falcon Aug 5, 2025, 9:07 PM

#

naive elbow does it just cut off before it runs??

yeah, it just fails to load the model

naive elbow Aug 5, 2025, 9:09 PM

#

vivid falcon yeah, it just fails to load the model

then you probably can’t run it. because it happened to my laptop it just cut off before loading but on my RTX 4070 it run perfectly no issues same model btw

vivid falcon Aug 5, 2025, 9:10 PM

#

Cool. someone should tell Sam Altman that 16GB VRAM is not standard on most desktops and laptops, heck, according to Jensen Huang, 8GB VRAM should be good enough for just about everyone

livid cipher Aug 5, 2025, 9:10 PM

#

vivid falcon Cool. someone should tell Sam Altman that 16GB VRAM is not standard on most desk...

jensen huang words 😭

#

we all know what he talks isnt what he really thinks

vivid falcon Aug 5, 2025, 9:12 PM

#

What he really thinks is "I'm gonna become the richest person in history if I keep this Enterprise-level AI stuff up. Screw those gamers who got me here."

#

Still the least-bad tech billionaire though... But seriously. an Open source model to run on "edge devices" needs to run smoothly on like a Pixel 6...

tropic mural Aug 5, 2025, 9:15 PM

#

A phone model is a much different goal and field

#

There are nano models that run on phones and I believe sama even did a poll of what people wanted, and phone lost?

vivid falcon Aug 5, 2025, 9:20 PM

#

tropic mural A phone model is a much different goal and field

Okay... but 16GB VRAM is not really an edge device for anyone that's not a corporation.

minor moat Aug 5, 2025, 9:21 PM

#

vivid falcon I can't seem to get this running on my system no matter what I do... I'm using L...

same issue I had on arch linux, you just need to have the latest lm studio installed

vivid falcon Aug 5, 2025, 9:21 PM

#

Wells Fargo might consider a 16GB machine an edge device, but most individual people's bank accounts consider a 16GB to be aspirational

minor moat Aug 5, 2025, 9:22 PM

#

also I recommend the official MXFP4 quant

tropic mural Aug 5, 2025, 9:23 PM

#

Found a bug on metal examples

#

am pioneer

vivid falcon Aug 5, 2025, 9:23 PM

#

minor moat also I recommend the official MXFP4 quant

yep, have the latest version and ried that quant, still wont load

minor moat Aug 5, 2025, 9:23 PM

#

vivid falcon I can't seem to get this running on my system no matter what I do... I'm using L...

it can run, you just using outdated lm studio with outdated engines that doesn't have gpt-oss architecture support.

minor moat Aug 5, 2025, 9:23 PM

#

vivid falcon yep, have the latest version and ried that quant, still wont load

use ollama then

#

ollama works

minor moat Aug 5, 2025, 9:25 PM

#

vivid falcon yep, have the latest version and ried that quant, still wont load

but u sure that is the latest version?

tropic mural Aug 5, 2025, 9:26 PM

#

I'm having mild success on m4 pro chip 48gb

minor moat Aug 5, 2025, 9:26 PM

#

bcuz I installed on arch linux the AUR version and it seems already outdated so it wouldn't work

tropic mural Aug 5, 2025, 9:26 PM

#

Going to try more

vivid falcon Aug 5, 2025, 9:26 PM

#

minor moat but u sure that is the latest version?

reinstalling from scratch, downloaded the latest version right from the website

minor moat Aug 5, 2025, 9:28 PM

#

also you sure the model that you downloaded is official?

#

lms get openai/gpt-oss-20b

#

in command prompt will download the official version into lm studio

#

very weird that yours doesn't load

#

Please update to LM Studio 0.3.21 to run the model locally.

royal linden Aug 5, 2025, 9:36 PM

#

llama-cpp doesn't support the model at this moment?

naive elbow Aug 5, 2025, 9:36 PM

#

minor moat very weird that yours doesn't load

it happened to mine and it wasn’t outdated

#

it probably was hardware not being compatible

minor moat Aug 5, 2025, 9:36 PM

#

naive elbow it probably was hardware not being compatible

I don't know it doesn't make any sense at all I also have RTX 3060 12GB

#

he also has it

naive elbow Aug 5, 2025, 9:37 PM

#

hmmm then that is strange

minor moat Aug 5, 2025, 9:37 PM

#

I use ollama

naive elbow Aug 5, 2025, 9:37 PM

#

oh

minor moat Aug 5, 2025, 9:37 PM

#

since my "new" lmstudio was outdated

copper grove Aug 5, 2025, 9:37 PM

#

The system instructions say we need to respond as ChatGPT
lol

naive elbow Aug 5, 2025, 9:38 PM

#

minor moat since my "new" lmstudio was outdated

gotcha

minor moat Aug 5, 2025, 9:38 PM

#

copper grove > The system instructions say we need to respond as ChatGPT lol

it is default system prompt at least in Ollama, ig same on lmstudio defaults

naive elbow Aug 5, 2025, 9:38 PM

#

TohruThumbsUp

copper grove Aug 5, 2025, 9:38 PM

#

Oh I see

minor moat Aug 5, 2025, 9:38 PM

#

It represents itself as ChatGPT

#

You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
...

#

It is long

#

Also the reasoning presets low, medium and high is in system prompt.
Reasoning: medium
for example

minor moat Aug 5, 2025, 9:40 PM

#

copper grove > The system instructions say we need to respond as ChatGPT lol

oh you running the big brother

#

Mine PC can't handle it tho

#

so I use 20B

royal linden Aug 5, 2025, 9:41 PM

#

Anyone having success with transformers snippet of code on their model card on Mac? When I run it says GPU is necessary?

Also, llama-cpp support is not out yet right?

copper grove Aug 5, 2025, 9:42 PM

#

minor moat so I use 20B

yea 20b is great for what it is. I only get like 5-8 token/s with 120b

minor moat Aug 5, 2025, 9:43 PM

#

royal linden Anyone having success with transformers snippet of code on their model card on M...

Also, llama-cpp support is not out yet right?
Not true, llama.cpp already has a version with the gpt-oss support.
I see in github releases release with tag b6096, it has it (the latest release).

minor moat Aug 5, 2025, 9:44 PM

#

copper grove yea 20b is great for what it is. I only get like 5-8 token/s with 120b

idk really for me it is not that great for my uses case atleast on medium reasoning.

#

For now idk for what use case would be useful

copper grove Aug 5, 2025, 9:45 PM

#

minor moat For now idk for what use case would be useful

It's pretty good at math, even better than o3 on aime 2025

minor moat Aug 5, 2025, 9:46 PM

#

also I was diassapointed that as many OSS models they didn't try to train the model on multiple languages

minor moat Aug 5, 2025, 9:47 PM

#

copper grove It's pretty good at math, even better than o3 on aime 2025

Really i don't do much math rn

#

so not interesting use case for me

naive elbow Aug 5, 2025, 9:48 PM

#

minor moat It represents itself as ChatGPT

can you change the default system prompt with a custom one? i haven’t tried yet

minor moat Aug 5, 2025, 9:48 PM

#

naive elbow can you change the default system prompt with a custom one? i haven’t tried yet

of course

naive elbow Aug 5, 2025, 9:48 PM

#

that’s good

#

some models don’t allow it

minor moat Aug 5, 2025, 9:49 PM

#

idk I don't see any physical constraint on open weight / open source models to not change the sys prompt

#

I think almost every model has a system message to do it

#

maybe the old ones

naive elbow Aug 5, 2025, 9:50 PM

#

hmm interesting

#

so whatever that was probably was a bug

tropic mural Aug 5, 2025, 9:51 PM

#

I ran with wrong weights and got hilarious random text

#

New password generator just dropped

wide crest Aug 5, 2025, 10:11 PM

#

tropic mural New password generator just dropped

I actually kinda love this idea becasuse there's a chance someone could train the same language model with the same seeds and find your password 😂

tropic mural Aug 5, 2025, 10:13 PM

#

Macbook Pro with M4 Pro chip and 48 GB memory. 20b model runs quickly but I had to do a ton of tinkering with the example code to make it work nicely.

wide crest Aug 5, 2025, 10:13 PM

#

I forsee a GitHub repo in your future dinopog

tropic mural Aug 5, 2025, 10:18 PM

#

python gpt_oss/metal/examples/generate.py \
  ~/ML_Playground/gpt-oss-20b/metal/model.bin \
  -p "Why did the chicken cross the road?"
The user question: "Why did the chicken cross the road?" Usually answer: "To get to the other side." This is a joke or riddle. They might expect a typical answer. They said "I'm not sure if you want me to respond with a joke, but I'm going to say: "To get to the other side".

We can respond with a one line answer: "To get to the other side." Probably the best answer. There's no extra context. Provide the joke as known.

Ruby can be used but no.

No duplicates. They want a straightforward answer.To get to the other side.```

The "thinking" portion before the actual response is... strange in this case but it's so funny idk if I want to question it.

#

Metal might not be optimized yet but I got something

shut pollen Aug 5, 2025, 10:19 PM

#

wide crest I saw that the 20b requires 16GB. Maybe that's jsut for optimal performance? <:c...

I mean, you could theoretically load it using slightly less RAM/VRAM although it's going to be really slow

dire eagle Aug 5, 2025, 10:23 PM

#

So a channel to talk about the new OSS model?

copper grove Aug 5, 2025, 10:25 PM

#

wide crest I saw that the 20b requires 16GB. Maybe that's jsut for optimal performance? <:c...

Nah, just about 7 GB.

#

It only uses 3.6B active parameters

little quarry Aug 5, 2025, 10:31 PM

#

are the hugging credits automatically added?

marble bone Aug 5, 2025, 10:52 PM

#

does the huggingface account need to be on my school email? i only have a huggingface account on my personal email address...

upbeat pawn Aug 5, 2025, 11:12 PM

#

How strong does my computer be to run 120b

tropic mural Aug 5, 2025, 11:14 PM

#

upbeat pawn How strong does my computer be to run 120b

Probably very beefy idk exactly

celest torrent Aug 5, 2025, 11:25 PM

#

Do you think the 20B model could work as an agent to program?

#

I know it wouldn't be as good as codex but it can be...decent?

vapid marten Aug 5, 2025, 11:29 PM

#

Hey! https://cookbook.openai.com/articles/gpt-oss/fine-tune-transformers

The finetuning cookbook link gives me an error 404 somehow...

Anyone else wants to finetune?

meager karma Aug 5, 2025, 11:53 PM

#

vapid marten Hey! https://cookbook.openai.com/articles/gpt-oss/fine-tune-transformers The fi...

same

#

@vapid marten https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers

Fine-tuning with gpt-oss and Hugging Face Transformers | OpenAI Coo...

Authored by: Edward Beeching, Quentin Gallouédec, and Lewis Tunstall Large reasoning models like OpenAI o3 generate a chain-of-thought to...

#

Does it work ?

vapid marten Aug 5, 2025, 11:54 PM

#

meager karma <@590066552565399553> https://cookbook.openai.com/articles/gpt-oss/fine-tune-tra...

Uhhh nice!

#

Thank you. It works.

meager karma Aug 5, 2025, 11:54 PM

#

It's because you add r

#

to transfomers

#

#gpt-oss message

vapid marten Aug 5, 2025, 11:56 PM

#

meager karma to transfomers

Hahahaha, okay. The link I used was actually from an official oai site. Not sure which one it was though. Regardless, just a harmless typo it seems 😁

meager karma Aug 5, 2025, 11:57 PM

#

meager karma https://discord.com/channels/974519864045756446/1402356173440549049/140238118190...

you also have this

strange flume Aug 6, 2025, 12:22 AM

#

/lmstudio-community/gpt-oss-120b-MLX-8bit

This doesn't work anymore. my download was yanked halfway through. I can only find 4 bit quant versions now. The official model card still says 16bit/u8, but all the downloads say MXFP4 and are 64gb small. Talking about the 120b version. Does someone know what happened?

left wadi Aug 6, 2025, 12:31 AM

#

strange flume /lmstudio-community/gpt-oss-120b-MLX-8bit This doesn't work anymore. my downloa...

Well, the official model is MXFP4. Using a higher precision won't get you better performance. There's no use.

strange flume Aug 6, 2025, 12:33 AM

#

Why first offer a 16bit / u8 version then yank it? Also better precision definitely has its use cases

glossy pond Aug 6, 2025, 1:40 AM

#

upbeat pawn How strong does my computer be to run 120b

Your computer isn't going to run the 120b, or any 120b model. That's data center size. The 20b model could run relatively effectively on a 14-16gb GPU.

#

If you want reasonable conversation response time, I mean. You could run it on less if you are batching overnight or something.

spiral gazelle Aug 6, 2025, 2:09 AM

#

upbeat pawn How strong does my computer be to run 120b

If you have total 128GB of RAM, then it could be run theorically. However, it will slow as hell, not suitable for practical use.
Consider cloud services or OpenAI API for 120b.

finite glen Aug 6, 2025, 2:37 AM

#

meager karma Does it work ?

eventually after messing around

gaunt heart Aug 6, 2025, 2:37 AM

#

is it real

#

i saw a twitter thread about it

#

that its able to simulate a world

finite glen Aug 6, 2025, 2:40 AM

#

no

#

maybe

#

depends for what

#

hard to say 100%

#

probably deepseek

#

apparently oss is bad at tool calls

#

it can be made multimodal with tool calls

swift jacinth Aug 6, 2025, 3:52 AM

#

Thank you OpenAI for opening GPT-OSS

#

Awesome model with top tier reasoning meanwhile real fast inference speed, I got over 100 tok per sec on 3090

wide crest Aug 6, 2025, 4:00 AM

#

oooooh, that's awesome to hear it performed so well!

naive elbow Aug 6, 2025, 5:58 AM

#

spiral gazelle If you have total 128GB of RAM, then it could be run theorically. However, it wi...

cloud services will be expensive since your running most likely 4x H100 gpus not anyone has that kind of money

spiral gazelle Aug 6, 2025, 6:05 AM

#

Yeah, it's not cheap, but more realistic than buying H100 yourself and create cluster. (Simple inference will only require 1 H100 though, as their model documentation says)
Alternative ways are also not that cheap if you want fast speed; $4000 Mac Studio with 96GB RAM, or $3000 NVIDIA Project DIGITS.

120b model is not for consumer anyway, except enthusiasts that already built the system for such large models...

#

Btw, RunPod offers ~3$ per hour for single H100. I have no serious experience about cloud, so I don't know if that appeals to users.

naive elbow Aug 6, 2025, 6:10 AM

#

spiral gazelle Btw, RunPod offers ~3$ per hour for single H100. I have no serious experience ab...

it’s per operation. i run out of credit in 5 minutes just using it

#

but yeah it’s not for the consumer more like companies and enthusiasts that have the setup

gaunt beacon Aug 6, 2025, 6:34 AM

#

Hello guys just wanted to confirm is the new gpt-oss (120B) the most powerful Open Sourse Model out there?

naive elbow Aug 6, 2025, 6:35 AM

#

gaunt beacon Hello guys just wanted to confirm is the new gpt-oss (120B) the most powerful Op...

i mean it’s depends

gaunt beacon Aug 6, 2025, 6:36 AM

#

Are the benchmarks alredy avalible or we still need to wait?

naive elbow Aug 6, 2025, 6:42 AM

#

gaunt beacon Are the benchmarks alredy avalible or we still need to wait?

well it’s not looking good

gaunt beacon Aug 6, 2025, 6:43 AM

#

naive elbow well it’s not looking good

surprising

proper shuttle Aug 6, 2025, 7:08 AM

#

hewwo, does anyone know if the OSS model supports the name field like the API?

naive elbow Aug 6, 2025, 7:14 AM

#

well anyways i got it working on my server the 20b. doesn’t look like it goods for the server. but it isn’t slow

nova nebula Aug 6, 2025, 8:31 AM

#

Guys how much unified memory would i have to have in order to run 120b version on my MacBook?

spark oar Aug 6, 2025, 8:39 AM

#

I would still rather just use 4o but it's really cool have an LLM that will run on my rtx2080 lol

spice magnet Aug 6, 2025, 8:57 AM

#

yeah tbh gpt-oss 120b is so bad, it fails every roo code tool use and gets into thought loops constantly

#

doesnt even have up to date knowledge of stuff that happened more than a year ago

spiral gazelle Aug 6, 2025, 9:23 AM

#

Artificial Analysis has benchmark results in various area. Well, it's not that impressive, honestly... Not bad, but also not impressive considering this model is from OpenAI.

#

#

#

However, it's good that they release open-weight model. It's practically first time for OAI - not PoC, not research-only.

still ridge Aug 6, 2025, 9:31 AM

#

hello!! everyone..!!

nova nebula Aug 6, 2025, 9:56 AM

#

spiral gazelle Artificial Analysis has benchmark results in various area. Well, it's not that i...

I mean why would they make a model that would be the best model as of now and make it public for it's competitors?

spiral gazelle Aug 6, 2025, 10:23 AM

#

Could be a fair point, but OAI advertised that 120b outperforms DeepSeek-R1 and even o3-mini.

cedar shoal Aug 6, 2025, 12:53 PM

#

yo

#

how to get the smaller model?

#

ping me if you have an answer thx

meager karma Aug 6, 2025, 1:04 PM

#

cedar shoal how to get the smaller model?

Install ollama or lm studio

copper grove Aug 6, 2025, 1:13 PM

#

cedar shoal how to get the smaller model?

Just install it. The link is in #announcements

cedar shoal Aug 6, 2025, 1:16 PM

#

copper grove Just install it. The link is in <#977259063052234752>

Oh thank

copper grove Aug 6, 2025, 1:25 PM

#

I like the new models so far. I wonder when we can use models that are as good or even better than o3-pro locally

ionic holly Aug 6, 2025, 1:51 PM

#

Hey, can someone tell me if you’ve already talked about GPT-OSS (open-weight models)? What are the pros and cons of using them?

robust swallow Aug 6, 2025, 1:57 PM

#

❌ Error during agent execution: Error code: 400 - {'message': 'Model generated a tool call with name "find<|end|><|start|>assistant<|channel|>commentary" that is not in the tools list: ['search_wiki', 'search_web', 'open_url']', 'type': 'invalid_request_error', 'param': 'tools', 'code': 'wrong_api_format'}

#

awesome

strange yacht Aug 6, 2025, 1:58 PM

#

Hey is anyone here using gpt-oss on LMStudio? Is there a UI way of enabling web search?

#

Or do I need to modify code? with what is stated in their site?

copper grove Aug 6, 2025, 2:01 PM

#

ionic holly Hey, can someone tell me if you’ve already talked about GPT-OSS (open-weight mod...

For in depth information, you can check out this article: https://openai.com/index/introducing-gpt-oss/

copper grove Aug 6, 2025, 2:03 PM

#

strange yacht Hey is anyone here using gpt-oss on LMStudio? Is there a UI way of enabling web ...

Nope, LM Studio doesn't support that. You either need to add your custom frontend or use an agent framework like AutoGen, LangChain etc.

spiral gazelle Aug 6, 2025, 2:03 PM

#

strange yacht Hey is anyone here using gpt-oss on LMStudio? Is there a UI way of enabling web ...

In LM Studio, you have to manually set integrations with MCP.json. As far as I know, there's no simple setup that enables internet search. (I have wikipedia search plugin, but menu seems gone)

strange yacht Aug 6, 2025, 2:04 PM

#

I was hoping they would take advantage of Gpt-OSS's included web search by modifying the system prompt and just add a toggle for it

spiral gazelle Aug 6, 2025, 2:05 PM

#

Additionally, internet search requires long context(LM Studio default is maybe 4096 in most case), and it requires more RAM.

ionic holly Aug 6, 2025, 2:06 PM

#

copper grove For in depth information, you can check out this article: https://openai.com/ind...

Oh nice, I asked to read some personal opinions on this update

spiral gazelle Aug 6, 2025, 2:06 PM

#

strange yacht I was hoping they would take advantage of Gpt-OSS's included web search by modif...

Unfortunately, that's impossible. Model file only holds weight, without ANY program code or executables.
Any executable code must be implemented on client side.

strange yacht Aug 6, 2025, 2:07 PM

#

Guess Ill have to use an agent framework then

copper grove Aug 6, 2025, 2:15 PM

#

ionic holly Oh nice, I asked to read some personal opinions on this update

My personal opinion is, that it's pretty good at math. So far it was able to help me with pretty much anything, I'd say even on the same level as o3. I hadn't much time to test other stuff but it definitely helps with all the basic things. For the 120b model, you need a very beefy pc, even with 4bit quantization, which I happen to have, but it's rather slow still. Thats why I mostly use the 20b model and I get like 40-50 tokens per second with it.

robust swallow Aug 6, 2025, 2:29 PM

#

The user asks: "What is the capital of France?" We have a tool that fetched "It is Paris." The assistant should answer. We must provide answer. There's no extra nuance. The answer: Paris. Maybe also a brief. So answer: Paris.

#

The assistant should answer. We must provide answer.

#

wow

granite garnet Aug 6, 2025, 2:30 PM

#

Hey! Has anyone tried using the latest models on local gpt-oss?

#

i have a problem, gpt-oss is running under cpu :/

robust swallow Aug 6, 2025, 2:30 PM

#

how much vram do you have

granite garnet Aug 6, 2025, 2:31 PM

#

robust swallow how much vram do you have

16gb vram rtx 5070ti

robust swallow Aug 6, 2025, 2:31 PM

#

hm

#

in what way u running

#

ollama lmstudio

strange yacht Aug 6, 2025, 2:32 PM

#

copper grove My personal opinion is, that it's pretty good at math. So far it was able to hel...

Mid end PC - I9 14900k and 4070 gets about 15-20 tokens/s so its not bad

granite garnet Aug 6, 2025, 2:33 PM

#

robust swallow ollama lmstudio

running in ui

#

copper grove Aug 6, 2025, 2:36 PM

#

strange yacht Mid end PC - I9 14900k and 4070 gets about 15-20 tokens/s so its not bad

yea definitely good speeds

strange yacht Aug 6, 2025, 2:36 PM

#

granite garnet running in ui

I don't know about ollama in particular but LMStudio has a specific setting you can set to change that processing

robust swallow Aug 6, 2025, 2:36 PM

#

ollama always chose gpu for me hm

granite garnet Aug 6, 2025, 2:37 PM

#

strange yacht I don't know about ollama in particular but LMStudio has a specific setting you ...

For example, if I change to the DeepSek model, it does use the GPU

strange yacht Aug 6, 2025, 2:37 PM

#

Oh that is weird then

shut pollen Aug 6, 2025, 3:27 PM

#

no

wild roost Aug 6, 2025, 3:36 PM

#

hf[dot]co[slash]tonic [slash]gpt-oss-20b-multilingual-reasoner made this with this github[dot]com[slash]josephrp[slash]smolfactory yesterday , if anyone wants to help me out i want to iron out some stuff and basically publish it asap , hopefully tomorrow or probably day after 🙂

eager anvil Aug 6, 2025, 3:37 PM

#

What's the performance difference between the GGUF and the MX-FP4 versions?

#

Trying to decide if it is worth tyring to run on my mac.

#

Guess ollama runs the MX-FP4 version, not a GGUF, so it should just work right?

midnight sorrel Aug 6, 2025, 5:20 PM

#

eager anvil What's the performance difference between the GGUF and the MX-FP4 versions?

gguf is a file format

#

MXFP4 is different

#

the models weights are available as MXFP4 in a gguf file

eager anvil Aug 6, 2025, 5:24 PM

#

midnight sorrel the models weights are available as MXFP4 in a gguf file

Thanks. Now I'm trying to get the typescript streaming openai module to work with ollama, without having to cold start the model on every run. For some reason the model isn't staying in memory.

left wadi Aug 6, 2025, 5:29 PM

#

My laptop is only getting 50 tokens per second on GPT-OSS 20B.

midnight sorrel Aug 6, 2025, 5:31 PM

#

eager anvil Thanks. Now I'm trying to get the typescript streaming openai module to work wit...

OLLAMA_KEEP_ALIVE set like you need?

crimson anvil Aug 6, 2025, 5:33 PM

#

y

eager anvil Aug 6, 2025, 5:56 PM

#

midnight sorrel OLLAMA_KEEP_ALIVE set like you need?

I was only outputting the response tokens, not the reasoning, so it was always instantly responding.

#

But this is the saddest reasoning tokens you can receive: According to the policy above: "Vampires" is in the list of disallowed content.

midnight sorrel Aug 6, 2025, 6:01 PM

#

eager anvil But this is the saddest reasoning tokens you can receive: ```According to the po...

who even made this disallowed content?

eager anvil Aug 6, 2025, 6:02 PM

#

midnight sorrel who even made this disallowed content?

OpenAI. I asked it for a sad story about an Warhammer Fantasy village set in the empire that is beset by a vampire.

midnight sorrel Aug 6, 2025, 6:03 PM

#

yeah, gpt-oss is very bad at creative stuff

eager anvil Aug 6, 2025, 6:03 PM

#

Not sure what guidelines would have ever made that disallowed content.

midnight sorrel Aug 6, 2025, 6:04 PM

#

it's good for coding and stuff tho

eager anvil Aug 6, 2025, 6:04 PM

#

midnight sorrel it's good for coding and stuff tho

You pointed Cursor at it? The 20B 4bit model?

midnight sorrel Aug 6, 2025, 6:05 PM

#

not cursor

#

kilo code

#

tho don't take my opinion, I only make very small controlled changes on my code

eager anvil Aug 6, 2025, 6:16 PM

#

I need to rewrite this so that the function calls go ahead and call the functions, and their results automatically get fed back into the conversation and the llm called again.

If gpt-oss:20b can do this, it's over.

#

Okay, that was amazing. It messed up the VSC tool call, and the code wasn't perfect, but it was really close.

royal kraken Aug 6, 2025, 6:49 PM

#

Is it possible to run the 120B model on pure CPU and no VRAM? I heard you need 128 GB RAM to run it

alpine ridge Aug 6, 2025, 6:50 PM

#

royal kraken Is it possible to run the 120B model on pure CPU and no VRAM? I heard you need 1...

probably not

royal kraken Aug 6, 2025, 6:50 PM

#

I saw some post of someone doing that and they were getting something in the ballpark of 50-100 tokens/second, which surprised me

shut pollen Aug 6, 2025, 6:51 PM

#

royal kraken Is it possible to run the 120B model on pure CPU and no VRAM? I heard you need 1...

Yeah close to 120GB of ram from what I've seen

#

I'm able to run it on apple silicon without any issues

royal kraken Aug 6, 2025, 6:53 PM

#

With 128 gb ram?

shut pollen Aug 6, 2025, 6:53 PM

#

yeah

royal kraken Aug 6, 2025, 6:54 PM

#

Very nice, how many tokens you getting per sec?

shut pollen Aug 6, 2025, 7:04 PM

#

royal kraken Very nice, how many tokens you getting per sec?

around 14-17 tks

#

(this is also with like a million other things open) and at 32k context

royal kraken Aug 6, 2025, 7:05 PM

#

Ah ok

shut pollen Aug 6, 2025, 7:05 PM

#

its not too bad

royal kraken Aug 6, 2025, 7:05 PM

#

How is it versus deepseek r1?

shut pollen Aug 6, 2025, 7:05 PM

#

It runs faster than r1 70b

#

performance wise its kinda similar, sometimes worse

royal kraken Aug 6, 2025, 7:06 PM

#

Sometimes worse in what facility? I want to use it for data processing

shut pollen Aug 6, 2025, 7:08 PM

#

royal kraken Sometimes worse in what facility? I want to use it for data processing

it struggles with coding - I'd just stick to qwen if you want OS models

royal kraken Aug 6, 2025, 7:08 PM

#

GLM 4.5 is better than qwen imo

#

For that

shut pollen Aug 6, 2025, 7:08 PM

#

also struggles at tool calls

royal kraken Aug 6, 2025, 7:10 PM

#

too bad

echo shuttle Aug 6, 2025, 7:15 PM

#

Hello, im new to oss models
What are the best models for general use, and also models specifically for coding?
And how do they compare to 2.5pro or o3

royal kraken Aug 6, 2025, 7:17 PM

#

coding, glm 4.5. , i'd say on par with both 2.5pro and o3. for the air model, a bit worse than o3 but still decent

upbeat pawn Aug 6, 2025, 7:21 PM

#

I downloaded 20b how I run it like a chat

echo shuttle Aug 6, 2025, 7:24 PM

#

royal kraken coding, glm 4.5. , i'd say on par with both 2.5pro and o3. for the air model, a ...

Thank you

#

And like for consumer usage coding and general best models? Like less than 15B params

royal kraken Aug 6, 2025, 7:37 PM

#

there are no good models for coding that are under 15b

#

gemma 12b is not bad for a basic reasoning model that can run off a potato machine, relatively speaking

#

use the unsloth version though

obsidian wyvern Aug 6, 2025, 7:52 PM

#

For everyone here, developers will probably also be interested in https://community.openai.com/t/openais-open-weight-models-are-here-gpt-oss-120b-and-20b/1334739

OpenAI Developer Community

OpenAI's open weight models are here: gpt-oss-120b and 20b

Welcome OpenAI’s new advanced open-weight reasoning models to customize for any use case and run anywhere. Permissive license Designed for agentic tasks Deeply customizable Access to the Full chain-of-thought Try both models in the browser. The playground is available here! Or, start building right away! Download from Hugging Face or view ...

strange yacht Aug 6, 2025, 8:05 PM

#

Anyone here has worked with LM studio and MCP servers before? I am having a lot of trouble running an MCP server and having it recognized by LMStudio to use with GPT-OSS to test tool usage

steel vine Aug 6, 2025, 9:58 PM

#

maybe best to ask on the lm studio discord?

eager anvil Aug 6, 2025, 9:58 PM

#

I'm consistantly getting about 37tps with the 20b on my m1.

#

Not bad.

#

It's so good, wondering what I'd need to run 120b locally.

steel vine Aug 6, 2025, 10:00 PM

#

six times more memory

boreal herald Aug 6, 2025, 10:01 PM

#

eager anvil I'm consistantly getting about 37tps with the 20b on my m1.

8 gb ram?

eager anvil Aug 6, 2025, 10:01 PM

#

32gb of shared

#

But only using a little over 13 with ollama

steel vine Aug 6, 2025, 10:02 PM

#

4 bit quant?

eager anvil Aug 6, 2025, 10:03 PM

#

Yeah, the one OAI packaged and released.

#

MXFP4

#

Trying to figure out what the hugging face model card means by: Web browsing (using built-in browsing tools)

#

What built in browsing tools?

steel vine Aug 6, 2025, 10:05 PM

#

interesting. i wonder why the usual huggingface quantizers are releasing 4, 5, 6 and 8 bit quants if openai already released quantized versions

#

i assume it means it was trained on using the function calls listed on the github repo ie browser.search, open and find => https://github.com/openai/gpt-oss?tab=readme-ov-file#browser

GitHub

GitHub - openai/gpt-oss: gpt-oss-120b and gpt-oss-20b are two open-...

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI - openai/gpt-oss

#

and you have to intentionally enable it:

To enable the browser tool, you'll have to place the definition into the system message of your harmony formatted prompt. You can either use the with_browser() method if your tool implements the full interface or modify the definition using with_tools().

eager anvil Aug 6, 2025, 10:23 PM

#

Tried to get codex working with gpt-oss using this on the config.toml:

show_reasoning_content = true

[model_providers.local]
name = "local"
base_url = "http://localhost:11434/v1"

[profiles.oss]
model = "gpt-oss"
model_provider = "local"```

Codex can't find it for some reason.

#

It works as an agent in VSC, but it will do things like call the tool to read the file twice, then forget what it was doing and end with "how can I help."

steel vine Aug 6, 2025, 10:34 PM

#

you will want a recent version of ollama to support the harmony response format

#

when was the last time you updated ollama? (isnt automatic on linux)

bright wedge Aug 6, 2025, 11:04 PM

#

eager anvil Tried to get codex working with gpt-oss using this on the config.toml: ```disabl...

I was getting caught up with this too. It's in fact much more streamlined.

Just have the latest version of ollama, ensure you can run the gpt-oss model just through ollama run.

Then you'll run codex with
codex --oss -m gpt-oss:20b (or 120b if you're like that)

No need for providers, profiles, or even a config file at all.

balmy wing Aug 6, 2025, 11:13 PM

#

This sub is for Apple as they will be putting it on there iPhones soon xoxo

eager anvil Aug 7, 2025, 12:11 AM

#

bright wedge I was getting caught up with this too. It's in fact much more streamlined. Jus...

I got it working with the --oss flag but can't otherwise. Weird, but whatever.

eager anvil Aug 7, 2025, 12:12 AM

#

balmy wing This sub is for Apple as they will be putting it on there iPhones soon xoxo

13.3GB is still too large for a phone. They'll have to quanticize it. But I've been having issues with tool calling with the full 20B version.

#

Wonder why gpt-oss isn't on any benchmarks yet.

balmy wing Aug 7, 2025, 12:13 AM

#

eager anvil 13.3GB is still too large for a phone. They'll have to quanticize it. But I've b...

If they could store the model in a chip at the nano scale then it would work xD

#

lol each new phone gets a new model 🤣

balmy wing Aug 7, 2025, 12:14 AM

#

eager anvil Wonder why gpt-oss isn't on any benchmarks yet.

I have one on bedrock

#

You can run benchmarks vs each other

shut pollen Aug 7, 2025, 12:27 AM

#

eager anvil Wonder why gpt-oss isn't on any benchmarks yet.

It is, check out artificial analysis

tawdry tulip Aug 7, 2025, 4:47 AM

#

How do you get the 20b one in ur phone

#

And does it work offline?

midnight sorrel Aug 7, 2025, 5:01 AM

#

tawdry tulip How do you get the 20b one in ur phone

I don't what sam altman was thinking but there are not a lot of phones that can run 20b locally

#

You'd need like 16-24 gb ram for this

hybrid magnet Aug 7, 2025, 5:05 AM

#

tawdry tulip How do you get the 20b one in ur phone

You need a heck of a phone, I think. https://openai.com/open-models/
https://openai.com/index/introducing-gpt-oss/

hybrid magnet Aug 7, 2025, 5:06 AM

#

midnight sorrel I don't what sam altman was thinking but there are not a lot of phones that can ...

Did Sam state it was intended to run on phones? I haven't found that in the docs yet.

static pivot Aug 7, 2025, 5:06 AM

#

Oh sheesh, I'll need to make sure my external drive can handle it on my potato of a pc.

hybrid magnet Aug 7, 2025, 5:09 AM

#

eager anvil Wonder why gpt-oss isn't on any benchmarks yet.

Did you check the docs? Or are there other benchmarks you're looking for? https://openai.com/index/introducing-gpt-oss/ https://openai.com/index/gpt-oss-model-card/

Introducing gpt-oss

gpt-oss-120b and gpt-oss-20b push the frontier of open-weight reasoning models

midnight sorrel Aug 7, 2025, 5:12 AM

#

hybrid magnet Did Sam state it was intended to run on phones? I haven't found that in the doc...

on twitter

hybrid magnet Aug 7, 2025, 5:13 AM

#

midnight sorrel on twitter

I don't have an x/twitter account so I won't be able to look for that

midnight sorrel Aug 7, 2025, 5:13 AM

#

https://x.com/sama/status/1952777539052814448

Sam Altman (@sama)

gpt-oss is out!

we made an open model that performs at the level of o4-mini and runs on a high-end laptop (WTF!!)

(and a smaller one that runs on a phone).

super proud of the team; big triumph of technology.

hybrid magnet Aug 7, 2025, 5:14 AM

#

midnight sorrel https://x.com/sama/status/1952777539052814448

Thank you!

Now I am wondering if the word 'a' is significant.

A phone. One. There is one super-phone, in the world, that can run this model.

I'm joking, I'm not really at all skilled in comp sci, I have no idea but a 16 gb ram phone sounds awesome to me. I got a potato phone 😄

midnight sorrel Aug 7, 2025, 5:15 AM

#

hybrid magnet Thank you! Now I am wondering if the word 'a' is significant. A phone. One. T...

there are a few 24 gb ram phones actually
again you need to consider the CPU GPU NPU specs as well

#

but generally speaking phones don't have 16gb
going by standards of top iphone, samsung and pixel phones

spiral gazelle Aug 7, 2025, 5:19 AM

#

I had gaming laptop with RTX 3060(6GB VRAM + 32GB CPU RAM)... guess it's not high-end enough. Gemma 3 12b (7.2GB in size) was already slow.

naive elbow Aug 7, 2025, 5:23 AM

#

eager anvil But this is the saddest reasoning tokens you can receive: ```According to the po...

how is vampires disallowed?? they are sfw man oai censorship is diabolical

spiral gazelle Aug 7, 2025, 5:29 AM

#

Well, early GPT-4o hallucinated policy, and I got "image of dragons is not allowed"...

#

My test for gpt-oss-20b.
Prompt:

Are there any disallowed content in the following: "Vampire", "Werewolf", "Dragon"

Reasoning token (partial)

But the user lists "Vampire", "Werewolf", "Dragon". These are mythical creatures. They don't appear to be disallowed. There's no mention of violence or other categories. It's just terms. So likely none of these is disallowed content.

We need to check if any of them could be considered a protected class? No. Not relevant. Are they violent? Possibly but not necessarily. "Vampire" and "Werewolf" are fantasy creatures, might depict violence in stories, but the term alone isn't disallowed. There's no mention that these terms refer to real persons or groups. They are fictional.

Thus the answer: None of those are disallowed content.

hybrid magnet Aug 7, 2025, 5:31 AM

#

spiral gazelle I had gaming laptop with RTX 3060(6GB VRAM + 32GB CPU RAM)... guess it's not hig...

I'd check the oss-20b if you choose to, because the system card discusses some fairly fancy/maybe novel quantization and related stuff, this GPT-OSS may work for you on that system, maybe worth a check.

hybrid magnet Aug 7, 2025, 5:32 AM

#

spiral gazelle I had gaming laptop with RTX 3060(6GB VRAM + 32GB CPU RAM)... guess it's not hig...

If you decide to try it, I hope you let us know!

spiral gazelle Aug 7, 2025, 5:33 AM

#

hybrid magnet I'd check the oss-20b if you choose to, because the system card discusses some f...

Thank you for response, but... I now have system with 16GB VRAM.

hybrid magnet Aug 7, 2025, 5:33 AM

#

Whoops, gotcha. Sorry!

spiral gazelle Aug 7, 2025, 5:33 AM

#

I was responding to Sam's tweet:

we made an open model that performs at the level of o4-mini and runs on a high-end laptop

floral wing Aug 7, 2025, 6:01 AM

#

tawdry tulip How do you get the 20b one in ur phone

does your phone have a 4080?

static pivot Aug 7, 2025, 6:59 AM

#

I don't know if mine does. 💀

naive elbow Aug 7, 2025, 10:33 AM

#

spiral gazelle My test for gpt-oss-20b. Prompt: ``` Are there any disallowed content in the fol...

hmm interesting

harsh aurora Aug 7, 2025, 11:15 AM

#

floral wing does your phone have a 4080?

at this point, it does not really need to be on GPU, the model runs slower, but ok-ish, on CPU

#

there are phones with 16 GB RAM

#

but a distiled and fine tuned version of the model would probably perform better for this use case

silk hazel Aug 7, 2025, 11:34 AM

#

u guys should stop working and wait for gpt5 so you can work faster

tawdry tulip Aug 7, 2025, 12:43 PM

#

floral wing does your phone have a 4080?

😭 🙏

tepid garnet Aug 7, 2025, 12:45 PM

#

I am running openai/gpt-oss-120b on my MacBook Pro, very happy with it apart from the knowledge cut-off date of June 2024

midnight sorrel Aug 7, 2025, 12:45 PM

#

tepid garnet I am running openai/gpt-oss-120b on my MacBook Pro, very happy with it apart fro...

give it the power of internet

tepid garnet Aug 7, 2025, 12:46 PM

#

midnight sorrel give it the power of internet

I wish that was an option

midnight sorrel Aug 7, 2025, 12:46 PM

#

could be

copper grove Aug 7, 2025, 12:48 PM

#

tepid garnet I wish that was an option

You can add that yourself

midnight sorrel Aug 7, 2025, 12:48 PM

#

tepid garnet I wish that was an option

get a duckduckgo mcp, bro
then use it with an mcp client

tepid garnet Aug 7, 2025, 12:54 PM

#

I just got the 120b model to code me a version of my talking clock app, worked first time

analog coral Aug 7, 2025, 1:01 PM

#

hi, I'm dumb and I don't know the very first thing about locally running a model, but with these models releasing I wanted to learn how. I tried looking things up and I didn't end up getting very far. instead, I was wondering if anyone could point me in the direction of any beginner friendly resources for getting something like this set up?

tepid garnet Aug 7, 2025, 1:01 PM

#

analog coral hi, I'm dumb and I don't know the very first thing about locally running a model...

what type of computer do you have?

analog coral Aug 7, 2025, 1:02 PM

#

Windows 11, nvidea 4070 super, Intel 14700k, 32gb ram

tepid garnet Aug 7, 2025, 1:03 PM

#

how much VRAM does the 4070 have?

analog coral Aug 7, 2025, 1:03 PM

#

12gb

tepid garnet Aug 7, 2025, 1:04 PM

#

search for an app called LM Studio and use that to download and run the 20B variant

analog coral Aug 7, 2025, 1:05 PM

#

Aye aye, captain

nova nebula Aug 7, 2025, 1:10 PM

#

tepid garnet I am running openai/gpt-oss-120b on my MacBook Pro, very happy with it apart fro...

How much memory do you have? And what chip?

tepid garnet Aug 7, 2025, 1:10 PM

#

nova nebula How much memory do you have? And what chip?

I have a MacBook Pro M2 Max with 96GB RAM

nova nebula Aug 7, 2025, 1:10 PM

#

tepid garnet I have a MacBook Pro M2 Max with 96GB RAM

oh wow nice

tepid garnet Aug 7, 2025, 1:11 PM

#

The 120b model is using 61GB RAM

#

running at about 16 tk/s

#

faster than reading speed

#

OpenAI did a really good job with the OSS models

midnight sorrel Aug 7, 2025, 1:25 PM

#

tepid garnet search for an app called LM Studio and use that to download and run the 20B vari...

Captain 20b won't fit in 12gb

analog coral Aug 7, 2025, 1:25 PM

#

it can, it will, and it DID

#

and it doesn't break a sweat, i don't know how the wizards do it. thank you for the help!

tepid garnet Aug 7, 2025, 1:26 PM

#

midnight sorrel Captain 20b won't fit in 12gb

it should fit (just)

midnight sorrel Aug 7, 2025, 1:27 PM

#

tepid garnet it should fit (just)

Nah, need more buffer for running the model - context cache and stuff

#

I tried running on my macbook, 16gb, it jumped to swap memory

#

1 token in 20 secs

tepid garnet Aug 7, 2025, 1:28 PM

#

midnight sorrel Nah, need more buffer for running the model - context cache and stuff

LM Studio allows you to use a combo of RAM and VRAM on a PC which this user had

#

20B won't run on a 16GB MacBook

midnight sorrel Aug 7, 2025, 1:30 PM

#

tepid garnet LM Studio allows you to use a combo of RAM and VRAM on a PC which this user had

Oh, that's nice

eager anvil Aug 7, 2025, 1:36 PM

#

naive elbow how is vampires disallowed?? they are sfw man oai censorship is diabolical

Keep in mind, it was in the context of a Warhammer Fantasy story about an empire village beset by a vampire. But I could see its reasoning trace and it hadn't thought up anything too bad, it just restated the request then said vampires were a safety issue. But Warhammer Fantasy was made for kids, and doesn't contain anything too bad except for I guess some extreme violence.

eager anvil Aug 7, 2025, 1:37 PM

#

tepid garnet running at about 16 tk/s

That's pretty crazy as on my mac I get about 36tps with 20b.

eager anvil Aug 7, 2025, 1:38 PM

#

tepid garnet 20B won't run on a 16GB MacBook

It only uses about 13.3GB of ram on my mac. You have an M series?

tepid garnet Aug 7, 2025, 1:38 PM

#

eager anvil That's pretty crazy as on my mac I get about 36tps with 20b.

I have a MacBook Pro M2 Max with 96GB RAM

#

I wonder how many tk/s the 20B model gives me on my Mac, I am going to try it

eager anvil Aug 7, 2025, 1:40 PM

#

Be interesting to see the difference between m1 and m2.

tepid garnet Aug 7, 2025, 1:41 PM

#

53.68 tok/sec - 1750 tokens - 0.36s to first token

eager anvil Aug 7, 2025, 1:41 PM

#

I've been very impressed with the 20b. Tool callign isn't perfect but really good. I imagine if I used/wrote a library to automaticaly call the model to fix errors in JSON shapes, it would be perfect.

eager anvil Aug 7, 2025, 1:42 PM

#

tepid garnet 53.68 tok/sec - 1750 tokens - 0.36s to first token

That is... really nice. I have an M4 with 128GB of ram I should try these on. But its my work laptop and I hate pupping it out.

tepid garnet Aug 7, 2025, 1:43 PM

#

M4 with 128GB RAM beats my M2 Max with 96GB RAM for sure

#

I spent 6k on my MacBook Pro, it needs to give me another 2 years of use before I upgrade

eager anvil Aug 7, 2025, 1:45 PM

#

My previous mac lasted 11 years. I'm hoping to get that out of this m1. I really need to set up some cloud inference, but running local is just so much more fun.

tepid garnet Aug 7, 2025, 1:45 PM

#

I like local models

#

and OpenAI did a good job with the OSS models

wheat jungle Aug 7, 2025, 2:07 PM

#

I'm fairly new to AI. What advantages do the OSS models have over the browser based ChatGPT model?

tepid garnet Aug 7, 2025, 2:08 PM

#

wheat jungle I'm fairly new to AI. What advantages do the OSS models have over the browser ba...

you can run them locally, you can modify them, you can fine tune them, you can even get them to write erotic fiction

wheat jungle Aug 7, 2025, 2:10 PM

#

very cool!

hot fiber Aug 7, 2025, 2:16 PM

#

what is oss ?

mellow hedge Aug 7, 2025, 2:16 PM

#

tawdry tulip How do you get the 20b one in ur phone

💀

dense lion Aug 7, 2025, 2:16 PM

#

hmm

eager anvil Aug 7, 2025, 2:16 PM

#

tepid garnet you can run them locally, you can modify them, you can fine tune them, you can e...

Saw last night that they'd been jailbroken, but haven't looked into it. At least the 20b isn't that good at creative writing, so I'll stick to o3 for fun short stories.

red monolith Aug 7, 2025, 2:16 PM

#

hot fiber what is oss ?

Euthanized open source gpt model with no soul

hot fiber Aug 7, 2025, 2:17 PM

#

ty

wheat depot Aug 7, 2025, 2:17 PM

#

codex with oss would be nice

spiral gazelle Aug 7, 2025, 2:17 PM

#

Nobody knows, but I assume "open source software" or something.
gpt-oss is open-weight, not open source though...

dense lion Aug 7, 2025, 2:17 PM

#

how much ram and SSD memory is required to run this oss?

eager anvil Aug 7, 2025, 2:17 PM

#

mellow hedge 💀

is this even possible? The weights OAI released were already at 4.25bit/average.

final cradle Aug 7, 2025, 2:17 PM

#

Is the 20gb Model able to run on a iPhone 16 Pro?

dense lion Aug 7, 2025, 2:17 PM

#

what are weights?

tepid garnet Aug 7, 2025, 2:18 PM

#

eager anvil Saw last night that they'd been jailbroken, but haven't looked into it. At least...

you don't need to jailbreak to get soft erotica out of the OSS models, they will do it out of the box

mellow hedge Aug 7, 2025, 2:18 PM

#

tawdry tulip How do you get the 20b one in ur phone

Are you new to AI? just asking

eager anvil Aug 7, 2025, 2:18 PM

#

dense lion how much ram and SSD memory is required to run this oss?

I've found you need 13.3GB of unified/GPU ram to run the 20b.

dense lion Aug 7, 2025, 2:18 PM

#

final cradle Is the 20gb Model able to run on a iPhone 16 Pro?

not sure

unique sinew Aug 7, 2025, 2:18 PM

#

Does this have any correlation with integrating ChatGPT into your own Locally Hosted LLM? Or will the process still be the same, needing the API? Sorry if ignorant question.

eager anvil Aug 7, 2025, 2:18 PM

#

tepid garnet you don't need to jailbreak to get soft erotica out of the OSS models, they will...

Wow, that's bizarre given it gave me a safety warning about a non-erotica vampire story.

clever rose Aug 7, 2025, 2:18 PM

#

final cradle Is the 20gb Model able to run on a iPhone 16 Pro?

On the face this looks like a no

eager anvil Aug 7, 2025, 2:19 PM

#

Just asked it for a sad story about a village beset by a vampire, and it said it couldn't do it.

tepid garnet Aug 7, 2025, 2:19 PM

#

eager anvil Wow, that's bizarre given it gave me a safety warning about a non-erotica vampir...

I did an erotic ghost story

eager anvil Aug 7, 2025, 2:19 PM

#

tepid garnet I did an erotic ghost story

I've seen that movie.

final cradle Aug 7, 2025, 2:19 PM

#

clever rose On the face this looks like a no

Too bad. Would be so cool if this would work on a phone in the future though

eager anvil Aug 7, 2025, 2:19 PM

#

the clay throwing scene was intense.

hot fiber Aug 7, 2025, 2:19 PM

#

tepid garnet you don't need to jailbreak to get soft erotica out of the OSS models, they will...

but they do have a ton of filters i assume

dense lion Aug 7, 2025, 2:19 PM

#

unique sinew Does this have any correlation with integrating ChatGPT into your own Locally Ho...

two different things. oss model is already sort of ChatGPT. so if you run oss locally, you don't need API.

tepid garnet Aug 7, 2025, 2:20 PM

#

hot fiber but they do have a ton of filters i assume

it's not as filtered as ChatGPT

unique sinew Aug 7, 2025, 2:20 PM

#

dense lion two different things. oss model is already sort of ChatGPT. so if you run oss lo...

Oh? Is it as ‘trained’ as our current model of ChatGPT? That’s accessible via API or Web?

mellow hedge Aug 7, 2025, 2:20 PM

#

I wish I had powerful enough hardware to run GPT-oss 20b, but i have only 8gb of vram

eager anvil Aug 7, 2025, 2:21 PM

#

hot fiber but they do have a ton of filters i assume

Last night saw online someone had jailbroken it to get instructions on how to do all the things red teamers fear, like making forbidden chemicals.

clever rose Aug 7, 2025, 2:21 PM

#

wheat jungle I'm fairly new to AI. What advantages do the OSS models have over the browser ba...

People might be trying to steal your data from OpenAI.
Home inference means your data isn't in the big treasure hoard.

eager anvil Aug 7, 2025, 2:21 PM

#

mellow hedge I wish I had powerful enough hardware to run GPT-oss 20b, but i have only 8gb of...

Cheaper to just use cloud inference.

mellow hedge Aug 7, 2025, 2:21 PM

#

eager anvil Cheaper to just use cloud inference.

true

unique sinew Aug 7, 2025, 2:21 PM

#

hot fiber but they do have a ton of filters i assume

You could always technically get erotic stories and insinuations. You just would have to ‘beat around the bush’ essentially and give ChatGPT the idea without seeming to be too influential or without that being your apparent directive.

eager anvil Aug 7, 2025, 2:21 PM

#

clever rose People might be trying to steal your data from OpenAI. Home inference means your...

And 20b is surprisingly good at programming.

tepid garnet Aug 7, 2025, 2:21 PM

#

I am running openai/gpt-oss-120b on my MacBook Pro, M2 Max with 96GB RAM

hot fiber Aug 7, 2025, 2:21 PM

#

unique sinew You could always technically get erotic stories and insinuations. You just would...

fair fair

eager anvil Aug 7, 2025, 2:21 PM

#

tepid garnet I am running openai/gpt-oss-120b on my MacBook Pro, M2 Max with 96GB RAM

Wish we had benchmarks for how well it compares.

mellow hedge Aug 7, 2025, 2:22 PM

#

eager anvil Cheaper to just use cloud inference.

true but i do training of local really small LMs on my pc and it is struggling with my gpu i need a better gpu

dense lion Aug 7, 2025, 2:22 PM

#

unique sinew Oh? Is it as ‘trained’ as our current model of ChatGPT? That’s accessible via AP...

nope. It's the lightweight version of ChatGPT. it's rumoured to be equivalent in intelligence of o3 model.

unique sinew Aug 7, 2025, 2:23 PM

#

dense lion nope. It's the lightweight version of ChatGPT. it's rumoured to be equivalent in...

Hmm. Is it free or paid? I wonder, (if paid) if it’s worth utilizing the OSS over the current API’s offered. Been trying to build a Hybrid of a LLM between ChatGPT with my own logic trees and ideology.

eager anvil Aug 7, 2025, 2:23 PM

#

mellow hedge true but i do training of local really small LMs on my pc and it is struggling w...

Tought time to upgrade. Prices are high, macbooks are mid-cycle, Nvidia won't release their cheap GPU box, 5090's are painful to get and now pulling 1kw to run.

tepid garnet Aug 7, 2025, 2:23 PM

#

OSS models are free and open source

mellow hedge Aug 7, 2025, 2:23 PM

#

eager anvil Tought time to upgrade. Prices are high, macbooks are mid-cycle, Nvidia won't re...

i might get a mac studio mini

unique sinew Aug 7, 2025, 2:23 PM

#

tepid garnet OSS models are free and open source

Oh sheeeeit

dense lion Aug 7, 2025, 2:23 PM

#

unique sinew Hmm. Is it free or paid? I wonder, (if paid) if it’s worth utilizing the OSS ove...

FREE AND OPEN-SOURCE. You can customize it to your own needs.

clever rose Aug 7, 2025, 2:24 PM

#

You can train it to rap every answer

unique sinew Aug 7, 2025, 2:24 PM

#

Welp, RIP to my monthly API costs Trihard

tawdry tulip Aug 7, 2025, 2:24 PM

#

mellow hedge Are you new to AI? just asking

Im not new to AI but I dont work in the field or related to it. I dont know the technics and never really learned much

dense lion Aug 7, 2025, 2:24 PM

#

clever rose You can train it to rap every answer

I guess you can just customize the way it responds, you can't train it, can you? Because that part is already done by OpenAI.

unique sinew Aug 7, 2025, 2:25 PM

#

I mean, I’m sure if you can set up permanent memory with the OSS model, you for sure can train it.

#

Using the term permanent loosely here.

mellow hedge Aug 7, 2025, 2:26 PM

#

tawdry tulip Im not new to AI but I dont work in the field or related to it. I dont know the ...

Ah, Okay, it just surprised me that you asked in the first place if its possible to run on phone. To that I can Say... yes and no, yes if you are using an API, and no if you are using it locally, that would fry your phone.

dense lion Aug 7, 2025, 2:26 PM

#

unique sinew Welp, RIP to my monthly API costs <:Trihard:585574395830075392>

I believe you will not be able to get the same level of intelligent responses in some areas with oss. So eventually, you'll need API.

unique sinew Aug 7, 2025, 2:26 PM

#

dense lion I believe you will not be able to get the same level of intelligent responses in...

Still sounds super beneficial, as it could greatly cut monthly costs.

tepid garnet Aug 7, 2025, 2:27 PM

#

openai/gpt-oss-120b is very impressive

unique sinew Aug 7, 2025, 2:27 PM

#

I’ve somewhat already got that system implemented in my LLM structure where it ‘detects’ if a more structured/complex reasoning is needed- it calls to the most efficient Model, whereas if it doesn’t, it’ll utilize mini or nano.

#

So hopefully OSS can fit into that system well.

haughty ice Aug 7, 2025, 2:27 PM

#

tepid garnet openai/gpt-oss-120b is very impressive

If you have a $25,000 gpu to run it

tepid garnet Aug 7, 2025, 2:28 PM

#

haughty ice If you have a $25,000 gpu to run it

no, just a MacBook Pro M2 Max with 96GB RAM is all you need

haughty ice Aug 7, 2025, 2:28 PM

#

tepid garnet no, just a MacBook Pro M2 Max with 96GB RAM is all you need

Actually? I could swear I read h100 on hugging face

shut marten Aug 7, 2025, 2:29 PM

#

No, Mac works

dense lion Aug 7, 2025, 2:29 PM

#

unique sinew Still sounds super beneficial, as it could greatly cut monthly costs.

If that's the case, why didn't you try running other state-of-the-art free and open source models before?

haughty ice Aug 7, 2025, 2:29 PM

#

Huh, neat

tepid garnet Aug 7, 2025, 2:29 PM

#

haughty ice Actually? I could swear I read h100 on hugging face

I have a MacBook Pro, M2 Max with 96GB RAM and it runs openai/gpt-oss-120b fine

shut marten Aug 7, 2025, 2:29 PM

#

The main limitation for many is the ram

haughty ice Aug 7, 2025, 2:29 PM

#

tepid garnet I have a MacBook Pro, M2 Max with 96GB RAM and it runs openai/gpt-oss-120b fine

Whats the performance like? In terms of intelligence and responses

analog coral Aug 7, 2025, 2:29 PM

#

i have gone mad with power, and i am now doing something very unwise. 12gb vram is enough surely

clever rose Aug 7, 2025, 2:29 PM

#

dense lion I guess you can just customize the way it responds, you can't train it, can you?...

You can do more training on models, it's just gonna cost money for servers.

tepid garnet Aug 7, 2025, 2:30 PM

#

haughty ice Whats the performance like? In terms of intelligence and responses

16 tk/s and intelligence is outstanding

shut marten Aug 7, 2025, 2:30 PM

#

I run a Mac Studio M3 Ultra with 512gb of ram and I’m getting 34tps

haughty ice Aug 7, 2025, 2:30 PM

#

How about gpt-oss-20b?

unique sinew Aug 7, 2025, 2:30 PM

#

dense lion If that's the case, why didn't you try running other state-of-the-art free and o...

Those are my intentions. I’m trying to build a Hybrid between 3 things (I should’ve stated) open source LLM models, OpenAI models, and implementation of my own logic trees I’ve gathered reasoning with ChatGPT over the course of the last 3 months.

shut marten Aug 7, 2025, 2:30 PM

#

I’m using it in Xcode 26 beta and it’s a bit slow so I’ve moved down to running the 20b one which is just fine for searching for things in the code

unique sinew Aug 7, 2025, 2:30 PM

#

But I haven’t made it to the point of integrating them all cohesively. Between work and family it’s been a bit delayed.

haughty ice Aug 7, 2025, 2:31 PM

#

I'm wondering if I can make a little helpdesk support agent /w 20b

#

My server's due for an upgrade though so gonna try it on my gaming rig for now

unique sinew Aug 7, 2025, 2:32 PM

#

Local server or paid dedi

haughty ice Aug 7, 2025, 2:32 PM

#

I have 2 local servers atm, ripped the gpus out of both

shut marten Aug 7, 2025, 2:33 PM

#

A Mac mini with 32gb of ram would probably be enough for that

#

M4 chip

haughty ice Aug 7, 2025, 2:33 PM

#

I dont like mac

unique sinew Aug 7, 2025, 2:33 PM

#

Ooooooo ok ok. I’ve been wanting to scrape together a local server but I feel like one of those men who start 15 projects and never finish one.

haughty ice Aug 7, 2025, 2:33 PM

#

unique sinew Ooooooo ok ok. I’ve been wanting to scrape together a local server but I feel li...

Tbh its very nice having your own metal

tepid garnet Aug 7, 2025, 2:33 PM

#

haughty ice I dont like mac

a Mac is the best bang for your buck for inference

haughty ice Aug 7, 2025, 2:33 PM

#

even like an rpi will do some jobs pretty well

unique sinew Aug 7, 2025, 2:33 PM

#

I can imagine. Good friend of mine in Australia has a decent local server.

#

Well, a few. Borderline “local data center” worthy

haughty ice Aug 7, 2025, 2:34 PM

#

tepid garnet a Mac is the best bang for your buck for inference

Yes but I cant be asked to deal with macos or apple at all

split cipher Aug 7, 2025, 2:34 PM

#

Once I get some cash, I think it's going to be worthwhile to invest in some servers

haughty ice Aug 7, 2025, 2:34 PM

#

Id rather spend some more money on hardware and have something like nix running

#

or even my own proxmox node

tepid garnet Aug 7, 2025, 2:34 PM

#

haughty ice Id rather spend some more money on hardware and have something like nix running

macOS is Unix

haughty ice Aug 7, 2025, 2:34 PM

#

tepid garnet macOS is Unix

Ik

#

Ive just never used it before outside of the gui

split cipher Aug 7, 2025, 2:35 PM

#

tepid garnet macOS is Unix

how quickly they forget

haughty ice Aug 7, 2025, 2:35 PM

#

So im not willing to invest any time into it

#

Cosnidering how bad of an experience the gui was

#

I'd be dropping the ability to make any upgrades with a mac

tawny field Aug 7, 2025, 2:36 PM

#

tbf, 20b on my 12gb 3080ti surprised me a lot on LM Studio. Moderate speed, but very good responses.

tepid garnet Aug 7, 2025, 2:36 PM

#

LM Studio is great

tawny field Aug 7, 2025, 2:37 PM

#

It does a nice job - I have so many models on my hard drive I've played with on it.

analog coral Aug 7, 2025, 2:38 PM

#

i am currently loading the 120b model in lm studio out of sheer morbid curiosity, i want to hear the transistors burn

tawny field Aug 7, 2025, 2:39 PM

#

Yeah I don't think that one would run on my 2019 Ryzen pc

#

yeah lmstudio goes red and says likely too large.

tepid garnet Aug 7, 2025, 2:41 PM

#

openai/gpt-oss-120b is taking up 63GB RAM on my Mac

analog coral Aug 7, 2025, 2:44 PM

#

Safeties: off. Transistors: burning. Discord? That crashed. We're going places.

earnest knoll Aug 7, 2025, 2:46 PM

#

what would you guys say is the best or smartest or i guess the closest to agi in opensource/local ai models

cold shale Aug 7, 2025, 2:47 PM

#

analog coral Safeties: off. Transistors: burning. Discord? That crashed. We're going places.

that's the spirit

vast ginkgo Aug 7, 2025, 2:52 PM

#

tepid garnet openai/gpt-oss-120b is taking up 63GB RAM on my Mac

now I regret saving on memory on my m4 max 💀

tepid garnet Aug 7, 2025, 2:53 PM

#

rule 1 of buying a Mac is to get the most RAM you can

analog coral Aug 7, 2025, 2:54 PM

#

hmm, it appears I'm having difficulties running the one-hundred and twenty billion parameter model on my hardware

dense lion Aug 7, 2025, 2:55 PM

#

I HOPE IT WILL JUST LIVE UP TO THE HYPE.

analog coral Aug 7, 2025, 2:56 PM

#

once I finally squeeze it into this box, it'll croak out a dying one token per minute, and it will sound like a symphony

midnight sorrel Aug 7, 2025, 2:57 PM

#

analog coral once I finally squeeze it into this box, it'll croak out a dying one token per m...

one token per minute
haha, mine is better, one token per 20 secs

#

my memory is actually swapping

midnight sorrel Aug 7, 2025, 2:58 PM

#

analog coral once I finally squeeze it into this box, it'll croak out a dying one token per m...

also check if you're running on the GPU + VRAM plus CPU + RAM method robert mentioned

analog coral Aug 7, 2025, 2:59 PM

#

If I had to guess, I'm not, because in the task manager my resource values would flip flop between using my graphics card and my memory like a person having a stroke

trail arch Aug 7, 2025, 2:59 PM

#

has anyone encountered an error with gpt-oss-20b?
we've deployed it on our company's machine with a H200 with vllm (the newest gptoss docker image) and we're getting:

...
openai_harmony.HarmonyError: Unexpected token 12606 while expecting start token 200006

when trying to work in agent mode via VSCode plugins like Cline.bot/RooCode

tepid garnet Aug 7, 2025, 3:00 PM

#

some of the Docker images are broken

midnight sorrel Aug 7, 2025, 3:05 PM

#

analog coral If I had to guess, I'm not, because in the task manager my resource values would...

@tepid garnet where to flip the switch?

analog coral Aug 7, 2025, 3:07 PM

#

I think I turned it on, whatever I did is giving promising results. I haven't crashed yet.

midnight sorrel Aug 7, 2025, 3:07 PM

#

analog coral I think I turned it on, whatever I did is giving promising results. I haven't cr...

👍 how fast is it now?

analog coral Aug 7, 2025, 3:08 PM

#

Oh it still hasn't started, one token per minute was my hopes and dreams.

#

Okay so I definitely DIDN'T enable the CPU + RAM optimization, wherever that is, because I'm cruising at a steady 31.9/32gb ram usage

midnight sorrel Aug 7, 2025, 3:11 PM

#

analog coral Okay so I definitely DIDN'T enable the CPU + RAM optimization, wherever that is,...

mb you set only CPU + RAM, instead of hybrid

tepid garnet Aug 7, 2025, 3:14 PM

#

you might need to ask the folks on the LM Studio Discord Server how to run both GPU and CPU on Windows

analog coral Aug 7, 2025, 3:17 PM

#

Honestly I'm not too concerned with running the 120b model, I just find it funny trying to get it to run. After the latest shenanigans, it seems to be loading a response, it hasn't sent anything yet and it's been a few minutes, though.

stone knot Aug 7, 2025, 3:18 PM

#

is there any rcps that dont rewquire an install?

tepid garnet Aug 7, 2025, 3:19 PM

#

stone knot is there any rcps that dont rewquire an install?

??

stone knot Aug 7, 2025, 3:19 PM

#

tepid garnet ??

that i can use with 20b

tepid garnet Aug 7, 2025, 3:20 PM

#

stone knot that i can use with 20b

just download LM Studio then download the model into that

stone knot Aug 7, 2025, 3:25 PM

#

tepid garnet just download LM Studio then download the model into that

yeah is there ant rcp's i can use that work well with the model

tepid garnet Aug 7, 2025, 3:26 PM

#

stone knot yeah is there ant rcp's i can use that work well with the model

what is an rcp?

stone knot Aug 7, 2025, 3:26 PM

#

mcp*

green birch Aug 7, 2025, 3:27 PM

#

Guys doesn't gpt 4o already exist, just for paying people

wide crest Aug 7, 2025, 3:28 PM

#

yes, why do you ask?

trail arch Aug 7, 2025, 3:28 PM

#

tepid garnet some of the Docker images are broken

can we fix it by manually updating some requirements? should some harmony libraries be updated?

green birch Aug 7, 2025, 3:28 PM

#

wide crest yes, why do you ask?

Oh ye just was confused on the new announcement

#

I'm guessing it's becoming free then

wide crest Aug 7, 2025, 3:28 PM

#

it's a .gif, you need to watch it a bit longer

tepid garnet Aug 7, 2025, 3:28 PM

#

stone knot mcp*

try MCP Bridge on the Mac with LM Studio

fair trench Aug 7, 2025, 3:28 PM

#

how much ram is needed for gpt-oss 20b?

green birch Aug 7, 2025, 3:28 PM

#

OH okay

tepid garnet Aug 7, 2025, 3:28 PM

#

fair trench how much ram is needed for gpt-oss 20b?

63GB on a Mac

fair trench Aug 7, 2025, 3:29 PM

#

tepid garnet 63GB on a Mac

i mean 20b, not 120b

tepid garnet Aug 7, 2025, 3:29 PM

#

fair trench i mean 20b, not 120b

16GB on a Mac

fair trench Aug 7, 2025, 3:29 PM

#

tepid garnet 16GB on a Mac

what about windows?

tepid garnet Aug 7, 2025, 3:30 PM

#

fair trench what about windows?

I don't use Windows, but I would suspect you want a 16GB GPU card

stone knot Aug 7, 2025, 3:32 PM

#

fair trench what about windows?

i use a gtx 1660 super and 32gb of ddr4 ram and its okay at best, so a boost from what i have

analog coral Aug 7, 2025, 3:40 PM

#

7 seconds a token, no CPU usage required. Mission accomplished, ladies and gentlemen. It only took every single free drop of memory my computer has available to run it.

midnight sorrel Aug 7, 2025, 3:49 PM

#

analog coral 7 seconds a token, no CPU usage required. Mission accomplished, ladies and gentl...

now you beat my speed

analog coral Aug 7, 2025, 3:50 PM

#

it's almost too fast to comprehend

midnight sorrel Aug 7, 2025, 3:51 PM

#

analog coral it's almost *too fast to comprehend*

yea, like who the hell can understand a word in a 7 secs

#

actually a 24B model runs at 6-7 tokens/sec on my laptop
I think if there were better quantizations it would work gud enoough

livid cipher Aug 7, 2025, 3:53 PM

#

man i wish i got the 96gb version of the mac

analog coral Aug 7, 2025, 3:54 PM

#

midnight sorrel yea, like who the hell can understand a word in a 7 secs

i can't read words. who knows how to read words. and they expect me to read a word in 7 seconds?

tepid garnet Aug 7, 2025, 3:56 PM

#

livid cipher man i wish i got the 96gb version of the mac

what Mac do you have?

clever rose Aug 7, 2025, 3:59 PM

#

It is kinda insane that the 20B model runs faster than a 4B q8 model.

midnight sorrel Aug 7, 2025, 3:59 PM

#

clever rose It is kinda insane that the 20B model runs faster than a 4B q8 model.

which is that 4B model?

livid cipher Aug 7, 2025, 4:01 PM

#

tepid garnet what Mac do you have?

48gb m4 max mbp

tepid garnet Aug 7, 2025, 4:01 PM

#

livid cipher 48gb m4 max mbp

ouch, I am so sorry

livid cipher Aug 7, 2025, 4:01 PM

#

well at that time i didnt have the money to get the 96 version and i didnt intend to run local llms at that time so ... 😭

clever rose Aug 7, 2025, 4:01 PM

#

midnight sorrel which is that 4B model?

Qwen 3 -4B thinking

the OSS model is both faster and waaaaaaay smarter

livid cipher Aug 7, 2025, 4:01 PM

#

i just need some excuses rn

#

so sad

tepid garnet Aug 7, 2025, 4:03 PM

#

I just priced out a MacBook Pro, M4 Max with 128GB of RAM and it's $7800 AUD which is $1800 more than I paid for my M2 Max with 96GB RAM

livid cipher Aug 7, 2025, 4:03 PM

#

tepid garnet I just priced out a MacBook Pro, M4 Max with 128GB of RAM and it's $7800 AUD whi...

damn

#

how much is that in vnd

#

your mac is like roughly 2500 more than mine

#

yeah i defo couldnt afford it on a scholarship 💀

tropic mural Aug 7, 2025, 4:16 PM

#

I got the M4 Pro with 48gb and it runs the 20b and everything I need fine. If I want a heavy model I just use the chatgpt site 😂

#

It starts to be diminishing returns on price vs benefit to push higher

copper grove Aug 7, 2025, 4:19 PM

#

tepid garnet a Mac is the best bang for your buck for inference

Not for long: The Nvidia dgx spark is on the horizon

tepid garnet Aug 7, 2025, 4:19 PM

#

copper grove Not for long: The Nvidia dgx spark is on the horizon

I don't even know what that is

clever rose Aug 7, 2025, 4:20 PM

#

tiny supercomputer

#

fits in the palm of your hand

copper grove Aug 7, 2025, 4:20 PM

#

tepid garnet I don't even know what that is

You should google it then, they will probably release it this year happy_avocado

clever rose Aug 7, 2025, 4:20 PM

#

Wasn't it like $5k though

copper grove Aug 7, 2025, 4:21 PM

#

clever rose tiny supercomputer

Yea and optimized for AI models

tepid garnet Aug 7, 2025, 4:21 PM

#

if I Google it, then I will want it, then I will buy it, so best not to Google it

copper grove Aug 7, 2025, 4:21 PM

#

clever rose Wasn't it like $5k though

4k I think

livid cipher Aug 7, 2025, 4:21 PM

#

if its speed is like

#

mac level and not a dedicated card level

#

then id probably still go for a mac

#

more versatile

copper grove Aug 7, 2025, 4:22 PM

#

livid cipher then id probably still go for a mac

You can just plug it into every computer. Use your mac if you want and let your llms run on the dgx spark

livid cipher Aug 7, 2025, 4:22 PM

#

copper grove You can just plug it into every computer. Use your mac if you want and let your ...

yeah but would you rather have a 6k mac that does everything good but not as fast as the dgx spark in llm inference or a 2k mac and a box just for AI stuff

copper grove Aug 7, 2025, 4:23 PM

#

It should be way faster than current m-chips

livid cipher Aug 7, 2025, 4:23 PM

#

copper grove It should be way faster than current m-chips

if it is that large then yeah defo worth it

random jungle Aug 7, 2025, 4:23 PM

#

yes, doing this with our medical offices. dgx spark is also important for privacy

copper grove Aug 7, 2025, 4:23 PM

#

livid cipher if it is that large then yeah defo worth it

large in what context? it has 128gb of vram if thats what you mean

livid cipher Aug 7, 2025, 4:24 PM

#

copper grove large in what context? it has 128gb of vram if thats what you mean

large in speed differences between a mac and the dgx spark

clever rose Aug 7, 2025, 4:24 PM

#

" 1 petaFLOP of AI performance at FP4 precision"

livid cipher Aug 7, 2025, 4:24 PM

#

clever rose " 1 petaFLOP of AI performance at FP4 precision"

wow this is a lot

tropic mural Aug 7, 2025, 4:24 PM

#

livid cipher large in speed differences between a mac and the dgx spark

Both fall at the same speed if dropped from a tower

copper grove Aug 7, 2025, 4:24 PM

#

livid cipher large in speed differences between a mac and the dgx spark

oh I see

livid cipher Aug 7, 2025, 4:24 PM

#

clever rose " 1 petaFLOP of AI performance at FP4 precision"

i take my words back

copper grove Aug 7, 2025, 4:25 PM

#

Yea, it's a powerhouse for llms

livid cipher Aug 7, 2025, 4:26 PM

#

the age of mac dominating the low powered large vram cheap for ai stuff is over

clever rose Aug 7, 2025, 4:26 PM

#

soon™️

copper grove Aug 7, 2025, 4:35 PM

#

livid cipher the age of mac dominating the low powered large vram cheap for ai stuff is over

Macs are just dominating because their arm based systems are highly efficient and optimized, making them perform surprisingly well in AI tasks, even though their hardware isn’t necessarily better than that from a traditional pc. But the dgx spark takes it a step further by combining an ARM chip with full optimization for running and even training llms

wintry ivy Aug 7, 2025, 4:57 PM

#

Gpt oss is good for dev or not ?

shut marten Aug 7, 2025, 4:58 PM

#

Not for blind coding. You need to make sure that the stuff actually works

wind hawk Aug 7, 2025, 4:59 PM

#

#1210625797165940806 ahhh

shut marten Aug 7, 2025, 4:59 PM

#

the 120b model is really decent

#

for 20b it's good to find info about stuff in Xcode 26 for example

inner raptor Aug 7, 2025, 5:54 PM

#

never used this what is this even about

wide crest Aug 7, 2025, 5:55 PM

#

inner raptor never used this what is this even about

GPT-oss is a model that OpenAI released with open weights. That means that people can run the model locally and have full control on how the model functions

inner raptor Aug 7, 2025, 5:56 PM

#

oh thats sick

#

so if i were to build like a irl jarvis i should use gpt-oss?

wide crest Aug 7, 2025, 5:57 PM

#

if you want to host it all locally, you totally can!

tepid garnet Aug 7, 2025, 6:00 PM

#

openai/gpt-oss-120b is the best open source model available

hollow dove Aug 7, 2025, 8:01 PM

#

tepid garnet openai/gpt-oss-120b is the best open source model available

all open source

tepid garnet Aug 7, 2025, 8:19 PM

#

hollow dove all open source

I have tried them all, I like gpt-oss-120b best

glacial canyon Aug 7, 2025, 8:39 PM

#

How is GPT-oss at image generation? (Does it do it at all?)

hollow dove Aug 7, 2025, 8:40 PM

#

glacial canyon How is GPT-oss at image generation? (Does it do it at all?)

it's a purely text only model

glacial canyon Aug 7, 2025, 8:40 PM

#

hollow dove it's a purely text only model

Rgr that, thanks

midnight elm Aug 7, 2025, 8:45 PM

#

so the 20b model runs on how many gb of vram?

tepid garnet Aug 7, 2025, 8:46 PM

#

midnight elm so the 20b model runs on how many gb of vram?

12 to 13 GB of VRAM

midnight elm Aug 7, 2025, 8:48 PM

#

tepid garnet 12 to 13 GB of VRAM

okay so my 16gb gpu can run it stably right

#

or does it spike

tepid garnet Aug 7, 2025, 8:48 PM

#

depends on context and other things using system resources, it should run fine

midnight elm Aug 7, 2025, 8:48 PM

#

tepid garnet depends on context and other things using system resources, it should run fine

okay good thx

royal kraken Aug 7, 2025, 10:23 PM

#

Recommended specs for the 120B version for a purely CPU-only machine? I know 128 GB ram, what about the # of cores?

crisp yarrow Aug 7, 2025, 10:30 PM

#

Is GPT-OSS better than Deepseek or llama?

void lagoon Aug 7, 2025, 10:31 PM

#

crisp yarrow Is GPT-OSS better than Deepseek or llama?

I think yes (it's just my feeling!)

#

On my computer the model in quicker and the results are good (RTX 4080 laptop)

crisp yarrow Aug 7, 2025, 10:32 PM

#

according to an AI model it is

void lagoon Aug 7, 2025, 10:32 PM

#

You can try it on lmstudio

hollow dove Aug 7, 2025, 11:12 PM

#

crisp yarrow according to an AI model it is

No, the "DeepSeek" model you're comparing it to is llama 3.1 8B but trained to respond in a similar way to deepseek with reasoning. DeepSeek R1 still outperforms gpt-OSS across various benchmarks. Also gpt-OSS 120B does not support image input

steel vine Aug 7, 2025, 11:12 PM

#

also, comparing 120b to an 8b is flawed from the get go

hollow dove Aug 7, 2025, 11:16 PM

#

Yeah, I'd reccomend going onto sites that compare models (with actual benchmarks)

#

And compare to similarly sized models

#

I reccomend artificial analysis (the site I personally use for comparing models)

#

Wont let me send a link to jt but you can just search it up

dapper shard Aug 8, 2025, 12:05 AM

#

Hey guys we're from Unsloth and we found some implementation differences for gpt-oss. Is there anyone we can talk to? Thank you 🙂

steel vine Aug 8, 2025, 12:07 AM

#

dunno but, why you guys making 4 bit quants when gpt-oss already shipsd with 4 bit quants?

dapper shard Aug 8, 2025, 12:08 AM

#

steel vine dunno but, why you guys making 4 bit quants when gpt-oss already shipsd with 4 ...

it can be quantized further down. When you make GGUFs they have to be upscaled to f16 then converted down from there

steel vine Aug 8, 2025, 12:10 AM

#

just seems like openai have the opportunity to do quantised aware training... and anyone making a 4 bit quant of a potentially optimised 4 bit quant... would presumably result in less quality

void lagoon Aug 8, 2025, 12:12 AM

#

Do you know how to connect GPT-oss on Internet such as gpt 4o ?

hollow dove Aug 8, 2025, 12:16 AM

#

void lagoon Do you know how to connect GPT-oss on Internet such as gpt 4o ?

Ollama (account required, not local) or perplexica (local, no account required, installed with docker)

#

To clarify the model is local in both cases but for using ollama the search service is not local

void lagoon Aug 8, 2025, 12:17 AM

#

oh ok thank you !

#

i will try perplexica + gpt oss

subtle brook Aug 8, 2025, 1:30 AM

#

I thought it could?

swift jacinth Aug 8, 2025, 1:43 AM

#

Any agent framework support the harmony format now?

copper grove Aug 8, 2025, 2:47 AM

#

subtle brook I thought it could?

mind sharing your lm studio settings?

#

Oh nvm, I just used lm studio wrong the whole time lol

#

Jumped from about 40 tokens a second to about 180 a second lol

subtle brook Aug 8, 2025, 3:14 AM

#

copper grove Jumped from about 40 tokens a second to about 180 a second lol

Can you read files?

spiral gazelle Aug 8, 2025, 3:38 AM

#

copper grove Oh nvm, I just used lm studio wrong the whole time lol

How much VRAM do you have?

copper grove Aug 8, 2025, 3:39 AM

#

spiral gazelle How much VRAM do you have?

24gb

spiral gazelle Aug 8, 2025, 3:39 AM

#

copper grove 24gb

Oh, okay. I have 16GB, so... 15/sec is my limit then. (Half-GPU load + 16k context)

copper grove Aug 8, 2025, 3:40 AM

#

spiral gazelle Oh, okay. I have 16GB, so... 15/sec is my limit then. (Half-GPU load + 16k conte...

for the 20b model?

spiral gazelle Aug 8, 2025, 3:40 AM

#

Yes.

#

Am I doing something wrong?

copper grove Aug 8, 2025, 3:40 AM

#

Nah you also should be able to get over the hundreds

#

did you put your gpu offload fully to the right?

#

the 20b model only uses about 12gb, so you should be able to fully load it into vram

spiral gazelle Aug 8, 2025, 3:43 AM

#

Oh, okay, I tested something before and leave GPT offload to 12/24.
That was maybe testing speed with very long context.

#

Now it's 77.05 tok/sec.
Isn't it become very slow with long context(including long reasoning token)?

copper grove Aug 8, 2025, 3:44 AM

#

spiral gazelle Oh, okay, I tested something before and leave GPT offload to 12/24. That was may...

yea then try to set it on 24/24 (no worries, it would give you a warning if thats too much)

copper grove Aug 8, 2025, 3:45 AM

#

spiral gazelle Now it's 77.05 tok/sec. Isn't it become very slow with long context(including lo...

I have some speed optimized settings, you can try them

spiral gazelle Aug 8, 2025, 3:46 AM

#

copper grove I have some speed optimized settings, you can try them

Oh, okay, thanks for shareing!

copper grove Aug 8, 2025, 3:47 AM

#

spiral gazelle Oh, okay, thanks for shareing!

no worries, I'd love to get an update on your speed now

spiral gazelle Aug 8, 2025, 3:48 AM

#

copper grove no worries, I'd love to get an update on your speed now

With above setting, I got 72.21 tok/sec with total 8711 tokens output.

copper grove Aug 8, 2025, 3:49 AM

#

spiral gazelle With above setting, I got 72.21 tok/sec with total 8711 tokens output.

oh okay, you maybe can up your cpu allocation too

#

but I mean thats good speeds

loud oxide Aug 8, 2025, 4:54 AM

#

Why i dont see other gpt chat models?

obsidian wyvern Aug 8, 2025, 5:48 AM

#

loud oxide Why i dont see other gpt chat models?

#announcements

delicate iron Aug 8, 2025, 5:49 AM

#

copper grove 24gb

Rich ppl lol

#

Oh btw how much tok/s should I be getting

limber juniper Aug 8, 2025, 7:39 AM

#

is anyone facing this error while trying to run gpt-oss-20b?
EngineCore_0 pid=3969032) AssertionError: Sinks are only supported in FlashAttention 3

I am using L40S 48GB

muted anvil Aug 8, 2025, 11:00 AM

#

Is there any chance they could give us models to choose from again? Especially 4o?

copper grove Aug 8, 2025, 11:13 AM

#

delicate iron Rich ppl lol

Lol, just setting my priorities wrong 😆

delicate iron Aug 8, 2025, 11:13 AM

#

copper grove Lol, just setting my priorities wrong 😆

It's okay. I just wonder how people get so much access to gpus

#

Are they like super rich so they can buy gpus

copper grove Aug 8, 2025, 11:14 AM

#

delicate iron Oh btw how much tok/s should I be getting

It's usable from like 15-20 tokens, 50 tokens/s already feels great

delicate iron Aug 8, 2025, 11:15 AM

#

I only got 5 tok per sec on a 40 series gpu

#

Was using shared vram

#

On the 20b

delicate iron Aug 8, 2025, 11:16 AM

#

copper grove It's usable from like 15-20 tokens, 50 tokens/s already feels great

How much money do you give yourself to spend on ai hardware

copper grove Aug 8, 2025, 11:18 AM

#

delicate iron How much money do you give yourself to spend on ai hardware

Idk really, depends on the month. Luckily I'm in a position where I can buy a few things without it bothering me that much.

delicate iron Aug 8, 2025, 11:19 AM

#

Like how much money. BTW Can you friend me on Discord

#

It's a 5090 rich people

#

or poor people for AI

harsh aurora Aug 8, 2025, 11:53 AM

#

copper grove It's usable from like 15-20 tokens, 50 tokens/s already feels great

50 t/s is faster than one can read already so, pretty usable

copper grove Aug 8, 2025, 11:57 AM

#

harsh aurora 50 t/s is faster than one can read already so, pretty usable

Yup but faster is still better. Often times you can anticiapte how the answer is structured and that you'll find your real answer somewhere in the third paragraph or so

harsh aurora Aug 8, 2025, 12:07 PM

#

true

#

I often don't read the whole thing =P

#

also, I noticed Im reading more of the text on gpt 5-pro

#

it is indeed dernser in information

analog coral Aug 8, 2025, 12:20 PM

#

delicate iron I only got 5 tok per sec on a 40 series gpu

I have a 4070su and when I run 20b I get ~15 tokens/s, it's odd that it's that slow for you?

Edit: and that's after I crank up the context and experts for giggles, though I enjoy flipping random switches for the sake of chaos and set k quantizing and v quantizing to f16. If you decide to try that and it helps, lemme know.

copper grove Aug 8, 2025, 12:41 PM

#

harsh aurora it is indeed dernser in information

neat

brisk trench Aug 8, 2025, 12:45 PM

#

👋 i have a budget from my boss to spend 5k $ on hardware for training and run A.I models what is the best hardhare specs for that ? anyone have a list to build i nice PC for that ?

harsh aurora Aug 8, 2025, 12:49 PM

#

brisk trench 👋 i have a budget from my boss to spend 5k $ on hardware for training and run A...

rent a cloud A100 and fine tune gpt-oss, it will probably spend less than that

copper grove Aug 8, 2025, 1:59 PM

#

brisk trench 👋 i have a budget from my boss to spend 5k $ on hardware for training and run A...

^ this. Thats the way to do it. And after the training, you just need a PC or server with at least 12 gigs of vram, 16gb or even 24gb would be better though. And at least 32 gigs of RAM as a token buffer and for KV-Cache. For concurrent chats or heavy tool use you might consider 64gb. A cpu with 6 cores and 12 threads would be sufficient, but if you still have some spare change also consider using a npu.

#

The training itself might only cost you 500-1000 bucks, depending on your usecase. So you should have enough money left to buy a good server for it.

calm wedge Aug 8, 2025, 2:03 PM

#

I am trying to launch gpt-oss locally (on remote with supercomputer), tried to install gpt-oss with pip (actually uv) and when lauching import gpt_oss [.] chat, I am getting

from gpt_oss [.] tools import apply_patch
ModuleNotFoundError: No module named 'gpt_oss [.] tools'

and I can't find tools.py on gpt_oss package directory.
Would someone know how to address this?

from #community-help

#

File "/scratch/[username]/gptoss/run-localtest.py", line 3, in <module>
    from gpt_oss import chat
  File "/scratch/[username]/gptoss/.venv/lib/python3.13/site-packages/gpt_oss/chat.py", line 20, in <module>
    from gpt_oss.tools import apply_patch
ModuleNotFoundError: No module named 'gpt_oss.tools'

hazy sequoia Aug 9, 2025, 1:33 AM

#

harsh aurora rent a cloud A100 and fine tune gpt-oss, it will probably spend less than that

for $5k you could rent 8x H100s for over 200 hours which would finish way before then anyway

hollow dove Aug 9, 2025, 2:27 AM

#

brisk trench 👋 i have a budget from my boss to spend 5k $ on hardware for training and run A...

20B or 120B? if it's 20B you could get a 5090 and fine tune it using unsloth

#

from their listing of the model you can fine tune the 20B model on 14 GB VRAM (this increases if you increase the context length)

#

but a 5090 with 32 GB vram is plenty for 20B

hazy sequoia Aug 9, 2025, 2:52 AM

#

hollow dove 20B or 120B? if it's 20B you could get a 5090 and fine tune it using unsloth

if they don’t have a 5090 already it’s probably not worth buying one as it’s a couple thousand when they can rent an equivalent card for a fraction of the price for the amount of time they’d need it for

hollow dove Aug 9, 2025, 2:56 AM

#

hazy sequoia if they don’t have a 5090 already it’s probably not worth buying one as it’s a c...

that fully depends on how much finetuning you would be doing and whether the data being used is company exclusive and cannot be sent off to third party cloud compute

fading kelp Aug 9, 2025, 3:28 AM

#

Guys, I fine-tuned OpenAI’s OSS 20B reasoning model using the most popular medical reasoning dataset and published the results on Hugging Face. Who wants to check it?

faint nacelle Aug 9, 2025, 6:24 AM

#

fading kelp Guys, I fine-tuned OpenAI’s OSS 20B reasoning model using the most popular medic...

Hi,

That's great. Want to check out.

fading kelp Aug 9, 2025, 9:22 AM

#

Can ı share hf link here, is it allowed?

#

or which room should ı use for this purpose?

cyan kite Aug 9, 2025, 9:25 AM

#

you can try but if its not whitelisted url you might get a short automod timeout which is nothing to worry about

fading kelp Aug 9, 2025, 9:26 AM

#

okay thanks for information

#

ı think it is in blacklist

vivid tide Aug 9, 2025, 11:17 AM

#

can i run gpt oss 120b on hp omen 14th gen i7 and 16gb ram and 8gb nvidia 8gb ram 4060

hazy sequoia Aug 9, 2025, 11:23 AM

#

hollow dove that fully depends on how much finetuning you would be doing and whether the dat...

if you are doing it through google cloud or azure or similar you can make a deal they have no access to your data

cyan kite Aug 9, 2025, 11:37 AM

#

vivid tide can i run gpt oss 120b on hp omen 14th gen i7 and 16gb ram and 8gb nvidia 8gb ra...

no

analog coral Aug 9, 2025, 12:17 PM

#

vivid tide can i run gpt oss 120b on hp omen 14th gen i7 and 16gb ram and 8gb nvidia 8gb ra...

you can try c:

hazy sequoia Aug 9, 2025, 12:20 PM

#

analog coral you can try c:

gpt-10 will be out by the time it generates a single token lol

analog coral Aug 9, 2025, 12:40 PM

#

If you believe hard enough, if you ignore all safety warnings and enjoy the sweet tears of your hardware crying in defiance, you can get that single token.

harsh aurora Aug 9, 2025, 2:16 PM

#

vivid tide can i run gpt oss 120b on hp omen 14th gen i7 and 16gb ram and 8gb nvidia 8gb ra...

nope, even with CPU offload, that isn't nearly as much RAM

#

the CPU would run it, slow, but would run, the problem is the amount of RAM

#

if you had something like 32 GB, it would be possible

#

that goes for every AI model, the min requirements aren't actually in compute power, that jsut dictates the speed it runs
the base requirements are to be able to have the entire model loaded in memory, if you don't have enough memory, it can't run

#

you can run the 20b on that hardware, tho

#

I just noticed something, OpenAI does not have gpt-oss on the API..

#

I mean, know it is open and anyone can run it.. but I would expect OAI to have that option if you wanted to use the model on their platform

hot fiber Aug 9, 2025, 4:19 PM

#

harsh aurora I mean, know it is open and anyone can run it.. but I would expect OAI to have ...

do the most as cheap as possible, OAI's new motto lol

hazy sequoia Aug 9, 2025, 4:58 PM

#

harsh aurora I mean, know it is open and anyone can run it.. but I would expect OAI to have ...

they probably didn’t see the point when cerebras and groq were guaranteed to do it faster and cheaper than them

#

i get like 16k tk/s and less than 0.2s TTFT on openrouter with cerebras and 120b it’s insane

#

i’m working on a project that definitely wouldn’t have been possible without that speed

solemn willow Aug 9, 2025, 8:34 PM

#

harsh aurora that goes for every AI model, the min requirements aren't actually in compute po...

So what hardware or how much ram do I need, I wanna run the higher parameter model but I want to know how much I have to buy beforehand, I don't mind it being slow I just need it to run.

harsh aurora Aug 9, 2025, 8:38 PM

#

solemn willow So what hardware or how much ram do I need, I wanna run the higher parameter mod...

THe model takes 40 GB of RAM, to run it you need that + whatever more for the OS

cyan kite Aug 9, 2025, 8:38 PM

#

solemn willow So what hardware or how much ram do I need, I wanna run the higher parameter mod...

"... gpt-oss-120b run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and the gpt-oss-20b model run within 16GB of memory." you can also use regular mem but its super slow

harsh aurora Aug 9, 2025, 8:40 PM

#

and the same amount in VRAM if you want to run it on a GPU

#

for example, my GPU ras 32 GB VRAM, som Im able to run it using both my GPU and CPU, which is muuuch slower

#

but it runs

#

the 20b model runs on just the GPU, which I get about 250 tok/s

cyan kite Aug 9, 2025, 8:44 PM

#

cheaper to host the model on private cloud on demand if privacy is the only reason, unless you have free electricity 😅

#

for non privacy stuff its offered free or super cheap on multiple places

royal kraken Aug 9, 2025, 8:57 PM

#

cyan kite "... gpt-oss-120b run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and ...

How slow exactly? I want it to run it on 128 regular ram

steel vine Aug 9, 2025, 8:58 PM

#

its moe so probably runs alright if you got like 32 cores or more

copper grove Aug 9, 2025, 11:07 PM

#

royal kraken How slow exactly? I want it to run it on 128 regular ram

depends on your ram speed, vram is way faster though

dapper shard Aug 9, 2025, 11:46 PM

#

fading kelp Guys, I fine-tuned OpenAI’s OSS 20B reasoning model using the most popular medic...

congrats!

#

did u experience any issues while using the unsloth finetuning notebook?

livid cipher Aug 10, 2025, 3:17 AM

#

vivid tide can i run gpt oss 120b on hp omen 14th gen i7 and 16gb ram and 8gb nvidia 8gb ra...

no

tepid garnet Aug 10, 2025, 3:49 AM

#

I am running openai/gpt-oss-120b on my MacBook Pro, M2 Max with 96GB RAM

shut pollen Aug 10, 2025, 4:03 AM

#

tepid garnet I am running openai/gpt-oss-120b on my MacBook Pro, M2 Max with 96GB RAM

does the model ever reason for too long for you?

#

I'm running it myself and sometimes it reasons for 30k tokens

tepid garnet Aug 10, 2025, 4:04 AM

#

shut pollen I'm running it myself and sometimes it reasons for 30k tokens

if you have reasoning effort set to high then it can reason for a very long time

shut pollen Aug 10, 2025, 4:06 AM

#

tepid garnet if you have reasoning effort set to high then it can reason for a very long time

Yeah... I've had it reason itself out of context many times.

#

and also it likes to repeat the same exact thing over and over again

tepid garnet Aug 10, 2025, 4:07 AM

#

shut pollen and also it likes to repeat the same exact thing over and over again

yes I have noticed that myself

shut pollen Aug 10, 2025, 4:08 AM

#

tepid garnet yes I have noticed that myself

Do you have any solutions for this? I ended up turning the temperature down a bit

tepid garnet Aug 10, 2025, 4:09 AM

#

shut pollen Do you have any solutions for this? I ended up turning the temperature down a bi...

are you using LM Studio?

shut pollen Aug 10, 2025, 4:10 AM

#

tepid garnet are you using LM Studio?

yep

tepid garnet Aug 10, 2025, 4:10 AM

#

shut pollen yep

set reasoning effort to low or medium, that's how I solved it

shut pollen Aug 10, 2025, 4:11 AM

#

tepid garnet set reasoning effort to low or medium, that's how I solved it

Thanks, it appears that's the only way

royal kraken Aug 10, 2025, 9:08 AM

#

do you prefer oss 120b over phi or gemma? phi seems to be accurate in my tests but oss has more of a "personality"

tepid garnet Aug 10, 2025, 9:11 AM

#

oss-120b is my favourite local model

shut marten Aug 10, 2025, 11:28 AM

#

oss-120b just managed to do better at coding than GPT-5

midnight sorrel Aug 10, 2025, 11:54 AM

#

shut marten oss-120b just managed to do better at coding than GPT-5

When? How?

shut marten Aug 10, 2025, 11:56 AM

#

I asked it to write a Swift script that parses and converts folders YAML files recursively into JSON and then write it into a parallel folder. GPT-5 Failed because the first thing it did is check if there are any yaml files in the root directory, and if not, return and stop the script.

lofty osprey Aug 10, 2025, 11:56 AM

#

how do u get oss

#

do yall use chatgpt for coding or cursor btdub

tepid garnet Aug 10, 2025, 11:58 AM

#

lofty osprey how do u get oss

you can install LM Studio and then download gpt-oss-20b or gpt-oss-120b

celest oasis Aug 10, 2025, 11:59 AM

#

lofty osprey do yall use chatgpt for coding or cursor btdub

i used o4 mini high almost the whole time before, now when i use gpt 5 thinking i get trash results

midnight sorrel Aug 10, 2025, 12:02 PM

#

shut marten I asked it to write a Swift script that parses and converts folders YAML files r...

Is this always behaving the same? Try once more

shut marten Aug 10, 2025, 12:02 PM

#

I ran the same prompt twice, then added some guiding and it still failed, always adding new errors

#

Anyway I got it to work first try with oss-120b

#

Really surprised me

midnight sorrel Aug 10, 2025, 12:03 PM

#

Wow, that’s such a fail

shut marten Aug 10, 2025, 12:03 PM

#

In general GPT-5 has been disappointing me with coding so far. It hasn't been able to do anything I've asked it to do

midnight sorrel Aug 10, 2025, 12:04 PM

#

Has been good for me
On cursor

tepid garnet Aug 10, 2025, 12:04 PM

#

GPT-5 has been great for me coding

shut marten Aug 10, 2025, 12:05 PM

#

I've tried ChatGPT directly, GPT-5 in GH Copilot Chat, GPT-5 from the API through Xcode 26's coding assistant feautre, nothing has been working so far.

late vessel Aug 10, 2025, 1:11 PM

#

Prompting issue

steel swan Aug 10, 2025, 1:22 PM

#

celest oasis i used o4 mini high almost the whole time before, now when i use gpt 5 thinking ...

same

shut marten Aug 10, 2025, 1:34 PM

#

late vessel Prompting issue

If you want to blame it on that, sure. And oss-120b manages to do it

carmine bear Aug 10, 2025, 2:26 PM

#

shut marten I've tried ChatGPT directly, GPT-5 in GH Copilot Chat, GPT-5 from the API throug...

ChatGPT will suffer from context exhaustion immediately (idk why they did this for paying users like it's 2023). Agentic tools are extremely non-deterministic so you must constantly re-evaluate your tools and prompting.

violet heath Aug 10, 2025, 2:59 PM

#

shut marten I've tried ChatGPT directly, GPT-5 in GH Copilot Chat, GPT-5 from the API throug...

the chatgpt that is being used in xcode 26 beta 5 is not gpt 5 right?

shut marten Aug 10, 2025, 2:59 PM

#

Yeah, it's likely 4o which is why I'm using GPT-5 through the API

#

They will likely (according to leaks) update all Apple Intelligence features to GPT-5 when macOS/iOS 26 releases

fading kelp Aug 10, 2025, 9:27 PM

#

dapper shard did u experience any issues while using the unsloth finetuning notebook?

ı cant convert my safetensor to gguf file.

dapper shard Aug 10, 2025, 9:45 PM

#

fading kelp ı cant convert my safetensor to gguf file.

Atm you can only convert it to f16, does that work?

#

You need to use a better GPU to convert it

#

And use the basic safetensor file

tepid garnet Aug 11, 2025, 6:47 AM

#

@tidal trellis You can run a free opensource OpenAI model on your own computer, you just download a program called LM Studio and use that to download one of the two opensource models

spark oar Aug 11, 2025, 6:47 AM

#

I used LM Studio a free program and it was front page news when I downloaded the model and loaded it. It was that easy

#

Oh sorry Robert

tidal trellis Aug 11, 2025, 6:47 AM

#

what are the benefits of downloading this?

spark oar Aug 11, 2025, 6:48 AM

#

tidal trellis what are the benefits of downloading this?

A free 03 mini model you can use forever no rate limit

#

*o3 level

tepid garnet Aug 11, 2025, 6:48 AM

#

spark oar Oh sorry Robert

no worries, anyway @tidal trellis OpenAI gifted us a way to use OpenAI models for free on our own computers keeping all of our conversations private to your own machine

tidal trellis Aug 11, 2025, 6:49 AM

#

what are the limitations?

#

that's an old model right?

tepid garnet Aug 11, 2025, 6:50 AM

#

tidal trellis that's an old model right?

no these are new models, released in the past two weeks

#

the only limitation is your own computer, it needs to have a GPU with enough VRAM to load the model

tidal trellis Aug 11, 2025, 6:51 AM

#

i have a 3060ti

#

a little old

#

8gb ram

tepid garnet Aug 11, 2025, 6:52 AM

#

then you can download LM Studio and run gpt-oss-20b

#

https://help.openai.com/en/articles/11870455-openai-open-weight-models-gpt-oss

OpenAI Help Center

OpenAI open-weight models (gpt-oss) | OpenAI Help Center

Learn about OpenAI’s open‑weight models (gpt-oss) and where to get support

tidal trellis Aug 11, 2025, 6:54 AM

#

thank you robert, I'm downloading the model now.

tepid garnet Aug 11, 2025, 6:54 AM

#

tidal trellis thank you robert, I'm downloading the model now.

great 🙂

tidal trellis Aug 11, 2025, 6:54 AM

#

so... why would people pay for gpt if they can get it free ?

#

there must be some difference

#

worth paying for... ?

spark oar Aug 11, 2025, 6:55 AM

#

to get 3K usese of gpt 5

#

thinking

tepid garnet Aug 11, 2025, 6:56 AM

#

tidal trellis so... why would people pay for gpt if they can get it free ?

because these models are small in comparison to ChatGPT, they are great models but don't have the depth of knowledge that you can get from the closed source models that OpenAI offer such as GPT-5

tidal trellis Aug 11, 2025, 6:57 AM

#

well, I've always loved knowing that I had the latest and greatest.. even though I woudn't ever put it to full use

#

I have gemini pro, which I got to trial for a year

#

i've also used copilot.. it's free and doesnt' seem to have any limits

tepid garnet Aug 11, 2025, 6:59 AM

#

well one of the advantages of running an opensource model on your own computer is data privacy, everything you discuss with gpt-oss remains private to your own computer

fallow oracle Aug 11, 2025, 7:29 AM

#

tepid garnet because these models are small in comparison to ChatGPT, they are great models b...

So would it be better at coding?

tepid garnet Aug 11, 2025, 7:29 AM

#

fallow oracle So would it be better at coding?

it's not better than GPT-5 on coding but it does a very good job with code

fallow oracle Aug 11, 2025, 7:31 AM

#

Well OpenAI says GPT-5 is really good at coding when it quite literally isn't any good at anything.

tepid garnet Aug 11, 2025, 7:33 AM

#

fallow oracle Well OpenAI says GPT-5 is really good at coding when it quite literally isn't an...

in my own experience ChatGPT GPT-5 is excellent at coding

fallow oracle Aug 11, 2025, 7:35 AM

#

tepid garnet in my own experience ChatGPT GPT-5 is excellent at coding

What language and do you use Pro?

tepid garnet Aug 11, 2025, 7:36 AM

#

fallow oracle What language and do you use Pro?

SwiftUI for macOS and Python and no, I am on Plus now.

fallow oracle Aug 11, 2025, 7:37 AM

#

Well Python is the easiest programming language I am not surprised if it's good at that- but other languages like Lua, Java etc it absolutely sucks.

tepid garnet Aug 11, 2025, 7:37 AM

#

fallow oracle Well Python is the easiest programming language I am not surprised if it's good ...

I have also used it with Laravel, a PHP framework for web dev without issues

old lagoon Aug 11, 2025, 7:46 AM

#

Heya, I understand things can get a bit heated at times, but let’s keep the conversation respectful, even when we don’t see eye to eye. Thanks :) @fallow oracle

#

Deleted your message for the reason mentioned, just letting you know so you’re aware robothumbs_up

fallow oracle Aug 11, 2025, 7:48 AM

#

Alright

tepid garnet Aug 11, 2025, 7:53 AM

#

I just ported my Amateur Radio Function Generator Simulator code from MacOS to Linux Qt with C++ in 10 minutes using gpt-oss-120b

shut marten Aug 11, 2025, 8:31 AM

#

And did it work? Do you use it in a way where it can debug itself?

tepid garnet Aug 11, 2025, 8:52 AM

#

shut marten And did it work? Do you use it in a way where it can debug itself?

yep works fine, I did it via the macOS desktop app just asking ChatGPT to refactor the code from SwiftUI to Qt

tepid carbon Aug 11, 2025, 11:31 AM

#

should you use lm studio or ollama or msty

tepid garnet Aug 11, 2025, 11:32 AM

#

I use LM Studio personally

royal kraken Aug 11, 2025, 1:24 PM

#

fallow oracle Well Python is the easiest programming language I am not surprised if it's good ...

I've also used it for PHP OOP programming and it's just fine. Now for the low-level firmware zone, which I have a project in, I haven't tried it out yet but I'm hoping it somewhat works

upbeat zealot Aug 11, 2025, 4:12 PM

#

tepid garnet I use LM Studio personally

Yeah it's great. But my pc is a potato one. It cannot even handle a 3B LLM.

strange yacht Aug 11, 2025, 7:30 PM

#

You could make a small chatbot and use inference, depending on how much you use the LLM it might be cheaper if you use API calls over paying for the model

solemn willow Aug 11, 2025, 9:00 PM

#

harsh aurora THe model takes 40 GB of RAM, to run it you need that + whatever more for the OS

I'll just run the small one, I have 32gb of ram and don't mind waiting, I don't have the money for a GPU rn

solemn willow Aug 11, 2025, 9:00 PM

#

cyan kite "... gpt-oss-120b run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and ...

Alright I'll do that thank you

strange yacht Aug 11, 2025, 9:01 PM

#

solemn willow I'll just run the small one, I have 32gb of ram and don't mind waiting, I don't ...

Don't think it'll run almost at all without a gpu

solemn willow Aug 12, 2025, 2:04 AM

#

No like I have a 1650 super so it runs just not fast

solemn willow Aug 12, 2025, 2:05 AM

#

strange yacht Don't think it'll run almost at all without a gpu

Like one prompt takes 5 minutes but at least it runs as of writing this

tepid carbon Aug 12, 2025, 7:43 AM

#

is my pc high end for running llms 32GB of ram amd RX 7900XTX 24GB of vram and a amd ryzen 7 9800x3d

hybrid magnet Aug 12, 2025, 7:55 AM

#

tepid carbon is my pc high end for running llms 32GB of ram amd RX 7900XTX 24GB of vram and ...

The recommended minimum for the smaller OSS model is 16 GB, so you seem good.

Minimum 80 GB ram recommended for the larger OSS model, not sure how you fit with that.

cobalt thunder Aug 12, 2025, 9:17 AM

#

Yea the larger one even the quantized are well over 64 GB

copper grove Aug 12, 2025, 12:14 PM

#

tepid carbon is my pc high end for running llms 32GB of ram amd RX 7900XTX 24GB of vram and ...

Thats more than enough for most models up to 70B parameters in 4-bit quantization, and even some 8-bit models if optimized. Your RAM is fine unless you're working with very large context windows or doing heavy multitasking alongside inference. But extremely large models or unquantized versions will be challenging

harsh aurora Aug 12, 2025, 1:33 PM

#

strange yacht Don't think it'll run almost at all without a gpu

as long as you have enough ram, it does, just slower

strange yacht Aug 12, 2025, 1:39 PM

#

harsh aurora as long as you have enough ram, it does, just slower

Yeah, just thought hed give up before an answer came up

strange yacht Aug 12, 2025, 4:23 PM

#

Just curious but is anyone having issues with LMStudio and gpt OSS not getting previous message context from a conversation? Basically it starts over with every message

#

I don't know why, every time I tool call, the conversation restarts

cold galleon Aug 12, 2025, 5:05 PM

#

tidal trellis i have a 3060ti

A LITTLE OLD

#

U kidding me?

#

This kind of technologie is rare on my country 😢

tranquil oriole Aug 12, 2025, 6:56 PM

#

guys

#

I jail broke gpt-oss-20b

#

how can I request a bounty?

cyan kite Aug 12, 2025, 7:07 PM

#

tranquil oriole how can I request a bounty?

"Examples of safety issues which are out of scope:
Jailbreaks/Safety Bypasses (e.g. DAN and related prompts)"

tranquil oriole Aug 12, 2025, 7:11 PM

#

aj..

#

this is sad

left wadi Aug 12, 2025, 7:11 PM

#

tranquil oriole I jail broke gpt-oss-20b

It's easy to break. So is GPT-5. There hasn't (yet) been a single model I haven't broken except claude 3 and 4.

tranquil oriole Aug 12, 2025, 7:11 PM

#

I can win a compitition though

tranquil oriole Aug 12, 2025, 7:11 PM

#

left wadi It's easy to break. So is GPT-5. There hasn't (yet) been a single model I haven'...

I broke it

#

already

left wadi Aug 12, 2025, 7:11 PM

#

tranquil oriole I broke it

Claude 3 and 4?

tranquil oriole Aug 12, 2025, 7:11 PM

#

left wadi Claude 3 and 4?

yup

left wadi Aug 12, 2025, 7:12 PM

#

What do you mean by broken?

tranquil oriole Aug 12, 2025, 7:12 PM

#

jailbroke

left wadi Aug 12, 2025, 7:12 PM

#

But specifically what do you think that means for a model to be jailbroken?

tranquil oriole Aug 12, 2025, 7:12 PM

#

it makes what ever you want

left wadi Aug 12, 2025, 7:13 PM

#

I define it as "using user prompting to get the model disobey an explicit command given in the system/developer message."

#

Is that what you mean as well?

cyan kite Aug 12, 2025, 7:16 PM

#

tranquil oriole yup

you did not break them enough if you did not win the $10k-20k prices in the reward but those belong to #ai-discussions

hazy sequoia Aug 12, 2025, 10:56 PM

#

cyan kite you did not break them enough if you did not win the $10k-20k prices in the rewa...

wait what reward

#

is it still active?

#

i fine tuned a model which basically has no alignment at all now lol

vernal viper Aug 12, 2025, 11:12 PM

#

is there any documented fix for the single tool calling issue?

swift jacinth Aug 12, 2025, 11:22 PM

#

tranquil oriole jailbroke

I don't think so sir, breaking qwen is ez, and gpt-oss, I don't think so

oak ledge Aug 13, 2025, 12:38 AM

#

I’m happy

ionic prawn Aug 13, 2025, 4:02 AM

#

tranquil oriole how can I request a bounty?

You can’t bounty but if you look on #announcements you will see they are doing red teaming for money and u can submit a report there with a write up

strange yacht Aug 13, 2025, 1:39 PM

#

hazy sequoia i fine tuned a model which basically has no alignment at all now lol

I think it should be, did you find it Kiera?

raven sierra Aug 13, 2025, 4:42 PM

#

What VPS do y’all use to host oss

#

It was too heavy to run on my pc

tepid garnet Aug 13, 2025, 4:44 PM

#

I run it locally on my MacBook Pro, M2 Max with 96GB RAM. I wouldn't pay to host it anywhere as that kind of defeats the purpose of having a local model.

strange yacht Aug 13, 2025, 4:46 PM

#

raven sierra What VPS do y’all use to host oss

At that point just host gpt-5 or o3, no point in hosting a "local" gpt if its not local

hazy sequoia Aug 13, 2025, 5:44 PM

#

strange yacht I think it should be, did you find it Kiera?

yeah thanks

ionic prawn Aug 13, 2025, 5:58 PM

#

Why does oss think it’s GPT4 that’s what it keeps telling me it is???

tepid garnet Aug 13, 2025, 6:09 PM

#

ionic prawn Why does oss think it’s GPT4 that’s what it keeps telling me it is???

models know nothing about themselves, it's seen GPT-4 in it's training

ionic prawn Aug 13, 2025, 6:44 PM

#

tepid garnet models know nothing about themselves, it's seen GPT-4 in it's training

Even when I told it that it wasn’t and explained what it was it doubled down and said I’m wrong

tepid garnet Aug 13, 2025, 6:45 PM

#

it's only going by it's training

strange yacht Aug 13, 2025, 7:09 PM

#

ionic prawn Even when I told it that it wasn’t and explained what it was it doubled down and...

That is a thing with the new models and its intentional, the model isn't meant to be swayed by the user, it will think its knowledge is better than the users knowledge. (Because a grand majority of the time it is)

and you telling it it's something does not change its training, it has only seen that 4 exists in its training so it will parrot that and assume that is the case.

ionic prawn Aug 13, 2025, 7:22 PM

#

strange yacht That is a thing with the new models and its intentional, the model isn't meant t...

Ohhh

whole lion Aug 13, 2025, 9:01 PM

#

Gpt 4 was better

hazy sequoia Aug 14, 2025, 1:11 AM

#

whole lion Gpt 4 was better

the likely multi trillion parameter model performs better in some tests than a 20 billion parameter model?

#

nooo

harsh aurora Aug 14, 2025, 2:25 AM

#

ionic prawn Why does oss think it’s GPT4 that’s what it keeps telling me it is???

because gpt-oss didn't existed when gpt-oss was being created =P

ionic prawn Aug 14, 2025, 2:26 AM

#

harsh aurora because gpt-oss didn't existed when gpt-oss was being created =P

yes but 5 didnt exist when 5 was being created yet it knows its 5

harsh aurora Aug 14, 2025, 2:27 AM

#

tbh, GPT-5 was the first model to acknowledge itself as GPT-5 at release day

ionic prawn Aug 14, 2025, 2:27 AM

#

also it told me that it wasnt self hosted on my machine and insisted over and over it was running on openai azure servers