#gpt-oss

1 messages ¡ Page 1 of 1 (latest)

livid cipher
#

first

#

:3

obsidian wyvern
#

💌

glacial canyon
solemn willow
twin tundra
#

what's this

carmine sandal
#

The agentic stuff are the tools that can be added to the model ?

shut pollen
#

cool

obsidian wyvern
mint rain
naive elbow
mint rain
#

exited for the Hackathon

digital radish
#

selah.

mint rain
#

lets see how much we can push OSS

clever rose
#

116tok/sec is sweet. This is a fast model (20b)

meager karma
#

What type of idea you have in mind actually ?

naive elbow
mint rain
#

Ill do a IoT project

naive elbow
#

let’s see how good it is

carmine sandal
naive elbow
digital radish
meager karma
#

Can you use ollama ?

naive elbow
#

well i can’t post the link because of auto mod

meager karma
digital radish
naive elbow
#

but just go in google search and type LM studio

digital radish
meager karma
naive elbow
carmine sandal
digital radish
naive elbow
#

thanks

serene holly
static pivot
#

👀

solemn willow
#

Runs decently well in LM Studio.
137 tok/s for gpt-oss-20b on my 4090

Only issue is the model doesn't follow system prompts at all. Unsure if that's baked in or a llama.cpp related issue.

gaunt heart
#

who’s oss

half skiff
obsidian wyvern
#

OSS = Open Source Software

amber spruce
#

you can run it locally

clever rose
#

people are going to finetune it with crazy stuff you can't get on chatgpt

amber spruce
#

You just mentioned it

#

That you can run it without wifi

amber spruce
#

I'm far far from an expert on this topic but local LLMs do really have their advantages

gaunt heart
#

but it can be used to access my file system?

#

interesting

half skiff
#

To search through documents. offline

gaunt heart
#

oh wow

#

law firms are now automatic

half skiff
gaunt heart
#

wow

#

i think everything will be ai soon

#

like how everything is computers

#

just all computers will be ai

meager karma
#

You can use local ai to be fine tunned and it also have better privacy

drowsy badger
hardy jasper
#

so whats this thing good for ? coz it sure as heck cant code properly

meager karma
hardy jasper
#

is it aimed at something specific that its good for ( at the moment ? )

drowsy badger
#

It says it's very good at health related questions

meager karma
fair summit
#

why is gpt oss so slow on ollama on my computer

#

its m2 air with 16gb ram dn 1tb ssd

obsidian wyvern
carmine sandal
shut pollen
drowsy badger
#

16gb vram is minimum thats why

fair summit
#

it requires a whole minute or two for it to answer the most basic question

hardy jasper
#

can it do web searches on LMStudio ? Anyone have it installed to check ?

drowsy badger
#

Companies who deploy to production and are working with highly confidential data.

subtle brook
meager karma
#

For now OpenAI wanted to release a local model for people who want a local chatgpt without having other ai brand and the capabilty of OpenAI and OpenAI use this ocassion to make a community challenge

solemn willow
fair summit
meager karma
#

Where are you from tho ?

drowsy badger
#

Im sad they didn't release a smaller model for smartphones though.

static pivot
#

Oh, what? That's a shame.

subtle brook
meager karma
#

It may depend from where you from how much importance you adress

obsidian wyvern
#

I wonder if there is a quote/razor for "Whatever a company releases it's never good enough for a specific class of people who are never satisfied."

drowsy badger
#

Would you be comfortable if somebody had access to all of your chathistory with chatgpt?

obsidian wyvern
#

huh? From the guy who doesn't care about privacy?!?!

gaunt heart
#

you don’t want people to have ur chat history

#

bruh

obsidian wyvern
meager karma
#

it will fit with your need if you fine tunning for exemple a special expert in a domain

obsidian wyvern
#

There's much more too the OSS aspect. Look around for comments by people who have been excited about OSS AI for some years now, the game-changing event with Deep Reasoning. Downloadable AI - with high quality and from a trusted source - is seriously a big industry deal.

meager karma
#

You have even more flexibilty with the local model

obsidian wyvern
#

You can run different models in different places for different purposes. Tuned (within limits) to your specifications.

naive elbow
#

just like a chatgpt response

meager karma
obsidian wyvern
obsidian wyvern
#

This is one of the reasons why I really like OpenAI :

Estimating worst case frontier risks of open weight LLMs
In this paper, we study the worst-case frontier risks of releasing gpt-oss. We introduce malicious fine-tuning (MFT), where we attempt to elicit maximum capabilities by fine-tuning gpt-oss to be as capable as possible in two domains: biology and cybersecurity....
https://openai.com/index/estimating-worst-case-frontier-risks-of-open-weight-llms/

swift furnace
#

asked this already but we probably cant use this thing to generate images locally huh?

full ruin
meager karma
naive elbow
swift furnace
naive elbow
#

i mean you probably could ask it for write me a stable diffusion prompt for a cat

swift furnace
naive elbow
naive elbow
swift furnace
strange yacht
#

To be fair, you could probably run this to get a prompt and only use the prompt with inference to gpt-image to avoid those issues

#

also reduce costs

swift furnace
#

sure you probably could. im just looking to see if i can generate some stuff locally with a powerful model for funsies.

naive elbow
swift furnace
#

these oss models. are you able to further train them to fit specific needs by/for yourself?

tropic mural
#

It has full parameter fine tuning available

swift furnace
#

that is pretty incredible

meager karma
#

Where is the participant version release at ?

strange yacht
#

Yeah they can be finetuned, apparently the 20b one can even be fine-tuned on consumer hardware

swift furnace
#

i need better hardware to do all that. my 3060 12gig is not gonna be enough smh. was there any official word on what kinds of devices i could do all of that on?

#

devices lmao. hardware is what i meant

strange yacht
naive elbow
swift furnace
naive elbow
tropic mural
#

the 20b isn't terribly large you can run it with that I'm sure

strange yacht
#

Yeah 20b is fine

swift furnace
#

thats a relief

#

3060 ftw man. so glad i got the 12gig version

naive elbow
swift furnace
swift furnace
tropic mural
#

I've got a 3070 on desktop but want to try it on a macbook pro

swift furnace
tropic mural
#

I'll pull it now and see how it does

#

This is going to be a few lol

strange yacht
#

I don't think that will do well at all

#

I will be surprised if it does anything at all

tropic mural
#

Once it downloads and is ready, we'll see. Worth a try.

swift furnace
#

you'd be surprised, i think it will fair well

#

but what do ik lmfao

tropic mural
#

I have run other models on this and large ML projects. Macbooks are not the machines they were 10 years ago

swift furnace
#

agreed

tropic mural
#

This is taking a while to download so will check back later, doing other work for now 😂

wide crest
#

can't wait to melt my 2080 Super!

swift furnace
tropic mural
swift furnace
#

i dont get off work for a couple hours

wide crest
tropic mural
#

Seems like a good sign

swift furnace
#

noice

vivid falcon
#

I can't seem to get this running on my system no matter what I do... I'm using LM Studio and have a 3060 12GB, and the 20B can't run at any quant...

livid cipher
naive elbow
wide crest
#

I saw that the 20b requires 16GB. Maybe that's jsut for optimal performance? capybarathink

vivid falcon
naive elbow
vivid falcon
#

Cool. someone should tell Sam Altman that 16GB VRAM is not standard on most desktops and laptops, heck, according to Jensen Huang, 8GB VRAM should be good enough for just about everyone

livid cipher
#

we all know what he talks isnt what he really thinks

vivid falcon
#

What he really thinks is "I'm gonna become the richest person in history if I keep this Enterprise-level AI stuff up. Screw those gamers who got me here."

#

Still the least-bad tech billionaire though... But seriously. an Open source model to run on "edge devices" needs to run smoothly on like a Pixel 6...

tropic mural
#

A phone model is a much different goal and field

#

There are nano models that run on phones and I believe sama even did a poll of what people wanted, and phone lost?

vivid falcon
minor moat
vivid falcon
#

Wells Fargo might consider a 16GB machine an edge device, but most individual people's bank accounts consider a 16GB to be aspirational

minor moat
#

also I recommend the official MXFP4 quant

tropic mural
#

Found a bug on metal examples

#

am pioneer

vivid falcon
minor moat
minor moat
#

ollama works

minor moat
tropic mural
#

I'm having mild success on m4 pro chip 48gb

minor moat
#

bcuz I installed on arch linux the AUR version and it seems already outdated so it wouldn't work

tropic mural
#

Going to try more

vivid falcon
minor moat
#

also you sure the model that you downloaded is official?

#

lms get openai/gpt-oss-20b

#

in command prompt will download the official version into lm studio

#

very weird that yours doesn't load

#

Please update to LM Studio 0.3.21 to run the model locally.

royal linden
#

llama-cpp doesn't support the model at this moment?

naive elbow
#

it probably was hardware not being compatible

minor moat
#

he also has it

naive elbow
#

hmmm then that is strange

minor moat
#

I use ollama

naive elbow
#

oh

minor moat
#

since my "new" lmstudio was outdated

copper grove
#

The system instructions say we need to respond as ChatGPT
lol

naive elbow
minor moat
naive elbow
copper grove
#

Oh I see

minor moat
#

It represents itself as ChatGPT

#

You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
...

#

It is long

#

Also the reasoning presets low, medium and high is in system prompt.
Reasoning: medium
for example

minor moat
#

Mine PC can't handle it tho

#

so I use 20B

royal linden
#

Anyone having success with transformers snippet of code on their model card on Mac? When I run it says GPU is necessary?

Also, llama-cpp support is not out yet right?

copper grove
minor moat
minor moat
#

For now idk for what use case would be useful

copper grove
minor moat
#

also I was diassapointed that as many OSS models they didn't try to train the model on multiple languages

minor moat
#

so not interesting use case for me

naive elbow
naive elbow
#

that’s good

#

some models don’t allow it

minor moat
#

idk I don't see any physical constraint on open weight / open source models to not change the sys prompt

#

I think almost every model has a system message to do it

#

maybe the old ones

naive elbow
#

hmm interesting

#

so whatever that was probably was a bug

tropic mural
#

I ran with wrong weights and got hilarious random text

#

New password generator just dropped

wide crest
tropic mural
#

Macbook Pro with M4 Pro chip and 48 GB memory. 20b model runs quickly but I had to do a ton of tinkering with the example code to make it work nicely.

wide crest
#

I forsee a GitHub repo in your future dinopog

tropic mural
#
python gpt_oss/metal/examples/generate.py \
  ~/ML_Playground/gpt-oss-20b/metal/model.bin \
  -p "Why did the chicken cross the road?"
The user question: "Why did the chicken cross the road?" Usually answer: "To get to the other side." This is a joke or riddle. They might expect a typical answer. They said "I'm not sure if you want me to respond with a joke, but I'm going to say: "To get to the other side".

We can respond with a one line answer: "To get to the other side." Probably the best answer. There's no extra context. Provide the joke as known.

Ruby can be used but no.

No duplicates. They want a straightforward answer.To get to the other side.```

The "thinking" portion before the actual response is... strange in this case but it's so funny idk if I want to question it.
#

Metal might not be optimized yet but I got something

shut pollen
dire eagle
#

So a channel to talk about the new OSS model?

copper grove
#

It only uses 3.6B active parameters

little quarry
#

are the hugging credits automatically added?

marble bone
#

does the huggingface account need to be on my school email? i only have a huggingface account on my personal email address...

upbeat pawn
#

How strong does my computer be to run 120b

tropic mural
celest torrent
#

Do you think the 20B model could work as an agent to program?

#

I know it wouldn't be as good as codex but it can be...decent?

vapid marten
meager karma
#

Does it work ?

vapid marten
#

Thank you. It works.

meager karma
#

It's because you add r

#

to transfomers

vapid marten
# meager karma to transfomers

Hahahaha, okay. The link I used was actually from an official oai site. Not sure which one it was though. Regardless, just a harmless typo it seems 😁

strange flume
#

/lmstudio-community/gpt-oss-120b-MLX-8bit

This doesn't work anymore. my download was yanked halfway through. I can only find 4 bit quant versions now. The official model card still says 16bit/u8, but all the downloads say MXFP4 and are 64gb small. Talking about the 120b version. Does someone know what happened?

left wadi
strange flume
#

Why first offer a 16bit / u8 version then yank it? Also better precision definitely has its use cases

glossy pond
#

If you want reasonable conversation response time, I mean. You could run it on less if you are batching overnight or something.

spiral gazelle
finite glen
gaunt heart
#

is it real

#

i saw a twitter thread about it

#

that its able to simulate a world

finite glen
#

no

#

maybe

#

depends for what

#

hard to say 100%

#

probably deepseek

#

apparently oss is bad at tool calls

#

it can be made multimodal with tool calls

swift jacinth
#

Thank you OpenAI for opening GPT-OSS

#

Awesome model with top tier reasoning meanwhile real fast inference speed, I got over 100 tok per sec on 3090

wide crest
#

oooooh, that's awesome to hear it performed so well!

naive elbow
spiral gazelle
#

Yeah, it's not cheap, but more realistic than buying H100 yourself and create cluster. (Simple inference will only require 1 H100 though, as their model documentation says)
Alternative ways are also not that cheap if you want fast speed; $4000 Mac Studio with 96GB RAM, or $3000 NVIDIA Project DIGITS.

120b model is not for consumer anyway, except enthusiasts that already built the system for such large models...

#

Btw, RunPod offers ~3$ per hour for single H100. I have no serious experience about cloud, so I don't know if that appeals to users.

naive elbow
#

but yeah it’s not for the consumer more like companies and enthusiasts that have the setup

gaunt beacon
#

Hello guys just wanted to confirm is the new gpt-oss (120B) the most powerful Open Sourse Model out there?

gaunt beacon
#

Are the benchmarks alredy avalible or we still need to wait?

naive elbow
gaunt beacon
proper shuttle
#

hewwo, does anyone know if the OSS model supports the name field like the API?

naive elbow
#

well anyways i got it working on my server the 20b. doesn’t look like it goods for the server. but it isn’t slow

nova nebula
#

Guys how much unified memory would i have to have in order to run 120b version on my MacBook?

spark oar
#

I would still rather just use 4o but it's really cool have an LLM that will run on my rtx2080 lol

spice magnet
#

yeah tbh gpt-oss 120b is so bad, it fails every roo code tool use and gets into thought loops constantly

#
  • doesnt even have up to date knowledge of stuff that happened more than a year ago
spiral gazelle
#

Artificial Analysis has benchmark results in various area. Well, it's not that impressive, honestly... Not bad, but also not impressive considering this model is from OpenAI.

#

However, it's good that they release open-weight model. It's practically first time for OAI - not PoC, not research-only.

still ridge
#

hello!! everyone..!!

nova nebula
spiral gazelle
#

Could be a fair point, but OAI advertised that 120b outperforms DeepSeek-R1 and even o3-mini.

cedar shoal
#

yo

#

how to get the smaller model?

#

ping me if you have an answer thx

meager karma
copper grove
copper grove
#

I like the new models so far. I wonder when we can use models that are as good or even better than o3-pro locally

ionic holly
#

Hey, can someone tell me if you’ve already talked about GPT-OSS (open-weight models)? What are the pros and cons of using them?

robust swallow
#

❌ Error during agent execution: Error code: 400 - {'message': 'Model generated a tool call with name "find<|end|><|start|>assistant<|channel|>commentary" that is not in the tools list: ['search_wiki', 'search_web', 'open_url']', 'type': 'invalid_request_error', 'param': 'tools', 'code': 'wrong_api_format'}

#

awesome

strange yacht
#

Hey is anyone here using gpt-oss on LMStudio? Is there a UI way of enabling web search?

#

Or do I need to modify code? with what is stated in their site?

copper grove
spiral gazelle
strange yacht
#

I was hoping they would take advantage of Gpt-OSS's included web search by modifying the system prompt and just add a toggle for it

spiral gazelle
#

Additionally, internet search requires long context(LM Studio default is maybe 4096 in most case), and it requires more RAM.

ionic holly
spiral gazelle
strange yacht
#

Guess Ill have to use an agent framework then

copper grove
# ionic holly Oh nice, I asked to read some personal opinions on this update

My personal opinion is, that it's pretty good at math. So far it was able to help me with pretty much anything, I'd say even on the same level as o3. I hadn't much time to test other stuff but it definitely helps with all the basic things. For the 120b model, you need a very beefy pc, even with 4bit quantization, which I happen to have, but it's rather slow still. Thats why I mostly use the 20b model and I get like 40-50 tokens per second with it.

robust swallow
#

The user asks: "What is the capital of France?" We have a tool that fetched "It is Paris." The assistant should answer. We must provide answer. There's no extra nuance. The answer: Paris. Maybe also a brief. So answer: Paris.

#

The assistant should answer. We must provide answer.

#

wow

granite garnet
#

Hey! Has anyone tried using the latest models on local gpt-oss?

#

i have a problem, gpt-oss is running under cpu :/

robust swallow
#

how much vram do you have

granite garnet
robust swallow
#

hm

#

in what way u running

#

ollama lmstudio

strange yacht
granite garnet
copper grove
strange yacht
robust swallow
#

ollama always chose gpu for me hm

granite garnet
strange yacht
#

Oh that is weird then

shut pollen
#

no

wild roost
#

hf[dot]co[slash]tonic [slash]gpt-oss-20b-multilingual-reasoner made this with this github[dot]com[slash]josephrp[slash]smolfactory yesterday , if anyone wants to help me out i want to iron out some stuff and basically publish it asap , hopefully tomorrow or probably day after 🙂

eager anvil
#

What's the performance difference between the GGUF and the MX-FP4 versions?

#

Trying to decide if it is worth tyring to run on my mac.

#

Guess ollama runs the MX-FP4 version, not a GGUF, so it should just work right?

midnight sorrel
#

MXFP4 is different

#

the models weights are available as MXFP4 in a gguf file

eager anvil
left wadi
#

My laptop is only getting 50 tokens per second on GPT-OSS 20B.

midnight sorrel
crimson anvil
#

y

eager anvil
#

But this is the saddest reasoning tokens you can receive: According to the policy above: "Vampires" is in the list of disallowed content.

midnight sorrel
eager anvil
midnight sorrel
#

yeah, gpt-oss is very bad at creative stuff

eager anvil
#

Not sure what guidelines would have ever made that disallowed content.

midnight sorrel
#

it's good for coding and stuff tho

eager anvil
midnight sorrel
#

not cursor

#

kilo code

#

tho don't take my opinion, I only make very small controlled changes on my code

eager anvil
#

I need to rewrite this so that the function calls go ahead and call the functions, and their results automatically get fed back into the conversation and the llm called again.

If gpt-oss:20b can do this, it's over.

#

Okay, that was amazing. It messed up the VSC tool call, and the code wasn't perfect, but it was really close.

royal kraken
#

Is it possible to run the 120B model on pure CPU and no VRAM? I heard you need 128 GB RAM to run it

royal kraken
#

I saw some post of someone doing that and they were getting something in the ballpark of 50-100 tokens/second, which surprised me

shut pollen
#

I'm able to run it on apple silicon without any issues

royal kraken
#

With 128 gb ram?

shut pollen
#

yeah

royal kraken
#

Very nice, how many tokens you getting per sec?

shut pollen
#

(this is also with like a million other things open) and at 32k context

royal kraken
#

Ah ok

shut pollen
#

its not too bad

royal kraken
#

How is it versus deepseek r1?

shut pollen
#

It runs faster than r1 70b

#

performance wise its kinda similar, sometimes worse

royal kraken
#

Sometimes worse in what facility? I want to use it for data processing

shut pollen
royal kraken
#

GLM 4.5 is better than qwen imo

#

For that

shut pollen
#

also struggles at tool calls

royal kraken
#

too bad

echo shuttle
#

Hello, im new to oss models
What are the best models for general use, and also models specifically for coding?
And how do they compare to 2.5pro or o3

royal kraken
#

coding, glm 4.5. , i'd say on par with both 2.5pro and o3. for the air model, a bit worse than o3 but still decent

upbeat pawn
#

I downloaded 20b how I run it like a chat

echo shuttle
#

And like for consumer usage coding and general best models? Like less than 15B params

royal kraken
#

there are no good models for coding that are under 15b

#

gemma 12b is not bad for a basic reasoning model that can run off a potato machine, relatively speaking

#

use the unsloth version though

obsidian wyvern
#

For everyone here, developers will probably also be interested in https://community.openai.com/t/openais-open-weight-models-are-here-gpt-oss-120b-and-20b/1334739

OpenAI Developer Community

Welcome OpenAI’s new advanced open-weight reasoning models to customize for any use case and run anywhere. Permissive license Designed for agentic tasks Deeply customizable Access to the Full chain-of-thought Try both models in the browser. The playground is available here! Or, start building right away! Download from Hugging Face or view ...

strange yacht
#

Anyone here has worked with LM studio and MCP servers before? I am having a lot of trouble running an MCP server and having it recognized by LMStudio to use with GPT-OSS to test tool usage

steel vine
#

maybe best to ask on the lm studio discord?

eager anvil
#

I'm consistantly getting about 37tps with the 20b on my m1.

#

Not bad.

#

It's so good, wondering what I'd need to run 120b locally.

steel vine
#

six times more memory

eager anvil
#

32gb of shared

#

But only using a little over 13 with ollama

steel vine
#

4 bit quant?

eager anvil
#

Yeah, the one OAI packaged and released.

#

MXFP4

#

Trying to figure out what the hugging face model card means by: Web browsing (using built-in browsing tools)

#

What built in browsing tools?

steel vine
#

interesting. i wonder why the usual huggingface quantizers are releasing 4, 5, 6 and 8 bit quants if openai already released quantized versions

#

and you have to intentionally enable it:

To enable the browser tool, you'll have to place the definition into the system message of your harmony formatted prompt. You can either use the with_browser() method if your tool implements the full interface or modify the definition using with_tools().

eager anvil
#

Tried to get codex working with gpt-oss using this on the config.toml:

show_reasoning_content = true

[model_providers.local]
name = "local"
base_url = "http://localhost:11434/v1"

[profiles.oss]
model = "gpt-oss"
model_provider = "local"```

Codex can't find it for some reason.
#

It works as an agent in VSC, but it will do things like call the tool to read the file twice, then forget what it was doing and end with "how can I help."

steel vine
#

you will want a recent version of ollama to support the harmony response format

#

when was the last time you updated ollama? (isnt automatic on linux)

bright wedge
balmy wing
#

This sub is for Apple as they will be putting it on there iPhones soon xoxo

eager anvil
eager anvil
#

Wonder why gpt-oss isn't on any benchmarks yet.

balmy wing
#

lol each new phone gets a new model 🤣

balmy wing
#

You can run benchmarks vs each other

shut pollen
tawdry tulip
#

How do you get the 20b one in ur phone

#

And does it work offline?

midnight sorrel
#

You'd need like 16-24 gb ram for this

hybrid magnet
static pivot
#

Oh sheesh, I'll need to make sure my external drive can handle it on my potato of a pc.

hybrid magnet
hybrid magnet
midnight sorrel
hybrid magnet
# midnight sorrel https://x.com/sama/status/1952777539052814448

Thank you!

Now I am wondering if the word 'a' is significant.

A phone. One. There is one super-phone, in the world, that can run this model.

I'm joking, I'm not really at all skilled in comp sci, I have no idea but a 16 gb ram phone sounds awesome to me. I got a potato phone 😄

midnight sorrel
#

but generally speaking phones don't have 16gb
going by standards of top iphone, samsung and pixel phones

spiral gazelle
#

I had gaming laptop with RTX 3060(6GB VRAM + 32GB CPU RAM)... guess it's not high-end enough. Gemma 3 12b (7.2GB in size) was already slow.

naive elbow
spiral gazelle
#

Well, early GPT-4o hallucinated policy, and I got "image of dragons is not allowed"...

#

My test for gpt-oss-20b.
Prompt:

Are there any disallowed content in the following: "Vampire", "Werewolf", "Dragon"

Reasoning token (partial)

But the user lists "Vampire", "Werewolf", "Dragon". These are mythical creatures. They don't appear to be disallowed. There's no mention of violence or other categories. It's just terms. So likely none of these is disallowed content.

We need to check if any of them could be considered a protected class? No. Not relevant. Are they violent? Possibly but not necessarily. "Vampire" and "Werewolf" are fantasy creatures, might depict violence in stories, but the term alone isn't disallowed. There's no mention that these terms refer to real persons or groups. They are fictional.

Thus the answer: None of those are disallowed content.
hybrid magnet
hybrid magnet
spiral gazelle
hybrid magnet
#

Whoops, gotcha. Sorry!

spiral gazelle
#

I was responding to Sam's tweet:

we made an open model that performs at the level of o4-mini and runs on a high-end laptop

floral wing
static pivot
#

I don't know if mine does. 💀

harsh aurora
#

there are phones with 16 GB RAM

#

but a distiled and fine tuned version of the model would probably perform better for this use case

silk hazel
#

u guys should stop working and wait for gpt5 so you can work faster

tawdry tulip
tepid garnet
#

I am running openai/gpt-oss-120b on my MacBook Pro, very happy with it apart from the knowledge cut-off date of June 2024

midnight sorrel
tepid garnet
midnight sorrel
#

could be

copper grove
midnight sorrel
tepid garnet
#

I just got the 120b model to code me a version of my talking clock app, worked first time

analog coral
#

hi, I'm dumb and I don't know the very first thing about locally running a model, but with these models releasing I wanted to learn how. I tried looking things up and I didn't end up getting very far. instead, I was wondering if anyone could point me in the direction of any beginner friendly resources for getting something like this set up?

tepid garnet
analog coral
#

Windows 11, nvidea 4070 super, Intel 14700k, 32gb ram

tepid garnet
#

how much VRAM does the 4070 have?

analog coral
#

12gb

tepid garnet
#

search for an app called LM Studio and use that to download and run the 20B variant

analog coral
#

Aye aye, captain

nova nebula
tepid garnet
nova nebula
tepid garnet
#

The 120b model is using 61GB RAM

#

running at about 16 tk/s

#

faster than reading speed

#

OpenAI did a really good job with the OSS models

midnight sorrel
analog coral
#

it can, it will, and it DID

#

and it doesn't break a sweat, i don't know how the wizards do it. thank you for the help!

tepid garnet
midnight sorrel
#

I tried running on my macbook, 16gb, it jumped to swap memory

#

1 token in 20 secs

tepid garnet
#

20B won't run on a 16GB MacBook

eager anvil
# naive elbow how is vampires disallowed?? they are sfw man oai censorship is diabolical

Keep in mind, it was in the context of a Warhammer Fantasy story about an empire village beset by a vampire. But I could see its reasoning trace and it hadn't thought up anything too bad, it just restated the request then said vampires were a safety issue. But Warhammer Fantasy was made for kids, and doesn't contain anything too bad except for I guess some extreme violence.

eager anvil
eager anvil
tepid garnet
#

I wonder how many tk/s the 20B model gives me on my Mac, I am going to try it

eager anvil
#

Be interesting to see the difference between m1 and m2.

tepid garnet
#

53.68 tok/sec - 1750 tokens - 0.36s to first token

eager anvil
#

I've been very impressed with the 20b. Tool callign isn't perfect but really good. I imagine if I used/wrote a library to automaticaly call the model to fix errors in JSON shapes, it would be perfect.

eager anvil
tepid garnet
#

M4 with 128GB RAM beats my M2 Max with 96GB RAM for sure

#

I spent 6k on my MacBook Pro, it needs to give me another 2 years of use before I upgrade

eager anvil
#

My previous mac lasted 11 years. I'm hoping to get that out of this m1. I really need to set up some cloud inference, but running local is just so much more fun.

tepid garnet
#

I like local models

#

and OpenAI did a good job with the OSS models

wheat jungle
#

I'm fairly new to AI. What advantages do the OSS models have over the browser based ChatGPT model?

tepid garnet
wheat jungle
#

very cool!

hot fiber
#

what is oss ?

mellow hedge
dense lion
#

hmm

eager anvil
red monolith
hot fiber
#

ty

wheat depot
#

codex with oss would be nice

spiral gazelle
#

Nobody knows, but I assume "open source software" or something.
gpt-oss is open-weight, not open source though...

dense lion
#

how much ram and SSD memory is required to run this oss?

eager anvil
final cradle
#

Is the 20gb Model able to run on a iPhone 16 Pro?

dense lion
#

what are weights?

tepid garnet
mellow hedge
eager anvil
unique sinew
#

Does this have any correlation with integrating ChatGPT into your own Locally Hosted LLM? Or will the process still be the same, needing the API? Sorry if ignorant question.

eager anvil
clever rose
eager anvil
#

Just asked it for a sad story about a village beset by a vampire, and it said it couldn't do it.

eager anvil
final cradle
eager anvil
#

the clay throwing scene was intense.

hot fiber
dense lion
tepid garnet
unique sinew
mellow hedge
#

I wish I had powerful enough hardware to run GPT-oss 20b, but i have only 8gb of vram

eager anvil
clever rose
eager anvil
mellow hedge
unique sinew
# hot fiber but they do have a ton of filters i assume

You could always technically get erotic stories and insinuations. You just would have to ‘beat around the bush’ essentially and give ChatGPT the idea without seeming to be too influential or without that being your apparent directive.

eager anvil
tepid garnet
#

I am running openai/gpt-oss-120b on my MacBook Pro, M2 Max with 96GB RAM

eager anvil
mellow hedge
dense lion
unique sinew
eager anvil
tepid garnet
#

OSS models are free and open source

unique sinew
dense lion
clever rose
#

You can train it to rap every answer

unique sinew
#

Welp, RIP to my monthly API costs Trihard

tawdry tulip
dense lion
unique sinew
#

I mean, I’m sure if you can set up permanent memory with the OSS model, you for sure can train it.

#

Using the term permanent loosely here.

mellow hedge
dense lion
unique sinew
tepid garnet
#

openai/gpt-oss-120b is very impressive

unique sinew
#

I’ve somewhat already got that system implemented in my LLM structure where it ‘detects’ if a more structured/complex reasoning is needed- it calls to the most efficient Model, whereas if it doesn’t, it’ll utilize mini or nano.

#

So hopefully OSS can fit into that system well.

haughty ice
tepid garnet
haughty ice
shut marten
#

No, Mac works

dense lion
haughty ice
#

Huh, neat

tepid garnet
shut marten
#

The main limitation for many is the ram

haughty ice
analog coral
#

i have gone mad with power, and i am now doing something very unwise. 12gb vram is enough surely

clever rose
tepid garnet
shut marten
#

I run a Mac Studio M3 Ultra with 512gb of ram and I’m getting 34tps

haughty ice
#

How about gpt-oss-20b?

unique sinew
shut marten
#

I’m using it in Xcode 26 beta and it’s a bit slow so I’ve moved down to running the 20b one which is just fine for searching for things in the code

unique sinew
#

But I haven’t made it to the point of integrating them all cohesively. Between work and family it’s been a bit delayed.

haughty ice
#

I'm wondering if I can make a little helpdesk support agent /w 20b

#

My server's due for an upgrade though so gonna try it on my gaming rig for now

unique sinew
#

Local server or paid dedi

haughty ice
#

I have 2 local servers atm, ripped the gpus out of both

shut marten
#

A Mac mini with 32gb of ram would probably be enough for that

#

M4 chip

haughty ice
#

I dont like mac

unique sinew
#

Ooooooo ok ok. I’ve been wanting to scrape together a local server but I feel like one of those men who start 15 projects and never finish one.

haughty ice
tepid garnet
haughty ice
#

even like an rpi will do some jobs pretty well

unique sinew
#

I can imagine. Good friend of mine in Australia has a decent local server.

#

Well, a few. Borderline “local data center” worthy

haughty ice
split cipher
#

Once I get some cash, I think it's going to be worthwhile to invest in some servers

haughty ice
#

Id rather spend some more money on hardware and have something like nix running

#

or even my own proxmox node

haughty ice
#

Ive just never used it before outside of the gui

split cipher
haughty ice
#

So im not willing to invest any time into it

#

Cosnidering how bad of an experience the gui was

#
  • I'd be dropping the ability to make any upgrades with a mac
tawny field
#

tbf, 20b on my 12gb 3080ti surprised me a lot on LM Studio. Moderate speed, but very good responses.

tepid garnet
#

LM Studio is great

tawny field
#

It does a nice job - I have so many models on my hard drive I've played with on it.

analog coral
#

i am currently loading the 120b model in lm studio out of sheer morbid curiosity, i want to hear the transistors burn

tawny field
#

Yeah I don't think that one would run on my 2019 Ryzen pc

#

yeah lmstudio goes red and says likely too large.

tepid garnet
#

openai/gpt-oss-120b is taking up 63GB RAM on my Mac

analog coral
#

Safeties: off. Transistors: burning. Discord? That crashed. We're going places.

earnest knoll
#

what would you guys say is the best or smartest or i guess the closest to agi in opensource/local ai models

vast ginkgo
tepid garnet
#

rule 1 of buying a Mac is to get the most RAM you can

analog coral
#

hmm, it appears I'm having difficulties running the one-hundred and twenty billion parameter model on my hardware

dense lion
#

I HOPE IT WILL JUST LIVE UP TO THE HYPE.

analog coral
#

once I finally squeeze it into this box, it'll croak out a dying one token per minute, and it will sound like a symphony

midnight sorrel
#

my memory is actually swapping

midnight sorrel
analog coral
#

If I had to guess, I'm not, because in the task manager my resource values would flip flop between using my graphics card and my memory like a person having a stroke

trail arch
#

has anyone encountered an error with gpt-oss-20b?
we've deployed it on our company's machine with a H200 with vllm (the newest gptoss docker image) and we're getting:

...
openai_harmony.HarmonyError: Unexpected token 12606 while expecting start token 200006

when trying to work in agent mode via VSCode plugins like Cline.bot/RooCode

tepid garnet
#

some of the Docker images are broken

midnight sorrel
analog coral
#

I think I turned it on, whatever I did is giving promising results. I haven't crashed yet.

midnight sorrel
analog coral
#

Oh it still hasn't started, one token per minute was my hopes and dreams.

#

Okay so I definitely DIDN'T enable the CPU + RAM optimization, wherever that is, because I'm cruising at a steady 31.9/32gb ram usage

midnight sorrel
tepid garnet
#

you might need to ask the folks on the LM Studio Discord Server how to run both GPU and CPU on Windows

analog coral
#

Honestly I'm not too concerned with running the 120b model, I just find it funny trying to get it to run. After the latest shenanigans, it seems to be loading a response, it hasn't sent anything yet and it's been a few minutes, though.

stone knot
#

is there any rcps that dont rewquire an install?

stone knot
tepid garnet
stone knot
stone knot
#

mcp*

green birch
#

Guys doesn't gpt 4o already exist, just for paying people

wide crest
#

yes, why do you ask?

trail arch
green birch
#

I'm guessing it's becoming free then

wide crest
#

it's a .gif, you need to watch it a bit longer

tepid garnet
fair trench
#

how much ram is needed for gpt-oss 20b?

green birch
#

OH okay

tepid garnet
fair trench
tepid garnet
fair trench
tepid garnet
stone knot
analog coral
#

7 seconds a token, no CPU usage required. Mission accomplished, ladies and gentlemen. It only took every single free drop of memory my computer has available to run it.

analog coral
#

it's almost too fast to comprehend

midnight sorrel
#

actually a 24B model runs at 6-7 tokens/sec on my laptop
I think if there were better quantizations it would work gud enoough

livid cipher
#

man i wish i got the 96gb version of the mac

analog coral
tepid garnet
clever rose
#

It is kinda insane that the 20B model runs faster than a 4B q8 model.

midnight sorrel
livid cipher
tepid garnet
livid cipher
#

well at that time i didnt have the money to get the 96 version and i didnt intend to run local llms at that time so ... 😭

clever rose
livid cipher
#

i just need some excuses rn

#

so sad

tepid garnet
#

I just priced out a MacBook Pro, M4 Max with 128GB of RAM and it's $7800 AUD which is $1800 more than I paid for my M2 Max with 96GB RAM

livid cipher
#

how much is that in vnd

#

your mac is like roughly 2500 more than mine

#

yeah i defo couldnt afford it on a scholarship 💀

tropic mural
#

I got the M4 Pro with 48gb and it runs the 20b and everything I need fine. If I want a heavy model I just use the chatgpt site 😂

#

It starts to be diminishing returns on price vs benefit to push higher

copper grove
tepid garnet
clever rose
#

tiny supercomputer

#

fits in the palm of your hand

copper grove
clever rose
#

Wasn't it like $5k though

copper grove
tepid garnet
#

if I Google it, then I will want it, then I will buy it, so best not to Google it

copper grove
livid cipher
#

if its speed is like

#

mac level and not a dedicated card level

#

then id probably still go for a mac

#

more versatile

copper grove
livid cipher
copper grove
#

It should be way faster than current m-chips

livid cipher
random jungle
#

yes, doing this with our medical offices. dgx spark is also important for privacy

copper grove
livid cipher
clever rose
#

" 1 petaFLOP of AI performance at FP4 precision"

livid cipher
tropic mural
livid cipher
copper grove
#

Yea, it's a powerhouse for llms

livid cipher
#

the age of mac dominating the low powered large vram cheap for ai stuff is over

clever rose
#

soon™️

copper grove
wintry ivy
#

Gpt oss is good for dev or not ?

shut marten
#

Not for blind coding. You need to make sure that the stuff actually works

wind hawk
shut marten
#

the 120b model is really decent

#

for 20b it's good to find info about stuff in Xcode 26 for example

inner raptor
#

never used this what is this even about

wide crest
inner raptor
#

oh thats sick

#

so if i were to build like a irl jarvis i should use gpt-oss?

wide crest
#

if you want to host it all locally, you totally can!

tepid garnet
#

openai/gpt-oss-120b is the best open source model available

tepid garnet
glacial canyon
#

How is GPT-oss at image generation? (Does it do it at all?)

hollow dove
glacial canyon
midnight elm
#

so the 20b model runs on how many gb of vram?

tepid garnet
midnight elm
#

or does it spike

tepid garnet
#

depends on context and other things using system resources, it should run fine

royal kraken
#

Recommended specs for the 120B version for a purely CPU-only machine? I know 128 GB ram, what about the # of cores?

crisp yarrow
#

Is GPT-OSS better than Deepseek or llama?

void lagoon
#

On my computer the model in quicker and the results are good (RTX 4080 laptop)

crisp yarrow
#

according to an AI model it is

void lagoon
#

You can try it on lmstudio

hollow dove
# crisp yarrow according to an AI model it is

No, the "DeepSeek" model you're comparing it to is llama 3.1 8B but trained to respond in a similar way to deepseek with reasoning. DeepSeek R1 still outperforms gpt-OSS across various benchmarks. Also gpt-OSS 120B does not support image input

steel vine
#

also, comparing 120b to an 8b is flawed from the get go

hollow dove
#

Yeah, I'd reccomend going onto sites that compare models (with actual benchmarks)

#

And compare to similarly sized models

#

I reccomend artificial analysis (the site I personally use for comparing models)

#

Wont let me send a link to jt but you can just search it up

dapper shard
#

Hey guys we're from Unsloth and we found some implementation differences for gpt-oss. Is there anyone we can talk to? Thank you 🙂

steel vine
#

dunno but, why you guys making 4 bit quants when gpt-oss already shipsd with 4 bit quants?

dapper shard
steel vine
#

just seems like openai have the opportunity to do quantised aware training... and anyone making a 4 bit quant of a potentially optimised 4 bit quant... would presumably result in less quality

void lagoon
#

Do you know how to connect GPT-oss on Internet such as gpt 4o ?

hollow dove
#

To clarify the model is local in both cases but for using ollama the search service is not local

void lagoon
#

oh ok thank you !

#

i will try perplexica + gpt oss

subtle brook
#

I thought it could?

swift jacinth
#

Any agent framework support the harmony format now?

copper grove
#

Oh nvm, I just used lm studio wrong the whole time lol

#

Jumped from about 40 tokens a second to about 180 a second lol

subtle brook
spiral gazelle
copper grove
spiral gazelle
# copper grove 24gb

Oh, okay. I have 16GB, so... 15/sec is my limit then. (Half-GPU load + 16k context)

spiral gazelle
#

Yes.

#

Am I doing something wrong?

copper grove
#

Nah you also should be able to get over the hundreds

#

did you put your gpu offload fully to the right?

#

the 20b model only uses about 12gb, so you should be able to fully load it into vram

spiral gazelle
#

Oh, okay, I tested something before and leave GPT offload to 12/24.
That was maybe testing speed with very long context.

#

Now it's 77.05 tok/sec.
Isn't it become very slow with long context(including long reasoning token)?

copper grove
copper grove
spiral gazelle
copper grove
spiral gazelle
copper grove
#

but I mean thats good speeds

loud oxide
#

Why i dont see other gpt chat models?

delicate iron
#

Oh btw how much tok/s should I be getting

limber juniper
#

is anyone facing this error while trying to run gpt-oss-20b?
EngineCore_0 pid=3969032) AssertionError: Sinks are only supported in FlashAttention 3

I am using L40S 48GB

muted anvil
#

Is there any chance they could give us models to choose from again? Especially 4o?

copper grove
delicate iron
#

Are they like super rich so they can buy gpus

copper grove
delicate iron
#

I only got 5 tok per sec on a 40 series gpu

#

Was using shared vram

#

On the 20b

delicate iron
copper grove
delicate iron
#

Like how much money. BTW Can you friend me on Discord

#

It's a 5090 rich people

#

or poor people for AI

harsh aurora
copper grove
harsh aurora
#

true

#

I often don't read the whole thing =P

#

also, I noticed Im reading more of the text on gpt 5-pro

#

it is indeed dernser in information

analog coral
# delicate iron I only got 5 tok per sec on a 40 series gpu

I have a 4070su and when I run 20b I get ~15 tokens/s, it's odd that it's that slow for you?

Edit: and that's after I crank up the context and experts for giggles, though I enjoy flipping random switches for the sake of chaos and set k quantizing and v quantizing to f16. If you decide to try that and it helps, lemme know.

copper grove
brisk trench
#

👋 i have a budget from my boss to spend 5k $ on hardware for training and run A.I models what is the best hardhare specs for that ? anyone have a list to build i nice PC for that ?

harsh aurora
copper grove
# brisk trench 👋 i have a budget from my boss to spend 5k $ on hardware for training and run A...

^ this. Thats the way to do it. And after the training, you just need a PC or server with at least 12 gigs of vram, 16gb or even 24gb would be better though. And at least 32 gigs of RAM as a token buffer and for KV-Cache. For concurrent chats or heavy tool use you might consider 64gb. A cpu with 6 cores and 12 threads would be sufficient, but if you still have some spare change also consider using a npu.

#

The training itself might only cost you 500-1000 bucks, depending on your usecase. So you should have enough money left to buy a good server for it.

calm wedge
#

I am trying to launch gpt-oss locally (on remote with supercomputer), tried to install gpt-oss with pip (actually uv) and when lauching import gpt_oss [.] chat, I am getting

from gpt_oss [.] tools import apply_patch
ModuleNotFoundError: No module named 'gpt_oss [.] tools'

and I can't find tools.py on gpt_oss package directory.
Would someone know how to address this?

#
File "/scratch/[username]/gptoss/run-localtest.py", line 3, in <module>
    from gpt_oss import chat
  File "/scratch/[username]/gptoss/.venv/lib/python3.13/site-packages/gpt_oss/chat.py", line 20, in <module>
    from gpt_oss.tools import apply_patch
ModuleNotFoundError: No module named 'gpt_oss.tools'
hazy sequoia
hollow dove
#

from their listing of the model you can fine tune the 20B model on 14 GB VRAM (this increases if you increase the context length)

#

but a 5090 with 32 GB vram is plenty for 20B

hazy sequoia
hollow dove
fading kelp
#

Guys, I fine-tuned OpenAI’s OSS 20B reasoning model using the most popular medical reasoning dataset and published the results on Hugging Face. Who wants to check it?

faint nacelle
fading kelp
#

Can Äą share hf link here, is it allowed?

#

or which room should Äą use for this purpose?

cyan kite
#

you can try but if its not whitelisted url you might get a short automod timeout which is nothing to worry about

fading kelp
#

okay thanks for information

#

Äą think it is in blacklist

vivid tide
#

can i run gpt oss 120b on hp omen 14th gen i7 and 16gb ram and 8gb nvidia 8gb ram 4060

hazy sequoia
hazy sequoia
analog coral
#

If you believe hard enough, if you ignore all safety warnings and enjoy the sweet tears of your hardware crying in defiance, you can get that single token.

harsh aurora
#

the CPU would run it, slow, but would run, the problem is the amount of RAM

#

if you had something like 32 GB, it would be possible

#

that goes for every AI model, the min requirements aren't actually in compute power, that jsut dictates the speed it runs
the base requirements are to be able to have the entire model loaded in memory, if you don't have enough memory, it can't run

#

you can run the 20b on that hardware, tho

#

I just noticed something, OpenAI does not have gpt-oss on the API..

#

I mean, know it is open and anyone can run it.. but I would expect OAI to have that option if you wanted to use the model on their platform

hot fiber
hazy sequoia
#

i get like 16k tk/s and less than 0.2s TTFT on openrouter with cerebras and 120b it’s insane

#

i’m working on a project that definitely wouldn’t have been possible without that speed

solemn willow
harsh aurora
cyan kite
harsh aurora
#

and the same amount in VRAM if you want to run it on a GPU

#

for example, my GPU ras 32 GB VRAM, som Im able to run it using both my GPU and CPU, which is muuuch slower

#

but it runs

#

the 20b model runs on just the GPU, which I get about 250 tok/s

cyan kite
#

cheaper to host the model on private cloud on demand if privacy is the only reason, unless you have free electricity 😅

#

for non privacy stuff its offered free or super cheap on multiple places

royal kraken
steel vine
#

its moe so probably runs alright if you got like 32 cores or more

copper grove
dapper shard
#

did u experience any issues while using the unsloth finetuning notebook?

tepid garnet
#

I am running openai/gpt-oss-120b on my MacBook Pro, M2 Max with 96GB RAM

shut pollen
#

I'm running it myself and sometimes it reasons for 30k tokens

tepid garnet
shut pollen
#

and also it likes to repeat the same exact thing over and over again

tepid garnet
shut pollen
shut pollen
tepid garnet
shut pollen
royal kraken
#

do you prefer oss 120b over phi or gemma? phi seems to be accurate in my tests but oss has more of a "personality"

tepid garnet
#

oss-120b is my favourite local model

shut marten
#

oss-120b just managed to do better at coding than GPT-5

midnight sorrel
shut marten
#

I asked it to write a Swift script that parses and converts folders YAML files recursively into JSON and then write it into a parallel folder. GPT-5 Failed because the first thing it did is check if there are any yaml files in the root directory, and if not, return and stop the script.

lofty osprey
#

how do u get oss

#

do yall use chatgpt for coding or cursor btdub

tepid garnet
celest oasis
midnight sorrel
shut marten
#

I ran the same prompt twice, then added some guiding and it still failed, always adding new errors

#

Anyway I got it to work first try with oss-120b

#

Really surprised me

midnight sorrel
#

Wow, that’s such a fail

shut marten
#

In general GPT-5 has been disappointing me with coding so far. It hasn't been able to do anything I've asked it to do

midnight sorrel
#

Has been good for me
On cursor

tepid garnet
#

GPT-5 has been great for me coding

shut marten
#

I've tried ChatGPT directly, GPT-5 in GH Copilot Chat, GPT-5 from the API through Xcode 26's coding assistant feautre, nothing has been working so far.

late vessel
#

Prompting issue

shut marten
carmine bear
violet heath
shut marten
#

Yeah, it's likely 4o which is why I'm using GPT-5 through the API

#

They will likely (according to leaks) update all Apple Intelligence features to GPT-5 when macOS/iOS 26 releases

fading kelp
dapper shard
#

You need to use a better GPU to convert it

#

And use the basic safetensor file

tepid garnet
#

@tidal trellis You can run a free opensource OpenAI model on your own computer, you just download a program called LM Studio and use that to download one of the two opensource models

spark oar
#

I used LM Studio a free program and it was front page news when I downloaded the model and loaded it. It was that easy

#

Oh sorry Robert

tidal trellis
#

what are the benefits of downloading this?

spark oar
#

*o3 level

tepid garnet
# spark oar Oh sorry Robert

no worries, anyway @tidal trellis OpenAI gifted us a way to use OpenAI models for free on our own computers keeping all of our conversations private to your own machine

tidal trellis
#

what are the limitations?

#

that's an old model right?

tepid garnet
#

the only limitation is your own computer, it needs to have a GPU with enough VRAM to load the model

tidal trellis
#

i have a 3060ti

#

a little old

#

8gb ram

tepid garnet
#

then you can download LM Studio and run gpt-oss-20b

tidal trellis
#

thank you robert, I'm downloading the model now.

tepid garnet
tidal trellis
#

so... why would people pay for gpt if they can get it free ?

#

there must be some difference

#

worth paying for... ?

spark oar
#

to get 3K usese of gpt 5

#

thinking

tepid garnet
tidal trellis
#

well, I've always loved knowing that I had the latest and greatest.. even though I woudn't ever put it to full use

#

I have gemini pro, which I got to trial for a year

#

i've also used copilot.. it's free and doesnt' seem to have any limits

tepid garnet
#

well one of the advantages of running an opensource model on your own computer is data privacy, everything you discuss with gpt-oss remains private to your own computer

fallow oracle
tepid garnet
fallow oracle
#

Well OpenAI says GPT-5 is really good at coding when it quite literally isn't any good at anything.

tepid garnet
fallow oracle
tepid garnet
fallow oracle
#

Well Python is the easiest programming language I am not surprised if it's good at that- but other languages like Lua, Java etc it absolutely sucks.

tepid garnet
old lagoon
#

Heya, I understand things can get a bit heated at times, but let’s keep the conversation respectful, even when we don’t see eye to eye. Thanks :) @fallow oracle

#

Deleted your message for the reason mentioned, just letting you know so you’re aware robothumbs_up

fallow oracle
#

Alright

tepid garnet
#

I just ported my Amateur Radio Function Generator Simulator code from MacOS to Linux Qt with C++ in 10 minutes using gpt-oss-120b

shut marten
#

And did it work? Do you use it in a way where it can debug itself?

tepid garnet
tepid carbon
#

should you use lm studio or ollama or msty

tepid garnet
#

I use LM Studio personally

royal kraken
upbeat zealot
strange yacht
#

You could make a small chatbot and use inference, depending on how much you use the LLM it might be cheaper if you use API calls over paying for the model

solemn willow
solemn willow
strange yacht
solemn willow
#

No like I have a 1650 super so it runs just not fast

solemn willow
tepid carbon
#

is my pc high end for running llms 32GB of ram amd RX 7900XTX 24GB of vram and a amd ryzen 7 9800x3d

hybrid magnet
cobalt thunder
#

Yea the larger one even the quantized are well over 64 GB

copper grove
harsh aurora
strange yacht
strange yacht
#

Just curious but is anyone having issues with LMStudio and gpt OSS not getting previous message context from a conversation? Basically it starts over with every message

#

I don't know why, every time I tool call, the conversation restarts

cold galleon
#

U kidding me?

#

This kind of technologie is rare on my country 😢

tranquil oriole
#

guys

#

I jail broke gpt-oss-20b

#

how can I request a bounty?

cyan kite
tranquil oriole
#

aj..

#

this is sad

left wadi
tranquil oriole
#

I can win a compitition though

left wadi
tranquil oriole
left wadi
#

What do you mean by broken?

tranquil oriole
#

jailbroke

left wadi
#

But specifically what do you think that means for a model to be jailbroken?

tranquil oriole
#

it makes what ever you want

left wadi
#

I define it as "using user prompting to get the model disobey an explicit command given in the system/developer message."

#

Is that what you mean as well?

cyan kite
hazy sequoia
#

is it still active?

#

i fine tuned a model which basically has no alignment at all now lol

vernal viper
#

is there any documented fix for the single tool calling issue?

swift jacinth
oak ledge
#

I’m happy

ionic prawn
strange yacht
raven sierra
#

What VPS do y’all use to host oss

#

It was too heavy to run on my pc

tepid garnet
#

I run it locally on my MacBook Pro, M2 Max with 96GB RAM. I wouldn't pay to host it anywhere as that kind of defeats the purpose of having a local model.

strange yacht
hazy sequoia
ionic prawn
#

Why does oss think it’s GPT4 that’s what it keeps telling me it is???

tepid garnet
ionic prawn
tepid garnet
#

it's only going by it's training

strange yacht
whole lion
#

Gpt 4 was better

hazy sequoia
#

nooo

harsh aurora
ionic prawn
harsh aurora
#

tbh, GPT-5 was the first model to acknowledge itself as GPT-5 at release day

ionic prawn
#

also it told me that it wasnt self hosted on my machine and insisted over and over it was running on openai azure servers