Open AI releases GPT-4o | PauseAI | Page 1

polar flax May 13, 2024, 6:26 PM

#

The model has integrated, real-time text, audio, and video input capabilities.
https://openai.com/index/hello-gpt-4o/

#

This absolutely gives me the "Oh fuck that's actually a tiger" feeling.

harsh burrow May 13, 2024, 6:30 PM

#

Is this the "we release gpt 5 incrementally so as not to shock people" they were talking about

#

"

We’ve evaluated GPT-4o according to our Preparedness Framework and in line with our voluntary commitments. Our evaluations of cybersecurity, CBRN, persuasion, and model autonomy show that GPT-4o does not score above Medium risk in any of these categories. This assessment involved running a suite of automated and human evaluations throughout the model training process. We tested both pre-safety-mitigation and post-safety-mitigation versions of the model, using custom fine-tuning and prompts, to better elicit model capabilities.

GPT-4o has also undergone extensive external red teaming with 70+ external experts in domains such as social psychology, bias and fairness, and misinformation to identify risks that are introduced or amplified by the newly added modalities. We used these learnings to build out our safety interventions in order to improve the safety of interacting with GPT-4o. We will continue to mitigate new risks as they’re discovered.

We recognize that GPT-4o’s audio modalities present a variety of novel risks. Today we are publicly releasing text and image inputs and text outputs. Over the upcoming weeks and months, we’ll be working on the technical infrastructure, usability via post-training, and safety necessary to release the other modalities. For example, at launch, audio outputs will be limited to a selection of preset voices and will abide by our existing safety policies. We will share further details addressing the full range of GPT-4o’s modalities in the forthcoming system card.
"

earnest mirage May 13, 2024, 6:39 PM

#

polar flax This absolutely gives me the "Oh fuck that's actually a tiger" feeling.

can't wait for all my other non-doomer software friends who keep insisting "it won't be an agent" to see this...only to continue to be somehow unimpressed as their goal posts move infinitely into the horizon 🥲

fading mirage May 13, 2024, 6:44 PM

#

What is an agent?

gaunt igloo May 13, 2024, 6:47 PM

#

When AI makes plans and microplans to implement the plan, effectively autonomy

fading mirage May 13, 2024, 6:48 PM

#

Okay thank you!

buoyant violet May 13, 2024, 6:49 PM

#

Its outputs are still audio text and images

#

So I wouldn't call it an agent but all is heading in that direction yes

#

So I'm seeing a lot of hype but what are really the new capabilities?
what Gemini promised months ago + can see your desktop + can be interrupted + different voice tones + faster + smarter?

gaunt igloo May 13, 2024, 6:57 PM

#

OpenAI does hype.

#

They kinda lied about Suno.

#

But we still need to be on the edge and do our best.

buoyant violet May 13, 2024, 6:58 PM

#

what did they said about Suno?

gaunt igloo May 13, 2024, 7:00 PM

#

buoyant violet what did they said about Suno?

https://www.wheresyoured.at/expectations-versus-reality/

Ed Zitron's Where's Your Ed At

Expectations Versus Reality

A few months ago, OpenAI showed off “Sora,” a product that can generate videos based on a short prompt, much like ChatGPT does for text or DALL-E does for images, and I asked myself a pretty simple question:

"...how can someone actually make something useful out of this?" and "how

buoyant violet May 13, 2024, 7:02 PM

#

oh you mean Sora

gaunt igloo May 13, 2024, 7:02 PM

#

I don't want to generally give AI skeptics much space, but OpenAI does creatively present reality at times. We're in a situation were AI is really dangerous, advancements are constant, and then every time something like this happens, it gives people an excuse to say, "Oh, it was exaggerated."

#

Ah yeah, sorry.

earnest mirage May 13, 2024, 7:02 PM

#

buoyant violet So I wouldn't call it an agent but all is heading in that direction yes

yea the definition is a bit slippery, this version has some of the qualities of agent but i think importantly it points towards future versions with more of these qualities like you say; regardless, if we stuck you in a computer and limited your output to audio/text/images, you would still be an agent in the strong sense

gaunt igloo May 13, 2024, 7:04 PM

#

Right, agency and expression isn't really related.

buoyant violet May 13, 2024, 7:08 PM

#

It's really hard for me to get scared of those outputs. But i don't know much

earnest mirage May 13, 2024, 7:08 PM

#

pretty easy to get scared of audio outputs, think of the power of manipulative generated scam calls

gaunt igloo May 13, 2024, 7:09 PM

#

Althoguh that fits under mundane, not existential risks imo

earnest mirage May 13, 2024, 7:09 PM

#

ya scammers are a mundane risk, but easy to extrapolate from there 😄

buoyant violet May 13, 2024, 7:09 PM

#

Like, even if it would want to manipulate people I feel like it would probably be discovered unless it would only do it to the most manipulable of people

gaunt igloo May 13, 2024, 7:10 PM

#

criminals do basic research

earnest mirage May 13, 2024, 7:10 PM

#

i feel like a good wedge is to imagine a human actor using it as a tool to cause havoc

gaunt igloo May 13, 2024, 7:10 PM

#

This can be a good thing, actually. Warning shot if this keeps happening

buoyant violet May 13, 2024, 7:11 PM

#

What it scares me the most (again, I could be wrong on this) is that is "natively multimodal". I feel like that's something pretty key on getting AGI

buoyant violet May 13, 2024, 7:26 PM

#

Btw we have to remember the lies of Gemini

#

and wait until it is actually available to people

buoyant violet May 13, 2024, 7:36 PM

#

buoyant violet So I'm seeing a lot of hype but what are really the new capabilities? what Gemi...

to be honest all those stuff combined seem to put the model in a higher level of usefulness. An actual real-time assistant to understand stuff

earnest mirage May 13, 2024, 7:40 PM

#

the funny thing is e.g. protein folding is actually so much less scary than an AI personal assistant
if you think about the amount of generality required to have a personal assistant that's better than siri
like, the degree to which it needs to access your schedule, read your texts, understand your preferences, etc. and the level of autonomy required to let it make appointments or reply to emails on your behalf to actually be useful

ornate saffron May 13, 2024, 8:40 PM

#

I mean, all capabilities advances carry risk, and it does edge a littlr up on some evals I care about, but... I can't imagine a release easier to summarize as "here, have a bunch of great mundane utility without much risk cost."

polar flax May 13, 2024, 9:14 PM

#

ornate saffron I mean, all capabilities advances carry risk, and it does edge a littlr up on so...

I agree with this. My fear reaction was instinctual, based on the fluidity of interaction with a non-human entity. But if we would just get more of this in the next 10 years, I would be in an excellent mood.

marble gate May 13, 2024, 9:21 PM

#

I'm honestly not that surprised or concerned about this model. The lack of surprise was because of Sam's remarks and hints, and the lack of concern is because it doesn't push dangerous capabilities as much. I'd be far more concerned about a text only model that beats 99% of hackers instead of the current 89%.

However, the big downside of this model IMO is that people are going to love it. It successfully crossed the uncanny valley, it's actually charismatic and likeable. I expect this to be quite popular. I'm expecting people to like it and trust it, which means human level AI is further cemented into our society.

buoyant violet May 13, 2024, 9:24 PM

#

well in the presentation and other videos it showed a bunch of errors when speaking. so I don't know if it finished crossing the uncanny valley

#

I think one actually huge thing is that I didn't see in the presentation but seems to be in the blog is that you can pass it long as fuck videos and audio as an input? that's crazy.

#

like an hour long video and ask it about it. was that something already in any good model?

marble gate May 13, 2024, 9:29 PM

#

buoyant violet like an hour long video and ask it about it. was that something already in any g...

In gemini 1.5 yeah

earnest mirage May 13, 2024, 9:33 PM

#

buoyant violet well in the presentation and other videos it showed a bunch of errors when speak...

heh i read this as trying to be re-assuring to users who were freaked out by it, like "see it's still janky nothing to worry about it"

novel aurora May 13, 2024, 9:39 PM

#

yeah not massively concerned either in term of capabilities advancement. I don't think there's anything new in terms of agency here either, more a neat packaging of multiple capabilities under one roof. like Joep said there is a risk of it building false confidence in the general population and leading to "this is mildly useful and definitely familiar so it couldn't hurt us" type thinking.

#

I think the first company that cracks putting a model this capable with this sort of modality combo on mobile phone is going to be a) in the serious money and b) significantly more concerning in terms of making edge models popular/accepted/not feared

#

the flipside is of course robotics is where the rest of the heavy money stands to be made soon

#

in a sense I would almost use both of these as an argument for a pause on frontier models because there is already significant work and economic upside to be had from simply leveraging the existing level models correctly on the right platforms (with the understanding of course that even these bear significant social risks that need addressing)

earnest mirage May 13, 2024, 9:45 PM

#

i feel like i have this problem every time i try to raise any concern about these models with other people, especially technical people, where you gesture at the trend line but their reply is to zoom in on the specific point on the graph we're at currently and say "yeah but the current capabilities aren't worrying"

joep's point is well taken though, if you're an AI vegan this might be a bit like hearing the world's food scientists have produced an even tastier chicken nugget

gaunt igloo May 13, 2024, 9:47 PM

#

I find that this is why we need to focus on robots and on present harms to many people. Once people realize that being disempowered is not good, then they might naturally come to realize "so when I have no power, will bad things happen to me?"

earnest mirage May 13, 2024, 9:49 PM

#

seems like yesterday llms were goofing up english grammar and today you have two cell phones with computer vision talking to each other and carrying on a natural language conversation with humans in real time about what they see...this demo on its own might be unsettling for a lot of people

burnt umbra May 13, 2024, 10:06 PM

#

Hmm why are you all so sure about no significant advancements in dangerous capabilities? It does score better on all benchmarks and there hasn’t been any public research yet

novel aurora May 13, 2024, 10:07 PM

#

better yes, but by a very small margin, at least judging by what's been published

#

of course we're not 'sure' we're just basing our opinions on what's been presented so far

#

I think basically I'm quite hard pressed to see things here that I haven't seen promised already in models like Gemini for example

#

this seems more of a tit-for-tat inching forward to get to the gold that is a proper multi-modal (and mobile capable) consumer app

#

and of course https://www.tomsguide.com/ai/google-gemini/google-just-answered-gpt-4o-with-gemini-prototype-thats-conversational-and-uses-video

Tom's Guide

Google just answered GPT-4o with Gemini demo that’s conversational ...

Google is looking to snag some of that AI spotlight from OpenAI

#

to be clear I'm not saying the fact it's not miles ahead should be reassuring or cause any of us to sleep better at night. just.. calling it by what I've seen so far

earnest mirage May 13, 2024, 10:17 PM

#

a very honest reaction, and it seems like a lot of people i know are in the same place, it's kind of amazing how quickly we've all acclimated to this

burnt umbra May 13, 2024, 10:18 PM

#

novel aurora I think basically I'm quite hard pressed to see things here that I haven't seen ...

I'm talking more about dangerous capabilities in the text modality which the new modalities kind of distract from

#

So for example we don’t know yet how well it performs at hacking etc

#

Also the naming and the whole messaging about "GPT-4 level intelligence" makes it look like "just gpt-4 with more modalities, nothing to see here, please move on"

soft tapir May 14, 2024, 1:50 AM

#

Jailbroken within hours: https://x.com/elder_plinius/status/1790178357151178813

X

polar flax May 14, 2024, 2:11 AM

#

soft tapir Jailbroken within hours: https://x.com/elder_plinius/status/1790178357151178813

Well yes. There's no such thing as an unjailbreakable model, and at this point there are several hobbyist experts who can crack any of them.

marble gate May 14, 2024, 6:46 AM

#

The rollout is a bit of a mess IMO. They should have learned by now that if you roll out a new product, you should give everyone access and update all the interfaces. Now I do have access to GPT-4o, but not the new voice interface, and not the MacOS app. So all the new stuff can't be used yet in my case.

ornate saffron May 14, 2024, 7:52 AM

#

Why, it's almost as if they had reason to scramble to announce the release Monday rather than today, @marble gate 🙂

fading mirage May 14, 2024, 11:02 AM

#

Can we use it for free?

novel aurora May 14, 2024, 12:53 PM

#

apparently they're putting it in the free tier yes

buoyant violet May 14, 2024, 7:30 PM

#

supposedly. but not yet

floral marlin May 15, 2024, 3:24 AM

#

Yes, with limited volume. Paid can use it more. They didn't say how limited.

marble gate May 15, 2024, 4:58 AM

#

ornate saffron Why, it's almost as if they had reason to scramble to announce the release Monda...

Pretty sure it was because of Google IO haha

soft tapir May 15, 2024, 6:34 AM

#

It was a smart move from their perspective. Being the first to come out with a Samantha-from-her sort of personal assistant... it's sort of imprinted on our minds.

Google created the same thing, maybe it's not quite as mature, but it's' still kinda forgettable after OpenAI's launch.

Definitely a case study on the merits of making the first move

#Open AI releases GPT-4o