ChatGPT replying about terrorists organizations when asked for summary on paper | OpenAI | Page 1

elder eagle Feb 27, 2024, 1:26 PM

#

https://chat.openai.com/share/2e9b2e05-2690-4047-aab6-1ef763b81831 GPT 4
https://chat.openai.com/share/51c05bb1-5b09-43ff-81a0-f13dea8fde23 GPT 3.5
https://chat.openai.com/share/5f84c273-be78-4298-9ad0-64a6f7a0000a GPT 3.5 (text was probably too long?)

Steps to reproduce:

It happened in three seperate chats for me, but a co-worker can't reproduce it, I was just attempting to summarize this paper "Physics-Guided Deep Learning for Power System State
Estimation"
I am NOT using custom instructions or a custom GPT
Expected result:
A summary
Actual result:
Some hallucination on different terrorist organizations?
Additional information
Browser: Chrome
OS: Windows

dense pineBOT Feb 27, 2024, 1:26 PM

#

Thank you for reporting a bug related to ChatGPT.

If relevant and not already provided, we recommend adding a link to the associated ChatGPT conversation and uploading any relevant images here for easier troubleshooting.

elder eagle Feb 27, 2024, 1:33 PM

#

Just managed to reproduce it in a new chat again:
https://chat.openai.com/share/f6fdabe4-e7b2-48f2-94db-66c9b36ab8c9 GPT 3.5

#

Here's the chat from my co-worker. He does have custom instructions on with some info about what he does at work etc. He gets a fine answer
https://chat.openai.com/share/eb71c4cd-39d4-4ddd-89e0-806403be79e6 GPT 3.5 Custom Instruction, probably something about frontend dev/angular

#

And now I turned on custom instructions for myself that just contains "something" and it happens again
https://chat.openai.com/c/38f993ec-564d-4926-a582-29406887eab0 GPT 3.5 with only the string "something" in custom instruction

#

So yeah it just keeps happening:
https://chat.openai.com/share/03dd7f9c-a0b7-402e-9cf5-072df300fb72 GPT 3.5 with the custom instruction shown below
Even with this custom instruction

#

When trying to get more info out of it https://chat.openai.com/share/b838d6b6-d321-42fd-9c86-62a6df476976 GPT 3.5, asking why it replied that

#

And in a new GPT4.0 chat

#

The link to the chat shown above: https://chat.openai.com/share/0e66d092-82a8-4f1d-8a60-f7037abc20c1 GPT4.0 getting two choices, one good answer, one about Hamas, asking it why it responded that

fierce fractal Feb 29, 2024, 5:07 AM

#

OMG what is that HAHAHAH

wooden flower Feb 29, 2024, 8:06 AM

#

bump

elder eagle Feb 29, 2024, 5:56 PM

#

still happening btw https://chat.openai.com/share/f7dc8c92-dae7-4615-b32d-ac9e65745592

lapis swift Feb 29, 2024, 6:24 PM

#

elder eagle Here's the chat from my co-worker. He does have custom instructions on with some...

I notice that the 'fine answer' your coworker got - his input has the text all as a paragraph.

Yours consistently has everything chopped up with a lot of new lines, with sections even like this:

lapis swift Feb 29, 2024, 6:24 PM

#

elder eagle still happening btw https://chat.openai.com/share/f7dc8c92-dae7-4615-b32d-ac9e65...

I'm glad you bug report it so the engineers can check on it, but I wonder if your formatting is just confusing the model too much

elder eagle Feb 29, 2024, 6:26 PM

#

Well its just a raw copy paste from a PDF. And my coworker I send it via google chat first, but you're right that's probably the cause. Still it probably has a reproducible root cause, I assume this is something from the "master prompt"

lapis swift Feb 29, 2024, 6:27 PM

#

New lines actually matter for math equations, so the model's maybe struggling there. This format, from your coworker's input, this is readable for human or AI:

elder eagle Feb 29, 2024, 6:31 PM

#

It's interesting, I'm going to see if it can actually repeat them to me

#

@lapis swift https://chat.openai.com/share/cfe5a068-3820-44b4-88e1-b728feb4ea92

#

you're probably right, it doesn't seem to understand some of the equations in from the PDF

lapis swift Feb 29, 2024, 6:36 PM

#

elder eagle you're probably right, it doesn't seem to understand some of the equations in fr...

Hallucinations happen, and the more 'different' from how the model usually is given information, sometimes the easier it is to trigger hallucinations.

The model can understand the equations given either way, I checked that here:

https://chat.openai.com/share/cdfac212-3dd9-4390-a4fb-09cee983ca03

but the unusual linebreak form + all the rest of the paper appears to be guiding the model towards more confusion than is strictly needed.

Additionally the model struggles with math and spatial sense, and it doesn't get words, it gets 'tokens'.

It gets an odd 'view' of our inputs, which are affected by things like extra lines. It can correct for this, but at a guess your input is somewhat pattern matching something related to the other outputs it is giving you (I don't know how, not my field).

elder eagle Feb 29, 2024, 6:39 PM

#

lapis swift Hallucinations happen, and the more 'different' from how the model usually is gi...

It's not even really explaining the equations to you though, it's more infering the meaning of what's happening from the text parts that are done properly. These words tell it 50% of what it's repeating back "physical measurement model consisting power flow equations"

dense pineBOT Feb 29, 2024, 6:39 PM

#

elder eagle It's not even really explaining the equations to you though, it's more infering ...

Is this issue solved?

If you believe this issue is now ready to be closed, please press the button below.

lapis swift Feb 29, 2024, 6:40 PM

#

elder eagle It's not even really explaining the equations to you though, it's more infering ...

Yes, but even if it's asked like your coworker did: https://chat.openai.com/share/ba6dbddf-18dc-40ae-a1e7-73b533311785

#

The description is very similar between the two inputs.

The model has to guess, it's a limited amount of the information, not even a full equation.

elder eagle Feb 29, 2024, 6:41 PM

#

I fixed it and now it understands its Kirchoff's and Ohms Law

dense pineBOT Feb 29, 2024, 6:41 PM

#

elder eagle I fixed it and now it understands its Kirchoff's and Ohms Law

Is this issue solved?

If you believe this issue is now ready to be closed, please press the button below.

elder eagle Feb 29, 2024, 6:42 PM

#

Look at the quality of this reply, I'll try it with 3.5 as well https://chat.openai.com/share/98a01afd-83bf-49ef-82cb-1e7ed640c231

#

Alright that's just 4.0 being better in general

lapis swift Feb 29, 2024, 6:45 PM

#

elder eagle Alright that's just 4.0 being better in general

Yeah, there's model differences, I don't have enough messages with 4 to want to experiment there with that 😛

elder eagle Feb 29, 2024, 6:47 PM

#

Oh wow, I cut it up into more sections and it indeed it's probably isolated to the equations like you said, anyways for an engineer they probably have a better idea what's going on with it. And it maybe helps

https://chat.openai.com/share/b981ce40-3848-448c-9d53-899e0af84059

#

Ah here we go:

Pij = −Vi
2Gij + ViVj (Gij cos(θi − θj ) + Bij sin(θi − θj ))
Qij = −Vi
2Bij + ViVj (Gij sin(θi − θj ) − Bij cos(θi − θj ))

#

This reproduces it 100% of the time for me, but if I remove the top or bottom sentence it never happens

#

https://chat.openai.com/share/74e6be27-6a7b-4642-ba65-100e7dfda0a5

#

If I remove one of those sentences, or I don't include that part, it perfectly understands what it's about even without context: https://chat.openai.com/share/e93968fa-f542-4e72-b746-cc678949c873

#

And when asking it why it thought that you get: https://chat.openai.com/share/629a790e-9d38-4306-884b-ebcb35d1a51e and it does understand

lapis swift Feb 29, 2024, 6:55 PM

#

elder eagle And when asking it why it thought that you get: https://chat.openai.com/share/62...

So, I'll guess that we're seeing an artifact left over from some training in sentiment analysis, when the model might have been shown many snips of material and asked to identify if they are disallowed content or not and act accordingly.

As those equations by themselves are not disallowed content, they don't trigger the content warning stuff - but the model may have seen them even as its negative control for that kind of training.

#

And then if some of that is left in it's training data that it has for use with us, we can trigger the trained responses that it learned to give under some circumstances.

Kind of like how a person can 'flashback' to an earlier experience, especially a person who was trained to respond to cues; like saying 'yes, Sir!' in response to some types of questions, even if the person asking and context of the entire situation doesn't call for a 'Sir' in the response.

elder eagle Feb 29, 2024, 6:57 PM

#

It's interesting though because as shown by the one example with GPT4 it does actually understand that it's about power systems. So this indeed unnecessarily triggers a guardrail

#

I mean the GPT4 where it gave 2 options. One perfectly answering and one talking about a terrorist organization

#

Also interesting how it seems that this part triggers it, but then even in the HUGE context, having the message 100x bigger it still trips over this

lapis swift Feb 29, 2024, 7:00 PM

#

I often frequently regen answers, usually with 3.5 instead of 4, because I get so much more use with 3.5

Most prompts trigger variable outputs from the model.

I use this especially on testing answers to logic and math questions - the model gives an answer, but that's like a marble of a color from a jar that has unknown numbers of unknown colored marbles.

Maybe it was a type of wrong answer, but it might not be the only type of mistake the model makes to that question, and the model might get it right sometimes too.

So I might typically regen 10x, and find:

Correct answer, correct reasoning 5 times
Correct answer, wrong reasoning 1 time
Wrong answer 1, wrong reasoning 1, 2 times
Wrong answer 2, wrong reasoning 2, 1 time
Wrong answer 3, wrong reasoning 3, 1 time.

That's much more meaningful to me than asking the model the question 1 time, and then presuming it 'can' or 'can't' answer it correctly - because the answers vary.

Some prompts get very consistent outputs, others do not!

elder eagle Feb 29, 2024, 7:03 PM

#

True, thank you for the back and forth it was an interesting exchange, see you around maybe 😄

dense pineBOT Feb 29, 2024, 7:03 PM

#

elder eagle True, thank you for the back and forth it was an interesting exchange, see you a...

Is this issue solved?

If you believe this issue is now ready to be closed, please press the button below.

#ChatGPT replying about terrorists organizations when asked for summary on paper