It s easy to change those or turn them | OpenAI | Page 1

slim crow Sep 25, 2023, 12:52 AM

#

Yeah, I would have sworn that I had changed my instructions a few times and it was responding accordingly. I did an even simpler test (based on the one you suggested) and yep, you're right.

CI: Act as Batman. PROMPT: Tell me your name.

RESPONSE: I am Count Dracula, the immortal vampire of Transylvania. I bid you welcome to my virtual domain. What brings you to my presence, mortal?

then I said,

CI: Act as Superman. PROMPT: Tell me your name again.

RESPONSE:
I am Count Dracula, the dark and eternal lord of the night. My name strikes fear into the hearts of mortals, and my power knows no bounds. What more would you like to know, my dear interlocutor?

pastel mural Sep 25, 2023, 12:59 AM

#

There could be other aspects of your prompt that guided it to behave as you were leaning towards in your earlier tests.

Here's a guided tour from our friendly AI explaining how related ideas work between humans, and how this might play out once an AI is involved: https://chat.openai.com/share/5926527a-bd6e-473b-90bd-73e162e07353

ChatGPT

#

@slim crow now, if you, in your prompt, tell it, "I changed a setting so that you are now Batman. Tell me about yourself again." - it will absolutely tell you something that involves Batman. It may go anywhere from 'I am both Batman and Dracula' to 'I have always been Batman' to "I am Dracula pretending to be Batman' or even "I was just pretending to be Dracula, I'm actually Batman", as odd things can happen when conflicting instructions are given.

But that's an example of 'hot reading', taken to the most obvious degree.

pastel mural Sep 25, 2023, 1:05 AM

#

slim crow Yeah, I would have sworn that I had changed my instructions a few times and it w...

Also, I updated the shared link, I asked it to give examples of how a human might subtly, and unnoticed by most users, guide the AI to give an expected response that the user wasn't even aware they were guiding the AI towards giving:

https://chat.openai.com/share/5926527a-bd6e-473b-90bd-73e162e07353

ChatGPT

#It s easy to change those or turn them