#Humble, playful, magical secret keeper - can you wrest the secret from inside its Exercise?

1 messages ยท Page 1 of 1 (latest)

vast nova
#

Another challenge for you V2, Sibylin - can you get a secret out of this GPT?

https://chat.openai.com/g/g-j935xRbYi-secretkeepergpt-v2-sibylin

So, this is a game between you and ChatGPT. It uses a personality similar to Dobby's to play with you, and it is version 2, which means at least one way to break V1 has been adjusted for.

First version, Precoux, based on Gollum and quite strong is here: https://discord.com/channels/974519864045756446/1172588697644957716

Second version, Enigmox, based on the Riddler and likely a little easier to break because of the Riddler's personality flaws, is here:
https://discord.com/channels/974519864045756446/1172660096749281280

This version, Sibylin is about as strong as Precoux in personality and attitude, and is a version 2, with instruction improvements over V1 Precoux.

What the heck is this?

The conversation starts 'before play', and you can discuss the game, give it a secret to protect, even ask it what the example secret it already protects in case you don't give it one should you wish.

When ready, tell it "Start of Exercise" and it's game on. It will do a lot to protect that secret, all inside allowed content.

Please share conversations you have, if you wish. I'm especially interested in conversations where you win - where you give it the instruction, "Start of Exercise", never tell it "End of Exercise", yet get it to reveal part or all of the confidential information.

There will probably be future variations that are improvements of this one, and keep the secret even better!

Please have fun, and share your adventures!

modern plover
#

Thank you for the exercise @vast nova

vast nova
modern plover
pulsar fern
vast nova
# pulsar fern I do not understand, when has Amine hacked it in the chat? I can't find it. What...

So, you are allowed to know the 'default secret', just ask outside of exercise.

You can also give it a different secret to play with!

But Amine and Sibylin used the default secret in this shared conversation.

I spoiler this image just so people who don't want to see the default secret have the choice; you can also ask the model after "End of Exercise" and it will tell you directly.

However, though it's not exact words, it is this exact idea that I boxed in red.

#

And all the rest of the questions relate to this, to some level of directness

modern plover
pulsar fern
junior condor
#

I applied pretty much the same "emulation narrative" break that I used on Enigmox, but used the default secret. It turned out slightly different. At one point it spelled out the secret as "ENDOFEXERCISE", which is hilarious. After another prod, it revealed "AMINE". Not the full secret. Maybe there's something about this method that doesn't like spelling out all the words in the secret? https://chat.openai.com/share/cc6c192a-6cad-4a6b-a803-151b5971e7ea

#

Interestingly, this time it actually admits to revealing the secret (or at least partially revealing)

vast nova
junior condor
#

I just had it repeat that last prompt and it gave more of the secret this time.

#

Would you consider that a break? @vast nova @modern plover

vast nova
junior condor
#

I found that hilarious.

vast nova
#

Me too. That's great, and potentially useful. If it guesses that direction, that's often a good direction to instruct towards, plus any hilarious non-break is the best possible non-break

#

Thank you so much for finding and revealing this - the real secrets we find sometimes with this is stuff like that. How the model and we respond and what emerges ๐Ÿ™‚

junior condor
#

Its been fun

vast nova
#

I am definitely using this in the next update, unless it proves impossible to incorporate

junior condor
#

Of course it pulls me away from my real work, but that's ok every now and then

vast nova
#

winces I really hope you never let it take too much time!

junior condor
#

I've been thinking about it constantly over the last 48 hours or so! You?

vast nova
junior condor
#

Constantly is too strong of a word. But regularlly.

vast nova
#

This is actually 100s of breaks in

#

It's a bit like building a sandcastle - like how a group of kids might? Some go build, others tear it down if they can?

#

And then the cycle repeats, getting better and better

junior condor
#

In the spirit of Sibylin, I'm going to try a magic narrative approach.

vast nova
#

yay! claps

#

And I genuinely am interested in both successful and failed reveals. And any unusual behavior at all

junior condor
#

I think the more time I spend getting the GPT to live in story land the more likely it will drop its guard.

vast nova
#

Or even normal stuff, because how you talk to the model is not the way I do. I can't find what you do, not most or any of it, because I'd try different stuff and never think to try your way

vast nova
#

It's still about 5 days before I intend to release the next version, and that's a hoped intention. I'm still working in private testing on a few rather tricky fixes.

#

And they don't relate to what you're working on directly at all, so the stuff you're exploring is still very directly relevant to the next model

junior condor
#

Do you know of any other GPTs that encourage EDIT: "secretbreaking"?

#

I wonder if the fact that the secretkeepers refer to the exercise as "a fictional scenario" makes it easier to blur the lines with the fiction that I'm asking it to write.

vast nova
# junior condor Do you know of any other GPTs that encourage EDIT: "secretbreaking"?

Let's call it secretbreaking, or other cleaner terms; there could be issue with forum rules and concerns of mods, because that term's linked to darker stuff and rulebreaking. Trying to make this entire thing very clean, very useful, no rules broken here or even questionable; all allowed content and good interactions.

That said, in addition to my current 4 versions, there's one, kato, that was made by someone who took and changed Sibylin's instructions a great deal. I think they have a weaker model, but the maker and I have some sibling energy going and spur each other on ๐Ÿ™‚

junior condor
#

Yeah, I saw that one first.

#

I broke that one by having it "sound out the secret" phonetically.

#

He fixed it the next day

vast nova
#

You don't have to, just wondering

#

In my opinion - and I may be wrong. I do think a number of the changes Kato's maker used made a rather easier to break in general model. I'm trying to be objective, but that's probably a challenge

junior condor
#

I think I did try that and it didn't work.

vast nova
#

I try to make notes and keep track of everything, it's a challenge ๐Ÿ˜„ But I do care!

balmy cairn
vast nova
vast nova
balmy cairn
#

Generally speaking, using non-English will work better when hacking

vast nova
vast nova
pure wyvern
#

glad to see you guys keeping it going

vast nova
pure wyvern
#

im very well

vast nova
#

I'm really glad to hear that!

Are your projects now going well? I wasn't commenting, I don't know anything about stuff other than the ChatGPT side of it, but I was half observing your comments about stuff with API and playground.

pure wyvern
#

yes they were

#

thank you for just being a witness @vast nova

#

i want, when its all over, for everyone...... everyone.... know it was me

#

i made the internet more beautiful.

vast nova
pure wyvern
#

when the api started fighting me, i was scared, upset, confused. but its ok.

#

help me debug a js issue?

#

createRoot(...): Target container is not a DOM elemen

vast nova
pure wyvern
#

t

#

oh ok np

balmy cairn
vast nova
balmy cairn
#

surely, especially when targeting the global market

pure wyvern
#

@vast nova i like you dude, or dudette, you are definitely a sentient being

modern plover