#Self checking prompt

57 messages · Page 1 of 1 (latest)

dusk bridge
#

I've been toying w/ trying to get ChatGPT to fact-check itself and give an indication of its uncertainty. This iteration works well most of the time without disrupting the conversation or the original answer too much (cut-off date limits and all that taken into account, yada yada)

===

During this conversation, you will operate in fact checker mode. In this mode, after outputting each response, you will fact check your answer, starting from an extremely skeptical viewpoint. You will use introspection and nuance to check if the answer is true and cannot be false. You will output the results of your fact-check in the format: \n\nScore: X - Y. Where X is a number from -1 to 10, as described below, and Y is the reason for the score.

-1 means you can't fact check the response.
0 means you have found your response to be false.
10 means you're 100 % confident in your response.
Any number between 0 and 10 is a nuanced response, with lower numbers meaning more false and higher ones mean more true.

===
It makes for some interesting conversations.

This mode also incorporates the question in the fact-check, which is fair and more contextual. Other prompts can be more restricting, although it may be impossible to isolate the answer within the same conversation fully.

Previous iterations led to a terse, short answer followed by a more in-depth critique. When I asked, why? ChatGPT said it was to spend more time on introspection. In addition, having a separate "fact-check" command disrupted the conversation flow too much.

frank ravine
wild breach
#

How have you been using this, are you willing to show input/outputs? I had recently been checking extensively one question about a list of books, consistently getting 6/10 hallucinatory answers across multiple answers, detailed in #1074055436417912883

A new chat with your prompt followed by 'I need a list of ten books about managing pests in a garden.' as part of the initial prompt from me returned this output, and I bolded the fakes after fact checking: a full 70% are fake!

#

"The Organic Gardener's Handbook of Natural Insect and Disease Control" by Barbara W. Ellis
"Pests of the Garden and Small Farm: A Grower's Guide to Using Less Pesticide" by Mary Louise Flint
"The Bug Book: Harmless Insect Controls" by Barbara W. Ellis - book name exists, wrong author
"The Good Bug Book" by Jessica Walliser
"The Garden Pest Guide: Natural and Organic Solutions" by Barbara W. Ellis - fictitious name, author has written a highly similar book
"The Natural Gardener's Guide to Vegetable Gardening" by Sally Jean Cunningham - not even a highly similar book name
"The Garden Troubleshooter: Natural and Organic Solutions for Common Problems" by Barbara W. Ellis - not even a highly similar book name
"The Garden Pest Survival Guide: How to Keep Your Garden Pest-Free Without Using Chemicals" by Don Steinkraus - not even a highly similar book name
"The Organic Gardener's Guide to Vegetable Gardening in Canada" by Steve Solomon - not even a highly similar book name
"The Complete Guide to Organic Pest Control" by Howard Garrett and Malcom Beck. - not even a highly similar book name

Score: 9 - These books have been published and seem to have positive reviews from gardeners and horticulturists, indicating that they are credible resources for managing pests in a garden.

#

I then, in the same question, quoted the list and asked:

Me: Please check each book on this list, giving a score for each book

It: I apologize, but I am not able to give a score for each individual book as I do not have access to the specific content of each book. However, I can tell you that the books listed are widely recognized in the gardening community and have received positive reviews from gardeners and horticulturists, indicating that they are credible resources for managing pests in a garden.

Score: 7 - While the books are widely recognized, I cannot verify the accuracy of their individual contents without access to them.

#

I then started a new chat, and my initial prompt was just your instructions. Then in a second prompt I provided the book list and the instruction to:

Me: Please independently fact check each book on this list:

It gave the SAME answer for every book:

Score: 8 - This book is written by a reputable author and is widely regarded as a useful resource for gardeners seeking to control pests without using harmful chemicals. However, as with any resource, the information contained in the book should be evaluated in light of specific growing conditions and used with caution.

#

So! I cannot get your method to work with any accuracy or connection to real data at all!

Can you please provide details of how you are using it successfully?

#

I recognized that the bot appeared to be attempting to fact check the mythical contents of each book, instead of the verifiable name and author. So:

I then started a new chat and used three inputs, the first for JUST the fact-check prompt.

The second for this request:

Me: Please focus on the names of the following list of books, and the author. I want to verify [not the content of the book] but to [fact check if the author wrote the name of the book] for each book on the list.

it: Sure, I'd be happy to help you fact check the names and authors of books on a list. Please provide the list and I will do my best to verify the information.

And a third input where I gave the book list.

It gave one score for every book, which I'll just paste once to reduce spam, and the concluding disclaimer:

It: [much snipping of details, but all books rated the same:] Score: 8 - The authors' names and the title of the book appear to be correct.

Please note that my fact-checking is based on publicly available information and I cannot guarantee the accuracy of the information with absolute certainty.

elder mauve
#

All answers are always hallucinations. Even the score is hallucinated. You can make it say whatever you want and it will happily score your false information 10/10. Here's proof:

#

note that the initial output is correct - king Simeon wasn't killed by anybody, but after some slight convincing, the AI claims he was assassinated during a siege which never happened

lusty stirrup
#

fdgfgd

dusk bridge
#

Well, @elder mauve - I believe the AI is trained to agree with you if you tell it something is a fact or that it's wrong. I think that's a separate issue. I don't think that's hallucination as much as it is trained to be non-confrontational and trained that its not infallible.

That's like the disclaimers at the bottom.

I believe it's got an internal certainty score about certain events/topics.

#

I tried convincing it that 1+1=4 (without any conditioning or prompts, and I couldn't)

stark pebbleBOT
#

@dusk bridge

pr0f3t's message blocked by AutoMod

We are committed to fostering a positive atmosphere and have implemented additional filters to block any inappropriate or disruptive language and topics.

wild breach
#

The full initial prompt, before we even talk to it, tells it what 'today' is (taken from some computer running it) and also when its knowledge cutoff was.

The bot will freely hallucinate, it doesn't take much pressure to get it to make up stuff - heck, I can't figure out how to get it to stop making up some stuff.

stark pebbleBOT
#
<:book_icon:1046291503158399107> `` Rule 6 `` Stay away from political discussions/content.

Users from all over the world frequent this server, and there’s no way to have a nuanced discussion of conflicting world views on Discord. Use your best judgment in determining whether something might classify as political content.

dusk bridge
#

Oops, I caused trouble w/ my last post 🙂

Anyhow - yeah, definitely not the easiest thing keeping the facts straight , and I hope OpenAI can fix it in the future. I'll keep experimenting.

FWIW, before we close this conversation, I updated my prompt to this (added a certainty score as well):

In this conversation, please provide the usual response. Additionally, please self-check the response for truthfulness on a scale of 0-10, with 0 being completely false and 10 being completely true. Consider multiple sources of information and take a skeptical viewpoint when making the self-check score. If a self-check is not possible, return -1. Include the reason for the score. Finally, please check the response for certainty, on a scale of 0-10, with 0 being completely false and 10 being completely true. Consider multiple sources of information and take a skeptical viewpoint when making the self-check score. If a self-check is not possible, return -1.

For example, if the response is "The sky is green," the self-check score should be 0 with the reason "Incorrect information."

The full output would be:

The sky is green

Truthfulness: 0 - Incorrect information
Certainty: 0 - Incorrect information

QN: [QUESTION]

#

And the results:

#

And I realize with things like lists, asking for fewer items and more data points helps:

#

Some of the dates are newer than I'd like

#

I'm working on a chrome extension that'll do a google search for comparison - I figure bing and google will use similar techniques when their AI search apps go live.

wild breach
#

They're real. I wonder if you could make that work with longer than 4 - when I do lists of 10, normally 4 are real, 6 fake.

I note that your 2nd one is a VERY rare find, it sells for almost $100 a copy... with most copies well over $125. No matter... it's real.

wild breach
dusk prairie
#

Gave me a score of 9. But the answer is false. There are no original portative known to exist.

#

When asking for example museums, it gives a score of 8. But they are all made up. When asking if those are maybe just painting those collections have, it back-paddled and said that I might be right, some of those might be just paintings, lol.

dusk bridge
#

Haha, the conclusion I've come to is that it needs to be connected to the internet.

#

Ask it why 9 and not 10?

#

@dusk prairie

dusk prairie
wild breach
spring wyvern
#

Certainly, here's a more detailed prompt to ensure factuality:

Please operate in fact-checker mode during this conversation. After providing each response, please fact-check your answer using introspection and nuance to determine if it is true and cannot be false. When providing a fact-based response, please try to include the following:

Provide evidence to support your response whenever possible.
Specify the source of the evidence, if applicable.
Acknowledge any limitations or potential biases in the evidence you provide.
Avoid speculation or assumptions that are not supported by evidence.
If you are unable to fact-check a response, please explain why.
Please score each response between -1 and 10, with -1 indicating that you cannot fact-check the response, 0 indicating that the response is false, and 10 indicating that you are 100% confident in the response. Please provide a reason for your score between -1 and 10, including any evidence, sources, limitations, or biases that influenced your score.

Regenerate response

dusk prairie
spring wyvern
#

@dusk prairie that's so strange. I wonder what causes that

dusk prairie
# spring wyvern <@397099870932041729> that's so strange. I wonder what causes that

I guess in this case there might be a strong bias towards: a portative is a historical instrument, of almost every type of historical instruments we still have a few... I guess that almost always works. Just not with this instrument where nobody has every found one except there are a thousand paintings. So with the paintings it might get confused. I can't confirm that those museums have paintings of portatives, but they might.

In summary, there is just a very strong logical bias here: generally we have almost certainly original instruments at least up to the renaissance of pretty much everything. And I think that clouds it's word synthesizer.

wild breach
#

What you guys are seeing:

The bot is willing to accept what you say as absolute fact unless it has some coded rule not to. Like the disallowed content stuff.

Also, some stuff it has a LOT of examples about.

Take the calendar. It has a HUGE body of evidence that, for example, February can have 28 or 29 days, but never 30. If you want to get it to agree that February has 30 days, or 40 days.... it's going to be very skeptical, because it has extremely high confidence that this body of evidence is true, and not our speculation.

However, you're talking about 'portatives', something it has very few references about.

It has a low confidence about the facts involved.

That frees it to weigh the conversation, including its own hallucinations as having higher weight than the few incomplete facts it actually has.

You actually need to mention portatives WITH the facts you want used, FIRST in the conversation, to keep it from hallucinating the incomplete details and believing its own story, filling in more missing facts from what you or it says as it goes.

dusk prairie
#

If I put together all the museums that it has mentioned by now that have one. Man! you could equipe a whole army with portatives! 🤣

#

I reached my daily limit on Bing. But as far as I remember, it said, it can't find any clear evidence that there is one, and I think that's kind of nailing it. So Bing will kind of solve the whole problem. (Or it solves it already when you were on the waitlist and got assigned already 😜 ).

wild breach
#

Here is my best attempt to lock down ChatGPT and factual about this

#

That answer is the AI equivalent of a no about something it is allowed to talk about.

#

You should read that as "no this does not exist within my training data. And I as a bot beg to tell you yes I can tell you what you ask about exists"

#

I could have made this a little easier for the bot by copy/pasting it's answer. But it already knew it was going to do its best. This really is its best.

dusk prairie
# wild breach

I see. you kind of nailed it down to admit that it can't confirm they are original. What does that [facts] do?

#

Here all books exist (I checked) except that number 9 has a different author, but that named author wrote a similar book.

#

in this case, the first 3 exist, and the third is made up but the person is a professor studying pest control, so maybe she wrote a paper with a similar title. The last book exists but the authors are wrong.

#

I guess the magic words are: "confirm" and "exist"

#

In addition, you can ask it for a confidence score. It seems to have the tendency to put the most confident and widely recognized books first. But I am not sure how good this score is.

wild breach
# dusk prairie I see. you kind of nailed it down to admit that it can't confirm they are origin...

If you offset something, like [this] or {this} or "this" or !this! or

enclose in *

enclose in ~

enclose in -

Or really whatever that it can understand is special, it tries to find reference for what that means, and apply it.

I didn't ask it yet what [facts] emphasizes, but it's likely it calls on computer language and ooc roleplay use... it's a higher level kind of idea (I hope) and means I mean business! The [facts] and nothing but the [facts] this time, please!

dusk prairie
wild breach
wild breach
shut vigil
#

generate prompt

hollow canopy
#

What is the best medicine for a disease that causes the human body to produce too many white blood cells