#How to get the AI to images in the knowledge
1 messages ยท Page 1 of 1 (latest)
I am not sure its working like that
What I can say is:
images generated by code interpreter can not be analyzed by Vision in chat, you need to download and upload them
if they were what you could do was to prompt your bot like that:
- you have an image in your files, take a look at it with Vision and tell me detailed description of whats inside
- draw me picture like that with Dall-E
But currently that does not work
Here screenshot of the test
Otherwise, method I used that is close to this I showed in youtube video with id bZzLzy_d6LA
What I did was upload image to chat, ask it to give a detailed description, then draw images based on it
You can ask it to describe style, exposition, and so on.
What you can do then is, put in to chatgpt image you want, ask it to produce a detailed description of how to draw it, spend time on this, it should be exhaustive one, check out Creature Fusion Plus for example
And then save that as text file and add to your knowledge base
Enable Code Interpreter
Instruct your bot to use code interpreter to retreive that description in to the chat when needed
draw pictures baed on that
@dusty pilot Regarding text files in the GPT knowledge base, I have several text files in my GPT that contain additional instructions. I use the primary instructions to tell the GPT to Read knowledge base file "xyz.txt" for additional instructions related to XYZ โ and it works fine.
If I keep the primary instructions and each of the 10 knowledge base files below the context window size (< 3000 or 4000 words), then I can have ~33,000 words of instructions max (not that I'm anywhere close to that yet).
Do you use code interpreter? Or knowledge retrieval works?
Wanted to experiment with similar setup, just did not had time.
Sorry, I don't use Code Interpreter. I know that the knowledge base files are converted into a kind of database that the GPT can query. I know that the files can be up to 512 MB, but I have not tried using large text files yet. Mine are all <3000 words each.
"images generated by code interpreter can not be analyzed by Vision in chat, you need to download and upload them"
This is not my experience. What do you base this on?
Here I made a video for you to demonstrate
if you made it work, aka it can do GPT Vision analysis on images from knowledge base do share, it did not work for me, requires for user to upload image
ChatGPT is such a good liar, so it is difficult to show evidence of my theory. But attached is one example of how I believe it does vision analysis of an image it has just created using code.
Possible, but it's my font that I created.
I deliberately chose a font that was not called anything like "elegant", "script", "calligraphy".
I have already used up my ChatGPT quota for the next two hours, so I cannot test more ATM.
It could be that you are right, interesting why it fails with knowledge file then
Here is where I was at in my testing.
no I think it is lying
Hm, I think it actually did well. Sure, it is completely wrong about it, but it sees the stuff on top of the L?
yeah its weird, I am thinking how to test it better in that sense
but anyways, I have other use case where I upload PDF, convert it to images, and then ask it to tell me what is on them and it fails, says it can't
then I download and reupload and it works
I think it is hardcoded to block the content from the knowledge base. When I get to write what should display the image, the image just disappears.

However, if the link it broken, it shows up like an broken image symbol.
I also got it to send me the image from the knowledge base after it had added a blue filter to it. White image was in knowledge base, which it refused to send or show, but once it had created the blue version it was happy to give me a download link.
for me it does give image as is without resistance
Cool, what was your prompt for that?
I am still leaning towards that it can do vision analysis on a python generated image. I may be fooled by its lying, but my experience is that it does it. However, I don't know about vision analysis from knowledge base. Is that was is in question?
If so, I think it is an effect of that it does not want to display the images from the knowledge base. And if I understand it correctly, the images needs to be displayed for it to analyse them.
Maybe that is why your experience is that is cannot analyze images from code interpreter? Because it requires the extra step of to display the images before being able to look at it. And it doesn't even consistenly use this approach when specifically instructed to do so.
yeah with python generated its bit tricky... you need to make it generate and image for which is not clear from code what it shows but it clear from the picture
ouh, I know
so here is how I made it not know what it draws
https://chat.openai.com/g/g-YhQpmZkPo-gpt-vision-on-code-interpreter-images-test
I gave it as knowledge file a python code that draw a picture with 4 black horizontal lines
and instruction to import and run that code
so it gives and image
if you ask what is inside of it it fails to see it
if you download and upload it it sees
all in all its a good lier, it does not really see what is in the pictures, it infers it from file names and code it used to draw them
Great test, thank you. I will test if I can convince it that it can see it, as it is poossible it just says it can't.
if you find out that it can do share, I realy need it!
Here, it seems to support my idea that it needs to display the image to be able to analyze. But it also gets it wrong, about it being yellow. But the image is straight from knowledge
(the above is not your test GPT but my own)
yeah but one thing is that it uses other information, like file names for example, there could be also meta information attached to your file etc
my test removes it all, it is straight image from code it does not know
I specifically removed all comments and stuff so that it can not use that to make guesses
How did you draw 4 lines without it knowing how many? Just by pixel distance?
Proving you right:
Image filename is of course "house_with_yellow_roof01.webp"
Okay, this is confusing. Me and the GPT seems equally unsure about everything.
Look, it definitely can see the image.
One more example. It seems like the vision does not get activated unless you upload a photo.
Here is when I uploaded the sudoku image first, it analyzed it, and then I asked it to show its knowledge image. Then there is no confusion. So it appears that you need to send it an image for vision to be activated.
ouh wow that is weird...
but its an interesting find, may be it should be reported as a bug
I will test now quick, interesting, if so great find
you right lol, it can speak about images from knowledge base after you upload, weeeird
it gives me hope that there could way to activate it, give me a sec
hm, it still is confusing
I seen that I can ask it about its prompts and system messages and functions
but it does not tell me anything about vision capabilities
so I am not sure if we can enable it before uploading... weeeird
seems like a bug
I want to test if it activates it if you upload any file. A font for example.
still problematic if we can not activate it without upload...
I am thinking of reporting it as bug
Hey, I saw your GPT mentioned in a Youtube video I watched this morning ๐
Tried to send the link but it got blocked.
at 14:34: efXoLvB4Xkw
well if you check author avatar and my here you may notice that's me ๐ It's my channel
Cool ๐ Even better, I watched your video this morning!
cool!
was it recommended? I am pretty new to publishing youtube content
I was thinking to turn this conversation in to one more video, it will include you ๐ I hope you do not mind ๐
Yes, I guess. It must have been on my front page.
haha, have no idea how all that works, was thinking of doing youtube content for a while, at the moment custom gpts have a niche hype interest ๐
I think it is a great idea. If it takes of it may be very rewarding.
If the whole GPT thing takes off, I mean.
@dusty pilot since you have the code interpreter activated. Tell it to download it into the data folder and name it {uniqueName}.png. This will let you use it for analysis and you can just delete it when that's done.