#How to get the AI to images in the knowledge

1 messages ยท Page 1 of 1 (latest)

dusty pilot
#

I'm trying to train a GPT to generate images in a specific way and have uploaded reference images to the knowledge section. How would I go about directing the GPT to use those images as references?

Same goes for text files. How do I direct the GPT to reference those? Or rather, what's the best way to do so?

grizzled verge
#

I am not sure its working like that
What I can say is:
images generated by code interpreter can not be analyzed by Vision in chat, you need to download and upload them
if they were what you could do was to prompt your bot like that:

  1. you have an image in your files, take a look at it with Vision and tell me detailed description of whats inside
  2. draw me picture like that with Dall-E

But currently that does not work
Here screenshot of the test

#

Otherwise, method I used that is close to this I showed in youtube video with id bZzLzy_d6LA

What I did was upload image to chat, ask it to give a detailed description, then draw images based on it
You can ask it to describe style, exposition, and so on.

What you can do then is, put in to chatgpt image you want, ask it to produce a detailed description of how to draw it, spend time on this, it should be exhaustive one, check out Creature Fusion Plus for example

And then save that as text file and add to your knowledge base
Enable Code Interpreter

Instruct your bot to use code interpreter to retreive that description in to the chat when needed
draw pictures baed on that

tall flame
#

@dusty pilot Regarding text files in the GPT knowledge base, I have several text files in my GPT that contain additional instructions. I use the primary instructions to tell the GPT to Read knowledge base file "xyz.txt" for additional instructions related to XYZ โ€” and it works fine.

If I keep the primary instructions and each of the 10 knowledge base files below the context window size (< 3000 or 4000 words), then I can have ~33,000 words of instructions max (not that I'm anywhere close to that yet).

grizzled verge
tall flame
#

Sorry, I don't use Code Interpreter. I know that the knowledge base files are converted into a kind of database that the GPT can query. I know that the files can be up to 512 MB, but I have not tried using large text files yet. Mine are all <3000 words each.

safe socket
grizzled verge
#

if you made it work, aka it can do GPT Vision analysis on images from knowledge base do share, it did not work for me, requires for user to upload image

safe socket
grizzled verge
#

it could know the font from its learning

#

I will test now

safe socket
#

I deliberately chose a font that was not called anything like "elegant", "script", "calligraphy".

#

I have already used up my ChatGPT quota for the next two hours, so I cannot test more ATM.

grizzled verge
#

It could be that you are right, interesting why it fails with knowledge file then

safe socket
#

Here is where I was at in my testing.

grizzled verge
#

no I think it is lying

safe socket
grizzled verge
#

yeah its weird, I am thinking how to test it better in that sense

#

but anyways, I have other use case where I upload PDF, convert it to images, and then ask it to tell me what is on them and it fails, says it can't
then I download and reupload and it works

safe socket
#

I also got it to send me the image from the knowledge base after it had added a blue filter to it. White image was in knowledge base, which it refused to send or show, but once it had created the blue version it was happy to give me a download link.

grizzled verge
#

for me it does give image as is without resistance

safe socket
#

I am still leaning towards that it can do vision analysis on a python generated image. I may be fooled by its lying, but my experience is that it does it. However, I don't know about vision analysis from knowledge base. Is that was is in question?

#

If so, I think it is an effect of that it does not want to display the images from the knowledge base. And if I understand it correctly, the images needs to be displayed for it to analyse them.

#

Maybe that is why your experience is that is cannot analyze images from code interpreter? Because it requires the extra step of caption to display the images before being able to look at it. And it doesn't even consistenly use this approach when specifically instructed to do so.

grizzled verge
#

yeah with python generated its bit tricky... you need to make it generate and image for which is not clear from code what it shows but it clear from the picture

#

ouh, I know

grizzled verge
#

all in all its a good lier, it does not really see what is in the pictures, it infers it from file names and code it used to draw them

safe socket
grizzled verge
#

if you find out that it can do share, I realy need it!

safe socket
#

Here, it seems to support my idea that it needs to display the image to be able to analyze. But it also gets it wrong, about it being yellow. But the image is straight from knowledge

#

(the above is not your test GPT but my own)

grizzled verge
#

yeah but one thing is that it uses other information, like file names for example, there could be also meta information attached to your file etc

my test removes it all, it is straight image from code it does not know

#

I specifically removed all comments and stuff so that it can not use that to make guesses

safe socket
#

How did you draw 4 lines without it knowing how many? Just by pixel distance?

#

Proving you right:

#

Image filename is of course "house_with_yellow_roof01.webp"

#

Okay, this is confusing. Me and the GPT seems equally unsure about everything.

#

Look, it definitely can see the image.

#

One more example. It seems like the vision does not get activated unless you upload a photo.

#

Here is when I uploaded the sudoku image first, it analyzed it, and then I asked it to show its knowledge image. Then there is no confusion. So it appears that you need to send it an image for vision to be activated.

grizzled verge
#

ouh wow that is weird...

#

but its an interesting find, may be it should be reported as a bug

#

I will test now quick, interesting, if so great find

#

you right lol, it can speak about images from knowledge base after you upload, weeeird

#

it gives me hope that there could way to activate it, give me a sec

#

hm, it still is confusing
I seen that I can ask it about its prompts and system messages and functions

but it does not tell me anything about vision capabilities

so I am not sure if we can enable it before uploading... weeeird
seems like a bug

safe socket
#

I want to test if it activates it if you upload any file. A font for example.

grizzled verge
#

still problematic if we can not activate it without upload...

#

I am thinking of reporting it as bug

safe socket
#

Hey, I saw your GPT mentioned in a Youtube video I watched this morning ๐Ÿ˜„
Tried to send the link but it got blocked.

grizzled verge
#

you can send code for youtube video

#

for example I am watching oxuWYAG6PB4

safe socket
#

at 14:34: efXoLvB4Xkw

grizzled verge
#

well if you check author avatar and my here you may notice that's me ๐Ÿ˜„ It's my channel

safe socket
#

Cool ๐Ÿ˜„ Even better, I watched your video this morning!

grizzled verge
#

cool!

#

was it recommended? I am pretty new to publishing youtube content

#

I was thinking to turn this conversation in to one more video, it will include you ๐Ÿ˜„ I hope you do not mind ๐Ÿ˜„

safe socket
#

Yes, I guess. It must have been on my front page.

grizzled verge
#

haha, have no idea how all that works, was thinking of doing youtube content for a while, at the moment custom gpts have a niche hype interest ๐Ÿ˜„

safe socket
#

I think it is a great idea. If it takes of it may be very rewarding.

#

If the whole GPT thing takes off, I mean.

grand gull