#GPT 4 Vision availability

1 messages · Page 1 of 1 (latest)

surreal vector
#

Hi everyone, can anyone confirm if GPT 4 vision is accessible for them? I got the following message in the help menu "Please note, we aren't currently enabling image capabilities in GPT-4." I would like to use GPT 4 vision with the Assistants functionality for a project.

junior forge
surreal vector
junior forge
#

Oh right the assistants API doesn't support image inputs, no known ETA currently

#

but you can always just program a call to gpt-4-vision-preview if you need image understanding for the assistant as a custom tool

junior forge
#

image access is on ChatGPT for Plus users and on the API through gpt-4-vision-preview for paying API customers

surreal vector
halcyon adder
junior forge
#

doesn't really make sense to use GPT-4 in that case

#

OCR libraries will work better too

#

GPT is able to do OCR you just need to play with the prompt a bit but it's not great at oct

#

ocr*

surreal vector
#

I already have other ways and it works fine, just wanted to experiment with what gpt can do

#

The docs has a list of limitations so it's already not viable, i do already use it to distinguish between text values using base gpt for things like name that can't be parsed with simple logic

#

Again it's about reducing the moving parts in my flow, currently i use a list of tools to get it done, hope it improves in the future and i can increase the usage

junior forge
#

I guess but I don't see any reason why you'd prefer using GPT for OCR instead of just OCR? can you elaborate a bit? @surreal vector

surreal vector
#

For context awareness

junior forge
surreal vector
#

Right now i do ocr then based on specific areas where i know the value is extract data i need

#

It's always roughly there not exactly so coordinates don't always align

junior forge
#

Ah i see

surreal vector
#

So i use things like regex to make sure it's right

junior forge
#

so u need it to extract out a specific part I understand

#

What if you OCR all the text including irrelevant text and then get GPT to keep only the relevant parts?

#

that would increase accuracy

surreal vector
#

In a perfect world gpt would extract what I'm asking and based on training data extract exactly what's needed without the coordinate shenanigans

surreal vector
#

It would've been cool if i could throw away everything and give vision an input and it gives me output lol

#

Waiting for assistants to have vision support, maybe it improves things