GPT 4 has some nice image recognition, but when it comes to math, it tries to use python to better "read" the image, rather than actually checking the image. this may suck if you have things like complex symbols or integration, as GPTs image recognition is far better. a good example of this is trying to get gpt to do calculus with multiple problems. in order to try and help out people who end up in situations like mine, here is my general guide on getting gpt to do math.
Step 1: Your problems
Be sure that your problems are legible. this means that if you show a fifth grader whos third language is english your problems, they should be able to read it. make sure the problem can be understood, then make sure there is a clear distinction of the instructions for each problem.
TIPS FOR THIS STEP:
A pdf is usable, and in a lot of cases preferred. for this reason, i will continue the guide. if your questions are in jpgs or pngs, thats fine too, but try to stick to one file format. in the case you already have images, skip to the merge step.
STEP 2: TRANSLATE & MERGE!
now, you have gpt looking at your gorgeous pdf. you were sure to dot your is, and cross your ts, you even included a nice message for the three letter agent reading your post. if you try to have gpt solve all the problems right there and then, there is a lower (in my experience) chance it will be able to do so. my theory on why this is is that the pdf format, while nice for humans, is very cringe to machines, as anyone who tried converting a pdf to docx may testify to. in order to have it not be cringe then, we need to mush our carrots into baby food for the ai. here is the prompt i would give GPT if i had a pdf:
"please use python to modify this pdf into a series of images. after doing so, use python to merge the images into one long image. after merging the images, use python again to embed the image into our chat."
dont use the first seventeen words if you already have images.
part 1/?