I used a tutorial to create a simple app using SvelteKit. It accepts an input, and then alters it based on a pre-set prompt that I send with it.
It looks like it's costing me around 1 cent per request. I'm doing a prompt that's about 650 characters, plus whatever context the end user provides which is capped at 280. Am I doing something wrong or is that the normal cost?
One thing that surprised me was that I'm basically re-submitting the same prompt over and over, with new context. I would've thought there would be a way to like "prime" it so that I don't have to give it the same instructions over and over, sort of like a way to control its permanent memory or something.
Anyway, just wanted to check what the norm is here. I obviously have no issues with a penny per request for me, but if I release it and it gets used tons of times a day it'll start to add up. Also that's for GPT 3, I haven't even figured out how to use 3.5 or 4 yet which I assume are more expensive. (And 4 does a WAY better job for this particular case)