We are building an AI model to take our video content, use whisper to transcribe it, and present the information in written form, as well as optimize it for our social media posts.
However, we are currently facing issues with the API costs being significant when using GPT-4, but the prompts being sub-par when using GPT-3.5-Turbo.
We take a very large summary of the video transcript, and then create a structure for the written content and publish it.
Any proposals to improve it? Any success yourself?