A thread to discuss ways to reduce hallucinations (the model mishearing or making up words and phrases that aren't there)
When using the OpenAI API, an effective way I've found to remove the repeating sequence hallucinations is to reject segments that have a high avg_logprob and/or a low no_speech_prob value. These values are included if you use the verbose_json response format in your requests.
My current filter settings are these but Im still adjusting them
segment.no_speech_prob < 0.20 && segment.avg_logprob < -0.70