#High latency in GPT-4o compared to GPT-4o-mini

2 messages · Page 1 of 1 (latest)

serene moth Sep 27, 2025, 5:57 PM

I am migrating to GPT-4o via AI foundry endpoint and i noticed that there is some delay in my response. It takes about 30-50s more for a simple question. I tried it with GPT-4o-mini via AI foundry endpoint too and the output was quick. Any help would be greatly appreciated!

haughty prairie Sep 29, 2025, 5:43 PM

It appears you're experiencing delays with GPT-4o on Azure AI Foundry, while GPT-4o-mini is performing well. Here's a potential approach to troubleshoot:

Timeout Settings: If the service typically takes longer than 10 seconds, consider increasing the timeout to 30 seconds or more.
Retry Policies: Implement exponential backoff and retry policies can help manage temporary service interruptions or overloads