Hello OpenAI team,
I'm using the OpenAI API to extract and structure addresses from free-text descriptions. I rely on the response_format: json option to ensure clean, machine-readable outputs. However, in many cases, the API is returning malformed or incorrectly encoded characters within the JSON response.
For example, instead of returning "São Paulo" or "Guarujá", I receive:
{
"estado": "S\u00050",
"cidade": "Guaruj\u00101"
}
These are control characters (\u0005, \u0010, etc.) that corrupt the expected UTF-8 output, making it unusable in production systems.
This behavior has been consistent and is severely affecting the reliability of our integration. For context, in this past month alone, our usage statistics are:
Total tokens: 1,971,385,774
Total requests: 344,854
We kindly ask for guidance or a fix, as we're strictly relying on the model's output for critical address processing and need consistent, UTF-8 clean responses.
Thank you in advance for your support.
Best regards,
Luis