#Prompt Caching Not Working with gpt-4.1-nano Despite Repeated Context and prompt_cache_key

1 messages · Page 1 of 1 (latest)

winged elk
#

Hi all,

I'm currently building a chatbot using gpt-4.1-nano, and I'm trying to leverage prompt caching since a large portion of the prompt context remains identical across multiple user queries.

To do this, I’ve added the prompt_cache_key parameter to my requests and also console-logged the constructed prompt to verify that the majority of the context is unchanged. The total prompt length is over 10,000 tokens — well above the 1,024-token threshold for caching to be available, as noted in the documentation: https://platform.openai.com/docs/guides/prompt-caching

Despite this, I'm only ever seeing "cached_tokens": 0 in all API responses.

Is there something I'm missing about how prompt caching works or is triggered? Any guidance or clarification would be greatly appreciated!

winged elk
#

Nvm, got it working!

severe night
#

how'd you get working, I feel like I'm stuck in the same boat LOL