I’m experiencing a serious issue (or potentially multiple issues) that has rendered Typingmind almost unusable. Could you please investigate this as soon as possible and advise on any workarounds?
Details:
- Response Cut Off: Responses are limited to 5000 tokens (see attached 1) despite the max token setting being 50000.
- Timing: The issue began about one day after I changed the max token and default model settings. Is it related to prompt cache? If so, I’d like to know how to force update cache.
I’ve attached a HAR file for reference. Please let me know if you need more information.
Response from Typingmind:
Regarding the cut off issue, I can see that TypingMind has correctly sent the parameter max_completion_tokens=50000 to OpenRouter. My best guest is that Open Router doesn't respect this parameter. You have two options in this case:
- Option 1: Modify your custom model to add a custom body parameter using max_tokens=50000(my understanding is that OpenRouter is using the deprecated parameter by OpenAI, but I could be wrong, only OpenRouter team can answer this)
- Option 2: contact OpenRouter team to make their API compatible with OpenAI API and accepting the max_completion_tokens parameter.