#Bug: Response cut off on Typingmind despite the parameter max_completion_tokens was sent

7 messages · Page 1 of 1 (latest)

rustic pelican
#

I’m experiencing a serious issue (or potentially multiple issues) that has rendered Typingmind almost unusable. Could you please investigate this as soon as possible and advise on any workarounds?

Details:

  • Response Cut Off: Responses are limited to 5000 tokens (see attached 1) despite the max token setting being 50000.
  • Timing: The issue began about one day after I changed the max token and default model settings. Is it related to prompt cache? If so, I’d like to know how to force update cache.
    I’ve attached a HAR file for reference. Please let me know if you need more information.

Response from Typingmind:

Regarding the cut off issue, I can see that TypingMind has correctly sent the parameter max_completion_tokens=50000 to OpenRouter. My best guest is that Open Router doesn't respect this parameter. You have two options in this case:

  • Option 1: Modify your custom model to add a custom body parameter using max_tokens=50000(my understanding is that OpenRouter is using the deprecated parameter by OpenAI, but I could be wrong, only OpenRouter team can answer this)
  • Option 2: contact OpenRouter team to make their API compatible with OpenAI API and accepting the max_completion_tokens parameter.
gaunt bough
rustic pelican
uneven jewel
#

Have you tried not setting max tokens at all?

rustic pelican
uneven jewel
jolly pulsar