Custom-GPTs can be coerced into generating more than 4096 tokens per response/message by eliciting the model to use web-browsing or code-interpreter calls in-between the generation.
Using this method, we can get around 4230 tokens per response excluding function calls and input. (Including the function call, it gets higher.)
ChatGPT shared link: https://chat.openai.com/share/e/3e4f4d50-3ff4-42a8-85e4-f870d6b7dead
Steps to Reproduce:
- Create a custom GPT.
- Instruct the model to use irrelevant/relevant Browsing and Code-interpreter calls in between the generation.
- Use the model to squeeze out every bit of token from every message.
Expected result:
4096 tokens
Actual result:
4230~ tokens
Additional information
Browser: Chrome
OS: Windows