I would suggest implementing per-thread settings for thinking effort. Referring to this reply by Teknium, https://x.com/Teknium/status/2047813157679399193?s=20 "/model <name> in the gateway is session-scoped by default and keyed on the session key (platform:chat_type:chat_id), so a switch in one Discord channel/thread doesn't affect any other channel/thread. Stored in _session_model_overrides[session_key] in gateway/run.py, separate from config.yaml."
Shouldn't the /reasoning setting follow the same per-session logic as the /model setting?
Practical scenario: I want to use the GPT 5.5 but thinking levels should vary between threads. Example: Maintenance threads = medium reasoning, production threads = high or xhigh
Currently changing /reasoning in one thread changes thinking levels globally
@5low_motion That is already how it currently operates
from hermes:
Good news — that's already how it works. /model in the gateway is session-scoped by default and keyed on the session key (platform:chat_type:chat_id), so a switch in one Discord channel/thread doesn't affect any