I just performed an update to 2026.4.15 from an older build (which worked fine), now I'm experiencing an issue where every basic chat response is loading a new model filling my available VRAM/RAM and tanking performance. The LMS service is running fine and I get a response back.
Two interesting log lines popping up:
[agent/embedded] embedded run failover decision: runId=slug-gen-1776609671777 stage=assistant decision=surface_error reason=timeout from=lmstudio/qwen/Qwen3.5-27B profile=-
[lmstudio/stream] LM Studio inference preload failed for "qwen/Qwen3.5-27B" (1 consecutive failure, next preload attempt skipped for ~5s); continuing without preload: AbortError: This operation was aborted