That llama grep fails here is the latest example:
slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.998 (> 0.100 thold), f_keep = 0.999
slot launch_slot_: id 0 | task -1 | sampler chain: logits -> penalties -> dry -> ?top-n-sigma -> top-k -> ?typical -> top-p -> min-p -> ?xtc -> ?temp-ext -> dist
slot launch_slot_: id 0 | task 112 | processing task, is_child = 0
slot update_slots: id 0 | task 112 | new prompt, n_ctx_slot = 120064, n_keep = 0, task.n_tokens = 17332
slot update_slots: id 0 | task 112 | n_tokens = 17290, memory_seq_rm [17290, end)
slot update_slots: id 0 | task 112 | prompt processing progress, n_tokens = 17328, batch.n_tokens = 38, progress = 0.999769
slot update_slots: id 0 | task 112 | n_tokens = 17328, memory_seq_rm [17328, end)
slot init_sampler: id 0 | task 112 | init sampler, took 1.45 ms, tokens: text = 17332, total = 17332
slot update_slots: id 0 | task 112 | prompt processing done, n_tokens = 17332, batch.n_tokens = 4
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
reasoning-budget: activated, budget=2147483647 tokens
srv stop: cancel task, id_task = 112
slot release: id 0 | task 112 | stop processing: n_tokens = 17553, truncated = 0
srv update_slots: all slots are idle
kens = 1024, progress = 0.357646
[50107] slot update_slots: id 0 | task 0 | n_tokens = 6144, memory_seq_rm [6144, end)
[50107] slot update_slots: id 0 | task 0 | prompt processing progress, n_tokens = 7168, batch.n_tokens = 1024, progress = 0.417254
[50107] slot update_slots: id 0 | task 0 | n_tokens = 7168, memory_seq_rm [7168, end)
[50107] slot create_check: id 0 | task 0 | created context checkpoint 2 of 32 (pos_min = 15104, pos_max = 16383, n_tokens = 16384, size = 531.266 MiB)
[50107] slot update_slots: id 0 | task 0 | n_tokens = 16919, memory_seq_rm [16919, end)
[50107] slot update_slots: id 0 | task 0 | prompt processing