I am an old solution Architect and to really understand what goes on around models, i built this application that let's you see what goes on during LLM chats and tool usage live, with token by token detail, full reasoning exposed on reasoning models. It turned to be for me a very usefull tool for choosing the right model for a given task.
It is 100% local, your data, your llms with Ollama.
If you like it just star the project in Github, that would be a nice way to say thank you to an old man!
LLMxRay on github:
https://github.com/LogneBudo/llmxray
Docs and Website pages:
https://lognebudo.github.io/llmxray/
LLMxRay — Local LLM Observatory. Full observability interface for Ollama: chat, streaming, reasoning, RAG, introspection, metrics. - LogneBudo/llmxray