#Should coding agents be headless? 🧵
1 messages · Page 1 of 1 (latest)
This has been on my mind... what if the current architecture for coding agents was all wrong?
- We have "thick" monolithic clients that bundle 1) UX, 2) access to project context and 3) builtin tools that implement the core devloop
- The LLM endpoint is (usually, not always) configurable and decoupled, but provides "raw intelligence" and usually no tools
- Then you have third party tools via MCP. Only works for tools that are loosely coupled to the core devloop
So the client is the agent. External tools and model are satellites to this thick agent. Swappable via MCP and OPENAI protocols, respectively
But that doesn't feel right to me, why should my agent logic be tied to the UX of a particular client? I kind of want that behind an API (ideally a standard one), with a choice of dumb clients in front
The rising popularity of CLI coding agents I think is a step in that direction. You're still tied to a client... But at least it's not a full IDE.
I feel exactly the same @thorny olive. That's why when we were talking today in the team meeting I always thought of Dagger as a platform which gives you all the building blocks to be able to create these coding and any kind of agent by following a lego model. You wanna swap the RAG part, use a different module, wanna swap the file diff algorithm, pick a module, etc...
of course each company will try to bundle that thick agent as much as possible so they can capture and retain users as much as they can I'd assume
yes exactly . The question is: if the agent is headless and decoupled from clients, what is its API?
a coding agent API you mean? I guess nobody thought of the API yet since it's currently bundled in these "thick" clients
I'd assume each thick client implements it different
RAG as well?
Currently the agent carries some intelligence like knowing which tool to pick up and use. I think a step in the direction of a headless agent would be to make swappable heads. So being able to seamlessly switch between agents that have different specialties. So you decouple from the LLM client, but perhaps also unbundle that logic from the app. I’m working on something in this area. The idea of making the primary agent configurable using a set of known agents from a registry of possibles. The right agent can be picked based on hardware, compute resources, etc.