#Evaluation AI Agent performance

4 messages · Page 1 of 1 (latest)

brittle ore
#

Which AI agent and model combination works best for modern angular code? I wanted to test this on my own codebase with a variety of agents, models, tasks, and prompt styles. I saw angular released web-codegen-scorer but it is for non-agentic testing only.

I am looking for more info for how to automatically score the agent-generated code. Currently thinking of generating videos using playwright and watching them manually, as well as looking at code manually.

dim tusk
#

I really like Claude Code with Opus 4.6 but I didn't do any measurements

mortal pasture