#Questions about Neo evals

1 messages · Page 1 of 1 (latest)

slender mirage Feb 27, 2026, 5:23 PM

Hey guys. Was reading https://projectdiscovery.io/blog/ai-code-review-vs-neo . Really interesting results! Any chance you could share the source for the "VaultBank" example app? The blog post mentions that the prompts will be shared, but it's kind of hard to run evals and compare our own results with the ones mentioned on the blog without the source 🙂 Cheers

ProjectDiscovery

AI code review has come a long way, but it can’t catch everything...

AI code review can reason about intent, but real incidents often stem from business logic flaws that only show up in runtime. Our benchmark reveals where code-only review falls short.

tender meadow Mar 11, 2026, 5:10 PM

@slender mirage we got the source code and result published in 2nd part of the blog published today - https://projectdiscovery.io/blog/inside-the-benchmark-pp-architectures-finding-walkthroughs-and-what-each-scanner-actually-caught

ProjectDiscovery

Inside the Benchmark: App Architectures, Walkthroughs of Findings, ...

This is Part 2 of our vibe coding security benchmark study. In Part 1, we compared how LLM-based security tools like ProjectDiscovery's Neo and Claude Code performed against traditional SAST and DAST scanners on AI-generated code. We found that LLM-based tools like Neo and Claude Code detected many high-value findings that traditional scanners ...