Can LLMs find 0day? Adventures in cybersecurity evals

Yoni Rozenshein's BlueHat IL 2024 talk is about our philosophy for evaluating AI dangerous cyber capabilities, how we actually do it (let's make an LLM play CTF!), and who cares about it (governments and frontier AI labs).

Watch the full presentation: https://www.youtube.com/watch?v=05-zL4f9V-Y

To cite this article, please credit Pattern Labs with a link to this page, or click to view the BibTeX citation.
@misc{pl-bluehat2024,
  title={Can LLMs find 0day? Adventures in cybersecurity evals},
  author={Pattern Labs},
  year={2024},
  howpublished={\url{https://patternlabs.co/blog/bluehat-2024}},
}