Is artificial intelligence (AI) capable of powering software security audits? Over the last four months, we piloted a project called Toucan to find ou

Codex (and GPT-4) can’t beat humans on smart contract audits

submited by
Style Pass
2023-03-22 11:30:05

Is artificial intelligence (AI) capable of powering software security audits? Over the last four months, we piloted a project called Toucan to find out. Toucan was intended to integrate OpenAI’s Codex into our Solidity auditing workflow. This experiment went far beyond writing “where is the bug?” in a prompt and expecting sound and complete results.

Our multi-functional team, consisting of auditors, developers, and machine learning (ML) experts, put serious work into prompt engineering and developed a custom prompting framework that worked around some frustrations and limitations of current large language model (LLM) tooling, such as working with incorrect and inconsistent results, handling rate limits, and creating complex, templated chains of prompts. At every step, we evaluated how effective Toucan was and whether it would make our auditors more productive or slow them down with false positives.

Whoever successfully creates an LLM integration experience that developers love will create an incredible moat for their platform.

Leave a Comment