AI-enabled code assistants (like GitHub’s Copilot, Continue.dev, and Tabby) are making software development faster and more productive. Unfortunately, these tools are often bad at Solidity. So we decided to improve them!
We also evaluated popular code models at different quantization levels to determine which are best at Solidity (as of August 2024), and compared them to ChatGPT and Claude. Our takeaway: local models compare favorably to the big commercial offerings, and even surpass them on certain completion styles.
However, while these models are useful, especially for prototyping, we’d still like to caution Solidity developers from being too reliant on AI assistants. We have reviewed contracts written using AI assistance that had multiple AI-induced errors: the AI emitted code that worked well for known patterns, but performed poorly on the actual, customized scenario it needed to handle. This is why we recommend thorough unit tests, using automated testing tools like Slither, Echidna, or Medusa—and, of course, a paid security audit from Trail of Bits.
At Trail of Bits, we both audit and write a fair bit of Solidity, and are quick to use any productivity-enhancing tools we can find. Once AI assistants added support for local code models, we immediately wanted to evaluate how well they work. Sadly, Solidity language support was lacking both at the tool and model level—so we made some pull requests.