Can AI be superhuman? Flaws in top gaming bot cast doubt

Talk of superhuman artificial intelligence (AI) is heating up. But research has revealed weaknesses in one of the most successful AI systems — a bot that plays the board game Go and can beat the world’s best human players — showing that such superiority can be fragile. The study raises questions about whether more general AI systems will suffer from vulnerabilities that could compromise their safety and reliability, and even their claim to be ‘superhuman’.

“The paper leaves a significant question mark on how to achieve the ambitious goal of building robust real-world AI agents that people can trust,” says Huan Zhang, a computer scientist at the University of Illinois Urbana-Champaign. Stephen Casper, a computer scientist at the Massachusetts Institute of Technology in Cambridge, adds: “It provides some of the strongest evidence to date that making advanced models robustly behave as desired is hard.”

The analysis, which was posted online as a preprint in June1 and has not been peer reviewed, makes use of what are called adversarial attacks — feeding AI systems inputs that are designed to prompt the systems to make mistakes, either for research or for nefarious purposes. For example, certain prompts can ‘jailbreak’ chatbots, making them give out harmful information that they were trained to suppress.

