Beating ARC the hard way

submited by

Style Pass

2024-12-23 06:30:04

ARC is benchmark developed to test out of distribution reasoning and common sense in general solvers. It is specifically designed to be:

The designers of ARC achieved the above in a creative way: by developing problems that contain visual puzzles in which the participant must find an algorithm that explains symmetries seen across several demonstrations. They then must apply that algorithm to a final input. This sounds complicated, but in practice it is quite intuitive – most children can complete ARC questions.

LLMs are being pitched as general solvers, so lately we have been trying them out on this challenge. However, to make ARC amenable to being solved by a pure language model, you must remove the visual “clues” to the problem.

This makes the problem considerably harder. While it’s true that clever programmers who are presented with the above text problem would probably figure it out with enough time, I do not think that most humans could solve it. Here’s an example set of steps you could pursue if you wanted to tackle such a problem from a command line interface: