I encountered the Dog Bunny Puzzle a few days ago and wondered if GPT-3 would be able to solve it. Sadly I wasn’t able to get it to work, although the following prompt got kind of close:
Amusingly, it adds some details to the prompt constraints (“Note that you do not have to use all of the edges, but you can only use each edge once”) but the actual moves are useless. Removing the “special rules” section helped the model perform more accurately, in my experiments, but my sense was that the level of complexity in the Dog Bunny Puzzle was overwhelming the model. (Here is a solution to the puzzle in Python, in case you’re curious!)
I wondered, though — how might GPT-3 perform on a much simpler problem: given a partially-connected graph, find a path between two nodes, or determine that no path exists.
I wrote some code to automate generating graphs, feeding them to GPT-3, and parsing + grading its results. I generated 1000 random graphs and fed them to the model. The graphs ranged from 3 to 14 nodes, and up to 25 edges. The optimal path through the graph ranged from 2-7 nodes. How did GPT-3 do? The model found a valid path (or correctly reported no result) a little over 60% of the time: