We used a process that we’re calling The Zero-Shot. We gave a description of the image and some context of our goal to a computer model to generate

Help a Computer Win the New Yorker Cartoon Caption Contest

submited by

Style Pass

2021-05-27 21:00:09

We used a process that we’re calling The Zero-Shot. We gave a description of the image and some context of our goal to a computer model to generate the results.

To produce the captions this week we “prompted” the GPT-3 model, davinci-instruct-beta, through the OpenAI API Playground. This model is specially trained to “follow instructions,” so we gave it the instruction “Write a funny caption for a New Yorker cartoon” plus a description we wrote of the cartoon illustrated by Carolita Johnson. The model parameters were set to their defaults.

In machine learning zero-shot learning refers to a paradigm where a model solely relies on its general knowledge of the world to solve a problem at test time. Say we play you a song by Olivia Rodrigo but you’ve never heard her music before. Can you guess that she’s the singer? What if you’ve heard a lot of Taylor Swift songs and have read a review of “drivers license” that talks about Olivia Rodrigo and her Swift-ian inspirations?

GPT-3 has never seen this particular description of a New Yorker cartoon. But it has read a lot of text on the internet that has described New Yorker cartoons and their captions. It’s also read a lot of New Yorker articles and has some understanding of the magazine’s readership and what they find interesting or funny. It has generally seen a lot of comics with associated text so presumably understands what is meant by a “cartoon caption.” Here is the full prompt: