Entity extraction using OpenAI structured outputs mode

submited by
Style Pass
2024-11-06 11:00:34

The relatively new structured outputs mode from the OpenAI gpt-4o model makes it easy for us to define an object schema and get a response from the LLM that conforms to that schema.

The code first defines the CalendarEvent class, an instance of a Pydantic model. Then it sends a request to the GPT model specifying a response_format of CalendarEvent. The parsed output will be a dictionary containing a name, date, and participants.

We can even go a step farther and turn the parsed output into a CalendarEvent instance, using the Pydantic model_validate method:

With this structured outputs capability, it's easier than ever to use GPT models for "entity extraction" tasks: give it some data, tell it what sorts of entities to extract from that data, and constrain it as needed.

Let's see an example of a way that I actually used structured outputs, to help me summarize the submissions that we got to a recent hackathon. I can feed the README of a repository to the GPT model and ask for it to extract key details like project title and technologies used.

Leave a Comment