Thought Eddies

Structured Output, Functions and Prompting

Dan Corin August 12, 2024

I've been prompting models to output JSON for about as long as I've been using models. Since text-davinci-003, getting valid JSON out of OpenAI's models didn't seem like that big of a challenge, but maybe I wasn't seeing the long tails of misbehavior because I hadn't massively scaled up a use case. As adoption has picked up, OpenAI has released features to make it easier to get JSON output from a model. Here are three examples using structured outputs, function calling and just prompting respectively.

json", "").replace("`

Running the script (output objects truncated) yields

Don't read too much into the time durations. After running the script a few times, all approaches seem to take 4-6 seconds, with none clearly faster than the others. The quality of the extraction seems to be around the same for all approaches for this use case as well.

Function calls are pitched as a solution to allow you to "connect models like gpt-4 to external tools and systems"^1. Structured outputs are supposed to be for "[g]enerating structured data from unstructured inputs"^2. The latter is an improvement on "JSON mode" apparently introduced in 2023 that I never tried. The bottom line is we need these models to respond with structure, but we don't want the imposition of this structure to detract from the model's performance.

I need to do some testing to see if the quality of the model's response varies depending on which approach is used.

Discussion in the ATmosphere