Raw Record Source

{
  "path": "/posts/2023/openai-function-calling",
  "site": "at://did:plc:mracrip6qu3vw46nbewg44sm/site.standard.publication/self",
  "tags": [
    "language_models",
    "openai",
    "structured_data",
    "function_calling"
  ],
  "$type": "site.standard.document",
  "title": "OpenAI Function Calling",
  "updatedAt": "2023-06-18T23:24:00.000Z",
  "publishedAt": "2023-06-18T23:24:00.000Z",
  "textContent": "This past week, OpenAI added function calling to their SDK.\nThis addition is exciting because it now incorporates schema as a first-class citizen in making calls to OpenAI chat models.\nAs the example code and naming suggest, you can define a list of functions and schema of the parameters required to call them and the model will determine whether a function needs to be invoked in the context of the completion, then return JSON adhering to the schema defined for the function.\nIf you read anything else I've written you probably know what I'm going to try and do next: let's use a function to extract structured data from an unstructured input.\n\nExtract a recipe as structured data\n\nI found this recipe and I want to try it out.\nI want to parse the content on the page and extract the recipe in a form that I could easily render on a personal recipe site.\nI quickly checked the page and it looks like most of the content is nested within an html element with the class \"content\".\nHere is some Python code to extract all the text from the HTML, eliminating the markup:\n\nThis code outputs a big block of text, a lot of which isn't ingredients or instructions for the recipe.\n\nBefore we get into calling the language model, let's write a schema for the data we'd like to extract from the page's content.\nWe'll use pydantic because it can easily be converted to a JSON schema.\n\nNothing too surprising so far.\nNow is the interesting part.\nLet's wire up a call to OpenAI that uses functions and our Recipe schema to structure the response:\n\nRe-writing with a bit of refactor:\n\nHere is where I started to run into problems.\nWhen running the script, I get the following error:\n\nInspecting the JSON output of the script, we see the model isn't returning valid JSON:\n\nWe see \"quantity\": 3/4 isn't valid JSON.\nWe can try to steer the model adding a description to the pydantic field:\n\nThis modifies the JSON schema in the following way:\n\nUnfortunately, this doesn't resolve the invalid JSON issue.\nHowever, switching from gpt-3.5-turbo-16k to gpt-4-0613 (and removing the Field description) yields JSON that adheres to the input schema.\nStill, GPT-4 models are slower and more expensive than 3.5 models, so there is motivation to try and get this working with the latter.\n\nTaking an approach I've tried previously, it seems like we can get more reliable results with gpt-3.5-turbo-16k.\n\nTakeaways\n\nOn one hand, it's great to see OpenAI training models to better integrate with emerging language model use cases like function invocation and schema extraction.\nOn the other, OpenAI acknowledges this approach doesn't always work in their documentation:\n\n> the model may generate invalid JSON or hallucinate parameters\n\nPrevious techniques I've explored for schema extraction seem to produce more consistent results, even with less advanced models.",
  "canonicalUrl": "https://www.danielcorin.com/posts/2023/openai-function-calling"
}