Using Marvin for Structured Data Extraction
I've been following the "AI engineering framework" marvin for several months now. In addition to openai_function_call, it's currently one of my favorite abstractions built on top of a language model. The docs are quite good, but as a quick demo, I've ported over a simplified version of an example from an earlier post, this time using marvin.
The result:
The code is clean and the result is good quality. The abstraction allows me to almost entirely avoid dealing with code that calls the language model. I get to think in data structures and code and the language model's response is woven into the software using the primitives I define. However, the response isn't exactly how I want it. I don't like that additional suffixes are being included in some of the unit. For example, "unit": "cup unsalted". The following modification to the Ingredient class helps improve this
New output:
This mostly looks good. My only remaining complaint is that if no details are extracted, the field is still included as an empty string.
I tried a few different modifications to the Ingredient class to eliminated this but all were unsuccessful such that the output still included "details": "" for some ingredients.
It's hard to tell without actually reading the prompt and response verbatim what is going on here. Inspecting pydantic's behavior for a null value, we see details show up as None rather than an empty string:
The outputted JSON now contains null for the field:
I have to assume the language model is outputting the empty string ("") rather than null or omitting the field. As a final test, I ran the code again using gpt-4 and the last definition for details above.
Gpt-4 is slower and more expensive and still does not do what I want. This small issue isn't difficult to correct in code, but it provides a bit of signal into how well the model follows instructions with this approach to prompting, which is a function of both the model and the prompt itself.
Discussion in the ATmosphere