Figuring out how to use LLMs in production

Dan Corin May 8, 2023
Source
The most popular language model use cases I've seen around have been - chatbots - agents - chat your X use cases These use cases are quite cool, but often stand alone, separate from existing products or added on as an isolated feature. Expanding production use cases for language models I've been thinking about what could it look like to naturally embed a call to language model in code to flexibly make use of its capabilities in a production application. In this pursuit, I've been hooked by the idea that declarative, unambiguous data schema can serve as the bridge between language model capabilities and applications. Generally, schemas provide the contract for the data that will be sent into and expected out of a procedure. With this approach, we're treating the language model as a sort of magic RPC or API call. The request schema provides detailed context about the data we are sending into the language model. The response schema becomes an instruction set for the language model to interpret and fill in given the request data and accompanying schema as a kind of explanation. Finally, the response schema also serves as a validation layer in the application. If the language model returns data that fails to comply with the response schema, a validation error will prevent that bad data from making it any further into the system. Here's an example of what failed validations might look like: To phrase the approach a bit differently, we're aiming to use the request and response data schemas themselves (the literal Python code) as the majority of the prompt for a language model to see if this approach can be effective for application-building with language models. The main benefit of this approach is that less content is needed to describe the input to the model or instruct it how to respond to a prompt. The input description, valid response structure and the "operations" we want the model to perform are encoded in the request and response schemas themselves, described by the schema definition and any comments we choose to add. We no longer need to write prose alone or example responses to get the model to respond in a certain way. A possible implementation Re-using some bits from a previous post, we have the following block of text and response schema: > Last weekend, I visited my friend who lives in a charming little house on Oak Avenue. The address was 3578 Oak Avenue, and it was easy to find with the help of my GPS. While typing in the address, I discovered there was more than one 3578 Oak Ave, but once I entered right zip code, 90011, I found it. The house was surrounded by trees and had a beautiful garden in the front. Inside, my friend had decorated the place with vintage furniture and colorful paintings. We spent the day catching up and enjoying a cup of coffee in the cozy living room. It was a lovely visit and my first time in Los Angeles, and I can't wait to go back and see my friend's new garden in full bloom. Since we're going to send request and response schemas to the language model, let's define the request schema now: Now, let's construct a generic prompt parameterized on the schemas. We include the schema definition exactly as their objects are defined in code. Our data input will be a JSON object in adherence with the request schema. When we send this prompt to gpt-3.5-turbo, we get the expected output result of With this approach, we can hypothetically substitute any request and response schemas as prompt content to drive our use case. A Python API for constructing a call like this could look like this: StructuredLLM(Address) initializes a class with the target schema. .run(JournalEntry(content="...")) passes in an instance of the request schema. The prompt construction and object extraction and validation would live inside StructuredLLM. With this API, we could also easily do things like classify text sentiment or generate tags without needing to write any prompt code. For example: Running the prompt again, substituting Address for EntryMetadata yields: With the proposed Python API, to accomplish this in code, we would run As a final example, let's add an enumeration, identifying the location described in the entry from a predefined list. Result: Note the tags do change, which is expected behavior for a language model. If we wanted more consistent tags, we should provide an enumeration. The above examples are relatively simple, but injecting object schemas from code into prompts seems to have potential for instructing language models and integrating them with production systems. I plan to continue exploring more diverse use cases and documenting my findings.

Discussion in the ATmosphere

Loading comments...