Giving OpenAI Predicted Output a Try

OpenAI recently released a new feature in their API called Predicted Outputs, designed to reduce the latency of model responses in scenarios where much of the response is already known in advance. A good use case for this is when making incremental changes to code or text. I am currently working on an LLM use case for a customer, building a JSON document generator for their low-code tool. The user interacts through a chat interface to incrementally build the JSON document. I decided to give Predicted Outputs a try to see how it performs with this use case.

Continue reading “Giving OpenAI Predicted Output a Try”