Streaming
Stream responses in real-time with multiple consumption patterns. All streams are built on a reusable stream architecture that supports concurrent consumers.
Stream responses in real-time with multiple consumption patterns. All streams are built on a reusable stream architecture that supports concurrent consumers.
Stream text content as it’s generated:
Each iteration yields a small chunk of text (typically a few characters or a word).
For models that support reasoning (like o1 or Claude with thinking), stream the reasoning process:
Stream complete items as they update. This is the recommended way to handle streaming when you need structured access to all output types (messages, tool calls, reasoning, etc.). See Working with Items for the full paradigm explanation.
Key insight: Each iteration yields a complete item with the same ID but updated content. Replace items by ID rather than accumulating deltas.
This stream yields all item types:
getNewMessagesStream() is deprecated. Use getItemsStream() instead, which
includes all item types and follows the items-based paradigm.
Stream cumulative message snapshots in the OpenResponses format:
This stream yields:
ResponsesOutputMessage - Assistant text/content updatesOpenResponsesFunctionCallOutput - Tool execution results (after tools
complete)Stream all response events including tool preliminary results:
The full stream includes these event types:
Stream structured tool calls as they complete:
Stream tool deltas and preliminary results:
Multiple consumers can read from the same result:
The underlying ReusableReadableStream ensures each consumer receives all events.
Cancel a stream to stop generation: