Streaming (AI)

Delivering model outputs token by token as they are generated, rather than waiting for the complete response. Streaming reduces perceived latency in chat interfaces because users see text appearing immediately. Most LLM APIs support streaming.