The Multimodal Live API enables low-latency, two-way interactions that use text, audio, and video input, with audio and text output.
Model response type