Firebase AI Logic supports Gemini 3 Pro and Gemini 3 Pro Image (nano banana pro) for use on all platforms (in preview).

All Gemini 1.0 and Gemini 1.5 models are retired. To avoid service disruption, update to a newer model (like gemini-2.5-flash-lite). Learn more.

此页面由 Cloud Translation API 翻译。

使用 Gemini Live API 进行双向流式传输

Gemini Live API 支持与 Gemini 建立低延迟的双向文本和语音互动。借助 Live API，您可以为最终用户提供自然的、类似人类的语音对话体验，并能够使用文本或语音指令中断模型的回答。该模型可以处理文本和音频输入（视频输入即将推出！），并提供文本和音频输出。

您可以在 Google AI Studio 或 Vertex AI Studio 中使用提示和 Live API 进行原型设计。

Live API 是一种有状态 API，用于创建 WebSocket 连接，以便在客户端与 Gemini 服务器之间建立会话。如需了解详情，请参阅 Live API 参考文档（Gemini Developer API | Vertex AI Gemini API）。

准备工作

点击您的 Gemini API 提供商，以查看此页面上特定于提供商的内容和代码。

如果您尚未完成入门指南，请先完成该指南。该指南介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、为所选的 Gemini API 提供程序初始化后端服务，以及创建 LiveModel 实例。

支持此功能的模型

支持 Live API 的模型取决于您选择的 Gemini API 提供方。

Gemini Developer API
- gemini-live-2.5-flash （非公开正式版^*）
- gemini-live-2.5-flash-preview
- gemini-2.0-flash-live-001
- gemini-2.0-flash-live-preview-04-09
Vertex AI Gemini API
- gemini-live-2.5-flash （非公开正式版^*）
- gemini-2.0-flash-live-preview-04-09 （仅可在 us-central1 中访问）

请注意，对于 Live API 的 2.5 版模型名称，live 段紧跟在 gemini 段之后。

^{* 请与您的 Google Cloud 账号团队代表联系，申请访问权限。}

使用 Live API 的标准功能

本部分介绍了如何使用 Live API 的标准功能，特别是如何以流式传输各种类型的输入和输出：

根据流式文本输入生成流式文本

在试用此示例之前，请完成本指南的准备工作部分，以设置您的项目和应用。
在该部分中，您还需要点击所选Gemini API提供商对应的按钮，以便在此页面上看到特定于提供商的内容。

您可以发送流式文本输入，并接收流式文本输出。请务必创建 liveModel 实例，并将响应模态设置为 Text。

Swift

 import FirebaseAILogic // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  generationConfig: LiveGenerationConfig(  responseModalities: [.text]  ) ) do {  let session = try await model.connect()  // Provide a text prompt  let text = "tell a short story"  await session.sendTextRealtime(text)  var outputText = ""  for try await message in session.responses {  if case let .content(content) = message.payload {  content.modelTurn?.parts.forEach { part in  if let part = part as? TextPart {  outputText += part.text  }  }  // Optional: if you don't require to send more requests.  if content.isTurnComplete {  await session.close()  }  }  }  // Output received from the server.  print(outputText) } catch {  fatalError(error.localizedDescription) }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(  modelName = "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  generationConfig = liveGenerationConfig {  responseModality = ResponseModality.TEXT   } ) val session = model.connect() // Provide a text prompt val text = "tell a short story" session.send(text) var outputText = "" session.receive().collect {  if(it.turnComplete) {  // Optional: if you don't require to send more requests.  session.stopReceiving();  }  outputText = outputText + it.text } // Output received from the server. println(outputText)

Java

 ExecutorService executor = Executors.newFixedThreadPool(1); // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(  "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  new LiveGenerationConfig.Builder()  .setResponseModalities(ResponseModality.TEXT)  .build() ); LiveModelFutures model = LiveModelFutures.from(lm); ListenableFuture<LiveSession> sessionFuture = model.connect(); class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {  @Override  public void onSubscribe(Subscription s) {  s.request(Long.MAX_VALUE); // Request an unlimited number of items  }  @Override  public void onNext(LiveContentResponse liveContentResponse) {  // Handle the response from the server. System.out.println(liveContentResponse.getText());  }  @Override  public void onError(Throwable t) {  System.err.println("Error: " + t.getMessage());  }  @Override  public void onComplete() {  System.out.println("Done receiving messages!");  } } Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {  @Override  public void onSuccess(LiveSession ses) {  LiveSessionFutures session = LiveSessionFutures.from(ses);  // Provide a text prompt  String text = "tell me a short story?";  session.send(text);  Publisher<LiveContentResponse> publisher = session.receive();  publisher.subscribe(new LiveContentResponseSubscriber());  }  @Override  public void onFailure(Throwable t) {  // Handle exceptions  } }, executor);

Web

 // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API) const model = getLiveGenerativeModel(ai, {  model: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  generationConfig: {  responseModalities: [ResponseModality.TEXT],  }, }); const session = await model.connect(); // Provide a text prompt const prompt = "tell a short story"; session.send(prompt); // Collect text from model's turn let text = ""; const messages = session.receive(); for await (const message of messages) {  switch (message.type) {  case "serverContent":  if (message.turnComplete) {  console.log(text);  } else {  const parts = message.modelTurn?.parts;  if (parts) {  text += parts.map((part) => part.text).join("");  }  }  break;  case "toolCall":  // Ignore  case "toolCallCancellation":  // Ignore  } }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart'; late LiveModelSession _session; await Firebase.initializeApp(  options: DefaultFirebaseOptions.currentPlatform, ); // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) final model = FirebaseAI.googleAI().liveGenerativeModel(  model: 'gemini-2.0-flash-live-preview-04-09',  // Configure the model to respond with text  liveGenerationConfig: LiveGenerationConfig(responseModalities: [ResponseModalities.text]), ); _session = await model.connect(); // Provide a text prompt final prompt = Content.text('tell a short story'); await _session.send(input: prompt, turnComplete: true); // In a separate thread, receive the response await for (final message in _session.receive()) {  // Process the received message }

Unity

 using Firebase; using Firebase.AI; async Task SendTextReceiveText() {  // Initialize the Gemini Developer API backend service  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  liveGenerationConfig: new LiveGenerationConfig(  responseModalities: new[] { ResponseModality.Text })  );  LiveSession session = await model.ConnectAsync();  // Provide a text prompt  var prompt = ModelContent.Text("tell a short story");  await session.SendAsync(content: prompt, turnComplete: true);  // Receive the response  await foreach (var message in session.ReceiveAsync()) {  // Process the received message  if (!string.IsNullOrEmpty(message.Text)) {  UnityEngine.Debug.Log("Received message: " + message.Text);  }  } }

根据流式音频输入生成流式音频

您可以发送流式音频输入，并接收流式音频输出。请务必创建 LiveModel 实例，并将响应模态设置为 Audio。

请参阅本页下文，了解如何配置和自定义回答语音。

Swift

 import FirebaseAILogic // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  generationConfig: LiveGenerationConfig(  responseModalities: [.audio]  ) ) do {  let session = try await model.connect()  // Load the audio file, or tap a microphone  guard let audioFile = NSDataAsset(name: "audio.pcm") else {  fatalError("Failed to load audio file")  }  // Provide the audio data  await session.sendAudioRealtime(audioFile.data)  var outputText = ""  for try await message in session.responses {  if case let .content(content) = message.payload {  content.modelTurn?.parts.forEach { part in  if let part = part as? InlineDataPart, part.mimeType.starts(with: "audio/pcm") {  // Handle 16bit pcm audio data at 24khz  playAudio(part.data)  }  }  // Optional: if you don't require to send more requests.  if content.isTurnComplete {  await session.close()  }  }  } } catch {  fatalError(error.localizedDescription) }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(  modelName = "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  generationConfig = liveGenerationConfig {  responseModality = ResponseModality.AUDIO   } ) val session = model.connect() // This is the recommended way. // However, you can create your own recorder and handle the stream. session.startAudioConversation()

Java

 ExecutorService executor = Executors.newFixedThreadPool(1); // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(  "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  new LiveGenerationConfig.Builder()  .setResponseModalities(ResponseModality.AUDIO)  .build() ); LiveModelFutures model = LiveModelFutures.from(lm); ListenableFuture<LiveSession> sessionFuture = model.connect(); Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {  @Override  public void onSuccess(LiveSession ses) {  LiveSessionFutures session = LiveSessionFutures.from(ses);  session.startAudioConversation();  }  @Override  public void onFailure(Throwable t) {  // Handle exceptions  } }, executor);

Web

 // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API) const model = getLiveGenerativeModel(ai, {  model: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  generationConfig: {  responseModalities: [ResponseModality.AUDIO],  }, }); const session = await model.connect(); // Start the audio conversation const audioConversationController = await startAudioConversation(session); // ... Later, to stop the audio conversation // await audioConversationController.stop()

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart'; import 'package:your_audio_recorder_package/your_audio_recorder_package.dart'; late LiveModelSession _session; final _audioRecorder = YourAudioRecorder(); await Firebase.initializeApp(  options: DefaultFirebaseOptions.currentPlatform, ); // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) final model = FirebaseAI.googleAI().liveGenerativeModel(  model: 'gemini-2.0-flash-live-preview-04-09',  // Configure the model to respond with audio  liveGenerationConfig: LiveGenerationConfig(responseModalities: [ResponseModalities.audio]), ); _session = await model.connect(); final audioRecordStream = _audioRecorder.startRecordingStream(); // Map the Uint8List stream to InlineDataPart stream final mediaChunkStream = audioRecordStream.map((data) {  return InlineDataPart('audio/pcm', data); }); await _session.startMediaStream(mediaChunkStream); // In a separate thread, receive the audio response from the model await for (final message in _session.receive()) {  // Process the received message }

Unity

 using Firebase; using Firebase.AI; async Task SendTextReceiveAudio() {  // Initialize the Gemini Developer API backend service  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  liveGenerationConfig: new LiveGenerationConfig(  responseModalities: new[] { ResponseModality.Audio })  );  LiveSession session = await model.ConnectAsync();  // Start a coroutine to send audio from the Microphone  var recordingCoroutine = StartCoroutine(SendAudio(session));  // Start receiving the response  await ReceiveAudio(session); } IEnumerator SendAudio(LiveSession liveSession) {  string microphoneDeviceName = null;  int recordingFrequency = 16000;  int recordingBufferSeconds = 2;  var recordingClip = Microphone.Start(microphoneDeviceName, true,  recordingBufferSeconds, recordingFrequency);  int lastSamplePosition = 0;  while (true) {  if (!Microphone.IsRecording(microphoneDeviceName)) {  yield break;  }  int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);  if (currentSamplePosition != lastSamplePosition) {  // The Microphone uses a circular buffer, so we need to check if the  // current position wrapped around to the beginning, and handle it  // accordingly.  int sampleCount;  if (currentSamplePosition > lastSamplePosition) {  sampleCount = currentSamplePosition - lastSamplePosition;  } else {  sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;  }  if (sampleCount > 0) {  // Get the audio chunk  float[] samples = new float[sampleCount];  recordingClip.GetData(samples, lastSamplePosition);  // Send the data, discarding the resulting Task to avoid the warning  _ = liveSession.SendAudioAsync(samples);  lastSamplePosition = currentSamplePosition;  }  }  // Wait for a short delay before reading the next sample from the Microphone  const float MicrophoneReadDelay = 0.5f;  yield return new WaitForSeconds(MicrophoneReadDelay);  } } Queue audioBuffer = new(); async Task ReceiveAudio(LiveSession liveSession) {  int sampleRate = 24000;  int channelCount = 1;  // Create a looping AudioClip to fill with the received audio data  int bufferSamples = (int)(sampleRate * channelCount);  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,  sampleRate, true, OnAudioRead);  // Attach the clip to an AudioSource and start playing it  AudioSource audioSource = GetComponent();  audioSource.clip = clip;  audioSource.loop = true;  audioSource.Play();  // Start receiving the response  await foreach (var message in liveSession.ReceiveAsync()) {  // Process the received message  foreach (float[] pcmData in message.AudioAsFloat) {  lock (audioBuffer) {  foreach (float sample in pcmData) {  audioBuffer.Enqueue(sample);  }  }  }  } } // This method is called by the AudioClip to load audio data. private void OnAudioRead(float[] data) {  int samplesToProvide = data.Length;  int samplesProvided = 0;  lock(audioBuffer) {  while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {  data[samplesProvided] = audioBuffer.Dequeue();  samplesProvided++;  }  }  while (samplesProvided < samplesToProvide) {  data[samplesProvided] = 0.0f;  samplesProvided++;  } }

根据流式音频输入生成流式文本

您可以发送流式音频输入，并接收流式文本输出。请务必创建 LiveModel 实例，并将响应模态设置为 Text。

Swift

 import FirebaseAILogic // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  generationConfig: LiveGenerationConfig(  responseModalities: [.text]  ) ) do {  let session = try await model.connect()  // Load the audio file, or tap a microphone  guard let audioFile = NSDataAsset(name: "audio.pcm") else {  fatalError("Failed to load audio file")  }  // Provide the audio data  await session.sendAudioRealtime(audioFile.data)  var outputText = ""  for try await message in session.responses {  if case let .content(content) = message.payload {  content.modelTurn?.parts.forEach { part in  if let part = part as? TextPart {  outputText += part.text  }  }  // Optional: if you don't require to send more requests.  if content.isTurnComplete {  await session.close()  }  }  }  // Output received from the server.  print(outputText) } catch {  fatalError(error.localizedDescription) }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(  modelName = "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  generationConfig = liveGenerationConfig {  responseModality = ResponseModality.TEXT  } ) val session = model.connect() // Provide a text prompt val audioContent = content("user") { audioData } session.send(audioContent) var outputText = "" session.receive().collect {  if(it.status == Status.TURN_COMPLETE) {  // Optional: if you don't require to send more requests.  session.stopReceiving();  }  outputText = outputText + it.text } // Output received from the server. println(outputText)

Java

 ExecutorService executor = Executors.newFixedThreadPool(1); // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(  "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  new LiveGenerationConfig.Builder()  .setResponseModalities(ResponseModality.TEXT)  .build() ); LiveModelFutures model = LiveModelFutures.from(lm); ListenableFuture<LiveSession> sessionFuture = model.connect(); class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {  @Override  public void onSubscribe(Subscription s) {  s.request(Long.MAX_VALUE); // Request an unlimited number of items  }  @Override  public void onNext(LiveContentResponse liveContentResponse) {  // Handle the response from the server. System.out.println(liveContentResponse.getText());  }  @Override  public void onError(Throwable t) {  System.err.println("Error: " + t.getMessage());  }  @Override  public void onComplete() {  System.out.println("Done receiving messages!");  } } Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {  @Override  public void onSuccess(LiveSession ses) {  LiveSessionFutures session = LiveSessionFutures.from(ses);  // Send Audio data  session.send(new Content.Builder().addInlineData(audioData, "audio/pcm").build());  session.send(text);  Publisher<LiveContentResponse> publisher = session.receive();  publisher.subscribe(new LiveContentResponseSubscriber());  }  @Override  public void onFailure(Throwable t) {  // Handle exceptions  } }, executor);

Web

 // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API) const model = getLiveGenerativeModel(ai, {  model: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  generationConfig: {  responseModalities: [ResponseModality.TEXT],  }, }); const session = await model.connect(); // TODO(developer): Collect audio data (16-bit 16kHz PCM) // const audioData = ... // Send audio const audioPart = {  inlineData: { data: audioData, mimeType: "audio/pcm" }, }; session.send([audioPart]); // Collect text from model's turn let text = ""; const messages = session.receive(); for await (const message of messages) {  switch (message.type) {  case "serverContent":  if (message.turnComplete) {  console.log(text);  } else {  const parts = message.modelTurn?.parts;  if (parts) {  text += parts.map((part) => part.text).join("");  }  }  break;  case "toolCall":  // Ignore  case "toolCallCancellation":  // Ignore  } }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart'; import 'package:your_audio_recorder_package/your_audio_recorder_package.dart'; import 'dart:async'; late LiveModelSession _session; final _audioRecorder = YourAudioRecorder(); await Firebase.initializeApp(  options: DefaultFirebaseOptions.currentPlatform, ); // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) final model = FirebaseAI.googleAI().liveGenerativeModel(  model: 'gemini-2.0-flash-live-preview-04-09',  // Configure the model to respond with text  liveGenerationConfig: LiveGenerationConfig(responseModalities: ResponseModalities.text), ); _session = await model.connect(); final audioRecordStream = _audioRecorder.startRecordingStream(); final mediaChunkStream = audioRecordStream.map((data) {  return InlineDataPart('audio/pcm', data); }); await _session.startMediaStream(mediaChunkStream); final responseStream = _session.receive(); return responseStream.asyncMap((response) async {  if (response.parts.isNotEmpty && response.parts.first.text != null) {  return response.parts.first.text!;  } else {  throw Exception('Text response not found.');  } }); Future main() async {  try {  final textStream = await audioToText();  await for (final text in textStream) {  print('Received text: $text');  // Handle the text response  }  } catch (e) {  print('Error: $e');  } }

Unity

 using Firebase; using Firebase.AI; async Task SendAudioReceiveText() {  // Initialize the Gemini Developer API backend service  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  liveGenerationConfig: new LiveGenerationConfig(  responseModalities: new[] { ResponseModality.Text })  );  LiveSession session = await model.ConnectAsync();  // Start a coroutine to send audio from the Microphone  var recordingCoroutine = StartCoroutine(SendAudio(session));  // Receive the response  await foreach (var message in session.ReceiveAsync()) {  // Process the received message  if (!string.IsNullOrEmpty(message.Text)) {  UnityEngine.Debug.Log("Received message: " + message.Text);  }  }  StopCoroutine(recordingCoroutine); } IEnumerator SendAudio(LiveSession liveSession) {  string microphoneDeviceName = null;  int recordingFrequency = 16000;  int recordingBufferSeconds = 2;  var recordingClip = Microphone.Start(microphoneDeviceName, true,  recordingBufferSeconds, recordingFrequency);  int lastSamplePosition = 0;  while (true) {  if (!Microphone.IsRecording(microphoneDeviceName)) {  yield break;  }  int currentSamplePosition = Microphone.GetPosition(microphoneDeviceName);  if (currentSamplePosition != lastSamplePosition) {  // The Microphone uses a circular buffer, so we need to check if the  // current position wrapped around to the beginning, and handle it  // accordingly.  int sampleCount;  if (currentSamplePosition > lastSamplePosition) {  sampleCount = currentSamplePosition - lastSamplePosition;  } else {  sampleCount = recordingClip.samples - lastSamplePosition + currentSamplePosition;  }  if (sampleCount > 0) {  // Get the audio chunk  float[] samples = new float[sampleCount];  recordingClip.GetData(samples, lastSamplePosition);  // Send the data, discarding the resulting Task to avoid the warning  _ = liveSession.SendAudioAsync(samples);  lastSamplePosition = currentSamplePosition;  }  }  // Wait for a short delay before reading the next sample from the Microphone  const float MicrophoneReadDelay = 0.5f;  yield return new WaitForSeconds(MicrophoneReadDelay);  } }

根据流式文本输入生成流式音频

您可以发送流式文本输入，并接收流式音频输出。请务必创建 LiveModel 实例，并将响应模态设置为 Audio。

请参阅本页下文，了解如何配置和自定义回答语音。

Swift

 import FirebaseAILogic // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  generationConfig: LiveGenerationConfig(  responseModalities: [.audio]  ) ) do {  let session = try await model.connect()  // Provide a text prompt  let text = "tell a short story"  await session.sendTextRealtime(text)  var outputText = ""  for try await message in session.responses {  if case let .content(content) = message.payload {  content.modelTurn?.parts.forEach { part in  if let part = part as? InlineDataPart, part.mimeType.starts(with: "audio/pcm") {  // Handle 16bit pcm audio data at 24khz  playAudio(part.data)  }  }  // Optional: if you don't require to send more requests.  if content.isTurnComplete {  await session.close()  }  }  } } catch {  fatalError(error.localizedDescription) }

Kotlin

 // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(  modelName = "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  generationConfig = liveGenerationConfig {  responseModality = ResponseModality.AUDIO  } ) val session = model.connect() // Provide a text prompt val text = "tell a short story" session.send(text) session.receive().collect {  if(it.turnComplete) {  // Optional: if you don't require to send more requests.  session.stopReceiving();  }  // Handle 16bit pcm audio data at 24khz  playAudio(it.data) }

Java

 ExecutorService executor = Executors.newFixedThreadPool(1); // Initialize the Gemini Developer API backend service // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) LiveGenerativeModel lm = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(  "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with text  new LiveGenerationConfig.Builder()  .setResponseModalities(ResponseModality.AUDIO)  .build() ); LiveModelFutures model = LiveModelFutures.from(lm); ListenableFuture<LiveSession> sessionFuture = model.connect(); class LiveContentResponseSubscriber implements Subscriber<LiveContentResponse> {  @Override  public void onSubscribe(Subscription s) {  s.request(Long.MAX_VALUE); // Request an unlimited number of items  }  @Override  public void onNext(LiveContentResponse liveContentResponse) {  // Handle 16bit pcm audio data at 24khz liveContentResponse.getData();  }  @Override  public void onError(Throwable t) {  System.err.println("Error: " + t.getMessage());  }  @Override  public void onComplete() {  System.out.println("Done receiving messages!");  } } Futures.addCallback(sessionFuture, new FutureCallback<LiveSession>() {  @Override  public void onSuccess(LiveSession ses) {  LiveSessionFutures session = LiveSessionFutures.from(ses);  // Provide a text prompt  String text = "tell me a short story?";  session.send(text);  Publisher<LiveContentResponse> publisher = session.receive();  publisher.subscribe(new LiveContentResponseSubscriber());  }  @Override  public void onFailure(Throwable t) {  // Handle exceptions  } }, executor);

Web

 // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Create a `LiveGenerativeModel` instance with the flash-live model (only model that supports the Live API) const model = getLiveGenerativeModel(ai, {  model: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  generationConfig: {  responseModalities: [ResponseModality.AUDIO],  }, }); const session = await model.connect(); // Provide a text prompt const prompt = "tell a short story"; session.send(prompt); // Handle the model's audio output const messages = session.receive(); for await (const message of messages) {  switch (message.type) {  case "serverContent":  if (message.turnComplete) {  // TODO(developer): Handle turn completion  } else if (message.interrupted) {  // TODO(developer): Handle the interruption  break;  } else if (message.modelTurn) {  const parts = message.modelTurn?.parts;  parts?.forEach((part) => {  if (part.inlineData) {  // TODO(developer): Play the audio chunk  }  });  }  break;  case "toolCall":  // Ignore  case "toolCallCancellation":  // Ignore  } }

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart'; import 'dart:async'; import 'dart:typed_data'; late LiveModelSession _session; Future<Stream<Uint8List>> textToAudio(String textPrompt) async {  WidgetsFlutterBinding.ensureInitialized();  await Firebase.initializeApp(  options: DefaultFirebaseOptions.currentPlatform,  );  // Initialize the Gemini Developer API backend service  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)  final model = FirebaseAI.googleAI().liveGenerativeModel(  model: 'gemini-2.0-flash-live-preview-04-09',  // Configure the model to respond with audio  liveGenerationConfig: LiveGenerationConfig(responseModalities: ResponseModalities.audio),  );  _session = await model.connect();  final prompt = Content.text(textPrompt);  await _session.send(input: prompt);  return _session.receive().asyncMap((response) async {  if (response is LiveServerContent && response.modelTurn?.parts != null) {  for (final part in response.modelTurn!.parts) {  if (part is InlineDataPart) {  return part.bytes;  }  }  }  throw Exception('Audio data not found');  }); } Future<void> main() async {  try {  final audioStream = await textToAudio('Convert this text to audio.');  await for (final audioData in audioStream) {  // Process the audio data (e.g., play it using an audio player package)  print('Received audio data: ${audioData.length} bytes');  // Example using flutter_sound (replace with your chosen package):  // await _flutterSoundPlayer.startPlayer(fromDataBuffer: audioData);  }  } catch (e) {  print('Error: $e');  } }

Unity

 using Firebase; using Firebase.AI; async Task SendTextReceiveAudio() {  // Initialize the Gemini Developer API backend service  // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API)  var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to respond with audio  liveGenerationConfig: new LiveGenerationConfig(  responseModalities: new[] { ResponseModality.Audio })  );  LiveSession session = await model.ConnectAsync();  // Provide a text prompt  var prompt = ModelContent.Text("Convert this text to audio.");  await session.SendAsync(content: prompt, turnComplete: true);  // Start receiving the response  await ReceiveAudio(session); } Queue<float> audioBuffer = new(); async Task ReceiveAudio(LiveSession session) {  int sampleRate = 24000;  int channelCount = 1;  // Create a looping AudioClip to fill with the received audio data  int bufferSamples = (int)(sampleRate * channelCount);  AudioClip clip = AudioClip.Create("StreamingPCM", bufferSamples, channelCount,  sampleRate, true, OnAudioRead);  // Attach the clip to an AudioSource and start playing it  AudioSource audioSource = GetComponent<AudioSource>();  audioSource.clip = clip;  audioSource.loop = true;  audioSource.Play();  // Start receiving the response  await foreach (var message in session.ReceiveAsync()) {  // Process the received message  foreach (float[] pcmData in message.AudioAsFloat) {  lock (audioBuffer) {  foreach (float sample in pcmData) {  audioBuffer.Enqueue(sample);  }  }  }  } } // This method is called by the AudioClip to load audio data. private void OnAudioRead(float[] data) {  int samplesToProvide = data.Length;  int samplesProvided = 0;  lock(audioBuffer) {  while (samplesProvided < samplesToProvide && audioBuffer.Count > 0) {  data[samplesProvided] = audioBuffer.Dequeue();  samplesProvided++;  }  }  while (samplesProvided < samplesToProvide) {  data[samplesProvided] = 0.0f;  samplesProvided++;  } }

打造更具吸引力的互动体验

本部分介绍了如何创建和管理 Live API 中更具吸引力或互动性的功能。

更改回答语音

Live API 使用 Chirp 3 来支持合成语音回答。使用 Firebase AI Logic 时，您可以发送各种高清语音语言的音频。如需查看每种语音的完整列表和演示，请参阅 Chirp 3：高清语音。

如需指定语音，请在 speechConfig 对象中设置语音名称，作为模型配置的一部分。如果您未指定语音，则默认使用 Puck。

Swift

 import FirebaseAILogic // ... let model = FirebaseAI.firebaseAI(backend: .googleAI()).liveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to use a specific voice for its audio response  generationConfig: LiveGenerationConfig(  responseModalities: [.audio],  speech: SpeechConfig(voiceName: "VOICE_NAME")  ) ) // ...

Kotlin

 // ... val model = Firebase.ai(backend = GenerativeBackend.googleAI()).liveModel(  modelName = "gemini-2.0-flash-live-preview-04-09",  // Configure the model to use a specific voice for its audio response  generationConfig = liveGenerationConfig {  responseModality = ResponseModality.AUDIO  speechConfig = SpeechConfig(voice = Voice("VOICE_NAME"))  } ) // ...

Java

 // ... LiveModel model = FirebaseAI.getInstance(GenerativeBackend.googleAI()).liveModel(  "gemini-2.0-flash-live-preview-04-09",  // Configure the model to use a specific voice for its audio response  new LiveGenerationConfig.Builder()  .setResponseModalities(ResponseModality.AUDIO)  .setSpeechConfig(new SpeechConfig(new Voice("VOICE_NAME")))  .build() ); // ...

Web

 // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Create a `LiveModel` instance with the flash-live model (only model that supports the Live API) const model = getLiveGenerativeModel(ai, {  model: "gemini-2.0-flash-live-preview-04-09",  // Configure the model to use a specific voice for its audio response  generationConfig: {  responseModalities: [ResponseModality.AUDIO],  speechConfig: {  voiceConfig: {  prebuiltVoiceConfig: { voiceName: "VOICE_NAME" },  },  },  }, });

Dart

 // ... final model = FirebaseAI.googleAI().liveGenerativeModel(  model: 'gemini-2.0-flash-live-preview-04-09',  // Configure the model to use a specific voice for its audio response  liveGenerationConfig: LiveGenerationConfig(  responseModalities: ResponseModalities.audio,  speechConfig: SpeechConfig(voiceName: 'VOICE_NAME'),  ), ); // ...

Unity

 var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetLiveModel(  modelName: "gemini-2.0-flash-live-preview-04-09",  liveGenerationConfig: new LiveGenerationConfig(  responseModalities: new[] { ResponseModality.Audio },  speechConfig: SpeechConfig.UsePrebuiltVoice("VOICE_NAME")) );

在以非英语语言提示模型并要求模型以非英语语言进行回答时，为了获得最佳结果，请在系统指令中添加以下内容：

RESPOND IN LANGUAGE. YOU MUST RESPOND UNMISTAKABLY IN LANGUAGE.

在会话和请求之间保持上下文

您可以使用聊天结构在会话和请求之间保持上下文。请注意，此功能仅适用于文本输入和文本输出。

此方法最适合简短的上下文；您可以发送逐轮互动来表示确切的事件序列。对于较长的上下文，建议提供单个消息摘要，以释放上下文窗口，以便进行后续互动。

处理中断

Firebase AI Logic 尚不支持处理中断。请过段时间再来查看！

使用函数调用（工具）

您可以定义工具（例如可用函数）以与 Live API 搭配使用，就像使用标准内容生成方法一样。本部分介绍了将 Live API 与函数调用搭配使用时的一些细微差别。如需查看函数调用的完整说明和示例，请参阅函数调用指南。

根据单个提示，模型可以生成多个函数调用以及将这些函数的输出串联所需的代码。此代码在沙盒环境中执行，生成后续的 BidiGenerateContentToolCall 消息。执行会暂停，直到每个函数调用的结果可用为止，从而确保顺序处理。

此外，将 Live API 与函数调用搭配使用非常强大，因为模型可以向用户请求后续信息或澄清信息。例如，如果模型没有足够的信息来为要调用的函数提供形参值，则模型可以要求用户提供更多或更清晰的信息。

客户端应使用 BidiGenerateContentToolResponse 进行响应。

限制和要求

请注意 Live API 的以下限制和要求。

转录

Firebase AI Logic 尚不支持转写。请过段时间再来查看！

语言

输入语言：查看Gemini 模型支持的输入语言的完整列表
输出语言：如需查看可用的输出语言的完整列表，请参阅 Chirp 3：高清语音

音频格式

Live API 支持以下音频格式：

输入音频格式：原始 16 位 PCM 音频，采样率 16kHz，小端字节序
输出音频格式：原始 16 位 PCM 音频，采样率 24kHz，小端序

速率限制

Live API 对每个 Firebase 项目的并发会话数以及每分钟的令牌数 (TPM) 都有速率限制。

Gemini Developer API：
- 限制因项目的 Gemini Developer API“使用层级”（请参阅其速率限制文档）而异
Vertex AI Gemini API：
- 每个 Firebase 项目 5,000 个并发会话
- 每分钟 400 万个 token

会话时长

会话的默认时长为 10 分钟。当会话时长超过限制时，连接会终止。

模型还受上下文大小的限制。发送大量输入内容可能会导致会话提前终止。

语音活动检测 (VAD)

模型会自动对连续的音频输入流执行语音活动检测 (VAD)。VAD 默认处于启用状态。

token 计数

您无法将 CountTokens API 与 Live API 搭配使用。

就您使用 Firebase AI Logic 的体验提供反馈

使用 Gemini Live API 进行双向流式传输 使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

准备工作

支持此功能的模型

使用 Live API 的标准功能

根据流式文本输入生成流式文本

Swift

Kotlin

Java

Web

Dart

Unity

根据流式音频输入生成流式音频

Swift

Kotlin

Java

Web

Dart

Unity

根据流式音频输入生成流式文本

Swift

Kotlin

Java

Web

Dart

Unity

根据流式文本输入生成流式音频

Swift

Kotlin

Java

Web

Dart

Unity

打造更具吸引力的互动体验

更改回答语音

Swift

Kotlin

Java

Web

Dart

Unity

在会话和请求之间保持上下文

处理中断

使用函数调用（工具）

限制和要求

转录

语言

音频格式

速率限制

会话时长

语音活动检测 (VAD)

token 计数

使用 Gemini Live API 进行双向流式传输