使用 Gemini API 生成结构化输出(例如 JSON 和枚举)

Gemini API 默认以非结构化文本形式返回响应。不过,某些使用情形需要结构化文本,例如 JSON。例如,您可能正在将该响应用于需要已建立数据架构的其他下游任务。

为确保模型生成的输出始终遵循特定架构,您可以定义回答架构,该架构类似于模型回答的蓝图。这样一来,您就可以直接从模型的输出中提取数据,而无需进行太多后期处理。

下面是一些示例:

  • 确保模型的回答生成有效的 JSON 并符合您提供的架构。
    例如,该模型可以生成食谱的结构化条目,这些条目始终包含食谱名称、配料列表和步骤。这样一来,您就可以更轻松地在应用的界面中解析和显示此信息。

  • 限制模型在分类任务中的回答方式。
    例如,您可以让模型使用一组特定的标签(例如一组特定的枚举,如 positivenegative)来注释文本,而不是使用模型生成的标签(这些标签可能具有一定程度的变异性,如 goodpositivenegativebad)。

本指南将介绍如何在对 generateContent 的调用中提供 responseSchema 来生成 JSON 输出。它侧重于纯文本输入,但 Gemini 还可以针对包含图片、视频和音频作为输入的多模态请求生成结构化回答。

本页底部提供了更多示例,例如如何生成枚举值作为输出

准备工作

点击您的 Gemini API 提供商,以查看此页面上特定于提供商的内容和代码。

如果您尚未完成入门指南,请先完成该指南。该指南介绍了如何设置 Firebase 项目、将应用连接到 Firebase、添加 SDK、为所选的 Gemini API 提供程序初始化后端服务,以及创建 GenerativeModel 实例。

如需测试和迭代提示,我们建议使用 Google AI Studio

第 1 步:定义回答架构

通过定义回答架构指定模型输出的结构、字段名称以及每个字段的预期数据类型。

模型生成回答时,会使用提示中的字段名称和上下文。为确保您的意图清晰明了,我们建议您使用清晰的结构、明确的字段名称,甚至在需要时添加说明。

响应架构的注意事项

编写响应架构时,请注意以下几点:

  • 响应架构的大小会占用输入词元限额。

  • 响应架构功能支持以下响应 MIME 类型:

    • application/json:按照响应架构中的定义输出 JSON(对于结构化输出要求非常有用)

    • text/x.enum:输出回答架构中定义的枚举值(适用于分类任务)

  • 响应架构功能支持以下架构字段:

    enum
    items
    maxItems
    nullable
    properties
    required

    如果您使用的是不受支持的字段,模型仍可以处理您的请求,但会忽略该字段。请注意,上述列表是 OpenAPI 3.0 架构对象的子集。

  • 默认情况下,对于 Firebase AI Logic SDK,除非您在 optionalProperties 数组中将所有字段指定为可选,否则所有字段都被视为必填。对于这些可选字段,模型可以填充字段或跳过字段。请注意,如果您使用这两个Gemini API提供商的服务器 SDK 或直接使用其 API,则此行为与默认行为相反。

第 2 步:使用响应架构生成 JSON 输出

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在该部分中,您还需要点击所选Gemini API提供商对应的按钮,以便在此页面上看到特定于提供商的内容

以下示例展示了如何生成结构化 JSON 输出。

创建 GenerativeModel 实例时,请指定相应的 responseMimeType(在本例中为 application/json)以及您希望模型使用的 responseSchema

Swift

 import FirebaseAILogic // Provide a JSON schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. let jsonSchema = Schema.object(  properties: [  "characters": Schema.array(  items: .object(  properties: [  "name": .string(),  "age": .integer(),  "species": .string(),  "accessory": .enumeration(values: ["hat", "belt", "shoes"]),  ],  optionalProperties: ["accessory"]  )  ),  ] ) // Initialize the Gemini Developer API backend service let ai = FirebaseAI.firebaseAI(backend: .googleAI()) // Create a `GenerativeModel` instance with a model that supports your use case let model = ai.generativeModel(  modelName: "gemini-2.5-flash",  // In the generation config, set the `responseMimeType` to `application/json`  // and pass the JSON schema object into `responseSchema`.  generationConfig: GenerationConfig(  responseMIMEType: "application/json",  responseSchema: jsonSchema  ) ) let prompt = "For use in a children's card game, generate 10 animal-based characters." let response = try await model.generateContent(prompt) print(response.text ?? "No text in response.") 

Kotlin

对于 Kotlin,此 SDK 中的方法是挂起函数,需要从协程范围调用。
 // Provide a JSON schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. val jsonSchema = Schema.obj(  mapOf("characters" to Schema.array(  Schema.obj(  mapOf(  "name" to Schema.string(),  "age" to Schema.integer(),  "species" to Schema.string(),  "accessory" to Schema.enumeration(listOf("hat", "belt", "shoes")),  ),  optionalProperties = listOf("accessory")  )  )) ) // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a model that supports your use case val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(  modelName = "gemini-2.5-flash",  // In the generation config, set the `responseMimeType` to `application/json`  // and pass the JSON schema object into `responseSchema`.  generationConfig = generationConfig {  responseMimeType = "application/json"  responseSchema = jsonSchema  }) val prompt = "For use in a children's card game, generate 10 animal-based characters." val response = generativeModel.generateContent(prompt) print(response.text) 

Java

对于 Java,此 SDK 中的流式传输方法会返回 Reactive Streams 库中的 Publisher 类型。
 // Provide a JSON schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. Schema jsonSchema = Schema.obj(  /* properties */  Map.of(  "characters", Schema.array(  /* items */ Schema.obj(  /* properties */  Map.of("name", Schema.str(),  "age", Schema.numInt(),  "species", Schema.str(),  "accessory",  Schema.enumeration(  List.of("hat", "belt", "shoes")))  ))),  List.of("accessory")); // In the generation config, set the `responseMimeType` to `application/json` // and pass the JSON schema object into `responseSchema`. GenerationConfig.Builder configBuilder = new GenerationConfig.Builder(); configBuilder.responseMimeType = "application/json"; configBuilder.responseSchema = jsonSchema; GenerationConfig generationConfig = configBuilder.build(); // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a model that supports your use case GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())  .generativeModel(  /* modelName */ "gemini-2.5-flash",  /* generationConfig */ generationConfig); GenerativeModelFutures model = GenerativeModelFutures.from(ai); Content content = new Content.Builder()  .addText("For use in a children's card game, generate 10 animal-based characters.")  .build(); // For illustrative purposes only. You should use an executor that fits your needs. Executor executor = Executors.newSingleThreadExecutor(); ListenableFuture<GenerateContentResponse> response = model.generateContent(content); Futures.addCallback(  response,  new FutureCallback<GenerateContentResponse>() {  @Override  public void onSuccess(GenerateContentResponse result) {  String resultText = result.getText();  System.out.println(resultText);  }  @Override  public void onFailure(Throwable t) {  t.printStackTrace();  }  },  executor); 

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, Schema } from "firebase/ai"; // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {  // ... }; // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig); // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Provide a JSON schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. const jsonSchema = Schema.object({  properties: {  characters: Schema.array({  items: Schema.object({  properties: {  name: Schema.string(),  accessory: Schema.string(),  age: Schema.number(),  species: Schema.string(),  },  optionalProperties: ["accessory"],  }),  }),  } }); // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {  model: "gemini-2.5-flash",  // In the generation config, set the `responseMimeType` to `application/json`  // and pass the JSON schema object into `responseSchema`.  generationConfig: {  responseMimeType: "application/json",  responseSchema: jsonSchema  }, }); let prompt = "For use in a children's card game, generate 10 animal-based characters."; let result = await model.generateContent(prompt) console.log(result.response.text()); 

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart'; // Provide a JSON schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. final jsonSchema = Schema.object(  properties: {  'characters': Schema.array(  items: Schema.object(  properties: {  'name': Schema.string(),  'age': Schema.integer(),  'species': Schema.string(),  'accessory':  Schema.enumString(enumValues: ['hat', 'belt', 'shoes']),  },  ),  ),  },  optionalProperties: ['accessory'],  ); // Initialize FirebaseApp await Firebase.initializeApp(  options: DefaultFirebaseOptions.currentPlatform, ); // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a model that supports your use case final model =  FirebaseAI.googleAI().generativeModel(  model: 'gemini-2.5-flash',  // In the generation config, set the `responseMimeType` to `application/json`  // and pass the JSON schema object into `responseSchema`.  generationConfig: GenerationConfig(  responseMimeType: 'application/json', responseSchema: jsonSchema)); final prompt = "For use in a children's card game, generate 10 animal-based characters."; final response = await model.generateContent([Content.text(prompt)]); print(response.text); 

Unity

 using Firebase; using Firebase.AI; // Provide a JSON schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. var jsonSchema = Schema.Object(  properties: new System.Collections.Generic.Dictionary<string, Schema> {  { "characters", Schema.Array(  items: Schema.Object(  properties: new System.Collections.Generic.Dictionary<string, Schema> {  { "name", Schema.String() },  { "age", Schema.Int() },  { "species", Schema.String() },  { "accessory", Schema.Enum(new string[] { "hat", "belt", "shoes" }) },  },  optionalProperties: new string[] { "accessory" }  )  ) },  } ); // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a model that supports your use case var model = FirebaseAI.DefaultInstance.GetGenerativeModel(  modelName: "gemini-2.5-flash",  // In the generation config, set the `responseMimeType` to `application/json`  // and pass the JSON schema object into `responseSchema`.  generationConfig: new GenerationConfig(  responseMimeType: "application/json",  responseSchema: jsonSchema  ) ); var prompt = "For use in a children's card game, generate 10 animal-based characters."; var response = await model.GenerateContentAsync(prompt); UnityEngine.Debug.Log(response.Text ?? "No text in response."); 

了解如何选择适合您的应用场景和应用的模型

更多示例

以下是一些关于如何使用和生成结构化输出的其他示例。

生成枚举值作为输出

在试用此示例之前,请完成本指南的准备工作部分,以设置您的项目和应用。
在该部分中,您还需要点击所选Gemini API提供商对应的按钮,以便在此页面上看到特定于提供商的内容

以下示例展示了如何为分类任务使用响应架构。要求模型根据电影的说明来识别其类型。输出是模型从提供的响应架构中定义的值列表中选择的一个纯文本枚举值。

如需执行此结构化分类任务,您需要在模型初始化期间指定适当的 responseMimeType(在本例中为 text/x.enum)以及您希望模型使用的 responseSchema

Swift

 import FirebaseAILogic // Provide an enum schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. let enumSchema = Schema.enumeration(values: ["drama", "comedy", "documentary"]) // Initialize the Gemini Developer API backend service let ai = FirebaseAI.firebaseAI(backend: .googleAI()) // Create a `GenerativeModel` instance with a model that supports your use case let model = ai.generativeModel(  modelName: "gemini-2.5-flash",  // In the generation config, set the `responseMimeType` to `text/x.enum`  // and pass the enum schema object into `responseSchema`.  generationConfig: GenerationConfig(  responseMIMEType: "text/x.enum",  responseSchema: enumSchema  ) ) let prompt = """ The film aims to educate and inform viewers about real-life subjects, events, or people. It offers a factual record of a particular topic by combining interviews, historical footage, and narration. The primary purpose of a film is to present information and provide insights into various aspects of reality. """ let response = try await model.generateContent(prompt) print(response.text ?? "No text in response.") 

Kotlin

对于 Kotlin,此 SDK 中的方法是挂起函数,需要从协程范围调用。
 // Provide an enum schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. val enumSchema = Schema.enumeration(listOf("drama", "comedy", "documentary")) // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a model that supports your use case val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel(  modelName = "gemini-2.5-flash",  // In the generation config, set the `responseMimeType` to `text/x.enum`  // and pass the enum schema object into `responseSchema`.  generationConfig = generationConfig {  responseMimeType = "text/x.enum"  responseSchema = enumSchema  }) val prompt = """  The film aims to educate and inform viewers about real-life subjects, events, or people.  It offers a factual record of a particular topic by combining interviews, historical footage,  and narration. The primary purpose of a film is to present information and provide insights  into various aspects of reality.  """ val response = generativeModel.generateContent(prompt) print(response.text) 

Java

对于 Java,此 SDK 中的流式传输方法会返回 Reactive Streams 库中的 Publisher 类型。
 // Provide an enum schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. Schema enumSchema = Schema.enumeration(List.of("drama", "comedy", "documentary")); // In the generation config, set the `responseMimeType` to `text/x.enum` // and pass the enum schema object into `responseSchema`. GenerationConfig.Builder configBuilder = new GenerationConfig.Builder(); configBuilder.responseMimeType = "text/x.enum"; configBuilder.responseSchema = enumSchema; GenerationConfig generationConfig = configBuilder.build(); // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a model that supports your use case GenerativeModel ai = FirebaseAI.getInstance(GenerativeBackend.googleAI())  .generativeModel(  /* modelName */ "gemini-2.5-flash",  /* generationConfig */ generationConfig); GenerativeModelFutures model = GenerativeModelFutures.from(ai); String prompt = "The film aims to educate and inform viewers about real-life subjects," +  " events, or people. It offers a factual record of a particular topic by" +  " combining interviews, historical footage, and narration. The primary purpose" +  " of a film is to present information and provide insights into various aspects" +  " of reality."; Content content = new Content.Builder().addText(prompt).build(); // For illustrative purposes only. You should use an executor that fits your needs. Executor executor = Executors.newSingleThreadExecutor(); ListenableFuture<GenerateContentResponse> response = model.generateContent(content); Futures.addCallback(  response,  new FutureCallback<GenerateContentResponse>() {  @Override  public void onSuccess(GenerateContentResponse result) {  String resultText = result.getText();  System.out.println(resultText);  }  @Override  public void onFailure(Throwable t) {  t.printStackTrace();  }  },  executor); 

Web

 import { initializeApp } from "firebase/app"; import { getAI, getGenerativeModel, GoogleAIBackend, Schema } from "firebase/ai"; // TODO(developer) Replace the following with your app's Firebase configuration // See: https://firebase.google.com/docs/web/learn-more#config-object const firebaseConfig = {  // ... }; // Initialize FirebaseApp const firebaseApp = initializeApp(firebaseConfig); // Initialize the Gemini Developer API backend service const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Provide an enum schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. const enumSchema = Schema.enumString({  enum: ["drama", "comedy", "documentary"], }); // Create a `GenerativeModel` instance with a model that supports your use case const model = getGenerativeModel(ai, {  model: "gemini-2.5-flash",  // In the generation config, set the `responseMimeType` to `text/x.enum`  // and pass the JSON schema object into `responseSchema`.  generationConfig: {  responseMimeType: "text/x.enum",  responseSchema: enumSchema,  }, }); let prompt = `The film aims to educate and inform viewers about real-life subjects, events, or people. It offers a factual record of a particular topic by combining interviews, historical footage, and narration. The primary purpose of a film is to present information and provide insights into various aspects of reality.`; let result = await model.generateContent(prompt); console.log(result.response.text()); 

Dart

 import 'package:firebase_ai/firebase_ai.dart'; import 'package:firebase_core/firebase_core.dart'; import 'firebase_options.dart'; // Provide an enum schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. final enumSchema = Schema.enumString(enumValues: ['drama', 'comedy', 'documentary']); // Initialize FirebaseApp await Firebase.initializeApp(  options: DefaultFirebaseOptions.currentPlatform, ); // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a model that supports your use case final model =  FirebaseAI.googleAI().generativeModel(  model: 'gemini-2.5-flash',  // In the generation config, set the `responseMimeType` to `text/x.enum`  // and pass the enum schema object into `responseSchema`.  generationConfig: GenerationConfig(  responseMimeType: 'text/x.enum', responseSchema: enumSchema)); final prompt = """  The film aims to educate and inform viewers about real-life subjects, events, or people.  It offers a factual record of a particular topic by combining interviews, historical footage,   and narration. The primary purpose of a film is to present information and provide insights  into various aspects of reality.  """; final response = await model.generateContent([Content.text(prompt)]); print(response.text); 

Unity

 using Firebase; using Firebase.AI; // Provide an enum schema object using a standard format. // Later, pass this schema object into `responseSchema` in the generation config. var enumSchema = Schema.Enum(new string[] { "drama", "comedy", "documentary" }); // Initialize the Gemini Developer API backend service // Create a `GenerativeModel` instance with a model that supports your use case var model = FirebaseAI.DefaultInstance.GetGenerativeModel(  modelName: "gemini-2.5-flash",  // In the generation config, set the `responseMimeType` to `text/x.enum`  // and pass the enum schema object into `responseSchema`.  generationConfig: new GenerationConfig(  responseMimeType: "text/x.enum",  responseSchema: enumSchema  ) ); var prompt = @" The film aims to educate and inform viewers about real-life subjects, events, or people. It offers a factual record of a particular topic by combining interviews, historical footage, and narration. The primary purpose of a film is to present information and provide insights into various aspects of reality. "; var response = await model.GenerateContentAsync(prompt); UnityEngine.Debug.Log(response.Text ?? "No text in response."); 

了解如何选择适合您的应用场景和应用的模型

控制内容生成的其他选项

  • 详细了解提示设计,以便影响模型生成符合您需求的输出内容。
  • 配置模型参数,以控制模型如何生成回答。对于 Gemini 模型,这些参数包括输出 token 数上限、温度、topK 和 topP。 对于 Imagen 模型,这些功能包括宽高比、人物生成、添加水印等。
  • 使用安全设置调整获得可能被视为有害的回答(包括仇恨言论和露骨色情内容)的可能性。
  • 设置系统指令,以引导模型的行为。此功能就像一段“序言”,在模型接收到最终用户的进一步指令之前添加。


就您使用 Firebase AI Logic 的体验提供反馈