Gemini 2.5模型可以使用内部“思考过程”,从而显著提高其推理和多步规划能力,使其能够高效处理编码、高等数学和数据分析等复杂任务。
思考模型提供以下配置和选项:
思考预算:您可以使用思考预算来配置模型可以进行的“思考”量。如果降低延迟时间或成本是首要考虑因素,此配置就显得尤为重要。此外,请查看任务难度比较,以确定模型可能需要多少思维能力。
思路总结:您可以启用思路总结,以便在生成的回答中包含思路总结。这些摘要是模型原始想法的合成版本,可帮助您深入了解模型的内部推理过程。
思考特征:Firebase AI Logic SDK 会自动为您处理思考特征,从而确保模型在调用函数时能够访问之前对话轮次的思考上下文。
请务必查看使用思维模型的最佳实践和提示指南。
使用思考模型
像使用任何其他 Gemini 模型一样使用思维模型(初始化所选的 Gemini API 提供程序、创建 GenerativeModel 实例等)。这些模型可用于文本或代码生成任务,例如生成结构化输出或分析多模态输入(例如图片、视频、音频或 PDF)。 您甚至可以在流式传输输出时使用思维模型。
支持此功能的模型
只有 Gemini 3 和 Gemini 2.5 型号支持此功能。
gemini-3-pro-previewgemini-3-pro-image-preview(又称“nano banana pro”)gemini-2.5-progemini-2.5-flashgemini-2.5-flash-lite
使用思维模型的最佳实践和提示指南
建议您在 Google AI Studio 或 Vertex AI Studio 中测试提示,以便查看完整的思考过程。您可以找出模型可能出错的任何方面,以便改进提示,从而获得更一致、更准确的回答。
先从一个描述预期结果的通用提示开始,观察模型在确定回答时的初步想法。如果回答不尽如人意,请使用以下任一提示技巧,帮助模型生成更好的回答:
- 提供分步说明
- 提供多个输入-输出对示例
- 提供有关输出和回答应如何措辞和设置格式的指南
- 提供具体的验证步骤
除了提示之外,您还可以考虑使用以下建议:
设置系统指令,该指令就像一段“序言”,在模型接收到提示或最终用户的任何进一步指令之前添加,它们可让您根据自己的特定需求和使用情形来控制模型的行为。
设置思考预算可配置模型可进行的思考量。如果您设置的预算较低,模型就不会“过度思考”其回答。如果您设置了较高的预算,模型就可以在需要时进行更多思考。设置思考预算还可以为实际回答预留更多总 token 输出限额。
启用 Firebase 控制台中的 AI 监控,以监控启用思考功能的请求的思考词元数和延迟时间。如果您已启用思路总结,它们会显示在控制台中,您可以在其中检查模型的详细推理过程,以便调试和优化提示。
控制思考预算
如需控制模型在生成回答时可进行的思考量,您可以指定允许其使用的思考预算 token 数量。
如果您需要比默认思考预算更多或更少的 token,则可以手动设置思考预算。如需详细了解任务复杂程度和建议预算,请参阅本部分后面的内容。以下是一些简要指南:
- 如果延迟时间很重要,或者任务不太复杂,请设置较低的思考预算
- 为更复杂的任务设置较高的思考预算
设置思考预算
| 点击您的 Gemini API 提供商,以查看此页面上特定于提供商的内容和代码。 |
在创建 GenerativeModel 实例时,在 GenerationConfig 中设置思考预算。该配置在实例的整个生命周期内保持不变。如果您想为不同的请求使用不同的思考预算,请创建配置了不同预算的 GenerativeModel 实例。
如需了解支持的思考预算值,请参阅本部分后面的内容。
Swift
在创建 GenerativeModel 实例时,在 GenerationConfig 中设置思考预算。
// ... // Set the thinking configuration // Use a thinking budget value appropriate for your model (example value shown here) let generationConfig = GenerationConfig( thinkingConfig: ThinkingConfig(thinkingBudget: 1024) ) // Specify the config as part of creating the `GenerativeModel` instance let model = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel( modelName: "GEMINI_MODEL_NAME", generationConfig: generationConfig ) // ... Kotlin
在创建 GenerativeModel 实例时,设置 GenerationConfig 中参数的值。
// ... // Set the thinking configuration // Use a thinking budget value appropriate for your model (example value shown here) val generationConfig = generationConfig { thinkingConfig = thinkingConfig { thinkingBudget = 1024 } } // Specify the config as part of creating the `GenerativeModel` instance val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel( modelName = "GEMINI_MODEL_NAME", generationConfig, ) // ... Java
在创建 GenerativeModel 实例时,设置 GenerationConfig 中参数的值。
// ... // Set the thinking configuration // Use a thinking budget value appropriate for your model (example value shown here) ThinkingConfig thinkingConfig = new ThinkingConfig.Builder() .setThinkingBudget(1024) .build(); GenerationConfig generationConfig = GenerationConfig.builder() .setThinkingConfig(thinkingConfig) .build(); // Specify the config as part of creating the `GenerativeModel` instance GenerativeModelFutures model = GenerativeModelFutures.from( FirebaseAI.getInstance(GenerativeBackend.googleAI()) .generativeModel( /* modelName */ "GEMINI_MODEL_NAME", /* generationConfig */ generationConfig ); ); // ... Web
在创建 GenerativeModel 实例时,设置 GenerationConfig 中参数的值。
// ... const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Set the thinking configuration // Use a thinking budget value appropriate for your model (example value shown here) const generationConfig = { thinkingConfig: { thinkingBudget: 1024 } }; // Specify the config as part of creating the `GenerativeModel` instance const model = getGenerativeModel(ai, { model: "GEMINI_MODEL_NAME", generationConfig }); // ... Dart
在创建 GenerativeModel 实例时,设置 GenerationConfig 中的参数值。
// ... // Set the thinking configuration // Use a thinking budget value appropriate for your model (example value shown here) final thinkingConfig = ThinkingConfig(thinkingBudget: 1024); final generationConfig = GenerationConfig( thinkingConfig: thinkingConfig ); // Specify the config as part of creating the `GenerativeModel` instance final model = FirebaseAI.googleAI().generativeModel( model: 'GEMINI_MODEL_NAME', config: generationConfig, ); // ... Unity
在创建 GenerativeModel 实例时,设置 GenerationConfig 中参数的值。
// ... // Set the thinking configuration // Use a thinking budget value appropriate for your model (example value shown here) var thinkingConfig = new ThinkingConfig(thinkingBudget: 1024); var generationConfig = new GenerationConfig( thinkingConfig: thinkingConfig ); // Specify the config as part of creating the `GenerativeModel` instance var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel( modelName: "GEMINI_MODEL_NAME", generationConfig: generationConfig ); // ... 支持的思考预算值
下表列出了您可以为每个模型设置的思考预算值,方法是配置模型的 thinkingBudget。
| 型号 | 默认值 | 思考预算的可用范围 | 用于 停用思考的值 | 价值: 培养动态思维 | |
|---|---|---|---|---|---|
| 最小值 | 最大值 | ||||
| Gemini 2.5 Pro | 8,192 | 128 | 32,768 | 无法关闭 | -1 |
| Gemini 2.5 Flash | 8,192 | 1 | 24,576 | 0 | -1 |
| Gemini 2.5 Flash‑Lite | 0(默认情况下,思考处于停用状态) | 512 | 24,576 | 0(或完全不配置思考预算) | -1 |
停用思考过程
对于某些较简单的任务,无需思考能力,传统推理就足够了。或者,如果缩短延迟时间是首要任务,您可能不希望模型花费不必要的时间来生成回答。
在这些情况下,您可以停用(或关闭)思考功能:
- Gemini 2.5 Pro:思考无法停用
- Gemini 2.5 Flash:将
thinkingBudget设置为0个令牌 - Gemini 2.5 Flash‑Lite:默认情况下,思考处于停用状态
培养动态思维
您可以将 thinkingBudget 设置为 -1,让模型自行决定何时进行思考以及思考的程度(称为动态思考)。模型可以使用其认为合适的任意数量的 token,但不得超过上述最大 token 值。
任务复杂性
简单任务 - 无需思考
不需要复杂推理的简单请求,例如事实检索或分类。示例:- “DeepMind 是在哪里成立的?”
- “这封电子邮件是要求安排会议,还是仅提供信息?”
中等任务 - 需要默认预算或一些额外的思考预算
需要一定程度的逐步处理或更深入理解的常见请求。示例:- “将光合作用和成长进行类比。”
- “比较和对比电动汽车与混合动力汽车。”
困难任务 - 可能需要最大思考预算
真正复杂的挑战,例如解决复杂的数学问题或编码任务。这类任务要求模型充分发挥推理和规划能力,通常需要在提供答案之前执行许多内部步骤。示例:- “解决 2025 年 AIME 中的问题 1:求出所有整数底数 b > 9 的和,使得 17b 是 97b 的除数。”
- “编写一个 Python Web 应用,用于直观呈现实时股市数据,包括用户身份验证。尽可能提高效率。”
在回答中包含思考总结
思考总结是模型原始思考的合成版本,可帮助您深入了解模型的内部推理过程。
以下是回答中包含思路总结的一些原因:
您可以在应用的界面中显示思维总结,也可以让用户访问思维总结。思维总结会作为响应中的单独部分返回,以便您更好地控制如何在应用中使用它。
如果您还在 Firebase 控制台中启用 AI 监控,系统会在控制台中显示思路总结,您可以在其中检查模型的详细推理过程,以便调试和优化提示。
以下是有关思路总结的一些重要说明:
启用思考总结
| 点击您的 Gemini API 提供商,以查看此页面上特定于提供商的内容和代码。 |
您可以在模型配置中将 includeThoughts 设置为 true,以启用思路总结。然后,您可以通过检查响应中的 thoughtSummary 字段来访问摘要。
以下示例演示了如何启用并检索包含在响应中的思路总结:
Swift
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) let generationConfig = GenerationConfig( thinkingConfig: ThinkingConfig(includeThoughts: true) ) // Specify the config as part of creating the `GenerativeModel` instance let model = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel( modelName: "GEMINI_MODEL_NAME", generationConfig: generationConfig ) let response = try await model.generateContent("solve x^2 + 4x + 4 = 0") // Handle the response that includes thought summaries if let thoughtSummary = response.thoughtSummary { print("Thought Summary: \(thoughtSummary)") } guard let text = response.text else { fatalError("No text in response.") } print("Answer: \(text)") Kotlin
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) val generationConfig = generationConfig { thinkingConfig = thinkingConfig { includeThoughts = true } } // Specify the config as part of creating the `GenerativeModel` instance val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel( modelName = "GEMINI_MODEL_NAME", generationConfig, ) val response = model.generateContent("solve x^2 + 4x + 4 = 0") // Handle the response that includes thought summaries response.thoughtSummary?.let { println("Thought Summary: $it") } response.text?.let { println("Answer: $it") } Java
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) ThinkingConfig thinkingConfig = new ThinkingConfig.Builder() .setIncludeThoughts(true) .build(); GenerationConfig generationConfig = GenerationConfig.builder() .setThinkingConfig(thinkingConfig) .build(); // Specify the config as part of creating the `GenerativeModel` instance GenerativeModelFutures model = GenerativeModelFutures.from( FirebaseAI.getInstance(GenerativeBackend.googleAI()) .generativeModel( /* modelName */ "GEMINI_MODEL_NAME", /* generationConfig */ generationConfig ); ); // Handle the response that includes thought summaries ListenableFuture responseFuture = model.generateContent("solve x^2 + 4x + 4 = 0"); Futures.addCallback(responseFuture, new FutureCallback() { @Override public void onSuccess(GenerateContentResponse response) { if (response.getThoughtSummary() != null) { System.out.println("Thought Summary: " + response.getThoughtSummary()); } if (response.getText() != null) { System.out.println("Answer: " + response.getText()); } } @Override public void onFailure(Throwable t) { // Handle error } }, MoreExecutors.directExecutor()); Web
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) const generationConfig = { thinkingConfig: { includeThoughts: true } }; // Specify the config as part of creating the `GenerativeModel` instance const model = getGenerativeModel(ai, { model: "GEMINI_MODEL_NAME", generationConfig }); const result = await model.generateContent("solve x^2 + 4x + 4 = 0"); const response = result.response; // Handle the response that includes thought summaries if (response.thoughtSummary()) { console.log(`Thought Summary: ${response.thoughtSummary()}`); } const text = response.text(); console.log(`Answer: ${text}`); Dart
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) final thinkingConfig = ThinkingConfig(includeThoughts: true); final generationConfig = GenerationConfig( thinkingConfig: thinkingConfig ); // Specify the config as part of creating the `GenerativeModel` instance final model = FirebaseAI.googleAI().generativeModel( model: 'GEMINI_MODEL_NAME', generationConfig: generationConfig, ); final response = await model.generateContent('solve x^2 + 4x + 4 = 0'); // Handle the response that includes thought summaries if (response.thoughtSummary != null) { print('Thought Summary: ${response.thoughtSummary}'); } if (response.text != null) { print('Answer: ${response.text}'); } Unity
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) var thinkingConfig = new ThinkingConfig(includeThoughts: true); var generationConfig = new GenerationConfig( thinkingConfig: thinkingConfig ); // Specify the config as part of creating the `GenerativeModel` instance var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel( modelName: "GEMINI_MODEL_NAME", generationConfig: generationConfig ); var response = await model.GenerateContentAsync("solve x^2 + 4x + 4 = 0"); // Handle the response that includes thought summaries if (response.ThoughtSummary != null) { Debug.Log($"Thought Summary: {response.ThoughtSummary}"); } if (response.Text != null) { Debug.Log($"Answer: {response.Text}"); } 串流思考总结
如果您选择使用 generateContentStream 对回答进行流式传输,还可以查看思路总结。这样会在生成响应期间返回滚动增量摘要。
Swift
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) let generationConfig = GenerationConfig( thinkingConfig: ThinkingConfig(includeThoughts: true) ) // Specify the config as part of creating the `GenerativeModel` instance let model = FirebaseAI.firebaseAI(backend: .googleAI()).generativeModel( modelName: "GEMINI_MODEL_NAME", generationConfig: generationConfig ) let stream = try model.generateContentStream("solve x^2 + 4x + 4 = 0") // Handle the streamed response that includes thought summaries var thoughts = "" var answer = "" for try await response in stream { if let thought = response.thoughtSummary { if thoughts.isEmpty { print("--- Thoughts Summary ---") } print(thought) thoughts += thought } if let text = response.text { if answer.isEmpty { print("--- Answer ---") } print(text) answer += text } } Kotlin
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) val generationConfig = generationConfig { thinkingConfig = thinkingConfig { includeThoughts = true } } // Specify the config as part of creating the `GenerativeModel` instance val model = Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel( modelName = "GEMINI_MODEL_NAME", generationConfig, ) // Handle the streamed response that includes thought summaries var thoughts = "" var answer = "" model.generateContentStream("solve x^2 + 4x + 4 = 0").collect { response -> response.thoughtSummary?.let { if (thoughts.isEmpty()) { println("--- Thoughts Summary ---") } print(it) thoughts += it } response.text?.let { if (answer.isEmpty()) { println("--- Answer ---") } print(it) answer += it } } Java
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) ThinkingConfig thinkingConfig = new ThinkingConfig.Builder() .setIncludeThoughts(true) .build(); GenerationConfig generationConfig = GenerationConfig.builder() .setThinkingConfig(thinkingConfig) .build(); // Specify the config as part of creating the `GenerativeModel` instance GenerativeModelFutures model = GenerativeModelFutures.from( FirebaseAI.getInstance(GenerativeBackend.googleAI()) .generativeModel( /* modelName */ "GEMINI_MODEL_NAME", /* generationConfig */ generationConfig ); ); // Streaming with Java is complex and depends on the async library used. // This is a conceptual example using a reactive stream. Flowable responseStream = model.generateContentStream("solve x^2 + 4x + 4 = 0"); // Handle the streamed response that includes thought summaries StringBuilder thoughts = new StringBuilder(); StringBuilder answer = new StringBuilder(); responseStream.subscribe(response -> { if (response.getThoughtSummary() != null) { if (thoughts.length() == 0) { System.out.println("--- Thoughts Summary ---"); } System.out.print(response.getThoughtSummary()); thoughts.append(response.getThoughtSummary()); } if (response.getText() != null) { if (answer.length() == 0) { System.out.println("--- Answer ---"); } System.out.print(response.getText()); answer.append(response.getText()); } }, throwable -> { // Handle error }); Web
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... const ai = getAI(firebaseApp, { backend: new GoogleAIBackend() }); // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) const generationConfig = { thinkingConfig: { includeThoughts: true } }; // Specify the config as part of creating the `GenerativeModel` instance const model = getGenerativeModel(ai, { model: "GEMINI_MODEL_NAME", generationConfig }); const result = await model.generateContentStream("solve x^2 + 4x + 4 = 0"); // Handle the streamed response that includes thought summaries let thoughts = ""; let answer = ""; for await (const chunk of result.stream) { if (chunk.thoughtSummary()) { if (thoughts === "") { console.log("--- Thoughts Summary ---"); } // In Node.js, process.stdout.write(chunk.thoughtSummary()) could be used // to avoid extra newlines. console.log(chunk.thoughtSummary()); thoughts += chunk.thoughtSummary(); } const text = chunk.text(); if (text) { if (answer === "") { console.log("--- Answer ---"); } // In Node.js, process.stdout.write(text) could be used. console.log(text); answer += text; } } Dart
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) final thinkingConfig = ThinkingConfig(includeThoughts: true); final generationConfig = GenerationConfig( thinkingConfig: thinkingConfig ); // Specify the config as part of creating the `GenerativeModel` instance final model = FirebaseAI.googleAI().generativeModel( model: 'GEMINI_MODEL_NAME', generationConfig: generationConfig, ); final responses = model.generateContentStream('solve x^2 + 4x + 4 = 0'); // Handle the streamed response that includes thought summaries var thoughts = ''; var answer = ''; await for (final response in responses) { if (response.thoughtSummary != null) { if (thoughts.isEmpty) { print('--- Thoughts Summary ---'); } thoughts += response.thoughtSummary!; } if (response.text != null) { if (answer.isEmpty) { print('--- Answer ---'); } answer += response.text!; } } Unity
在创建 GenerativeModel 实例时,在 GenerationConfig 中启用思路总结。
// ... // Set the thinking configuration // Optionally enable thought summaries in the generated response (default is false) var thinkingConfig = new ThinkingConfig(includeThoughts: true); var generationConfig = new GenerationConfig( thinkingConfig: thinkingConfig ); // Specify the config as part of creating the `GenerativeModel` instance var model = FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel( modelName: "GEMINI_MODEL_NAME", generationConfig: generationConfig ); var stream = model.GenerateContentStreamAsync("solve x^2 + 4x + 4 = 0"); // Handle the streamed response that includes thought summaries var thoughts = ""; var answer = ""; await foreach (var response in stream) { if (response.ThoughtSummary != null) { if (string.IsNullOrEmpty(thoughts)) { Debug.Log("--- Thoughts Summary ---"); } Debug.Log(response.ThoughtSummary); thoughts += response.ThoughtSummary; } if (response.Text != null) { if (string.IsNullOrEmpty(answer)) { Debug.Log("--- Answer ---"); } Debug.Log(response.Text); answer += response.Text; } } 了解思路签名
在多轮互动中使用思考时,模型无法访问之前轮次的思考上下文。不过,如果您使用函数调用,则可以利用思考特征在多个对话轮次中保持思考上下文。思考特征是模型内部思考过程的加密表示形式,在使用思考和函数调用时可用。具体来说,在以下情况下,系统会生成思考特征:
- 已启用思考,并生成了思考。
- 请求包含函数声明。
如需利用思考特征,请照常使用函数调用。 Firebase AI Logic SDK 可管理状态并自动处理思维签名,从而简化流程。SDK 会在 Chat 会话中自动在后续的 sendMessage 或 sendMessageStream 调用之间传递任何生成的思维签名。
价格和思维 token 计数
思考令牌与文本输出令牌使用相同的价格。如果您启用思考总结,则这些总结会被视为思考 token,并相应地定价。
您可以在 Firebase 控制台中启用 AI 监控,以监控已启用思考功能的请求的思考令牌数量。
您可以从回答的 usageMetadata 属性中的 thoughtsTokenCount 字段获取思考 token 总数:
Swift
// ... let response = try await model.generateContent("Why is the sky blue?") if let usageMetadata = response.usageMetadata { print("Thoughts Token Count: \(usageMetadata.thoughtsTokenCount)") } Kotlin
// ... val response = model.generateContent("Why is the sky blue?") response.usageMetadata?.let { usageMetadata -> println("Thoughts Token Count: ${usageMetadata.thoughtsTokenCount}") } Java
// ... ListenableFuture<GenerateContentResponse> response = model.generateContent("Why is the sky blue?"); Futures.addCallback(response, new FutureCallback<GenerateContentResponse>() { @Override public void onSuccess(GenerateContentResponse result) { String usageMetadata = result.getUsageMetadata(); if (usageMetadata != null) { System.out.println("Thoughts Token Count: " + usageMetadata.getThoughtsTokenCount()); } } @Override public void onFailure(Throwable t) { t.printStackTrace(); } }, executor); Web
// ... const response = await model.generateContent("Why is the sky blue?"); if (response?.usageMetadata?.thoughtsTokenCount != null) { console.log(`Thoughts Token Count: ${response.usageMetadata.thoughtsTokenCount}`); } Dart
// ... final response = await model.generateContent( Content.text("Why is the sky blue?"), ]); if (response?.usageMetadata case final usageMetadata?) { print("Thoughts Token Count: ${usageMetadata.thoughtsTokenCount}"); } Unity
// ... var response = await model.GenerateContentAsync("Why is the sky blue?"); if (response.UsageMetadata != null) { UnityEngine.Debug.Log($"Thoughts Token Count: {response.UsageMetadata?.ThoughtsTokenCount}"); } 如需详细了解令牌,请参阅令牌计数指南。