ChatGoogleGenerativeAI( self, **kwargs: Any = {}, )_BaseGoogleGenerativeAIBaseChatModelEnforce a schema to the output.
The format of the dictionary should follow JSON Schema specification.
The Google GenAI SDK automatically transforms schemas for Gemini compatibility:
$defs definitions (enables Union types with anyOf)$ref pointers for nested/recursive schemasminimum/maximum, minItems/maxItemsUnion types in Pydantic models (e.g., field: Union[TypeA, TypeB]) are automatically converted to anyOf schemas and work correctly with the json_schema method.
Refer to the Gemini API docs for more details on supported JSON Schema features.
Google GenAI chat model integration.
Setup:
Added in langchain-google-genai 4.0.0.
ChatGoogleGenerativeAI now supports both the Gemini Developer API and Vertex AI Platform as backend options.
For Gemini Developer API (simplest):
GOOGLE_API_KEY environment variable (recommended), orapi_key parameterfrom langchain_google_genai import ChatGoogleGenerativeAI model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview", api_key="...") For Vertex AI Platform with API key:
export GEMINI_API_KEY='your-api-key' export GOOGLE_GENAI_USE_VERTEXAI=true export GOOGLE_CLOUD_PROJECT='your-project-id' model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") # Or explicitly: model = ChatGoogleGenerativeAI( model="gemini-3.1-pro-preview", api_key="...", project="your-project-id", vertexai=True, ) For Vertex AI with credentials:
model = ChatGoogleGenerativeAI( model="gemini-2.5-flash", project="your-project-id", # Uses Application Default Credentials (ADC) ) Automatic backend detection (when vertexai=None / unspecified):
GOOGLE_GENAI_USE_VERTEXAI env var is set, uses that valuecredentials parameter is provided, uses Vertex AIproject parameter is provided, uses Vertex AIEnvironment variables:
| Variable | Purpose | Backend |
|---|---|---|
GOOGLE_API_KEY | API key (primary) | Both (see GOOGLE_GENAI_USE_VERTEXAI) |
GEMINI_API_KEY | API key (fallback) | Both (see GOOGLE_GENAI_USE_VERTEXAI) |
GOOGLE_GENAI_USE_VERTEXAI | Force Vertex AI backend (true/false) | Vertex AI |
GOOGLE_CLOUD_PROJECT | GCP project ID | Vertex AI |
GOOGLE_CLOUD_LOCATION | GCP region (default: global) | Vertex AI |
HTTPS_PROXY | HTTP/HTTPS proxy URL | Both |
SSL_CERT_FILE | Custom SSL certificate file | Both |
GOOGLE_API_KEY is checked first for backwards compatibility. (GEMINI_API_KEY was introduced later to better reflect the API's branding.)
Proxy configuration:
Set these before initializing:
export HTTPS_PROXY='http://username:password@proxy_uri:port' export SSL_CERT_FILE='path/to/cert.pem' # Optional: custom SSL certificate For SOCKS5 proxies or advanced proxy configuration, use the client_args parameter:
model = ChatGoogleGenerativeAI( model="gemini-2.5-flash", client_args={"proxy": "socks5://user:pass@host:port"}, ) from langchain_google_genai import ChatGoogleGenerativeAI model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") model.invoke("Write me a ballad about LangChain")messages = [ ("system", "Translate the user sentence to French."), ("human", "I love programming."), ] model.invoke(messages) AIMessage( content=[ { "type": "text", "text": "**J'adore la programmation.**\n\nYou can also say:...", "extras": {"signature": "Eq0W..."}, } ], additional_kwargs={}, response_metadata={ "prompt_feedback": {"block_reason": 0, "safety_ratings": []}, "finish_reason": "STOP", "model_name": "gemini-3.1-pro-preview", "safety_ratings": [], "model_provider": "google_genai", }, id="lc_run--63a04ced-6b63-4cf6-86a1-c32fa565938e-0", usage_metadata={ "input_tokens": 12, "output_tokens": 826, "total_tokens": 838, "input_token_details": {"cache_read": 0}, "output_token_details": {"reasoning": 777}, }, ) content formatThe shape of content may differ based on the model chosen. See the docs for more info.
from langchain_google_genai import ChatGoogleGenerativeAI model = ChatGoogleGenerativeAI(model="gemini-2.5-flash") for chunk in model.stream(messages): print(chunk) AIMessageChunk( content="J", response_metadata={"finish_reason": "STOP", "safety_ratings": []}, id="run-e905f4f4-58cb-4a10-a960-448a2bb649e3", usage_metadata={ "input_tokens": 18, "output_tokens": 1, "total_tokens": 19, }, ) AIMessageChunk( content="'adore programmer. \\n", response_metadata={ "finish_reason": "STOP", "safety_ratings": [ { "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_HATE_SPEECH", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_HARASSMENT", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "probability": "NEGLIGIBLE", "blocked": False, }, ], }, id="run-e905f4f4-58cb-4a10-a960-448a2bb649e3", usage_metadata={ "input_tokens": 18, "output_tokens": 5, "total_tokens": 23, }, ) To assemble a full AIMessage message from a stream of chunks:
stream = model.stream(messages) full = next(stream) for chunk in stream: full += chunk full AIMessageChunk( content="J'adore programmer. \\n", response_metadata={ "finish_reason": "STOPSTOP", "safety_ratings": [ { "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_HATE_SPEECH", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_HARASSMENT", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "probability": "NEGLIGIBLE", "blocked": False, }, ], }, id="run-3ce13a42-cd30-4ad7-a684-f1f0b37cdeec", usage_metadata={ "input_tokens": 36, "output_tokens": 6, "total_tokens": 42, }, ) content formatThe shape of content may differ based on the model chosen. See the docs for more info.
await model.ainvoke(messages) # stream: async for chunk in (await model.astream(messages)) # batch: await model.abatch([messages])See the docs for more info.
from pydantic import BaseModel, Field class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field( ..., description="The city and state, e.g. San Francisco, CA" ) class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field( ..., description="The city and state, e.g. San Francisco, CA" ) llm_with_tools = llm.bind_tools([GetWeather, GetPopulation]) ai_msg = llm_with_tools.invoke( "Which city is hotter today and which is bigger: LA or NY?" ) ai_msg.tool_calls [ { "name": "GetWeather", "args": {"location": "Los Angeles, CA"}, "id": "c186c99f-f137-4d52-947f-9e3deabba6f6", }, { "name": "GetWeather", "args": {"location": "New York City, NY"}, "id": "cebd4a5d-e800-4fa5-babd-4aa286af4f31", }, { "name": "GetPopulation", "args": {"location": "Los Angeles, CA"}, "id": "4f92d897-f5e4-4d34-a3bc-93062c92591e", }, { "name": "GetPopulation", "args": {"location": "New York City, NY"}, "id": "634582de-5186-4e4b-968b-f192f0a93678", }, ]See the docs for more info.
from typing import Optional from pydantic import BaseModel, Field class Joke(BaseModel): '''Joke to tell user.''' setup: str = Field(description="The setup of the joke") punchline: str = Field(description="The punchline to the joke") rating: Optional[int] = Field( description="How funny the joke is, from 1 to 10" ) # Default method uses json_schema for reliable structured output structured_model = model.with_structured_output(Joke) structured_model.invoke("Tell me a joke about cats") # Alternative: use function_calling method (less reliable) structured_model_fc = model.with_structured_output( Joke, method="function_calling" ) Joke( setup="Why are cats so good at video games?", punchline="They have nine lives on the internet", rating=None, ) Two methods are supported for structured output:
method='json_schema' (default): Uses Gemini's native structured output API.
The Google GenAI SDK automatically transforms schemas to ensure compatibility with Gemini. This includes:
$defs definitions (Union types work correctly)$ref references for nested schemasUses Gemini's response_json_schema API param. Refer to the Gemini API docs for more details. This method is recommended for better reliability as it constrains the model's generation process directly.
method='function_calling': Uses tool calling to extract structured data. Less reliable than json_schema but compatible with all models.
See the docs for more info.
import base64 import httpx from langchain.messages import HumanMessage image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8") message = HumanMessage( content=[ {"type": "text", "text": "describe the weather in this image"}, { "type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}, }, ] ) ai_msg = model.invoke([message]) ai_msg.content The weather in this image appears to be sunny and pleasant. The sky is a bright blue with scattered white clouds, suggesting fair weather. The lush green grass and trees indicate a warm and possibly slightly breezy day. There are no...See the docs for more info.
import base64 from langchain.messages import HumanMessage pdf_bytes = open("/path/to/your/test.pdf", "rb").read() pdf_base64 = base64.b64encode(pdf_bytes).decode("utf-8") message = HumanMessage( content=[ {"type": "text", "text": "describe the document in a sentence"}, { "type": "file", "source_type": "base64", "mime_type": "application/pdf", "data": pdf_base64, }, ] ) ai_msg = model.invoke([message])See the docs for more info.
import base64 from langchain.messages import HumanMessage audio_bytes = open("/path/to/your/audio.mp3", "rb").read() audio_base64 = base64.b64encode(audio_bytes).decode("utf-8") message = HumanMessage( content=[ {"type": "text", "text": "summarize this audio in a sentence"}, { "type": "file", "source_type": "base64", "mime_type": "audio/mp3", "data": audio_base64, }, ] ) ai_msg = model.invoke([message])See the docs for more info.
import base64 from langchain.messages import HumanMessage video_bytes = open("/path/to/your/video.mp4", "rb").read() video_base64 = base64.b64encode(video_bytes).decode("utf-8") message = HumanMessage( content=[ { "type": "text", "text": "describe what's in this video in a sentence", }, { "type": "file", "source_type": "base64", "mime_type": "video/mp4", "data": video_base64, }, ] ) ai_msg = model.invoke([message]) You can also pass YouTube URLs directly:
from langchain_google_genai import ChatGoogleGenerativeAI from langchain_core.messages import HumanMessage model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") message = HumanMessage( content=[ {"type": "text", "text": "Summarize the video in 3 sentences."}, { "type": "media", "file_uri": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "mime_type": "video/mp4", }, ] ) response = model.invoke([message]) print(response.text)See the docs for more info.
See the docs for more info.
Audio generation models (TTS) are currently in preview on Vertex AI and may require allowlist access. If you receive an INVALID_ARGUMENT error when using TTS models with vertexai=True, your project may need to be allowlisted.
See this post on the Google AI forum for more details.
You can also upload files to Google's servers and reference them by URI.
This works for PDFs, images, videos, and audio files.
import time from google import genai from langchain.messages import HumanMessage client = genai.Client() myfile = client.files.upload(file="/path/to/your/sample.pdf") while myfile.state.name == "PROCESSING": time.sleep(2) myfile = client.files.get(name=myfile.name) message = HumanMessage( content=[ {"type": "text", "text": "What is in the document?"}, { "type": "media", "file_uri": myfile.uri, "mime_type": "application/pdf", }, ] ) ai_msg = model.invoke([message])See the docs for more info.
Gemini 3+ models use thinking_level ('low', 'medium', or 'high') to control reasoning depth. If not specified, defaults to 'high'.
model = ChatGoogleGenerativeAI( model="gemini-3.1-pro-preview", thinking_level="low", # For faster, lower-latency responses ) Gemini 2.5 models use thinking_budget (an integer token count) to control reasoning. Set to 0 to disable thinking (where supported), or -1 for dynamic thinking.
See the Gemini API docs for more details on thinking models.
To see a thinking model's thoughts, set include_thoughts=True to have the model's reasoning summaries included in the response.
model = ChatGoogleGenerativeAI( model="gemini-3.1-pro-preview", include_thoughts=True, ) ai_msg = model.invoke("How many 'r's are in the word 'strawberry'?")Gemini 3+ models return thought signatures—encrypted representations of the model's internal reasoning.
For multi-turn conversations involving tool calls, you must pass the full AIMessage back to the model so that these signatures are preserved. This happens automatically when you append the AIMessage to your message list.
See the LangChain docs for more info as well as a code example.
See the Gemini API docs for more details on thought signatures.
See the docs for more info.
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") response = model.invoke( "When is the next total solar eclipse in US?", tools=[{"google_search": {}}], ) response.content_blocks Alternatively, you can bind the tool to the model for easier reuse across calls:
model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") model_with_search = model.bind_tools([{"google_search": {}}]) response = model_with_search.invoke( "When is the next total solar eclipse in US?" ) response.content_blocksSee the docs for more info.
See the docs for more info.
from langchain_google_genai import ChatGoogleGenerativeAI model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") model_with_code_interpreter = model.bind_tools([{"code_execution": {}}]) response = model_with_code_interpreter.invoke("Use Python to calculate 3^3.") response.content_blocks [{'type': 'server_tool_call', 'name': 'code_interpreter', 'args': {'code': 'print(3**3)', 'language': <Language.PYTHON: 1>}, 'id': '...'}, {'type': 'server_tool_result', 'tool_call_id': '', 'status': 'success', 'output': '27\n', 'extras': {'block_type': 'code_execution_result', 'outcome': 1}}, {'type': 'text', 'text': 'The calculation of 3 to the power of 3 is 27.'}] See the docs for more info.
The Computer Use model is in preview and may produce unexpected behavior.
Always supervise automated tasks and avoid use with sensitive data or critical operations. See the Gemini API docs for safety best practices.
See the docs for more info.
ai_msg = model.invoke(messages) ai_msg.usage_metadata {"input_tokens": 18, "output_tokens": 5, "total_tokens": 23}Gemini models have default safety settings that can be overridden. If you are receiving lots of "Safety Warnings" from your models, you can try tweaking the safety_settings attribute of the model. For example, to turn off safety blocking for dangerous content, you can construct your LLM as follows:
from langchain_google_genai import ( ChatGoogleGenerativeAI, HarmBlockThreshold, HarmCategory, ) llm = ChatGoogleGenerativeAI( model="gemini-3.1-pro-preview", safety_settings={ HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE, }, ) For an enumeration of the categories and thresholds available, see Google's safety setting types.
See the docs for more info.
Context caching allows you to store and reuse content (e.g., PDFs, images) for faster processing. The cached_content parameter accepts a cache name created via the Google Generative AI API.
See the Gemini docs for more details on cached content.
Below are two examples: caching a single file directly and caching multiple files using Part.
This caches a single file and queries it.
from google import genai from google.genai import types import time from langchain_google_genai import ChatGoogleGenerativeAI from langchain.messages import HumanMessage client = genai.Client() # Upload file file = client.files.upload(file="path/to/your/file") while file.state.name == "PROCESSING": time.sleep(2) file = client.files.get(name=file.name) # Create cache model = "gemini-3.1-pro-preview" cache = client.caches.create( model=model, config=types.CreateCachedContentConfig( display_name="Cached Content", system_instruction=( "You are an expert content analyzer, and your job is to answer " "the user's query based on the file you have access to." ), contents=[file], ttl="300s", ), ) # Query with LangChain llm = ChatGoogleGenerativeAI( model=model, cached_content=cache.name, ) message = HumanMessage(content="Summarize the main points of the content.") llm.invoke([message])This caches two files using Part and queries them together.
from google import genai from google.genai.types import CreateCachedContentConfig, Content, Part import time from langchain_google_genai import ChatGoogleGenerativeAI from langchain.messages import HumanMessage client = genai.Client() # Upload files file_1 = client.files.upload(file="./file1") while file_1.state.name == "PROCESSING": time.sleep(2) file_1 = client.files.get(name=file_1.name) file_2 = client.files.upload(file="./file2") while file_2.state.name == "PROCESSING": time.sleep(2) file_2 = client.files.get(name=file_2.name) # Create cache with multiple files contents = [ Content( role="user", parts=[ Part.from_uri(file_uri=file_1.uri, mime_type=file_1.mime_type), Part.from_uri(file_uri=file_2.uri, mime_type=file_2.mime_type), ], ) ] model = "gemini-3.1-pro-preview" cache = client.caches.create( model=model, config=CreateCachedContentConfig( display_name="Cached Contents", system_instruction=( "You are an expert content analyzer, and your job is to answer " "the user's query based on the files you have access to." ), contents=contents, ttl="300s", ), ) # Query with LangChain llm = ChatGoogleGenerativeAI( model=model, cached_content=cache.name, ) message = HumanMessage( content="Provide a summary of the key information across both files." ) llm.invoke([message])ai_msg = model.invoke(messages) ai_msg.response_metadata { "model_name": "gemini-3.1-pro-preview", "model_provider": "google_genai", "prompt_feedback": {"block_reason": 0, "safety_ratings": []}, "finish_reason": "STOP", "safety_ratings": [ { "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_HATE_SPEECH", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_HARASSMENT", "probability": "NEGLIGIBLE", "blocked": False, }, { "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "probability": "NEGLIGIBLE", "blocked": False, }, ], }