ChatGoogleGenerativeAI

Setup:

Vertex AI Platform Support

Added in langchain-google-genai 4.0.0.

ChatGoogleGenerativeAI now supports both the Gemini Developer API and Vertex AI Platform as backend options.

For Gemini Developer API (simplest):

Set the GOOGLE_API_KEY environment variable (recommended), or
Pass your API key using the api_key parameter

from langchain_google_genai import ChatGoogleGenerativeAI  model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview", api_key="...")

For Vertex AI Platform with API key:

export GEMINI_API_KEY='your-api-key' export GOOGLE_GENAI_USE_VERTEXAI=true export GOOGLE_CLOUD_PROJECT='your-project-id'

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") # Or explicitly: model = ChatGoogleGenerativeAI(  model="gemini-3.1-pro-preview",  api_key="...",  project="your-project-id",  vertexai=True, )

For Vertex AI with credentials:

model = ChatGoogleGenerativeAI(  model="gemini-2.5-flash",  project="your-project-id",  # Uses Application Default Credentials (ADC) )

Automatic backend detection (when vertexai=None / unspecified):

If GOOGLE_GENAI_USE_VERTEXAI env var is set, uses that value
If credentials parameter is provided, uses Vertex AI
If project parameter is provided, uses Vertex AI
Otherwise, uses Gemini Developer API

Environment variables:

Variable	Purpose	Backend
`GOOGLE_API_KEY`	API key (primary)	Both (see `GOOGLE_GENAI_USE_VERTEXAI`)
`GEMINI_API_KEY`	API key (fallback)	Both (see `GOOGLE_GENAI_USE_VERTEXAI`)
`GOOGLE_GENAI_USE_VERTEXAI`	Force Vertex AI backend (`true`/`false`)	Vertex AI
`GOOGLE_CLOUD_PROJECT`	GCP project ID	Vertex AI
`GOOGLE_CLOUD_LOCATION`	GCP region (default: `global`)	Vertex AI
`HTTPS_PROXY`	HTTP/HTTPS proxy URL	Both
`SSL_CERT_FILE`	Custom SSL certificate file	Both

GOOGLE_API_KEY is checked first for backwards compatibility. (GEMINI_API_KEY was introduced later to better reflect the API's branding.)

Proxy configuration:

Set these before initializing:

export HTTPS_PROXY='http://username:password@proxy_uri:port' export SSL_CERT_FILE='path/to/cert.pem' # Optional: custom SSL certificate

For SOCKS5 proxies or advanced proxy configuration, use the client_args parameter:

model = ChatGoogleGenerativeAI(  model="gemini-2.5-flash",  client_args={"proxy": "socks5://user:pass@host:port"}, )

Instantiation

from langchain_google_genai import ChatGoogleGenerativeAI  model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") model.invoke("Write me a ballad about LangChain")

Invoke

messages = [  ("system", "Translate the user sentence to French."),  ("human", "I love programming."), ] model.invoke(messages)

AIMessage(  content=[  {  "type": "text",  "text": "**J'adore la programmation.**\n\nYou can also say:...",  "extras": {"signature": "Eq0W..."},  }  ],  additional_kwargs={},  response_metadata={  "prompt_feedback": {"block_reason": 0, "safety_ratings": []},  "finish_reason": "STOP",  "model_name": "gemini-3.1-pro-preview",  "safety_ratings": [],  "model_provider": "google_genai",  },  id="lc_run--63a04ced-6b63-4cf6-86a1-c32fa565938e-0",  usage_metadata={  "input_tokens": 12,  "output_tokens": 826,  "total_tokens": 838,  "input_token_details": {"cache_read": 0},  "output_token_details": {"reasoning": 777},  }, )

content format

The shape of content may differ based on the model chosen. See the docs for more info.

Stream

from langchain_google_genai import ChatGoogleGenerativeAI  model = ChatGoogleGenerativeAI(model="gemini-2.5-flash")  for chunk in model.stream(messages):  print(chunk)

AIMessageChunk(  content="J",  response_metadata={"finish_reason": "STOP", "safety_ratings": []},  id="run-e905f4f4-58cb-4a10-a960-448a2bb649e3",  usage_metadata={  "input_tokens": 18,  "output_tokens": 1,  "total_tokens": 19,  }, ) AIMessageChunk(  content="'adore programmer. \\n",  response_metadata={  "finish_reason": "STOP",  "safety_ratings": [  {  "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_HATE_SPEECH",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_HARASSMENT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_DANGEROUS_CONTENT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  ],  },  id="run-e905f4f4-58cb-4a10-a960-448a2bb649e3",  usage_metadata={  "input_tokens": 18,  "output_tokens": 5,  "total_tokens": 23,  }, )

To assemble a full AIMessage message from a stream of chunks:

stream = model.stream(messages) full = next(stream) for chunk in stream:  full += chunk full

AIMessageChunk(  content="J'adore programmer. \\n",  response_metadata={  "finish_reason": "STOPSTOP",  "safety_ratings": [  {  "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_HATE_SPEECH",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_HARASSMENT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_DANGEROUS_CONTENT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  ],  },  id="run-3ce13a42-cd30-4ad7-a684-f1f0b37cdeec",  usage_metadata={  "input_tokens": 36,  "output_tokens": 6,  "total_tokens": 42,  }, )

content format

The shape of content may differ based on the model chosen. See the docs for more info.

Async invocation

await model.ainvoke(messages)  # stream: async for chunk in (await model.astream(messages))  # batch: await model.abatch([messages])

Tool calling

See the docs for more info.

from pydantic import BaseModel, Field  class GetWeather(BaseModel):  '''Get the current weather in a given location'''   location: str = Field(  ..., description="The city and state, e.g. San Francisco, CA"  )  class GetPopulation(BaseModel):  '''Get the current population in a given location'''   location: str = Field(  ..., description="The city and state, e.g. San Francisco, CA"  )  llm_with_tools = llm.bind_tools([GetWeather, GetPopulation]) ai_msg = llm_with_tools.invoke(  "Which city is hotter today and which is bigger: LA or NY?" ) ai_msg.tool_calls

[  {  "name": "GetWeather",  "args": {"location": "Los Angeles, CA"},  "id": "c186c99f-f137-4d52-947f-9e3deabba6f6",  },  {  "name": "GetWeather",  "args": {"location": "New York City, NY"},  "id": "cebd4a5d-e800-4fa5-babd-4aa286af4f31",  },  {  "name": "GetPopulation",  "args": {"location": "Los Angeles, CA"},  "id": "4f92d897-f5e4-4d34-a3bc-93062c92591e",  },  {  "name": "GetPopulation",  "args": {"location": "New York City, NY"},  "id": "634582de-5186-4e4b-968b-f192f0a93678",  }, ]

Structured output

See the docs for more info.

from typing import Optional  from pydantic import BaseModel, Field  class Joke(BaseModel):  '''Joke to tell user.'''   setup: str = Field(description="The setup of the joke")  punchline: str = Field(description="The punchline to the joke")  rating: Optional[int] = Field(  description="How funny the joke is, from 1 to 10"  )  # Default method uses json_schema for reliable structured output structured_model = model.with_structured_output(Joke) structured_model.invoke("Tell me a joke about cats")  # Alternative: use function_calling method (less reliable) structured_model_fc = model.with_structured_output(  Joke, method="function_calling" )

Joke(  setup="Why are cats so good at video games?",  punchline="They have nine lives on the internet",  rating=None, )

Two methods are supported for structured output:

method='json_schema' (default): Uses Gemini's native structured output API.

The Google GenAI SDK automatically transforms schemas to ensure compatibility with Gemini. This includes:
- Inlining $defs definitions (Union types work correctly)
- Resolving $ref references for nested schemas
- Property ordering preservation
- Support for streaming partial JSON chunks
Uses Gemini's response_json_schema API param. Refer to the Gemini API docs for more details. This method is recommended for better reliability as it constrains the model's generation process directly.
method='function_calling': Uses tool calling to extract structured data. Less reliable than json_schema but compatible with all models.

Image input

See the docs for more info.

import base64 import httpx from langchain.messages import HumanMessage  image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8") message = HumanMessage(  content=[  {"type": "text", "text": "describe the weather in this image"},  {  "type": "image_url",  "image_url": {"url": f"data:image/jpeg;base64,{image_data}"},  },  ] ) ai_msg = model.invoke([message]) ai_msg.content

The weather in this image appears to be sunny and pleasant. The sky is a bright blue with scattered white clouds, suggesting fair weather. The lush green grass and trees indicate a warm and possibly slightly breezy day. There are no...

PDF input

See the docs for more info.

import base64 from langchain.messages import HumanMessage  pdf_bytes = open("/path/to/your/test.pdf", "rb").read() pdf_base64 = base64.b64encode(pdf_bytes).decode("utf-8")  message = HumanMessage(  content=[  {"type": "text", "text": "describe the document in a sentence"},  {  "type": "file",  "source_type": "base64",  "mime_type": "application/pdf",  "data": pdf_base64,  },  ] ) ai_msg = model.invoke([message])

Audio input

See the docs for more info.

import base64 from langchain.messages import HumanMessage  audio_bytes = open("/path/to/your/audio.mp3", "rb").read() audio_base64 = base64.b64encode(audio_bytes).decode("utf-8")  message = HumanMessage(  content=[  {"type": "text", "text": "summarize this audio in a sentence"},  {  "type": "file",  "source_type": "base64",  "mime_type": "audio/mp3",  "data": audio_base64,  },  ] ) ai_msg = model.invoke([message])

Video input

See the docs for more info.

import base64 from langchain.messages import HumanMessage  video_bytes = open("/path/to/your/video.mp4", "rb").read() video_base64 = base64.b64encode(video_bytes).decode("utf-8")  message = HumanMessage(  content=[  {  "type": "text",  "text": "describe what's in this video in a sentence",  },  {  "type": "file",  "source_type": "base64",  "mime_type": "video/mp4",  "data": video_base64,  },  ] ) ai_msg = model.invoke([message])

You can also pass YouTube URLs directly:

from langchain_google_genai import ChatGoogleGenerativeAI from langchain_core.messages import HumanMessage  model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")  message = HumanMessage(  content=[  {"type": "text", "text": "Summarize the video in 3 sentences."},  {  "type": "media",  "file_uri": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",  "mime_type": "video/mp4",  },  ] ) response = model.invoke([message]) print(response.text)

Image generation

See the docs for more info.

Audio generation

See the docs for more info.

Vertex compatibility

Audio generation models (TTS) are currently in preview on Vertex AI and may require allowlist access. If you receive an INVALID_ARGUMENT error when using TTS models with vertexai=True, your project may need to be allowlisted.

See this post on the Google AI forum for more details.

File upload

You can also upload files to Google's servers and reference them by URI.

This works for PDFs, images, videos, and audio files.

import time from google import genai from langchain.messages import HumanMessage  client = genai.Client()  myfile = client.files.upload(file="/path/to/your/sample.pdf") while myfile.state.name == "PROCESSING":  time.sleep(2)  myfile = client.files.get(name=myfile.name)  message = HumanMessage(  content=[  {"type": "text", "text": "What is in the document?"},  {  "type": "media",  "file_uri": myfile.uri,  "mime_type": "application/pdf",  },  ] ) ai_msg = model.invoke([message])

Thinking

See the docs for more info.

Gemini 3+ models use thinking_level ('low', 'medium', or 'high') to control reasoning depth. If not specified, defaults to 'high'.

model = ChatGoogleGenerativeAI(  model="gemini-3.1-pro-preview",  thinking_level="low", # For faster, lower-latency responses )

Gemini 2.5 models use thinking_budget (an integer token count) to control reasoning. Set to 0 to disable thinking (where supported), or -1 for dynamic thinking.

See the Gemini API docs for more details on thinking models.

To see a thinking model's thoughts, set include_thoughts=True to have the model's reasoning summaries included in the response.

model = ChatGoogleGenerativeAI(  model="gemini-3.1-pro-preview",  include_thoughts=True, ) ai_msg = model.invoke("How many 'r's are in the word 'strawberry'?")

Thought signatures

Gemini 3+ models return thought signatures—encrypted representations of the model's internal reasoning.

For multi-turn conversations involving tool calls, you must pass the full AIMessage back to the model so that these signatures are preserved. This happens automatically when you append the AIMessage to your message list.

See the LangChain docs for more info as well as a code example.

See the Gemini API docs for more details on thought signatures.

Google search

See the docs for more info.

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview") response = model.invoke(  "When is the next total solar eclipse in US?",  tools=[{"google_search": {}}], ) response.content_blocks

Alternatively, you can bind the tool to the model for easier reuse across calls:

model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")  model_with_search = model.bind_tools([{"google_search": {}}]) response = model_with_search.invoke(  "When is the next total solar eclipse in US?" )  response.content_blocks

Google Maps

See the docs for more info.

Code execution

See the docs for more info.

from langchain_google_genai import ChatGoogleGenerativeAI  model = ChatGoogleGenerativeAI(model="gemini-3.1-pro-preview")  model_with_code_interpreter = model.bind_tools([{"code_execution": {}}]) response = model_with_code_interpreter.invoke("Use Python to calculate 3^3.")  response.content_blocks

[{'type': 'server_tool_call', 'name': 'code_interpreter', 'args': {'code': 'print(3**3)', 'language': <Language.PYTHON: 1>}, 'id': '...'}, {'type': 'server_tool_result', 'tool_call_id': '', 'status': 'success', 'output': '27\n', 'extras': {'block_type': 'code_execution_result', 'outcome': 1}}, {'type': 'text', 'text': 'The calculation of 3 to the power of 3 is 27.'}]

Computer use

See the docs for more info.

Preview model limitations

The Computer Use model is in preview and may produce unexpected behavior.

Always supervise automated tasks and avoid use with sensitive data or critical operations. See the Gemini API docs for safety best practices.

Token usage

See the docs for more info.

ai_msg = model.invoke(messages) ai_msg.usage_metadata

{"input_tokens": 18, "output_tokens": 5, "total_tokens": 23}

Safety settings

Gemini models have default safety settings that can be overridden. If you are receiving lots of "Safety Warnings" from your models, you can try tweaking the safety_settings attribute of the model. For example, to turn off safety blocking for dangerous content, you can construct your LLM as follows:

from langchain_google_genai import (  ChatGoogleGenerativeAI,  HarmBlockThreshold,  HarmCategory, )  llm = ChatGoogleGenerativeAI(  model="gemini-3.1-pro-preview",  safety_settings={  HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,  }, )

For an enumeration of the categories and thresholds available, see Google's safety setting types.

Context caching

See the docs for more info.

Context caching allows you to store and reuse content (e.g., PDFs, images) for faster processing. The cached_content parameter accepts a cache name created via the Google Generative AI API.

See the Gemini docs for more details on cached content.

Below are two examples: caching a single file directly and caching multiple files using Part.

Single file example

This caches a single file and queries it.

from google import genai from google.genai import types import time from langchain_google_genai import ChatGoogleGenerativeAI from langchain.messages import HumanMessage  client = genai.Client()  # Upload file file = client.files.upload(file="path/to/your/file") while file.state.name == "PROCESSING":  time.sleep(2)  file = client.files.get(name=file.name)  # Create cache model = "gemini-3.1-pro-preview" cache = client.caches.create(  model=model,  config=types.CreateCachedContentConfig(  display_name="Cached Content",  system_instruction=(  "You are an expert content analyzer, and your job is to answer "  "the user's query based on the file you have access to."  ),  contents=[file],  ttl="300s",  ), )  # Query with LangChain llm = ChatGoogleGenerativeAI(  model=model,  cached_content=cache.name, ) message = HumanMessage(content="Summarize the main points of the content.") llm.invoke([message])

Multiple files example

This caches two files using Part and queries them together.

from google import genai from google.genai.types import CreateCachedContentConfig, Content, Part import time from langchain_google_genai import ChatGoogleGenerativeAI from langchain.messages import HumanMessage  client = genai.Client()  # Upload files file_1 = client.files.upload(file="./file1") while file_1.state.name == "PROCESSING":  time.sleep(2)  file_1 = client.files.get(name=file_1.name)  file_2 = client.files.upload(file="./file2") while file_2.state.name == "PROCESSING":  time.sleep(2)  file_2 = client.files.get(name=file_2.name)  # Create cache with multiple files contents = [  Content(  role="user",  parts=[  Part.from_uri(file_uri=file_1.uri, mime_type=file_1.mime_type),  Part.from_uri(file_uri=file_2.uri, mime_type=file_2.mime_type),  ],  ) ] model = "gemini-3.1-pro-preview" cache = client.caches.create(  model=model,  config=CreateCachedContentConfig(  display_name="Cached Contents",  system_instruction=(  "You are an expert content analyzer, and your job is to answer "  "the user's query based on the files you have access to."  ),  contents=contents,  ttl="300s",  ), )  # Query with LangChain llm = ChatGoogleGenerativeAI(  model=model,  cached_content=cache.name, ) message = HumanMessage(  content="Provide a summary of the key information across both files." ) llm.invoke([message])

Response metadata

ai_msg = model.invoke(messages) ai_msg.response_metadata

{  "model_name": "gemini-3.1-pro-preview",  "model_provider": "google_genai",  "prompt_feedback": {"block_reason": 0, "safety_ratings": []},  "finish_reason": "STOP",  "safety_ratings": [  {  "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_HATE_SPEECH",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_HARASSMENT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  {  "category": "HARM_CATEGORY_DANGEROUS_CONTENT",  "probability": "NEGLIGIBLE",  "blocked": False,  },  ], }

LangChain Assistant

Menu

Bases

Constructors

Attributes

Methods

Inherited fromBaseChatModel(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods

Menu

ChatGoogleGenerativeAI

Bases

Used in Docs

Constructors

Attributes

Methods

Inherited fromBaseChatModel(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods