Google models

Featured Gemini models

Generally available Gemini models

diamond Gemini 2.5 Pro A Gemini model designed for complex reasoning.
spark Gemini 2.5 Flash A Gemini model offering a balance of price and performance.
photo_spark Gemini 2.5 Flash Image A Gemini model for rapid creative workflows with image generation and conversational editing.
performance_auto Gemini 2.5 Flash-Lite A cost-effective Gemini model that supports high-throughput tasks.
spark Gemini 2.0 Flash A Gemini 2.0 model offering well-rounded capabilities with a focus on price-performance.
performance_auto Gemini 2.0 Flash-Lite A Gemini 2.0 Flash model optimized for cost efficiency and low latency.

Preview Gemini models

preview Gemini 3 Pro A Gemini model capable of advanced reasoning and solving complex problems.
preview Gemini 2.5 Flash Live API A Gemini model enhanced for real-time, conversational experiences with streaming capabilities.

Gemma models

Gemma 3n An open model designed for efficient execution on low-resource devices, supporting multimodal input (text, image, video, and audio) and text output in over 140 languages.
Gemma 3 An open model featuring text and image input, support for over 140 languages, and a 128K context window.
Gemma 2 An open model supporting text generation, summarization, and extraction.
Gemma A small, lightweight open model supporting text generation, summarization, and extraction.
ShieldGemma 2 Instruction-tuned models for evaluating text and image safety against defined policies.
PaliGemma An open vision-language model combining SigLIP and Gemma.
CodeGemma A powerful, lightweight open model for coding tasks, including code completion, generation, and understanding.
TxGemma A model that generates predictions, classifications, or text based on therapeutic-related data, for building AI models with less data and compute.
MedGemma A collection of Gemma 3 variants trained for performance on medical text and image comprehension.
MedSigLIP A SigLIP variant trained to encode medical images and text into a common embedding space.
T5Gemma A family of lightweight encoder-decoder research models.

Embeddings models

width_normal Embeddings for Text Converts text data into vector representations for semantic search, classification, and clustering.
width_normal Multimodal Embeddings Generates vectors based on images, for tasks such as image classification and search.

Imagen models

photo_spark Imagen 4 for Generation Use text prompts to generate novel images with higher quality than our previous image generation models
photo_spark Imagen 4 for Fast Generation Use text prompts to generate novel images with higher quality and lower latency than our previous image generation models
photo_spark Imagen 4 for Ultra Generation Use text prompts to generate novel images with higher quality and better prompt adherence than our previous image generation models
photo_spark Imagen 3 for Generation 002 Use text prompts to generate novel images
photo_spark Imagen 3 for Generation 001 Use text prompts to generate novel images
photo_spark Imagen 3 for Fast Generation Use text prompts to generate novel images with lower latency than our other image generation models
image_edit_auto Imagen 3 for Editing and Customization Edits existing images or generates new images based on text prompts and provided context.

Preview Imagen models

photo_spark Virtual Try-On Generates images of people wearing clothing products.
image_edit_auto Imagen product recontext on Vertex AI Edits product images to place them in different scenes or backgrounds based on text prompts.

Veo models

movie Veo 2 Generate Generates videos from text prompts and images.
movie Veo 3 Generate Generates videos from text prompts and images with high quality.
movie Veo 3 Fast Generates videos from text prompts and images with high quality and low latency.
movie Veo 3.1 Generate Generates videos from text prompts and images with high quality.
movie Veo 3.1 Fast Generates videos from text prompts and images with high quality and low latency.

Preview Veo models

movie Veo 3 Generate preview Generates videos from text prompts and images with high quality.
movie Veo 3 Fast preview Generates videos from text prompts and images with high quality and low latency.
movie Veo 3.1 Generate preview Generates videos from text prompts and images with high quality.
movie Veo 3.1 Fast preview Generates videos from text prompts and images with high quality and low latency.
movie Veo 2 Preview Generates videos from text prompts and images, supporting inpaint and outpaint.

Experimental Veo models

movie Veo 2 Experimental An experimental model with features under test.

MedLM models

medical_information MedLM-medium A HIPAA-compliant model for medical question answering and summarization of healthcare documents.
clinical_notes MedLM-large-large A HIPAA-compliant model for medical question answering and summarization of healthcare documents.

Language support

Gemini

All the Gemini models can understand and respond in the following languages:

Afrikaans (af), Albanian (sq), Amharic (am), Arabic (ar), Armenian (hy), Assamese (as), Azerbaijani (az), Basque (eu), Belarusian (be), Bengali (bn), Bosnian (bs), Bulgarian (bg), Catalan (ca), Cebuano (ceb), Chinese (Simplified and Traditional) (zh), Corsican (co), Croatian (hr), Czech (cs), Danish (da), Dhivehi (dv), Dutch (nl), English (en), Esperanto (eo), Estonian (et), Filipino (Tagalog) (fil), Finnish (fi), French (fr), Frisian (fy), Galician (gl), Georgian (ka), German (de), Greek (el), Gujarati (gu), Haitian Creole (ht), Hausa (ha), Hawaiian (haw), Hebrew (iw), Hindi (hi), Hmong (hmn), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Krio (kri), Kurdish (ku), Kyrgyz (ky), Lao (lo), Latin (la), Latvian (lv), Lithuanian (lt), Luxembourgish (lb), Macedonian (mk), Malagasy (mg), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Meiteilon (Manipuri) (mni-Mtei), Mongolian (mn), Myanmar (Burmese) (my), Nepali (ne), Norwegian (no), Nyanja (Chichewa) (ny), Odia (Oriya) (or), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Samoan (sm), Scots Gaelic (gd), Serbian (sr), Sesotho (st), Shona (sn), Sindhi (sd), Sinhala (Sinhalese) (si), Slovak (sk), Slovenian (sl), Somali (so), Spanish (es), Sundanese (su), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Urdu (ur), Uyghur (ug), Uzbek (uz), Vietnamese (vi), Welsh (cy), Xhosa (xh), Yiddish (yi), Yoruba (yo), and Zulu (zu).

Gemma

Gemma and Gemma 2 support only the English (en) language. Gemma 3 and Gemma 3n provide multilingual support in over 140 languages.

Embeddings

Multilingual text embedding models support the following languages:

Afrikaans (af), Albanian (sq), Amharic (am), Arabic (ar), Armenian (hy), Azerbaijani (az), Basque (eu), Belarusian (be), Bengali (bn), Bulgarian (bg), Catalan (ca), Cebuano (ceb), Chinese (Simplified and Traditional) (zh), Corsican (co), Czech (cs), Danish (da), Dutch (nl), English (en), Esperanto (eo), Estonian (et), Filipino (Tagalog) (fil), Finnish (fi), French (fr), Frisian (fy), Galician (gl), Georgian (ka), German (de), Greek (el), Gujarati (gu), Haitian Creole (ht), Hausa (ha), Hawaiian (haw), Hebrew (iw), Hindi (hi), Hmong (hmn), Hungarian (hu), Icelandic (is), Igbo (ig), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Javanese (jv), Kannada (kn), Kazakh (kk), Khmer (km), Korean (ko), Kurdish (ku), Kyrgyz (ky), Lao (lo), Latin (la), Latvian (lv), Lithuanian (lt), Luxembourgish (lb), Macedonian (mk), Malagasy (mg), Malay (ms), Malayalam (ml), Maltese (mt), Maori (mi), Marathi (mr), Mongolian (mn), Myanmar (Burmese) (my), Nepali (ne), Nyanja (Chichewa) (ny), Norwegian (no), Pashto (ps), Persian (fa), Polish (pl), Portuguese (pt), Punjabi (pa), Romanian (ro), Russian (ru), Samoan (sm), Scots Gaelic (gd), Serbian (sr), Sesotho (st), Shona (sn), Sindhi (sd), Sinhala (Sinhalese) (si), Slovak (sk), Slovenian (sl), Somali (so), Spanish (es), Sundanese (su), Swahili (sw), Swedish (sv), Tajik (tg), Tamil (ta), Telugu (te), Thai (th), Turkish (tr), Ukrainian (uk), Urdu (ur), Uzbek (uz), Vietnamese (vi), Welsh (cy), Xhosa (xh), Yiddish (yi), Yoruba (yo), and Zulu (zu).

Imagen 3

Imagen 3 supports the following languages:

English (en), Chinese (Simplified and Traditional) (zh), Hindi (hi), Japanese (ja), Korean (ko), Portuguese (pt), and Spanish (es).

MedLM

The MedLM model supports the English (en) language.

Explore all models in Model Garden

Model Garden is a platform that helps you discover, test, customize, and deploy Google proprietary and select OSS models and assets. To explore the generative AI models and APIs that are available on Vertex AI, go to Model Garden in the Google Cloud console.

Go to Model Garden

To learn more about Model Garden, including available models and capabilities, see Explore AI models in Model Garden.

Model versions

To see all model versions, including legacy and retired models, see Model versions and lifecycle.

What's next