Skip to content

Gemini

Gemini

Google Gemini is a family of multimodal AI models from Google with support for text, image, video, audio generation, and embeddings. With the Gemini integration in SquadOS, your agents can generate content with Gemini Flash and Pro models, create images and videos via Veo, compute embeddings for semantic search, and much more — all directly within your workflows.

This tool requires no authentication (NO_AUTH).

FieldRequiredDescription
N/AN/ANo credentials required.
  1. Go to Tools in the side menu (/admin/tools).
  2. Open the Available tab and search for Gemini.
  3. Click the card to open the details and hit Activate — the tool goes directly to the active list with no login step. (Connection-flow details in Organization Tools.)

GEMINI_COUNT_TOKENS

Counts the number of tokens in text using Gemini tokenization. Useful for estimating costs, checking input limits, and optimizing prompts before making API calls.

NameTypeRequiredDescription
textstringYesText to count tokens for.
modelstringNoModel to use for token counting. Must be a model that supports the countTokens method. Use the ListModels action to see available models and their supported methods.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

GEMINI_EMBED_CONTENT

Generates text embeddings using Gemini embedding models. Converts text into numerical vectors for semantic search, similarity comparison, clustering, and classification tasks.

NameTypeRequiredDescription
textstringYesThe text content to generate embeddings for.
modelstringNoEmbedding model to use. Options: text-embedding-004 (768 dimensions, default), gemini-embedding-001 (3072 dimensions, latest).
titlestringNoOptional title for the content. Use with task_type='RETRIEVAL_DOCUMENT' to improve embedding quality for document search.
task_typestringNoSpecifies the intended use case to optimize the embedding. Options: RETRIEVAL_QUERY (search queries), RETRIEVAL_DOCUMENT (documents to be searched), SEMANTIC_SIMILARITY (text similarity), CLASSIFICATION (categorization), CLUSTERING (grouping), QUESTION_ANSWERING (question-document matching). Note: Some task types like CODE_RETRIEVAL_QUERY may only be supported by certain models.
output_dimensionalityintegerNoTruncate the embedding to this number of dimensions. Only supported by the gemini-embedding-001 model. Recommended values: 768, 1536, or 3072. Lower dimensions reduce storage but may affect quality.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

GEMINI_GENERATE_CONTENT

Generates text content or speech audio from prompts using Gemini models. Supports text generation models (Gemini Flash, Pro) and text-to-speech models with configurable parameters. Generated text is nested at results[i].response.data.text. Output may be wrapped in markdown fences (e.g., ```html...```) or preceded by explanatory prose; strip these before file writing or rendering.

NameTypeRequiredDescription
modelstringNoModel to use for generation. Text generation models: gemini-2.5-flash (default, fast & efficient), gemini-2.5-pro (advanced reasoning), gemini-2.0-flash (previous generation), gemini-2.0-flash-lite (cost-optimized). Text-to-speech models: gemini-2.5-flash-preview-tts (low latency), gemini-2.5-pro-preview-tts (high quality). Note: TTS models require the voice_name parameter and return audio data instead of text.
top_kintegerNoTop-k sampling parameter.
top_pnumberNoNucleus sampling parameter (0.0 to 1.0).
promptstringYesREQUIRED. The text prompt for content generation. This field must be provided. Example: Write a short poem about the ocean or Explain quantum computing in simple terms. For TTS models, include style instructions in the prompt (e.g., Say cheerfully: Hello!).
voice_namestringNoAvailable prebuilt voices for text-to-speech generation. Complete list of 30 official Gemini TTS voices at: https://ai.google.dev/gemini-api/docs/speech-generation
temperaturenumberNoControls randomness (0.0 to 2.0).
stop_sequencesarrayNoSequences where generation should stop.
safety_settingsarrayNoSafety filter settings.
max_output_tokensintegerNoMaximum number of tokens to generate. If response finishReason='MAX_TOKENS', output was truncated; narrow prompt scope or increase this value and regenerate.
system_instructionstringNoSystem instruction to guide the model’s behavior.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

GEMINI_GENERATE_IMAGE

Generates images from text prompts using Gemini models (Nano Banana). Supports models: gemini-2.5-flash-image (GA stable, fast), gemini-3-pro-image-preview (Nano Banana Pro — advanced with 4K resolution, thinking mode, up to 14 reference images), and gemini-2.0-flash-exp-image-generation (2.0 Flash experimental). Returns one image per call; images are uploaded to S3. Parse response at data.image.s3url or the text-type entry in data.content — prefer the URL to avoid base64 blobs. Always validate s3url before treating call as successful; a 200 response may contain only text with no image. Store s3url immediately as URLs can expire. Output formats are raster only (JPG/PNG/WebP); request PNG for transparency. Concurrent usage may trigger HTTP 429/RESOURCE_EXHAUSTED — keep concurrency <= 3 and use exponential backoff (1s→2s→4s, ~5 retries).

NameTypeRequiredDescription
modelstringNoModel to use for image generation. Options: gemini-2.5-flash-image (GA stable, fast), gemini-3-pro-image-preview (advanced with 4K, thinking mode), gemini-2.0-flash-exp-image-generation (2.0 Flash experimental).
top_kintegerNoTop-k sampling parameter.
top_pnumberNoNucleus sampling parameter (0.0 to 1.0).
promptstringYesText prompt for image generation. Sensitive, trademarked, or explicit content triggers HTTP 400 (PROHIBITED_CONTENT or IMAGE_RECITATION) with no image returned — rephrase into neutral, policy-compliant language rather than retrying identical prompts.
timeoutnumberNoRequest timeout in seconds. Default is 300 seconds (5 minutes). Increase for complex prompts or high-resolution images. Minimum 120 seconds, maximum 600 seconds.
image_sizestringNoOutput resolution (only for gemini-3-pro-image-preview). Options: 1K, 2K, 4K.
temperaturenumberNoControls randomness (0.0 to 2.0).
aspect_ratiostringNoAspect ratio for the generated image. Accepted values: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9. Not supported by gemini-2.0-flash-exp-image-generation. Unsupported strings will fail or silently default to 1:1.
safety_settingsarrayNoSafety filter settings. List of objects specifying content categories to filter and threshold levels. Each setting requires category (HARM_CATEGORY_HARASSMENT, HARM_CATEGORY_HATE_SPEECH, HARM_CATEGORY_SEXUALLY_EXPLICIT, or HARM_CATEGORY_DANGEROUS_CONTENT) and threshold (BLOCK_NONE, BLOCK_LOW_AND_ABOVE, BLOCK_MEDIUM_AND_ABOVE, or BLOCK_ONLY_HIGH).
max_output_tokensintegerNoMaximum number of tokens to generate (max 32,768). For image generation, images consume tokens based on resolution: 1K/2K consume 1,120 tokens, 4K consumes 2,000 tokens. If set too low, the API may return MAX_TOKENS finish reason with no image. If not specified, the API uses its default which is sufficient for image generation.
system_instructionstringNoSystem instruction to guide image generation behavior.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

GEMINI_GENERATE_VIDEOS

Generates videos from text prompts using Google’s Veo models. Returns an operation_name for tracking; pass it verbatim (no edits) to GEMINI_WAIT_FOR_VIDEO or GEMINI_GET_VIDEOS_OPERATION. Jobs take 30–180+ seconds; wait 10s before first poll, then poll every 10–30s (allow up to 12 min). Successful results include data.video_file.s3url — missing s3url means failure. If done=true but no video_file, check raiMediaFilteredReasons (safety block); revise prompt and regenerate. Text-only; cannot accept image inputs. Max ~3–5 concurrent jobs; 429 RESOURCE_EXHAUSTED requires exponential backoff. For retries, always start a fresh call — never reuse a failed operation_name.

NameTypeRequiredDescription
seedintegerNoSeed value for reproducibility. IMPORTANT: Only supported by Veo 3/3.1 models (VEO_3, VEO_3_FAST, VEO_3_1, VEO_3_1_FAST). VEO_2 does NOT support seed — using seed with VEO_2 will result in a validation error.
modelstringNoVeo model for video generation. Available enum values: VEO_3 (default, recommended), VEO_2, VEO_3_FAST, VEO_3_1 (newest), VEO_3_1_FAST (newest). Avoid preview model ID variants (e.g., *generate-preview*) — they fail to produce downloadable URIs. Use only stable IDs: veo-2.0-generate-001 or veo-3.0-generate-001.
promptstringYesText prompt for Veo video generation. Must be a non-empty string describing the video to generate.
resolutionstringNoSupported resolutions for video generation: 720p or 1080p.
aspect_ratiostringNoSupported aspect ratios for video generation: 16:9 or 9:16.
negative_promptstringNoText describing content to avoid in the generated video (e.g., cartoon, drawing, low quality).
duration_secondsintegerNoSupported video durations in seconds. Model-specific restrictions apply: Veo 2 supports 5, 6, 7, or 8 seconds (4 seconds NOT supported); Veo 3/3.1 models support 4, 6, or 8 seconds (5 and 7 seconds NOT supported).
person_generationstringNoPerson generation safety settings for video generation. Model-specific restrictions apply: Veo 2 supports DONT_ALLOW, ALLOW_ADULT, and ALLOW_ALL; Veo 3/3.1 models only support ALLOW_ALL (requires allowlist access).
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

GEMINI_GET_VIDEOS_OPERATION

DEPRECATED: Use WaitForVideo instead. Checks status of a Veo video generation operation. Use operation_name from GenerateVideos to track progress. Wait several seconds after starting GenerateVideos before first call to avoid OPERATION_NOT_FOUND. Poll at 10–30s intervals; use exponential backoff on HTTP 429 RESOURCE_EXHAUSTED; cap total polling at ~15 minutes. Complete when done=true AND a valid video URI is present; done=true without video_file indicates safety filtering blocked output — inspect raiMediaFilteredReasons and rephrase prompt. Video URL is at generatedSamples[].video.uri — persist promptly as URLs are time-limited. Keep concurrent polling to 3–5 parallel calls to avoid rate limits. If WaitForVideo times out, continue polling here using the same operation_name rather than starting a new GenerateVideos job.

NameTypeRequiredDescription
operation_namestringYesThe operation resource name from GEMINI_GENERATE_VIDEOS. Accepts either the full resource name models/{model}/operations/{operation_id} or just the operation ID. If only operation ID is provided, it will be expanded to use the default model veo-3.0-generate-001. Pass exactly as returned — do not truncate or edit. Never reuse an operation_name from a failed job; start a new GenerateVideos call instead.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

GEMINI_LIST_MODELS

Lists available Gemini and Veo models with their capabilities and limits. Useful for discovering supported models and their features before making generation requests. Before calling video generation tools, verify model availability here — preview Veo models (e.g., veo-3.0-generate-preview) may be unavailable or return missing video URIs; prefer stable models like veo-2.0-generate-001.

NameTypeRequiredDescription
page_sizeintegerNoMaximum number of models to return per page (default 50, max 1000).
page_tokenstringNoToken from a previous response’s nextPageToken to retrieve the next page.
filter_prefixstringNoFilter models by name prefix (client-side). Leave empty to get all models.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

GEMINI_WAIT_FOR_VIDEO

Polls a Veo video generation operation until completion, then downloads and returns the video as a FileDownloadable. Generation takes 30–120+ seconds (up to ~10–12 min); long waits are normal, not failures. On completion, the URL is nested at data.video_file.s3url — validate it is non-empty before downstream use. A done=true response without a valid s3url indicates safety filter rejection (check raiMediaFilteredReasons) or quota exhaustion — adjust the prompt and regenerate. On timeout, use GEMINI_GET_VIDEOS_OPERATION with incremental backoff before starting a new job. Keep parallel jobs to 3–5 to avoid 429 RESOURCE_EXHAUSTED errors.

NameTypeRequiredDescription
operation_namestringYesThe full operation name returned by GEMINI_GENERATE_VIDEOS. Format: models/&lt;model-id>/operations/&lt;operation-id> where &lt;operation-id> is an alphanumeric string (e.g., models/veo-3.0-generate-001/operations/m8dl4dtqqzg8). IMPORTANT: Do NOT use placeholder values like ... — use the exact operation_name string from the generate videos response. CRITICAL: Must be from a generate-video operation (VEO_2, VEO_3, VEO_3_FAST models), NOT generate-preview operations (VEO_3_1, VEO_3_1_FAST models). Do not reuse operation_name from a failed GEMINI_GENERATE_VIDEOS job — always start a new GEMINI_GENERATE_VIDEOS call for retried requests.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.