ElevenLabs

Overview

ElevenLabs is an AI voice synthesis platform that generates natural-sounding voices in any language with extremely high fidelity. With the integration in SquadOS, your agents can convert text to audio, clone custom voices, dub videos and audio files, and even create and manage full conversational AI agents with phone call support.

Official website: https://elevenlabs.io/
Composio documentation: docs.composio.dev/toolkits/elevenlabs

Authentication

This tool uses API key (API_KEY) to connect.

You will need the following fields:

Field	Required	Description
`api_key`	Yes	Your ElevenLabs account API key, used to authenticate all requests to the API.

How to get credentials

Go to elevenlabs.io and log in or create an account.
Click your avatar in the bottom-left corner and go to Profile.
Scroll to the API key section and click Copy to copy the key.
Use this value in the api_key field when connecting in SquadOS.

How to connect in SquadOS

Go to Tools in the side menu (/admin/tools).
Open the Available tab and search for ElevenLabs.
Click the card to open the details modal and hit Connect.
You’re taken to the secure connection page hosted by Composio, where you enter the API key obtained above.
Once done, you’re sent back to SquadOS with the account connected and the tool available to your agents. (Connection-flow details in Organization Tools.)

Available actions

Text to speech

ELEVENLABS_TEXT_TO_SPEECH

Converts text to speech using a specified ElevenLabs voice and model, returning a downloadable audio file. The audio URL is nested at data.file.s3url in the response. Use ELEVENLABS_TEXT_TO_SPEECH_STREAM for real-time streaming instead.

Input parameters

Name	Type	Required	Description
`text`	string	Yes	Input text for speech conversion. Max 10,000 characters for most models. Flash/Turbo v2 models: up to 30,000. Flash/Turbo v2.5 models: up to 40,000.
`voice_id`	string	Yes	Identifier of the voice to use. Obtainable from the `/v1/voices` endpoint.
`model_id`	string	No	Identifier of the synthesis model. List available models via `/v1/models`; ensure `can_do_text_to_speech` is `true`.
`output_format`	string	No	Output audio format (e.g., `mp3_44100_128`, `pcm_24000`, `ulaw_8000`). Some formats require a specific subscription tier.
`seed`	integer	No	Integer seed for potentially deterministic audio generation.
`voice_settings`	object	No	Voice settings for controlling speech generation characteristics.
`optimize_streaming_latency`	integer	No	Latency optimization controls (0–4). Higher values reduce latency, potentially affecting quality.
`pronunciation_dictionary_locators`	array	No	List of up to 3 pronunciation dictionary locators, applied sequentially.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Text to speech stream

ELEVENLABS_TEXT_TO_SPEECH_STREAM

Converts text to a spoken audio stream in real-time without saving a file or creating a history entry. Ideal for real-time responses. Use optimize_streaming_latency to balance latency versus quality.

Input parameters

Name	Type	Required	Description
`text`	string	Yes	The text to be converted into speech. Recommended to keep under 5,000 characters.
`voice_id`	string	Yes	Identifier of the voice to use. Retrieve available IDs from `GET /v1/voices`.
`model_id`	string	No	Identifier of the model. Verify `can_do_text_to_speech` is `true` for the chosen model.
`output_format`	string	No	Output audio format (e.g., `mp3_44100_128`, `pcm_24000`). Some formats require Creator or Pro tier.
`seed`	integer	No	Seed for deterministic generation.
`optimize_streaming_latency`	integer	No	Latency optimization (0–4). Value 4 disables the text normalizer for lowest latency.
`pronunciation_dictionary_locators`	array	No	List of up to 3 pronunciation dictionary locators.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Speech to speech

ELEVENLABS_SPEECH_TO_SPEECH

Converts an input audio file to speech using a specified voice. If a model_id is provided, it must support speech-to-speech conversion.

Input parameters

Name	Type	Required	Description
`audio`	object	Yes	The audio file to be converted.
`voice_id`	string	Yes	Identifier of the target voice.
`model_id`	string	No	Identifier of the model (must have `can_do_voice_conversion` equal to `true`).
`output_format`	string	No	Output audio format.
`seed`	integer	No	Seed for deterministic audio generation (0–4294967295).
`voice_settings`	string	No	JSON string defining voice settings such as `stability` and `similarity_boost`.
`optimize_streaming_latency`	integer	No	Latency optimization (0–4).

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Speech to speech streaming

ELEVENLABS_SPEECH_TO_SPEECH_STREAMING

Converts an input audio stream to a different voice output stream in real-time, using a specified speech-to-speech model.

Input parameters

Name	Type	Required	Description
`audio`	object	Yes	The input audio file (e.g., .wav, .mp3) to be converted.
`voice_id`	string	Yes	Identifier of the voice to use.
`model_id`	string	No	Identifier of the speech-to-speech model (e.g., `eleven_english_sts_v2`).
`output_format`	string	No	Desired output audio stream format.
`seed`	integer	No	Seed for deterministic audio generation (0–4294967295).
`optimize_streaming_latency`	integer	No	Latency optimization (0–4).

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Add a voice

ELEVENLABS_ADD_VOICE

Adds a custom voice by uploading audio samples for voice cloning. Recommended: 1–2 minutes of clear audio without background noise. Supported formats: mp3, wav, ogg.

Input parameters

Name	Type	Required	Description
`name`	string	Yes	Name for the new voice, used as its identifier in the platform.
`files`	array	Yes	List of audio files for voice cloning. At least one file is required.
`description`	string	No	Optional description detailing the voice’s characteristics or intended use cases.
`labels`	string	No	Optional stringified JSON object of key-value pairs for categorization (e.g., `{"accent": "American"}`).
`remove_background_noise`	boolean	No	If `true`, removes background noise from samples. Only use if samples contain noise; applying to clean audio can reduce quality.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Edit voice

ELEVENLABS_EDIT_VOICE

Updates the name, audio files, description, or labels for an existing voice model. Only voices you own (cloned voices) can be edited; premade/default voices cannot be edited. The name field is required for all edit operations.

Input parameters

Name	Type	Required	Description
`name`	string	Yes	Name for the voice model. This field is required.
`voice_id`	string	Yes	Identifier of the voice to edit. Only voices owned by you can be edited.
`files`	array	No	Optional list of audio files to add to the voice model. Formats: mp3, wav, ogg.
`description`	string	No	New description for the voice model.
`labels`	string	No	JSON string of key-value pairs for categorization; new labels overwrite existing ones.
`remove_background_noise`	boolean	No	If `true`, removes background noise from samples.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Edit voice settings

ELEVENLABS_EDIT_VOICE_SETTINGS

Edits key voice settings (e.g., stability, similarity enhancement, style exaggeration, speaker boost) for an existing voice, affecting all future audio generated with that voice ID.

Input parameters

Name	Type	Required	Description
`voice_id`	string	Yes	Identifier of the voice whose settings are to be modified.
`stability`	number	No	Controls voice consistency and randomness between generations (0.0–1.0). Lower values introduce broader emotional range.
`similarity_boost`	number	No	Determines how closely the AI adheres to the original voice (0.0–1.0).
`style`	number	No	Adjusts style exaggeration and expressiveness (0.0–1.0). Available for V2+ models.
`speed`	number	No	Controls speech rate and pacing (typically 0.25–4.0).
`use_speaker_boost`	boolean	No	Boosts similarity to the original speaker. Not available for the Eleven v3 model.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Delete voice by id

ELEVENLABS_DELETE_VOICE

Permanently and irreversibly deletes a specific custom voice using its voice_id. The authenticated user must have permission to delete it.

Input parameters

Name	Type	Required	Description
`voice_id`	string	Yes	The unique identifier of the voice to be deleted.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Get voices list

ELEVENLABS_GET_VOICES

Retrieves a list of all available voices along with their detailed attributes and settings.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Get voice

ELEVENLABS_GET_VOICE

Retrieves comprehensive details for a specific, existing voice by its voice_id, optionally including its settings.

Input parameters

Name	Type	Required	Description
`voice_id`	string	Yes	Identifier of the voice. Use `GET /v1/voices` to list available IDs.
`with_settings`	boolean	No	If `true`, the response will include detailed settings information for the voice.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Get shared voices

ELEVENLABS_GET_SHARED_VOICES

Retrieves a paginated and filterable list of shared voices from the ElevenLabs Voice Library.

Input parameters

Name	Type	Required	Description
`search`	string	No	A search term to filter voices by name or description.
`language`	string	No	Filters voices by language (ISO 639-1 code).
`gender`	string	No	Filters voices by gender.
`accent`	string	No	Filters voices by accent.
`age`	string	No	Filters voices by age group.
`category`	string	No	Filters voices by category.
`featured`	boolean	No	Filters for voices that are marked as featured.
`use_cases`	array	No	Filters voices by their intended use cases.
`page`	integer	No	Page number for pagination, starting from 0.
`page_size`	integer	No	Maximum number of shared voices per page (max 100).
`sort`	string	No	Sort options: `created_date`, `usage_character_count_1y`, `trending`, `cloned_by_count`.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Generate a random voice

ELEVENLABS_GENERATE_A_RANDOM_VOICE

Generates a unique, random ElevenLabs text-to-speech voice based on input text and specified voice characteristics.

Input parameters

Name	Type	Required	Description
`text`	string	Yes	The text to be synthesized. Length must be between 100 and 1,000 characters.
`gender`	string	Yes	Gender of the generated voice: `female` or `male`.
`age`	string	Yes	Age category of the generated voice: `young`, `middle_aged`, or `old`.
`accent`	string	Yes	Accent of the generated voice: `american`, `british`, `african`, `australian`, or `indian`.
`accent_strength`	number	Yes	Controls the strength of the accent (0.3–2.0).

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Dub a video or an audio file

ELEVENLABS_DUB_A_VIDEO_OR_AN_AUDIO_FILE

Dub a video or audio file into a specified target language, requiring file or source_url and target_lang. If mode is manual, csv_file is also required.

Input parameters

Name	Type	Required	Description
`target_lang`	string	Yes	Required target language code for dubbing (e.g., `en`, `es`, `pt`).
`file`	object	No	Video or audio file to dub. Required if `source_url` is not provided.
`source_url`	string	No	URL of the video or audio file to dub. Required if `file` is not provided.
`source_lang`	string	No	Language of the original audio. Use `auto` for automatic detection or provide a language code.
`name`	string	No	Name for the dubbing project.
`mode`	string	No	Dubbing mode: `automatic` for AI-driven dubbing or `manual` using a .csv file.
`num_speakers`	integer	No	Number of speakers in the audio. Use 0 for automatic detection.
`watermark`	boolean	No	Include a watermark in the dubbed audio.
`highest_resolution`	boolean	No	Process dubbing at highest possible resolution; may increase processing time.
`dubbing_studio`	boolean	No	Enable Dubbing Studio features for advanced editing capabilities.
`start_time`	integer	No	Start time in seconds for the audio portion to dub.
`end_time`	integer	No	End time in seconds for the audio portion to dub.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Get generated items

ELEVENLABS_GET_GENERATED_ITEMS

Retrieves metadata for a list of generated audio items from history, supporting pagination and optional filtering by voice ID.

Input parameters

Name	Type	Required	Description
`voice_id`	string	No	Filters history items to only include those generated with the specified voice ID.
`page_size`	integer	No	Maximum number of history items to return per page (1–1000).
`start_after_history_item_id`	string	No	The ID of the history item to start fetching results after, for pagination.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Download history items

ELEVENLABS_DOWNLOAD_HISTORY_ITEMS

Downloads audio clips from history by ID(s), returning a single file or a ZIP archive, with an optional output format (e.g., wav).

Input parameters

Name	Type	Required	Description
`history_item_ids`	array	Yes	A list of unique string identifiers for the history items to be downloaded.
`output_format`	string	No	Optional output audio format. Accepts `wav` to convert to WAV. If omitted, returns in original synthesized format.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Get models

ELEVENLABS_GET_MODELS

Retrieves a detailed list of all available ElevenLabs text-to-speech (TTS) models and their capabilities.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Create Conversational AI Agent

ELEVENLABS_CREATE_CONVERSATIONAL_AGENT

Creates a new ElevenLabs Conversational AI agent with specified configuration. After creating the agent, you can chain other tools to attach phone numbers or configure additional settings.

Input parameters

Name	Type	Required	Description
`conversation_config`	object	Yes	Configuration object defining the agent’s conversational behavior, including prompt, LLM model, language, and first message.
`name`	string	No	Human-readable name for the agent.
`tags`	array	No	List of tags to organize and categorize the agent.
`workflow`	object	No	Workflow configuration defining conditional logic and tool execution modes.
`platform_settings`	string	No	Platform-specific settings including evaluation criteria and widget configuration.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Update Conversational AI Agent

ELEVENLABS_UPDATE_CONVAI_AGENT

Updates an existing ElevenLabs Conversational AI agent’s settings, such as name, conversation configuration, workflow, or platform settings.

Input parameters

Name	Type	Required	Description
`agent_id`	string	Yes	The ID of the agent to update.
`name`	string	No	A name to make the agent easier to find.
`tags`	array	No	Tags to help classify and filter the agent.
`conversation_config`	object	No	Conversation configuration for the agent.
`workflow`	object	No	Workflow configuration.
`platform_settings`	object	No	Platform settings for the agent.
`version_description`	string	No	Description for this version when publishing changes (only for versioned agents).

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Simulate Conversational AI Agent Conversation

ELEVENLABS_SIMULATE_CONVAI_AGENTS_SIMULATE_CONVERSATION

Runs a simulated conversation between an agent and an AI user. Returns a full transcript with analysis including success metrics and a conversation summary.

Input parameters

Name	Type	Required	Description
`agent_id`	string	Yes	The ID of the agent to simulate.
`simulation_specification`	object	Yes	A specification used to simulate a conversation between an agent and an AI user.
`new_turns_limit`	integer	No	Maximum number of new turns to generate in the conversation simulation.
`extra_evaluation_criteria`	array	No	A list of additional evaluation criteria to test.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Outbound call

ELEVENLABS_OUTBOUND_CALL

Places an outbound call via SIP trunk. Requires an API key with Conversational AI permissions enabled and a valid SIP trunk phone number configured for outbound calls.

Input parameters

Name	Type	Required	Description
`agent_id`	string	Yes	Agent ID to place the call with.
`to_number`	string	Yes	Destination phone number in E.164 format.
`agent_phone_number_id`	string	Yes	ID of the phone number to originate the call from (must support outbound calls).
`conversation_initiation_client_data`	object	No	Personalization and override payload for initiating the conversation.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Register Twilio Call for ConvAI Agent

ELEVENLABS_REGISTER_CALL_CONVAI_TWILIO

Registers a Twilio call and returns TwiML to connect the call to an ElevenLabs Conversational AI agent. Use when integrating ElevenLabs agents with your own Twilio infrastructure for inbound or outbound calls.

Input parameters

Name	Type	Required	Description
`agent_id`	string	Yes	The unique identifier of the Conversational AI agent to use for this call.
`to_number`	string	Yes	The phone number the call is directed to in E.164 format.
`from_number`	string	Yes	The phone number the call is originating from in E.164 format.
`direction`	string	No	Direction of the call: `inbound` for incoming calls or `outbound` for outgoing calls.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Get user info

ELEVENLABS_GET_USER_INFO

Retrieves detailed information about the authenticated ElevenLabs user’s account, including subscription, usage, API key, and status.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Get user subscription info

ELEVENLABS_GET_USER_SUBSCRIPTION_INFO

Retrieves detailed subscription information for the currently authenticated ElevenLabs user.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Get voice settings

ELEVENLABS_GET_VOICE_SETTINGS

Retrieves the stability, similarity, style, and speaker boost settings for a specific, existing ElevenLabs voice using its voice_id.

Input parameters

Name	Type	Required	Description
`voice_id`	string	Yes	Identifier of the voice for which to retrieve settings.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.

Generate Music Composition Plan

ELEVENLABS_CREATE_MUSIC_PLAN

Generates a music composition plan from a text prompt using the ElevenLabs Music API. Creates a structured plan with defined styles, sections, and durations that can be used as input for actual music generation or as a template for variations.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error if any occurred during the execution of the action.
`successful`	boolean	Yes	Whether or not the action execution was successful.