Skip to content

ElevenLabs

ElevenLabs

ElevenLabs is an AI voice synthesis platform that generates natural-sounding voices in any language with extremely high fidelity. With the integration in SquadOS, your agents can convert text to audio, clone custom voices, dub videos and audio files, and even create and manage full conversational AI agents with phone call support.

This tool uses API key (API_KEY) to connect.

You will need the following fields:

FieldRequiredDescription
api_keyYesYour ElevenLabs account API key, used to authenticate all requests to the API.
  1. Go to elevenlabs.io and log in or create an account.
  2. Click your avatar in the bottom-left corner and go to Profile.
  3. Scroll to the API key section and click Copy to copy the key.
  4. Use this value in the api_key field when connecting in SquadOS.
  1. Go to Tools in the side menu (/admin/tools).
  2. Open the Available tab and search for ElevenLabs.
  3. Click the card to open the details modal and hit Connect.
  4. You’re taken to the secure connection page hosted by Composio, where you enter the API key obtained above.
  5. Once done, you’re sent back to SquadOS with the account connected and the tool available to your agents. (Connection-flow details in Organization Tools.)

ELEVENLABS_TEXT_TO_SPEECH

Converts text to speech using a specified ElevenLabs voice and model, returning a downloadable audio file. The audio URL is nested at data.file.s3url in the response. Use ELEVENLABS_TEXT_TO_SPEECH_STREAM for real-time streaming instead.

NameTypeRequiredDescription
textstringYesInput text for speech conversion. Max 10,000 characters for most models. Flash/Turbo v2 models: up to 30,000. Flash/Turbo v2.5 models: up to 40,000.
voice_idstringYesIdentifier of the voice to use. Obtainable from the /v1/voices endpoint.
model_idstringNoIdentifier of the synthesis model. List available models via /v1/models; ensure can_do_text_to_speech is true.
output_formatstringNoOutput audio format (e.g., mp3_44100_128, pcm_24000, ulaw_8000). Some formats require a specific subscription tier.
seedintegerNoInteger seed for potentially deterministic audio generation.
voice_settingsobjectNoVoice settings for controlling speech generation characteristics.
optimize_streaming_latencyintegerNoLatency optimization controls (0–4). Higher values reduce latency, potentially affecting quality.
pronunciation_dictionary_locatorsarrayNoList of up to 3 pronunciation dictionary locators, applied sequentially.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_TEXT_TO_SPEECH_STREAM

Converts text to a spoken audio stream in real-time without saving a file or creating a history entry. Ideal for real-time responses. Use optimize_streaming_latency to balance latency versus quality.

NameTypeRequiredDescription
textstringYesThe text to be converted into speech. Recommended to keep under 5,000 characters.
voice_idstringYesIdentifier of the voice to use. Retrieve available IDs from GET /v1/voices.
model_idstringNoIdentifier of the model. Verify can_do_text_to_speech is true for the chosen model.
output_formatstringNoOutput audio format (e.g., mp3_44100_128, pcm_24000). Some formats require Creator or Pro tier.
seedintegerNoSeed for deterministic generation.
optimize_streaming_latencyintegerNoLatency optimization (0–4). Value 4 disables the text normalizer for lowest latency.
pronunciation_dictionary_locatorsarrayNoList of up to 3 pronunciation dictionary locators.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_SPEECH_TO_SPEECH

Converts an input audio file to speech using a specified voice. If a model_id is provided, it must support speech-to-speech conversion.

NameTypeRequiredDescription
audioobjectYesThe audio file to be converted.
voice_idstringYesIdentifier of the target voice.
model_idstringNoIdentifier of the model (must have can_do_voice_conversion equal to true).
output_formatstringNoOutput audio format.
seedintegerNoSeed for deterministic audio generation (0–4294967295).
voice_settingsstringNoJSON string defining voice settings such as stability and similarity_boost.
optimize_streaming_latencyintegerNoLatency optimization (0–4).
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_SPEECH_TO_SPEECH_STREAMING

Converts an input audio stream to a different voice output stream in real-time, using a specified speech-to-speech model.

NameTypeRequiredDescription
audioobjectYesThe input audio file (e.g., .wav, .mp3) to be converted.
voice_idstringYesIdentifier of the voice to use.
model_idstringNoIdentifier of the speech-to-speech model (e.g., eleven_english_sts_v2).
output_formatstringNoDesired output audio stream format.
seedintegerNoSeed for deterministic audio generation (0–4294967295).
optimize_streaming_latencyintegerNoLatency optimization (0–4).
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_ADD_VOICE

Adds a custom voice by uploading audio samples for voice cloning. Recommended: 1–2 minutes of clear audio without background noise. Supported formats: mp3, wav, ogg.

NameTypeRequiredDescription
namestringYesName for the new voice, used as its identifier in the platform.
filesarrayYesList of audio files for voice cloning. At least one file is required.
descriptionstringNoOptional description detailing the voice’s characteristics or intended use cases.
labelsstringNoOptional stringified JSON object of key-value pairs for categorization (e.g., {"accent": "American"}).
remove_background_noisebooleanNoIf true, removes background noise from samples. Only use if samples contain noise; applying to clean audio can reduce quality.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_EDIT_VOICE

Updates the name, audio files, description, or labels for an existing voice model. Only voices you own (cloned voices) can be edited; premade/default voices cannot be edited. The name field is required for all edit operations.

NameTypeRequiredDescription
namestringYesName for the voice model. This field is required.
voice_idstringYesIdentifier of the voice to edit. Only voices owned by you can be edited.
filesarrayNoOptional list of audio files to add to the voice model. Formats: mp3, wav, ogg.
descriptionstringNoNew description for the voice model.
labelsstringNoJSON string of key-value pairs for categorization; new labels overwrite existing ones.
remove_background_noisebooleanNoIf true, removes background noise from samples.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_EDIT_VOICE_SETTINGS

Edits key voice settings (e.g., stability, similarity enhancement, style exaggeration, speaker boost) for an existing voice, affecting all future audio generated with that voice ID.

NameTypeRequiredDescription
voice_idstringYesIdentifier of the voice whose settings are to be modified.
stabilitynumberNoControls voice consistency and randomness between generations (0.0–1.0). Lower values introduce broader emotional range.
similarity_boostnumberNoDetermines how closely the AI adheres to the original voice (0.0–1.0).
stylenumberNoAdjusts style exaggeration and expressiveness (0.0–1.0). Available for V2+ models.
speednumberNoControls speech rate and pacing (typically 0.25–4.0).
use_speaker_boostbooleanNoBoosts similarity to the original speaker. Not available for the Eleven v3 model.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_DELETE_VOICE

Permanently and irreversibly deletes a specific custom voice using its voice_id. The authenticated user must have permission to delete it.

NameTypeRequiredDescription
voice_idstringYesThe unique identifier of the voice to be deleted.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GET_VOICES

Retrieves a list of all available voices along with their detailed attributes and settings.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GET_VOICE

Retrieves comprehensive details for a specific, existing voice by its voice_id, optionally including its settings.

NameTypeRequiredDescription
voice_idstringYesIdentifier of the voice. Use GET /v1/voices to list available IDs.
with_settingsbooleanNoIf true, the response will include detailed settings information for the voice.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GET_SHARED_VOICES

Retrieves a paginated and filterable list of shared voices from the ElevenLabs Voice Library.

NameTypeRequiredDescription
searchstringNoA search term to filter voices by name or description.
languagestringNoFilters voices by language (ISO 639-1 code).
genderstringNoFilters voices by gender.
accentstringNoFilters voices by accent.
agestringNoFilters voices by age group.
categorystringNoFilters voices by category.
featuredbooleanNoFilters for voices that are marked as featured.
use_casesarrayNoFilters voices by their intended use cases.
pageintegerNoPage number for pagination, starting from 0.
page_sizeintegerNoMaximum number of shared voices per page (max 100).
sortstringNoSort options: created_date, usage_character_count_1y, trending, cloned_by_count.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GENERATE_A_RANDOM_VOICE

Generates a unique, random ElevenLabs text-to-speech voice based on input text and specified voice characteristics.

NameTypeRequiredDescription
textstringYesThe text to be synthesized. Length must be between 100 and 1,000 characters.
genderstringYesGender of the generated voice: female or male.
agestringYesAge category of the generated voice: young, middle_aged, or old.
accentstringYesAccent of the generated voice: american, british, african, australian, or indian.
accent_strengthnumberYesControls the strength of the accent (0.3–2.0).
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_DUB_A_VIDEO_OR_AN_AUDIO_FILE

Dub a video or audio file into a specified target language, requiring file or source_url and target_lang. If mode is manual, csv_file is also required.

NameTypeRequiredDescription
target_langstringYesRequired target language code for dubbing (e.g., en, es, pt).
fileobjectNoVideo or audio file to dub. Required if source_url is not provided.
source_urlstringNoURL of the video or audio file to dub. Required if file is not provided.
source_langstringNoLanguage of the original audio. Use auto for automatic detection or provide a language code.
namestringNoName for the dubbing project.
modestringNoDubbing mode: automatic for AI-driven dubbing or manual using a .csv file.
num_speakersintegerNoNumber of speakers in the audio. Use 0 for automatic detection.
watermarkbooleanNoInclude a watermark in the dubbed audio.
highest_resolutionbooleanNoProcess dubbing at highest possible resolution; may increase processing time.
dubbing_studiobooleanNoEnable Dubbing Studio features for advanced editing capabilities.
start_timeintegerNoStart time in seconds for the audio portion to dub.
end_timeintegerNoEnd time in seconds for the audio portion to dub.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GET_GENERATED_ITEMS

Retrieves metadata for a list of generated audio items from history, supporting pagination and optional filtering by voice ID.

NameTypeRequiredDescription
voice_idstringNoFilters history items to only include those generated with the specified voice ID.
page_sizeintegerNoMaximum number of history items to return per page (1–1000).
start_after_history_item_idstringNoThe ID of the history item to start fetching results after, for pagination.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_DOWNLOAD_HISTORY_ITEMS

Downloads audio clips from history by ID(s), returning a single file or a ZIP archive, with an optional output format (e.g., wav).

NameTypeRequiredDescription
history_item_idsarrayYesA list of unique string identifiers for the history items to be downloaded.
output_formatstringNoOptional output audio format. Accepts wav to convert to WAV. If omitted, returns in original synthesized format.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GET_MODELS

Retrieves a detailed list of all available ElevenLabs text-to-speech (TTS) models and their capabilities.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_CREATE_CONVERSATIONAL_AGENT

Creates a new ElevenLabs Conversational AI agent with specified configuration. After creating the agent, you can chain other tools to attach phone numbers or configure additional settings.

NameTypeRequiredDescription
conversation_configobjectYesConfiguration object defining the agent’s conversational behavior, including prompt, LLM model, language, and first message.
namestringNoHuman-readable name for the agent.
tagsarrayNoList of tags to organize and categorize the agent.
workflowobjectNoWorkflow configuration defining conditional logic and tool execution modes.
platform_settingsstringNoPlatform-specific settings including evaluation criteria and widget configuration.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_UPDATE_CONVAI_AGENT

Updates an existing ElevenLabs Conversational AI agent’s settings, such as name, conversation configuration, workflow, or platform settings.

NameTypeRequiredDescription
agent_idstringYesThe ID of the agent to update.
namestringNoA name to make the agent easier to find.
tagsarrayNoTags to help classify and filter the agent.
conversation_configobjectNoConversation configuration for the agent.
workflowobjectNoWorkflow configuration.
platform_settingsobjectNoPlatform settings for the agent.
version_descriptionstringNoDescription for this version when publishing changes (only for versioned agents).
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

Simulate Conversational AI Agent Conversation

Section titled “Simulate Conversational AI Agent Conversation”

ELEVENLABS_SIMULATE_CONVAI_AGENTS_SIMULATE_CONVERSATION

Runs a simulated conversation between an agent and an AI user. Returns a full transcript with analysis including success metrics and a conversation summary.

NameTypeRequiredDescription
agent_idstringYesThe ID of the agent to simulate.
simulation_specificationobjectYesA specification used to simulate a conversation between an agent and an AI user.
new_turns_limitintegerNoMaximum number of new turns to generate in the conversation simulation.
extra_evaluation_criteriaarrayNoA list of additional evaluation criteria to test.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_OUTBOUND_CALL

Places an outbound call via SIP trunk. Requires an API key with Conversational AI permissions enabled and a valid SIP trunk phone number configured for outbound calls.

NameTypeRequiredDescription
agent_idstringYesAgent ID to place the call with.
to_numberstringYesDestination phone number in E.164 format.
agent_phone_number_idstringYesID of the phone number to originate the call from (must support outbound calls).
conversation_initiation_client_dataobjectNoPersonalization and override payload for initiating the conversation.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_REGISTER_CALL_CONVAI_TWILIO

Registers a Twilio call and returns TwiML to connect the call to an ElevenLabs Conversational AI agent. Use when integrating ElevenLabs agents with your own Twilio infrastructure for inbound or outbound calls.

NameTypeRequiredDescription
agent_idstringYesThe unique identifier of the Conversational AI agent to use for this call.
to_numberstringYesThe phone number the call is directed to in E.164 format.
from_numberstringYesThe phone number the call is originating from in E.164 format.
directionstringNoDirection of the call: inbound for incoming calls or outbound for outgoing calls.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GET_USER_INFO

Retrieves detailed information about the authenticated ElevenLabs user’s account, including subscription, usage, API key, and status.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GET_USER_SUBSCRIPTION_INFO

Retrieves detailed subscription information for the currently authenticated ElevenLabs user.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_GET_VOICE_SETTINGS

Retrieves the stability, similarity, style, and speaker boost settings for a specific, existing ElevenLabs voice using its voice_id.

NameTypeRequiredDescription
voice_idstringYesIdentifier of the voice for which to retrieve settings.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

ELEVENLABS_CREATE_MUSIC_PLAN

Generates a music composition plan from a text prompt using the ElevenLabs Music API. Creates a structured plan with defined styles, sections, and durations that can be used as input for actual music generation or as a template for variations.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.