Tavily

Overview

Tavily is a search and information retrieval platform built for AI agents. It offers specialized endpoints for web search, page content extraction, deep website crawling, and URL structure mapping — all with precise controls for depth, domain, and format. With the Tavily integration in SquadOS, your agents gain access to up-to-date web information without the complexity of managing scrapers, proxies, or rate limits.

Official website: https://tavily.com/
Composio documentation: docs.composio.dev/toolkits/tavily

Authentication

This tool uses an API key (API_KEY) to connect.

You will need the following fields:

Field	Required	Description
`api_key`	Yes	API key generated in the Tavily dashboard at app.tavily.com, used to authenticate all requests.

How to get credentials

Go to app.tavily.com/home and create an account (or log in if you already have one).
In the dashboard, locate the API Keys section.
Click the ”+” button next to the API Keys section to generate a new key.
Fill in a name for the key, choose the type (Development for up to 100 req/min; Production for up to 1,000 req/min), and optionally set a monthly credit limit.
Copy the generated key — this is the value to use in the api_key field when connecting in SquadOS.

How to connect in SquadOS

Go to Tools in the side menu (/admin/tools).
Open the Available tab and search for Tavily.
Click the card to open the details and hit Connect.
You’re taken to the secure connection page hosted by Composio, where you enter the API key obtained above.
Once done, you’re sent back to SquadOS with the account connected and the tool available to agents. (Connection-flow details in Organization Tools.)

Available actions

Crawl website

TAVILY_CRAWL

Tool to perform intelligent graph-based website crawling with parallel path exploration and content extraction. Use when you need to traverse and extract content from multiple pages of a website following specific patterns or instructions. Supports depth/breadth controls, domain filtering, and natural language instructions for guided crawling.

Input parameters

Name	Type	Required	Description
`url`	string	Yes	Root URL to begin the crawl. Can be provided with or without protocol (e.g., `docs.tavily.com` or `https://docs.tavily.com`).
`limit`	integer	No	Total number of links to process before stopping the crawl.
`format`	string (`"markdown"` \| `"text"`)	No	Format of the extracted content: `"markdown"` or `"text"`.
`timeout`	integer	No	Maximum wait time in seconds for the crawl operation. Range: 10–150.
`max_depth`	integer	No	Maximum crawl depth from base URL. Range: 1–5. Depth of 1 means only direct links from the root URL.
`max_breadth`	integer	No	Maximum number of links to follow per page level.
`instructions`	string	No	Natural language guidance for the crawler to find specific pages or content. Using instructions increases cost to 2 credits per 10 pages. Example: `"Find all pages about the Python SDK"`.
`select_paths`	array	No	List of regex patterns for specific URL paths to include. Example: `['/docs/.', '/api/.']`.
`exclude_paths`	array	No	List of regex patterns to skip certain URL paths. Example: `['/admin/.', '/private/.']`.
`extract_depth`	string (`"basic"` \| `"advanced"`)	No	Extraction level for content: `"basic"` for standard extraction or `"advanced"` for deeper analysis.
`include_usage`	boolean	No	If true, includes credit usage information in the response.
`allow_external`	boolean	No	If true, includes links to external domains in the crawl.
`include_images`	boolean	No	If true, includes images in the crawl results.
`select_domains`	array	No	List of regex patterns for domain filtering. Only URLs matching these patterns will be crawled.
`exclude_domains`	array	No	List of regex patterns to exclude certain domains from the crawl.
`include_favicon`	boolean	No	If true, includes favicon URLs in the results.
`chunks_per_source`	integer	No	Maximum content snippets per source (max 500 chars each). Range: 1–5.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.

Extract page content

TAVILY_EXTRACT

Tool to extract and parse web page content from specified URLs using Tavily’s extract endpoint. Use when you need to retrieve clean, structured content from web pages with optional image extraction and content reranking based on query relevance.

Input parameters

Name	Type	Required	Description
`urls`	string	Yes	URL(s) to extract content from. Can be a single URL string or a list of URL strings.
`query`	string	No	User intent for reranking extracted content chunks. Helps prioritize the most relevant extracted content based on the query.
`format`	string (`"markdown"` \| `"text"`)	No	Content format for extraction: `"markdown"` or `"text"`. Default is `"markdown"`.
`timeout`	number	No	Maximum wait time in seconds for the extraction request. Must be between 1.0 and 60.0 seconds. Default is 30.0.
`extract_depth`	string (`"basic"` \| `"advanced"`)	No	Extraction depth level: `"basic"` for standard extraction or `"advanced"` for more in-depth extraction. Default is `"basic"`.
`include_usage`	boolean	No	If true, includes credit usage information in the response.
`include_images`	boolean	No	If true, includes a list of image URLs found in the extracted content.
`include_favicon`	boolean	No	If true, includes the favicon URL for each result.
`chunks_per_source`	integer	No	Maximum number of relevant chunks to extract per source. Must be between 1 and 5. Default is 3.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.

Get account usage

TAVILY_GET_USAGE

Tool to retrieve API key and account usage statistics from Tavily. Use when you need to check credit consumption, limits, and per-endpoint usage for search, extract, crawl, map, and research operations.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.

Map website

TAVILY_MAP

Tool to map a website and discover its pages. Use when you need to scan a website and get a structured list of URLs/pages it contains without extracting full content.

Input parameters

Name	Type	Required	Description
`url`	string	Yes	The root URL to begin mapping (e.g., `docs.tavily.com`). This is the starting point from which the crawler will discover and map pages.
`limit`	integer	No	Total number of links to process before stopping. Minimum is 1. Default is 50.
`timeout`	integer	No	Maximum number of seconds to wait for the mapping to complete. Range: 10–150. Default is 150.
`max_depth`	integer	No	How far from the base URL the crawler explores. Range: 1–5. Default is 1.
`max_breadth`	integer	No	The number of links to follow per page level. Minimum is 1. Default is 20.
`instructions`	string	No	Natural language directions for the crawler to guide its exploration. Using this parameter increases cost to 2 credits per 10 pages instead of 1.
`select_paths`	array	No	List of regex patterns for specific URL paths to include (e.g., `'/docs/.*'` to only include documentation paths).
`exclude_paths`	array	No	List of regex patterns to skip certain URL paths (e.g., `'/admin/.*'` to exclude admin pages).
`include_usage`	boolean	No	If true, includes credit usage details in the response. Default is false.
`allow_external`	boolean	No	If true, includes external domain links in the results. Default is true.
`select_domains`	array	No	List of regex patterns for domain targeting. Only URLs matching these domain patterns will be included.
`exclude_domains`	array	No	List of regex patterns to exclude certain domains from the mapping results.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.

Search the web

TAVILY_SEARCH

Use this to perform a web search via the Tavily API; offers controls for search depth, content types, result count, and domain filtering. Requires an active Tavily connection (401 = auth failure). Rate limit: ~2 req/s; apply exponential backoff on HTTP 429. Results are nested under response_data.results (not a flat list).

Input parameters

Name	Type	Required	Description
`query`	string	Yes	The search query string to find relevant information online. No native date-filter exists; embed time hints directly in the query string. For broad coverage, issue multiple focused queries rather than one broad query.
`max_results`	integer	No	Maximum number of search results to return. Large values combined with `include_raw_content=true` produce very large payloads.
`search_depth`	string (`"basic"` \| `"advanced"`)	No	Specifies search depth: `"basic"` (standard, 1 API Credit) or `"advanced"` (in-depth, 2 API Credits).
`include_answer`	boolean	No	If true, attempts to include a direct answer to the query (suitable for factual questions). The answer field can be null; treat `response_data.results` array as primary evidence.
`include_images`	boolean	No	If true, includes links to relevant images in search results.
`exclude_domains`	array	No	A list of domain names (e.g., `['exclude.com', 'othersite.net']`) to exclude from search results; results from these domains will be filtered out.
`include_domains`	array	No	A list of specific domain names (e.g., `['example.com', 'website.org']`) to restrict the search to; only results from these domains are returned.
`include_raw_content`	boolean	No	If true, includes raw content from visited websites (e.g., unprocessed HTML or text) in search results. Without this, results may be short snippets that omit critical detail.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.