Tavily
Overview
Section titled “Overview”Tavily is a search and information retrieval platform built for AI agents. It offers specialized endpoints for web search, page content extraction, deep website crawling, and URL structure mapping — all with precise controls for depth, domain, and format. With the Tavily integration in SquadOS, your agents gain access to up-to-date web information without the complexity of managing scrapers, proxies, or rate limits.
- Official website: https://tavily.com/
- Composio documentation: docs.composio.dev/toolkits/tavily
Authentication
Section titled “Authentication”This tool uses an API key (API_KEY) to connect.
You will need the following fields:
| Field | Required | Description |
|---|---|---|
api_key | Yes | API key generated in the Tavily dashboard at app.tavily.com, used to authenticate all requests. |
How to get credentials
Section titled “How to get credentials”- Go to app.tavily.com/home and create an account (or log in if you already have one).
- In the dashboard, locate the API Keys section.
- Click the ”+” button next to the API Keys section to generate a new key.
- Fill in a name for the key, choose the type (Development for up to 100 req/min; Production for up to 1,000 req/min), and optionally set a monthly credit limit.
- Copy the generated key — this is the value to use in the
api_keyfield when connecting in SquadOS.
How to connect in SquadOS
Section titled “How to connect in SquadOS”- Go to Tools in the side menu (
/admin/tools). - Open the Available tab and search for
Tavily. - Click the card to open the details and hit Connect.
- You’re taken to the secure connection page hosted by Composio, where you enter the API key obtained above.
- Once done, you’re sent back to SquadOS with the account connected and the tool available to agents. (Connection-flow details in Organization Tools.)
Available actions
Section titled “Available actions”Crawl website
Section titled “Crawl website”TAVILY_CRAWL
Tool to perform intelligent graph-based website crawling with parallel path exploration and content extraction. Use when you need to traverse and extract content from multiple pages of a website following specific patterns or instructions. Supports depth/breadth controls, domain filtering, and natural language instructions for guided crawling.
Input parameters
Section titled “Input parameters”| Name | Type | Required | Description |
|---|---|---|---|
url | string | Yes | Root URL to begin the crawl. Can be provided with or without protocol (e.g., docs.tavily.com or https://docs.tavily.com). |
limit | integer | No | Total number of links to process before stopping the crawl. |
format | string ("markdown" | "text") | No | Format of the extracted content: "markdown" or "text". |
timeout | integer | No | Maximum wait time in seconds for the crawl operation. Range: 10–150. |
max_depth | integer | No | Maximum crawl depth from base URL. Range: 1–5. Depth of 1 means only direct links from the root URL. |
max_breadth | integer | No | Maximum number of links to follow per page level. |
instructions | string | No | Natural language guidance for the crawler to find specific pages or content. Using instructions increases cost to 2 credits per 10 pages. Example: "Find all pages about the Python SDK". |
select_paths | array | No | List of regex patterns for specific URL paths to include. Example: ['/docs/.*', '/api/.*']. |
exclude_paths | array | No | List of regex patterns to skip certain URL paths. Example: ['/admin/.*', '/private/.*']. |
extract_depth | string ("basic" | "advanced") | No | Extraction level for content: "basic" for standard extraction or "advanced" for deeper analysis. |
include_usage | boolean | No | If true, includes credit usage information in the response. |
allow_external | boolean | No | If true, includes links to external domains in the crawl. |
include_images | boolean | No | If true, includes images in the crawl results. |
select_domains | array | No | List of regex patterns for domain filtering. Only URLs matching these patterns will be crawled. |
exclude_domains | array | No | List of regex patterns to exclude certain domains from the crawl. |
include_favicon | boolean | No | If true, includes favicon URLs in the results. |
chunks_per_source | integer | No | Maximum content snippets per source (max 500 chars each). Range: 1–5. |
Output
Section titled “Output”| Name | Type | Required | Description |
|---|---|---|---|
data | string | Yes | Data from the action execution. |
error | string | No | Error message if execution failed. |
successful | boolean | Yes | Whether the action executed successfully. |
Extract page content
Section titled “Extract page content”TAVILY_EXTRACT
Tool to extract and parse web page content from specified URLs using Tavily’s extract endpoint. Use when you need to retrieve clean, structured content from web pages with optional image extraction and content reranking based on query relevance.
Input parameters
Section titled “Input parameters”| Name | Type | Required | Description |
|---|---|---|---|
urls | string | Yes | URL(s) to extract content from. Can be a single URL string or a list of URL strings. |
query | string | No | User intent for reranking extracted content chunks. Helps prioritize the most relevant extracted content based on the query. |
format | string ("markdown" | "text") | No | Content format for extraction: "markdown" or "text". Default is "markdown". |
timeout | number | No | Maximum wait time in seconds for the extraction request. Must be between 1.0 and 60.0 seconds. Default is 30.0. |
extract_depth | string ("basic" | "advanced") | No | Extraction depth level: "basic" for standard extraction or "advanced" for more in-depth extraction. Default is "basic". |
include_usage | boolean | No | If true, includes credit usage information in the response. |
include_images | boolean | No | If true, includes a list of image URLs found in the extracted content. |
include_favicon | boolean | No | If true, includes the favicon URL for each result. |
chunks_per_source | integer | No | Maximum number of relevant chunks to extract per source. Must be between 1 and 5. Default is 3. |
Output
Section titled “Output”| Name | Type | Required | Description |
|---|---|---|---|
data | string | Yes | Data from the action execution. |
error | string | No | Error message if execution failed. |
successful | boolean | Yes | Whether the action executed successfully. |
Get account usage
Section titled “Get account usage”TAVILY_GET_USAGE
Tool to retrieve API key and account usage statistics from Tavily. Use when you need to check credit consumption, limits, and per-endpoint usage for search, extract, crawl, map, and research operations.
Output
Section titled “Output”| Name | Type | Required | Description |
|---|---|---|---|
data | string | Yes | Data from the action execution. |
error | string | No | Error message if execution failed. |
successful | boolean | Yes | Whether the action executed successfully. |
Map website
Section titled “Map website”TAVILY_MAP
Tool to map a website and discover its pages. Use when you need to scan a website and get a structured list of URLs/pages it contains without extracting full content.
Input parameters
Section titled “Input parameters”| Name | Type | Required | Description |
|---|---|---|---|
url | string | Yes | The root URL to begin mapping (e.g., docs.tavily.com). This is the starting point from which the crawler will discover and map pages. |
limit | integer | No | Total number of links to process before stopping. Minimum is 1. Default is 50. |
timeout | integer | No | Maximum number of seconds to wait for the mapping to complete. Range: 10–150. Default is 150. |
max_depth | integer | No | How far from the base URL the crawler explores. Range: 1–5. Default is 1. |
max_breadth | integer | No | The number of links to follow per page level. Minimum is 1. Default is 20. |
instructions | string | No | Natural language directions for the crawler to guide its exploration. Using this parameter increases cost to 2 credits per 10 pages instead of 1. |
select_paths | array | No | List of regex patterns for specific URL paths to include (e.g., '/docs/.*' to only include documentation paths). |
exclude_paths | array | No | List of regex patterns to skip certain URL paths (e.g., '/admin/.*' to exclude admin pages). |
include_usage | boolean | No | If true, includes credit usage details in the response. Default is false. |
allow_external | boolean | No | If true, includes external domain links in the results. Default is true. |
select_domains | array | No | List of regex patterns for domain targeting. Only URLs matching these domain patterns will be included. |
exclude_domains | array | No | List of regex patterns to exclude certain domains from the mapping results. |
Output
Section titled “Output”| Name | Type | Required | Description |
|---|---|---|---|
data | string | Yes | Data from the action execution. |
error | string | No | Error message if execution failed. |
successful | boolean | Yes | Whether the action executed successfully. |
Search the web
Section titled “Search the web”TAVILY_SEARCH
Use this to perform a web search via the Tavily API; offers controls for search depth, content types, result count, and domain filtering. Requires an active Tavily connection (401 = auth failure). Rate limit: ~2 req/s; apply exponential backoff on HTTP 429. Results are nested under response_data.results (not a flat list).
Input parameters
Section titled “Input parameters”| Name | Type | Required | Description |
|---|---|---|---|
query | string | Yes | The search query string to find relevant information online. No native date-filter exists; embed time hints directly in the query string. For broad coverage, issue multiple focused queries rather than one broad query. |
max_results | integer | No | Maximum number of search results to return. Large values combined with include_raw_content=true produce very large payloads. |
search_depth | string ("basic" | "advanced") | No | Specifies search depth: "basic" (standard, 1 API Credit) or "advanced" (in-depth, 2 API Credits). |
include_answer | boolean | No | If true, attempts to include a direct answer to the query (suitable for factual questions). The answer field can be null; treat response_data.results array as primary evidence. |
include_images | boolean | No | If true, includes links to relevant images in search results. |
exclude_domains | array | No | A list of domain names (e.g., ['exclude.com', 'othersite.net']) to exclude from search results; results from these domains will be filtered out. |
include_domains | array | No | A list of specific domain names (e.g., ['example.com', 'website.org']) to restrict the search to; only results from these domains are returned. |
include_raw_content | boolean | No | If true, includes raw content from visited websites (e.g., unprocessed HTML or text) in search results. Without this, results may be short snippets that omit critical detail. |
Output
Section titled “Output”| Name | Type | Required | Description |
|---|---|---|---|
data | string | Yes | Data from the action execution. |
error | string | No | Error message if execution failed. |
successful | boolean | Yes | Whether the action executed successfully. |