Skip to content

Firecrawl

Firecrawl

Firecrawl automates web crawling and data extraction, enabling organizations to gather content, index sites, and gain insights from online sources at scale. With the integration in SquadOS, your agents can scrape pages, start full crawls, extract structured data via LLM, and perform autonomous deep research across multiple web sources.

This tool uses API key (API_KEY) to connect.

You will need the following fields:

FieldRequiredDescription
api_keyYesYour Firecrawl account API key, used to authenticate all requests.
  1. Go to firecrawl.dev and create an account or log in.
  2. In the dashboard, navigate to the API Keys section (or Settings → API Keys).
  3. Click Create new key, give it a descriptive name, and copy the generated value.
  1. Go to Tools in the side menu (/admin/tools).
  2. Open the Available tab and search for Firecrawl.
  3. Click the card to open the details and hit Connect.
  4. You’re taken to the secure connection page hosted by Composio, where you enter the API key obtained above.
  5. Once done, you’re sent back to SquadOS with the account connected and the tool available for your agents. (Connection-flow details in Organization Tools.)

FIRECRAWL_AGENT_CANCEL

Tool to cancel an in-progress agent job by its ID. Use when you need to terminate an active agent operation. The API returns a success boolean upon cancellation.

NameTypeRequiredDescription
idstringYesThe unique identifier (UUID) of the agent job to cancel.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_BATCH_SCRAPE

Tool to scrape multiple URLs in batch with concurrent processing. Use when you need to scrape multiple web pages efficiently with customizable formats and content filtering.

NameTypeRequiredDescription
urlsarrayYesThe URLs to be scraped in batch. At least one URL is required.
proxystringNoProxy type to use for requests (basic, stealth, or auto).
maxAgeintegerNoCache validity period in milliseconds. Default is 2 days.
mobilebooleanNoIf true, emulate a mobile device when scraping. Defaults to false.
actionsarrayNoBrowser actions to perform on each page before scraping.
formatsarrayNoDesired output formats for the scraped content. Defaults to ['markdown'].
headersobjectNoCustom HTTP headers to send with each request.
timeoutintegerNoRequest timeout in milliseconds.
waitForintegerNoDelay in milliseconds before content retrieval. Useful for pages with dynamic content. Defaults to 0.
webhookobjectNoWebhook configuration for batch scrape notifications.
blockAdsbooleanNoIf true, block advertisements during scraping. Defaults to true.
locationobjectNoLocation settings for the request.
excludeTagsarrayNoHTML tags to specifically exclude from the output.
includeTagsarrayNoHTML tags to specifically include in the output.
storeInCachebooleanNoIf true, store scraped content in cache for future use. Defaults to true.
maxConcurrencyintegerNoMaximum number of concurrent scrape operations.
onlyMainContentbooleanNoIf true, extract only the main content, excluding headers, footers, navigation bars, and ads. Defaults to true.
ignoreInvalidURLsbooleanNoIf true, skip invalid URLs instead of failing the entire batch. Defaults to true.
zeroDataRetentionbooleanNoIf true, do not retain any scraped data. Defaults to false.
removeBase64ImagesbooleanNoIf true, remove base64-encoded images from the scraped content. Defaults to true.
skipTlsVerificationbooleanNoIf true, skip TLS certificate verification. Defaults to true.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_BATCH_SCRAPE_CANCEL

Tool to cancel a running batch scrape job using its unique identifier. Use when you need to terminate an in-progress batch scrape operation.

NameTypeRequiredDescription
idstringYesThe unique identifier (UUID) of the batch scrape job to cancel.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_BATCH_SCRAPE_GET

Retrieves the current status and results of a batch scrape job using the job ID. Use this to check batch scrape progress and retrieve scraped data.

NameTypeRequiredDescription
idstringYesThe ID of the batch scrape job. Must be a valid UUID format obtained when the batch scrape was initiated.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_BATCH_SCRAPE_GET_ERRORS

Tool to retrieve error details from a batch scrape job, including failed URLs and URLs blocked by robots.txt. Use when you need to debug or understand why certain pages failed to scrape in a batch operation.

NameTypeRequiredDescription
idstringYesUnique identifier (UUID) of the batch scrape job.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CRAWL

Initiates a Firecrawl web crawl from a given URL, applying various filtering and content extraction rules, and polls until the job is complete; ensure the URL is accessible and any regex patterns for paths are valid.

NameTypeRequiredDescription
urlstringYesThe base URL to start crawling from. This is the initial entry point for the web crawler.
delayintegerNoDelay in milliseconds between requests to avoid overwhelming the server.
limitintegerNoMaximum number of pages to crawl. The crawl will stop once this limit is reached. Default is 10.
webhookstringNoAn optional webhook URL to receive real-time updates on the crawl job. Events include crawl start (crawl.started), page crawled (crawl.page), and crawl completion (crawl.completed or crawl.failed).
maxDepthintegerNoMaximum depth of subpages to crawl relative to the entered URL. A depth of 0 crawls only the entered URL, 1 crawls one path segment deeper, etc.
excludePathsarrayNoA list of regex patterns for URL paths to exclude from the crawl.
includePathsarrayNoA list of regex patterns for URL paths to include in the crawl. Only matching paths will be processed.
ignoreSitemapbooleanNoIf true, the crawler will ignore any sitemap.xml found on the website.
crawlEntireDomainbooleanNoIf true, allows the crawler to follow internal links to sibling or parent URLs, not just child paths. Replaces allowBackwardLinks.
maxDiscoveryDepthintegerNoMaximum depth to crawl based on discovery order. Root site and sitemapped pages have discovery depth 0.
allowBackwardLinksbooleanNoDEPRECATED: Use crawlEntireDomain instead. If true, allows the crawler to navigate to pages that were linked from already visited pages.
allowExternalLinksbooleanNoIf true, allows the crawler to follow links that lead to external websites.
scrapeOptionsProxystringNoProxy configuration for requests.
scrapeOptionsMaxAgeintegerNoMaximum age in seconds for cached content. If older, it will be re-scraped.
scrapeOptionsMobilebooleanNoIf true, emulate a mobile device when scraping.
scrapeOptionsActionsarrayNoList of actions to perform on each page before scraping (e.g., clicking buttons, waiting).
scrapeOptionsFormatsarrayNoSpecifies the desired output formats for the scraped content from each page. Default is ["markdown"]. IMPORTANT: If "json" format is included, scrapeOptionsJsonOptions must also be provided.
scrapeOptionsHeadersobjectNoCustom HTTP headers to send with each request.
scrapeOptionsTimeoutintegerNoTimeout in milliseconds for each page request. Default is 30000ms.
scrapeOptionsWaitForintegerNoAdditional milliseconds to wait after Firecrawl’s smart wait, before scraping the page.
ignoreQueryParametersbooleanNoIf true, ignore query parameters when determining if a URL has been visited.
scrapeOptionsBlockAdsbooleanNoIf true, block advertisements during scraping.
scrapeOptionsLocationobjectNoGeolocation settings for the scraper.
scrapeOptionsParsePDFbooleanNoIf true, attempt to parse PDF files encountered during crawling.
scrapeOptionsExcludeTagsarrayNoA list of HTML tags to exclude from the scraped output. Content within these tags will be removed before processing.
scrapeOptionsIncludeTagsarrayNoA list of HTML tags to specifically include in the scraped output. If empty or null, all relevant content is considered.
scrapeOptionsJsonOptionsobjectNoOptions for JSON format extraction including schema and prompts. REQUIRED when "json" format is specified in scrapeOptionsFormats.
scrapeOptionsStoreInCachebooleanNoIf true, store scraped content in cache for future use.
scrapeOptionsOnlyMainContentbooleanNoIf true, extract only the main content of each page, excluding headers, navigation bars, and footers. Default is true.
scrapeOptionsRemoveBase64ImagesbooleanNoIf true, remove base64-encoded images from the scraped content.
scrapeOptionsSkipTlsVerificationbooleanNoIf true, skip TLS certificate verification.
scrapeOptionsChangeTrackingOptionsobjectNoOptions for tracking changes between crawls.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CRAWL_CANCEL

Cancels an active or queued web crawl job using its ID; attempting to cancel completed, failed, or previously canceled jobs will not change their state.

NameTypeRequiredDescription
idstringYesThe unique identifier (UUID) of the crawl job to be canceled.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CRAWL_DELETE

Tool to cancel a running crawl job by its ID. Use when you need to stop an active crawl operation. The API returns a status of cancelled upon successful cancellation.

NameTypeRequiredDescription
idstringYesThe unique identifier (UUID) of the crawl job to cancel.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CRAWL_GET

Tool to retrieve the status and results of a Firecrawl crawl job. Use when you need to check the progress or get data from an ongoing or completed crawl operation. Returns crawl status, progress metrics, credits used, and the crawled page data.

NameTypeRequiredDescription
idstringYesThe ID of the crawl job to check status for. This is the UUID returned when the crawl was initiated.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CRAWL_GET_ERRORS

Tool to retrieve errors from a Firecrawl crawl job. Use when you need to understand why certain pages failed to scrape or which URLs were blocked by robots.txt during a crawl operation.

NameTypeRequiredDescription
idstringYesThe unique identifier (UUID) of the crawl job to retrieve errors from.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CRAWL_LIST_ACTIVE

Tool to retrieve all active crawl jobs for the authenticated team. Use when you need to see which crawl operations are currently running.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CRAWL_PARAMS_PREVIEW

Preview crawl parameters before starting a crawl by generating optimal configuration from natural language instructions. Use this tool to understand what crawl settings will be applied based on your requirements before executing a full crawl operation.

NameTypeRequiredDescription
urlstringYesThe website address to be crawled. This is the target URL for which crawl parameters will be generated.
promptstringYesNatural language description of crawling requirements (max 10,000 characters). Describe what pages to crawl, what to include or exclude, and any specific crawl behavior needed.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CRAWL_V2

[NEW v2 API] Initiates a Firecrawl v2 web crawl with enhanced features over v1: natural language prompts for automatic crawler configuration, crawlEntireDomain for sibling/parent page discovery, better depth control with maxDiscoveryDepth, subdomain support, and full webhook configuration. Polls until crawl is complete.

NameTypeRequiredDescription
urlstringYesThe base URL to start crawling from. This is the initial entry point for the web crawler.
delayintegerNoDelay in seconds between scrapes to respect website rate limits.
limitintegerNoMaximum number of pages to crawl. Default is 10.
promptstringNoA natural language prompt to automatically generate crawler settings. Explicitly set parameters will override the generated equivalents.
sitemapstringNoSitemap mode: include (default) uses sitemap and discovers other pages; skip ignores sitemap entirely; only crawls exclusively URLs from the sitemap.
webhookobjectNoWebhook configuration for receiving real-time crawl updates.
excludePathsarrayNoA list of regex patterns for URL paths to exclude from the crawl.
includePathsarrayNoA list of regex patterns for URL paths to include in the crawl.
maxConcurrencyintegerNoMaximum number of concurrent scrapes. If not specified, uses your team’s concurrency limit.
allowSubdomainsbooleanNoIf true, allows the crawler to follow links to subdomains of the main domain.
crawlEntireDomainbooleanNoAllows the crawler to follow internal links to sibling or parent URLs, not just child paths.
maxDiscoveryDepthintegerNoMaximum depth to crawl based on discovery order. Root site and sitemapped pages have discovery depth 0.
zeroDataRetentionbooleanNoIf true, enables zero data retention for this crawl. Contact help@firecrawl.dev to enable.
allowExternalLinksbooleanNoIf true, allows the crawler to follow links to external websites. Defaults to false.
scrapeOptions_proxystringNoProxy configuration for requests.
scrapeOptions_maxAgeintegerNoMaximum age in milliseconds for cached content. If older, it will be re-scraped.
scrapeOptions_mobilebooleanNoIf true, emulate a mobile device when scraping.
ignoreQueryParametersbooleanNoIf true, do not re-scrape the same path with different (or no) query parameters.
scrapeOptions_actionsarrayNoList of actions to perform on each page before scraping (e.g., clicking buttons, waiting).
scrapeOptions_formatsarrayNoSpecifies the desired output formats for the scraped content from each page. For JSON extraction, use a JsonFormatOptions object with type="json", optional schema, and optional prompt.
scrapeOptions_headersobjectNoCustom HTTP headers to send with each request.
scrapeOptions_parsersarrayNoList of parsers to use for specific content types (e.g., pdf).
scrapeOptions_timeoutintegerNoTimeout in milliseconds for each page request. Default is 30000ms.
scrapeOptions_waitForintegerNoDuration in milliseconds to wait for page JavaScript to execute and content to load before scraping.
scrapeOptions_blockAdsbooleanNoIf true, block advertisements during scraping.
scrapeOptions_locationobjectNoGeolocation settings for the scraper.
scrapeOptions_excludeTagsarrayNoA list of HTML tags to exclude from the scraped output. Content within these tags will be removed.
scrapeOptions_includeTagsarrayNoA list of HTML tags to specifically include in the scraped output.
scrapeOptions_storeInCachebooleanNoIf true, store scraped content in cache for future use.
scrapeOptions_onlyMainContentbooleanNoIf true, extract only the main content of each page, excluding headers, navigation bars, and footers. Default is true.
scrapeOptions_removeBase64ImagesbooleanNoIf true, remove base64-encoded images from the scraped content.
scrapeOptions_skipTlsVerificationbooleanNoIf true, skip TLS certificate verification.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CREDIT_USAGE_GET

Tool to get current team credit usage information. Use when you need to check remaining credits or billing period details.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_CREDIT_USAGE_GET_HISTORICAL

Tool to retrieve historical team credit usage on a monthly basis. Use when you need to analyze credit consumption patterns over time, optionally segmented by API key.

NameTypeRequiredDescription
byApiKeybooleanNoWhen enabled, breaks down usage by individual API key. Defaults to false.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_DEEP_RESEARCH

Initiates an AI-powered deep research operation that autonomously explores the web to investigate any topic and synthesizes findings from multiple sources. The research process iteratively searches, analyzes, and synthesizes information across multiple web sources, providing comprehensive insights with source citations. Results include a final analysis, detailed activity timeline, and curated source list. Billing: 1 credit per URL analyzed. Control costs with the maxUrls parameter. Note: This API is in Alpha and being deprecated after June 30, 2025; prefer FIRECRAWL_SEARCH + FIRECRAWL_EXTRACT or COMPOSIO_SEARCH_WEB for durable workflows.

NameTypeRequiredDescription
querystringYesThe research question or topic to investigate. Provide a clear, specific question or topic for best results.
formatsarrayNoOutput format list. Set to ["json"] to get structured JSON output. When using "json" format, you must also provide jsonOptions.
maxUrlsintegerNoMaximum number of URLs to analyze during research. Range: 1–1000. Default: 20. Higher values provide more comprehensive results but consume more credits (1 credit per URL).
maxDepthintegerNoControls how many iterations the research process goes through. Range: 1–10. Default: 7. Higher depth means more thorough research but longer processing time.
timeLimitintegerNoTime limit for the research job in seconds. Range: 30–300. Default: 270. Research will stop when this limit is reached.
jsonOptionsobjectNoConfiguration for JSON structured output. Must contain either "schema" (a valid JSON Schema dict) or "prompt" (a string).
systemPromptstringNoCustom system-level prompt to guide the agentic research exploration process.
analysisPromptstringNoCustom prompt to guide the final synthesis and analysis generation.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_EXTRACT

Extracts structured data from web pages by initiating an extraction job and polling for completion; requires a natural language prompt or a JSON schema (one must be provided).

NameTypeRequiredDescription
urlsarrayYesA list of URLs from which to extract data (maximum 10 URLs while in beta). Wildcards (e.g., https://example.com/blog/*) can be used for crawling multiple pages under a specific path.
promptstringNoNatural language query for information to extract from URL content. At least one of prompt or schema must be provided.
schemaobjectNoJSON object defining the desired structure for extracted data. Must be a valid JSON Schema object with properties and types. At least one of prompt or schema must be provided.
showSourcesbooleanNoWhen true, the sources used to extract the data will be included in the response as sources key.
ignoreSitemapbooleanNoBypasses sitemap.xml during scanning.
scrapeOptionsobjectNoAdvanced scraping configuration.
enableWebSearchbooleanNoIf true, allows crawling links outside initial domains in urls; if false, restricts to same domains.
ignoreInvalidURLsbooleanNoProceeds with valid URLs, returning invalid ones separately.
includeSubdomainsbooleanNoExtends scanning to subdomains.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_EXTRACT_GET

Tool to retrieve the status and results of a previously submitted extract job. Use when you need to check the progress or get the final results of an extraction operation.

NameTypeRequiredDescription
idstringYesThe unique identifier (UUID format) of the extract job to retrieve.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_GET_AGENT_STATUS

Tool to get the status and results of an agent job. Use when you need to check if an agent job has completed and retrieve the collected data. Agent jobs autonomously search, navigate, and extract data from the web.

NameTypeRequiredDescription
idstringYesUnique identifier (UUID) of the agent job.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_GET_DEEP_RESEARCH_STATUS

Retrieves the status and results of a deep research job by its ID. Use when you need to check the progress or retrieve the final analysis of a deep research operation.

NameTypeRequiredDescription
idstringYesUnique identifier (UUID) of the deep research job. Must be the UUID returned by FIRECRAWL_DEEP_RESEARCH; arbitrary UUIDs are not valid.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_GET_THE_STATUS_OF_A_CRAWL_JOB

Retrieves the current status, progress, and details of a web crawl job, using the job ID obtained when the crawl was initiated.

NameTypeRequiredDescription
idstringYesUnique identifier (UUID) of the crawl job.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_LLMS_TXT_GENERATE

Initiates an async job to generate an LLMs.txt file for a website, converting web content into LLM-friendly format. Returns a job ID to check status and retrieve results. Use when you need to create a standardized, machine-readable representation of website content for language models.

NameTypeRequiredDescription
urlstringYesThe URL to generate LLMs.txt from. Must be a valid URI format.
maxUrlsintegerNoMaximum number of URLs to analyze when generating the LLMs.txt file. Must be between 1 and 100. Default is 10.
showFullTextbooleanNoInclude full text content in the response. When true, generates both llmstxt and llmsfulltxt. Default is false.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_LLMS_TXT_GET

Tool to get the status and results of an LLMs.txt generation job. Use when you need to check if a job has completed and retrieve the generated content.

NameTypeRequiredDescription
idstringYesUnique identifier (UUID) of the LLMs.txt generation job.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_MAP_MULTIPLE_URLS_BASED_ON_OPTIONS

Maps a website by discovering URLs from a starting base URL, with options to customize the crawl via search query, subdomain inclusion, sitemap handling, and result limits; search effectiveness is site-dependent.

NameTypeRequiredDescription
urlstringYesThe starting website URL to map and discover links from. Must be a valid HTTP/HTTPS URL string (e.g., https://example.com). Do NOT pass code snippets, SDK examples, or anything other than a plain URL.
limitintegerNoMaximum number of links to return. Defaults to 5000. Maximum allowed is 100000.
searchstringNoOptional search query to guide URL mapping, prioritizing or finding specific page types.
sitemapstringNoSitemap handling mode: skip to exclude sitemaps, include to use sitemaps with other discovery methods (default), or only to return only sitemap URLs.
timeoutintegerNoTimeout in milliseconds. No timeout is applied by default.
locationobjectNoGeographic settings for location-based request processing. Object with country (ISO 3166-1 alpha-2 code, e.g., US, DE, JP) and optionally languages (array of language codes).
ignoreCachebooleanNoIf true, bypasses cached sitemap data. Useful when sitemaps have been recently updated. Sitemap data is cached for up to 7 days. Defaults to false.
includeSubdomainsbooleanNoIf true, includes subdomains of the base URL in the mapping. Defaults to true.
ignoreQueryParametersbooleanNoIf true, excludes URLs with query parameters from results. Defaults to true.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_QUEUE_GET

Tool to retrieve metrics about the team’s scrape queue. Use when you need to check queue status, job counts, or concurrency limits.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_SCRAPE

Scrapes a publicly accessible URL, optionally performing pre-scrape browser actions or extracting structured JSON using an LLM, to retrieve content in specified formats.

NameTypeRequiredDescription
urlstringYesThe fully qualified URL of the web page to scrape. Must start with http:// or https:// and be a valid web URL.
actionsarrayNoAn optional list of browser actions (e.g., click, write, wait, press) to perform on the page before scraping. Useful for interacting with dynamic content, filling forms, or navigating through page elements.
formatsarrayNoA list of desired output formats for the scraped content. Defaults to ['markdown']. Cannot include both screenshot and screenshot@fullPage. If json is included, jsonOptions must be provided.
timeoutintegerNoMaximum time in milliseconds to wait for the scraping request to complete. Defaults to 30000.
waitForintegerNoTime in milliseconds to wait for the page to load or for dynamic content to render before starting the scrape. Defaults to 0.
locationobjectNoLocation settings for the request.
excludeTagsarrayNoA list of HTML tags to specifically exclude from the output. Content within these tags will be removed.
includeTagsarrayNoA list of HTML tags to specifically include in the output. Content within these tags will be prioritized.
jsonOptionsobjectNoOptions for JSON extraction.
onlyMainContentbooleanNoIf true, attempts to extract only the main article content, excluding headers, footers, navigation bars, and ads. Defaults to true.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_SEARCH

Performs a web search for a query, scrapes content from the top search results using Firecrawl, and returns details in specified formats.

NameTypeRequiredDescription
qstringYesThe search query to execute. Can be provided as query or q.
langstringNoLanguage code for search results (e.g., en for English, default en).
limitintegerNoMaximum number of search results to return (1–100, default 5).
countrystringNoCountry code to tailor search results (e.g., us for United States, default us).
formatsarrayNoDesired output formats for scraped content of each search result. Available string formats: markdown, html, rawHtml, links. For screenshots, use object format: {'type': 'screenshot', 'fullPage': true/false, 'quality': 1-100}.
timeoutintegerNoMaximum time in milliseconds for search and scrape operations (1000–300000, default 60000).
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_START_AGENT

Tool to start an agent job for agentic web extraction with multi-page navigation and interaction capabilities. Use when you need to autonomously gather data from the web with complex navigation requirements. The agent can search, navigate, and extract information across multiple pages based on your natural language prompt.

NameTypeRequiredDescription
urlsarrayNoSpecific URLs to constrain the agent’s search. If provided, the agent will start from these URLs. If not provided, the agent will autonomously search the web.
promptstringYesNatural language description of what data you want to extract. The agent will autonomously navigate and interact with web pages to gather this information.
schemaobjectNoJSON schema defining the structure of data you want returned. Must be a valid JSON Schema object with properties and types.
maxCreditsintegerNoMaximum credits to spend on the request. The agent will stop when this limit is reached, preventing unexpected costs.
strictConstrainToURLsbooleanNoWhether to strictly limit the agent to only the provided URLs. If true, the agent will not navigate to external links.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_TOKEN_USAGE_GET

Tool to retrieve the current team’s token usage and balance information for Firecrawl’s Extract feature. Use when you need to check remaining token credits, plan allocation, or billing period details.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.

FIRECRAWL_TOKEN_USAGE_GET_HISTORICAL

Tool to retrieve historical team token usage on a monthly basis. Use when you need to analyze token consumption patterns over time, optionally segmented by API key.

NameTypeRequiredDescription
byApiKeybooleanNoWhen enabled, breaks down usage by individual API key. Defaults to false.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError if any occurred during the execution of the action.
successfulbooleanYesWhether or not the action execution was successful.