Scrapingbee

Overview

ScrapingBee is a web scraping API that handles headless browsers and proxy rotation, allowing developers to extract HTML from any website in a single API call. With the ScrapingBee integration in SquadOS, your agents can scrape pages, extract structured data, and bypass anti-bot protections reliably and easily.

Official website: https://www.scrapingbee.com/
Composio documentation: docs.composio.dev/toolkits/scrapingbee

Authentication

This tool uses an API key (API_KEY) to connect.

You will need the following fields:

Field	Required	Description
`api_key`	Yes	Your private ScrapingBee API key, used to authenticate all requests.

How to get credentials

Go to dashboard.scrapingbee.com/account/register and create an account.
Confirm your email to activate the account.
Log in at dashboard.scrapingbee.com/account/login.
Navigate to dashboard.scrapingbee.com/account/manage/api_key.
Copy the API key displayed — this is the value to use in the api_key field when connecting in SquadOS.

How to connect in SquadOS

Go to Tools in the side menu (/admin/tools).
Open the Available tab and search for Scrapingbee.
Click the card to open the details and hit Connect.
You’re taken to the secure connection page hosted by Composio, where you enter the API key obtained above.
Once done, you’re sent back to SquadOS with the account connected and the tool available to agents. (Connection-flow details in Organization Tools.)

Available actions

Data Extraction

SCRAPINGBEE_DATA_EXTRACTION

Tool to extract structured data from a webpage using CSS or XPath selectors. Use ScrapingBee’s extract_rules feature.

Input parameters

Name	Type	Required	Description
`url`	string	Yes	The webpage URL to extract data from.
`wait`	integer	No	Seconds to wait before extraction (for dynamic content).
`device`	string	No	Emulate device type (`desktop` or `mobile`).
`api_key`	string	Yes	Your ScrapingBee API key.
`extractor`	object	Yes	JSON object defining fields to extract and their CSS/XPath selectors. For nested selectors, use object with `selector` and optional `type` keys. Misaligned or invalid selectors silently drop fields with no error — verify each selector matches the target DOM before large-scale use.
`javascript`	boolean	No	Whether to render JavaScript before extraction.
`country_code`	string	No	Two-letter country code for proxy geolocation (e.g., `us`, `de`).
`premium_proxy`	boolean	No	Use premium proxy for higher reliability.
`block_resources`	boolean	No	Block images, CSS, and resources to speed up extraction.
`forward_headers`	object	No	Custom HTTP headers to forward to the target website. Provide as a dict, e.g., `{'Accept-Language': 'en-US'}`. Headers will be prefixed with `Spb-` and forwarded to the target.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.

HTML Fetch

SCRAPINGBEE_HTML_FETCH

Tool to fetch HTML or screenshot via ScrapingBee HTML API. Use when you need page markup or image after optional JS rendering and resource controls. For anti-bot or CAPTCHA-protected sites (e.g., Cloudflare), combine render_js=true with premium_proxy=true or stealth_proxy=true to avoid blocks.

Input parameters

Name	Type	Required	Description
`url`	string	Yes	The URL to scrape.
`wait`	integer	No	Milliseconds to wait before returning content.
`retry`	integer	No	Number of retries on request failure.
`device`	string	No	Device type to emulate (`desktop` or `mobile`).
`cookies`	string	No	Cookies to send in requests (HTTP header string).
`wait_for`	string	No	CSS selector to wait for before returning content.
`block_ads`	boolean	No	Block ads and tracking scripts.
`render_js`	boolean	No	Render JavaScript before returning HTML. Required for client-side rendered pages where dynamic data is absent in raw HTML.
`js_snippet`	string	No	JavaScript snippet to execute before returning content.
`screenshot`	boolean	No	Return screenshot as base64-encoded PNG.
`js_scenario`	string	No	JSON scenario for custom headless browser actions.
`country_code`	string	No	Two-letter country code for geolocation (e.g., `us`).
`extract_rules`	string	No	Extraction rules (CSS selector or JSONPath).
`premium_proxy`	boolean	No	Use premium proxy for scraping.
`stealth_proxy`	boolean	No	Use stealth (undetectable) proxy mode.
`block_resources`	boolean	No	Block images and CSS resources on the page to speed up scraping.
`screenshot_selector`	string	No	CSS selector of element to screenshot.
`screenshot_full_page`	boolean	No	Capture full-page screenshot instead of only viewport.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.

Proxy Mode

SCRAPINGBEE_SCRAPING_BEE_PROXY_MODE

Tool to fetch web content via ScrapingBee’s Proxy Mode. Use when you need to route requests through ScrapingBee proxies with optional JS rendering and resource blocking.

Input parameters

Name	Type	Required	Description
`url`	string	Yes	The target URL to scrape through ScrapingBee Proxy Mode.
`cookies`	object	No	Cookies to send with the request as a key-value mapping.
`headers`	object	No	Additional HTTP headers to forward to the target site. Each header will be prefixed with `Spb-` and forwarded when `forward_headers` is enabled.
`timeout`	integer	No	Request timeout in milliseconds.
`block_ads`	boolean	No	Block ads and tracking scripts to speed up scraping.
`render_js`	boolean	No	Enable JavaScript rendering before returning content.
`session_id`	integer	No	Session identifier (integer) to keep the same IP for multiple requests. Use the same number to maintain consistent IP across requests.
`js_scenario`	string	No	Custom JavaScript scenario name for advanced interactions.
`country_code`	string	No	Two-letter country code for geolocated proxy (e.g., `us`, `fr`).
`premium_proxy`	boolean	No	Use premium proxies for higher reliability.
`stealth_proxy`	boolean	No	Use stealth proxy mode for extra undetectability.
`block_resources`	boolean	No	Block images and CSS resources to speed up scraping. Only relevant when `render_js` is enabled.
`forward_headers`	boolean	No	Forward original request headers to the target site.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.

Stealth Proxy

SCRAPINGBEE_STEALTH_PROXY

Tool to perform stealth scraping via ScrapingBee’s Stealth Proxy mode. Use when you encounter anti-bot measures requiring undetectable requests.

Input parameters

Name	Type	Required	Description
`url`	string	Yes	The URL of the webpage to retrieve using stealth proxy.
`wait`	integer	No	Wait time in milliseconds before returning the response.
`device`	string	No	Device type to emulate during rendering. Options: `desktop` or `mobile`.
`cookies`	string	No	Custom cookies in semicolon-separated format: `name1=value1;name2=value2`.
`js_render`	boolean	No	Render JavaScript on the page before returning the response.
`country_code`	string	No	Two-letter country code for proxy geolocation (e.g., `us`, `de`).
`extract_rules`	string	No	Extraction rules in JSON string for structured data.
`premium_proxy`	boolean	No	Use premium proxies for higher reliability.
`stealth_proxy`	boolean	No	Enable stealth proxy mode. Use when the target site blocks bots.
`block_resources`	boolean	No	Block images, styles, and fonts for faster loads.
`forward_headers`	boolean	No	Forward original request headers from the browser.
`return_page_source`	boolean	No	Return the raw page source instead of text.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.

Usage Stats

SCRAPINGBEE_USAGE_STATS

Tool to retrieve usage statistics for your ScrapingBee account. Use when you need to monitor remaining credits and request count.

Output

Name	Type	Required	Description
`data`	string	Yes	Data from the action execution.
`error`	string	No	Error message if execution failed.
`successful`	boolean	Yes	Whether the action executed successfully.