Skip to content

Scrapingbee

Scrapingbee

ScrapingBee is a web scraping API that handles headless browsers and proxy rotation, allowing developers to extract HTML from any website in a single API call. With the ScrapingBee integration in SquadOS, your agents can scrape pages, extract structured data, and bypass anti-bot protections reliably and easily.

This tool uses an API key (API_KEY) to connect.

You will need the following fields:

FieldRequiredDescription
api_keyYesYour private ScrapingBee API key, used to authenticate all requests.
  1. Go to dashboard.scrapingbee.com/account/register and create an account.
  2. Confirm your email to activate the account.
  3. Log in at dashboard.scrapingbee.com/account/login.
  4. Navigate to dashboard.scrapingbee.com/account/manage/api_key.
  5. Copy the API key displayed — this is the value to use in the api_key field when connecting in SquadOS.
  1. Go to Tools in the side menu (/admin/tools).
  2. Open the Available tab and search for Scrapingbee.
  3. Click the card to open the details and hit Connect.
  4. You’re taken to the secure connection page hosted by Composio, where you enter the API key obtained above.
  5. Once done, you’re sent back to SquadOS with the account connected and the tool available to agents. (Connection-flow details in Organization Tools.)

SCRAPINGBEE_DATA_EXTRACTION

Tool to extract structured data from a webpage using CSS or XPath selectors. Use ScrapingBee’s extract_rules feature.

NameTypeRequiredDescription
urlstringYesThe webpage URL to extract data from.
waitintegerNoSeconds to wait before extraction (for dynamic content).
devicestringNoEmulate device type (desktop or mobile).
api_keystringYesYour ScrapingBee API key.
extractorobjectYesJSON object defining fields to extract and their CSS/XPath selectors. For nested selectors, use object with selector and optional type keys. Misaligned or invalid selectors silently drop fields with no error — verify each selector matches the target DOM before large-scale use.
javascriptbooleanNoWhether to render JavaScript before extraction.
country_codestringNoTwo-letter country code for proxy geolocation (e.g., us, de).
premium_proxybooleanNoUse premium proxy for higher reliability.
block_resourcesbooleanNoBlock images, CSS, and resources to speed up extraction.
forward_headersobjectNoCustom HTTP headers to forward to the target website. Provide as a dict, e.g., {'Accept-Language': 'en-US'}. Headers will be prefixed with Spb- and forwarded to the target.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError message if execution failed.
successfulbooleanYesWhether the action executed successfully.

SCRAPINGBEE_HTML_FETCH

Tool to fetch HTML or screenshot via ScrapingBee HTML API. Use when you need page markup or image after optional JS rendering and resource controls. For anti-bot or CAPTCHA-protected sites (e.g., Cloudflare), combine render_js=true with premium_proxy=true or stealth_proxy=true to avoid blocks.

NameTypeRequiredDescription
urlstringYesThe URL to scrape.
waitintegerNoMilliseconds to wait before returning content.
retryintegerNoNumber of retries on request failure.
devicestringNoDevice type to emulate (desktop or mobile).
cookiesstringNoCookies to send in requests (HTTP header string).
wait_forstringNoCSS selector to wait for before returning content.
block_adsbooleanNoBlock ads and tracking scripts.
render_jsbooleanNoRender JavaScript before returning HTML. Required for client-side rendered pages where dynamic data is absent in raw HTML.
js_snippetstringNoJavaScript snippet to execute before returning content.
screenshotbooleanNoReturn screenshot as base64-encoded PNG.
js_scenariostringNoJSON scenario for custom headless browser actions.
country_codestringNoTwo-letter country code for geolocation (e.g., us).
extract_rulesstringNoExtraction rules (CSS selector or JSONPath).
premium_proxybooleanNoUse premium proxy for scraping.
stealth_proxybooleanNoUse stealth (undetectable) proxy mode.
block_resourcesbooleanNoBlock images and CSS resources on the page to speed up scraping.
screenshot_selectorstringNoCSS selector of element to screenshot.
screenshot_full_pagebooleanNoCapture full-page screenshot instead of only viewport.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError message if execution failed.
successfulbooleanYesWhether the action executed successfully.

SCRAPINGBEE_SCRAPING_BEE_PROXY_MODE

Tool to fetch web content via ScrapingBee’s Proxy Mode. Use when you need to route requests through ScrapingBee proxies with optional JS rendering and resource blocking.

NameTypeRequiredDescription
urlstringYesThe target URL to scrape through ScrapingBee Proxy Mode.
cookiesobjectNoCookies to send with the request as a key-value mapping.
headersobjectNoAdditional HTTP headers to forward to the target site. Each header will be prefixed with Spb- and forwarded when forward_headers is enabled.
timeoutintegerNoRequest timeout in milliseconds.
block_adsbooleanNoBlock ads and tracking scripts to speed up scraping.
render_jsbooleanNoEnable JavaScript rendering before returning content.
session_idintegerNoSession identifier (integer) to keep the same IP for multiple requests. Use the same number to maintain consistent IP across requests.
js_scenariostringNoCustom JavaScript scenario name for advanced interactions.
country_codestringNoTwo-letter country code for geolocated proxy (e.g., us, fr).
premium_proxybooleanNoUse premium proxies for higher reliability.
stealth_proxybooleanNoUse stealth proxy mode for extra undetectability.
block_resourcesbooleanNoBlock images and CSS resources to speed up scraping. Only relevant when render_js is enabled.
forward_headersbooleanNoForward original request headers to the target site.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError message if execution failed.
successfulbooleanYesWhether the action executed successfully.

SCRAPINGBEE_STEALTH_PROXY

Tool to perform stealth scraping via ScrapingBee’s Stealth Proxy mode. Use when you encounter anti-bot measures requiring undetectable requests.

NameTypeRequiredDescription
urlstringYesThe URL of the webpage to retrieve using stealth proxy.
waitintegerNoWait time in milliseconds before returning the response.
devicestringNoDevice type to emulate during rendering. Options: desktop or mobile.
cookiesstringNoCustom cookies in semicolon-separated format: name1=value1;name2=value2.
js_renderbooleanNoRender JavaScript on the page before returning the response.
country_codestringNoTwo-letter country code for proxy geolocation (e.g., us, de).
extract_rulesstringNoExtraction rules in JSON string for structured data.
premium_proxybooleanNoUse premium proxies for higher reliability.
stealth_proxybooleanNoEnable stealth proxy mode. Use when the target site blocks bots.
block_resourcesbooleanNoBlock images, styles, and fonts for faster loads.
forward_headersbooleanNoForward original request headers from the browser.
return_page_sourcebooleanNoReturn the raw page source instead of text.
NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError message if execution failed.
successfulbooleanYesWhether the action executed successfully.

SCRAPINGBEE_USAGE_STATS

Tool to retrieve usage statistics for your ScrapingBee account. Use when you need to monitor remaining credits and request count.

NameTypeRequiredDescription
datastringYesData from the action execution.
errorstringNoError message if execution failed.
successfulbooleanYesWhether the action executed successfully.