Firecrawl Automation

Automate web crawling and data extraction with Firecrawl -- scrape pages, crawl sites, extract structured data, batch scrape URLs, and map website structures through the Composio Firecrawl integration.

Install

Hot:22

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=composiohq-composio-skills-firecrawl-automation&locale=en&source=copy

Firecrawl Automation

Overview of Skills


Firecrawl Automation lets you run web scraping and data extraction tasks directly in Claude Code. It supports single-page scraping, full-site crawling, structured data extraction, and batch processing—so you can collect website data without leaving the terminal.

Use Cases

1. Website Data Collection and Monitoring


Use cases that require regularly scraping competitors’ pricing, product information, or news content. Supports batch processing of multiple URLs. You can configure crawl depth and path filtering, making it well-suited for building continuously updated data sources.

2. Extracting Content from Dynamic Pages


Modern web applications often render content with JavaScript, which traditional crawlers struggle to capture. Firecrawl supports waiting for page rendering and performing browser actions (clicking, scrolling, inputting text), enabling it to scrape dynamic pages that require interaction.

3. Extracting and Organizing Structured Data


Extract specific fields (such as company information, product specifications, and price data) from unstructured web pages. With AI-driven extraction, you can provide natural-language descriptions or a JSON Schema, and it automatically converts page content into structured JSON.

Core Features

Single-Page Scraping


Fetch the content of a single URL. Supports multiple output formats (Markdown, HTML, screenshots, JSON). You can configure it to extract only the main content (automatically filtering out navigation bars, ads, and footers). It also supports scraping after performing browser actions, making it suitable for retrieving dynamically rendered page content.

Full-Site Crawling


Starting from a seed URL, it automatically discovers and crawls multiple pages. You can limit crawl depth, number of pages, and path range. It supports filtering URL paths with regular expressions to control the crawl scope and avoid wasting quotas. Crawling jobs run asynchronously, and you can query progress and results by task ID.

Structured Data Extraction


Use AI to extract structured JSON data from web pages. Supports defining the output structure via natural-language descriptions or JSON Schema. It can process multiple URLs in one run (beta supports up to 10), making it ideal for batch extraction of product information, company data, and other structured content.

Batch Scraping


Scrape multiple URLs concurrently to improve efficiency. Supports options such as concurrency level, geographic location, and ad blocking. You can ignore invalid URLs without interrupting the entire batch job, making it suitable for processing large lists of known URLs.

Website Structure Mapping


Discover all URLs on a website and generate a sitemap to help you understand the site structure or plan subsequent crawling tasks. Supports keyword-based filtering, limiting returned results, and ignoring query parameters.

Task Monitoring and Management


Both scraping and extraction jobs run asynchronously. After returning a task ID, you can use a dedicated tool to check status, retrieve results, or cancel the job. Quota usage is transparently visible, so you can control costs at any time.

Common Questions

What is Firecrawl?


Firecrawl is a web data extraction service, integrated into Claude Code via Composio. Unlike traditional crawlers, Firecrawl uses browser rendering technology to handle dynamic pages and provides AI-driven structured data extraction. You only need to add a Composio MCP server (https://rube.app/mcp) in your configuration, and then you can call the scraping functions directly in the conversation.

How do I extract structured data?


Use the FIRECRAWL_EXTRACT tool and provide an array of target URLs along with your extraction requirements. You can describe what you want to extract in natural language (e.g., “extract the company name, pricing, and a list of features”) or provide a complete JSON Schema. The task runs asynchronously; use the returned task ID with FIRECRAWL_EXTRACT_GET to obtain the final results. It’s recommended to test on a small set of URLs first to ensure the output matches expectations before scaling up.

What are Firecrawl’s usage limits?


Firecrawl uses quota-based billing, and each scraping and extraction operation consumes quota. For batch scraping, it’s recommended to use FIRECRAWL_BATCH_SCRAPE rather than calling FIRECRAWL_EXTRACT multiple times individually, which improves efficiency and reduces quota consumption. In the beta, FIRECRAWL_EXTRACT supports up to 10 URLs per run; large-scale extraction may hit rate limits (429 errors). It’s recommended to process in batches and implement a backoff strategy. Crawling tasks are limited to 10 pages by default, and you can adjust as needed.