Apify Automation

Automate web scraping and data extraction with Apify -- run Actors, manage datasets, create reusable tasks, and retrieve crawl results through the Composio Apify integration.

Install

Hot:2

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=composiohq-composio-skills-apify-automation&locale=en&source=copy

Apify Automation

Skills Overview


Run Apify web scraping Actors directly in Claude Code, manage datasets, and complete automated data extraction tasks without leaving the terminal.

Use Cases

  • Rapid Data Collection

  • When you need to extract structured data in bulk from a website, trigger the crawler directly in the conversation and retrieve the results. For example: scraping Google Store reviews, e-commerce product information, or news article lists. In synchronous mode, you can get JSON-formatted data within minutes.

  • Long-Term Monitoring

  • For scenarios that repeatedly collect data from the same source—such as daily price monitoring, tracking social media trends, or competitive analysis—create reusable tasks. Set input parameters once, then execute them repeatedly to ensure each collection uses the same configuration.

  • Large-Scale Asynchronous Scraping

  • For handling many pages or crawler jobs that need to run for a long time. In asynchronous mode, the crawler runs in the background without blocking the terminal. After completion, paginate through results by dataset ID, making it suitable for large scraping tasks lasting more than 5 minutes.

    Core Features

  • Actor Execution Management

  • Supports both synchronous and asynchronous execution modes. Synchronous mode (APIFY_RUN_ACTOR_SYNC_GET_DATASET_ITEMS) is suitable for quick tasks and returns data immediately after execution. Asynchronous mode (APIFY_RUN_ACTOR) is for long-running tasks; you can set memory limits and timeouts, and then fetch results later via dataset ID. Before execution, you can use APIFY_GET_ACTOR to view each Actor’s input schema to avoid parameter format errors.

  • Dataset Retrieval and Processing

  • Use APIFY_GET_DATASET_ITEMS to fetch data from a specified dataset, supporting multiple formats such as JSON, CSV, and XLSX. Includes built-in pagination (up to 1000 items per request). You can iterate through the entire dataset using offset. Supports field selection (fields) and omission (omit) so you only extract the fields you need, reducing data transfer.

  • Task and Run Management

  • Use APIFY_CREATE_TASK to create reusable tasks by fixing Actor input parameters and calling them repeatedly. View historical run records with APIFY_GET_LIST_OF_RUNS, and use APIFY_GET_LOG to retrieve execution logs to diagnose failures. You can filter run records by status to quickly locate issues.

    Common Questions

    How do Apify and Claude Code integrate?


    Add the Composio MCP server https://rube.app/mcp to your configuration. On first use, an authentication link will pop up. After binding your Apify account, you can call all Apify tools within Claude Code.

    What’s the difference between synchronous and asynchronous execution?


    Synchronous execution waits for the crawler to finish (up to 5 minutes) and returns data directly, suitable for quick, small-scale tasks. Asynchronous execution returns immediately while the crawler runs in the background, suitable for tasks longer than 5 minutes or those involving many pages. After an asynchronous task completes, you need to manually fetch results using the dataset ID.

    How can I avoid Actor input format errors?


    Each Actor has a different input schema. Check the required fields first with APIFY_GET_ACTOR. Common things to watch: URLs must include the protocol (https://), enumerated values are usually lowercase, and URL fields may require an object format like {"url": "https://example.com"}. For detailed parameters, refer to the Actor documentation at apify.com/store.