modal

Run Python code in the cloud with serverless containers, GPUs, and autoscaling. Use when deploying ML models, running batch processing jobs, scheduling compute-intensive tasks, or serving APIs that require GPU acceleration or dynamic scaling.

Install

Hot:7

Download and extract to your skills directory

Copy command and send to OpenClaw for auto-install:

Download and install this skill https://openskills.cc/api/download?slug=k-dense-ai-scientific-skills-modal&locale=en&source=copy

Modal - Cloud-based Python Serverless Execution Platform

Overview


Modal is a serverless cloud computing platform designed for Python, letting you run Python code in the cloud without configuring servers. It supports GPU acceleration, automatic scaling, and pay-as-you-go billing, making it especially suitable for ML model deployment, batch data processing, and scheduled tasks. Sign up and receive $30/month in free credit.

Use Cases

1. Machine Learning Model Deployment


Deploy trained LLMs, image generation, or embedding models as cloud APIs with GPU inference acceleration. Modal automatically handles container configuration, load balancing, and elastic scaling—you only need to define the model and service logic.

2. GPU-accelerated Compute Tasks


Compute tasks that require GPUs (e.g., model training, inference, rendering) can request T4, A100, H100 and other GPUs directly on Modal, billed by usage time without the need to maintain GPU servers.

3. Large-scale Batch Processing


Distribute data-processing tasks across thousands of containers to run automatically in parallel—suitable for massive datasets, bulk file conversion, or distributed scientific computing.

Core Features

1. Declarative Container Image Definition


Define the runtime environment using Python code. Supports installing PyPI packages, system dependencies, adding local code modules, or using existing Docker images. Each deployment is automatically built to ensure a consistent environment.

2. Flexible GPU and Resource Configuration


Choose different types and numbers of GPUs based on task needs (from a single T4 up to 8-card H100), and customize CPU cores, memory, and ephemeral disk space. Billing can be based on reserved resources or actual usage.

3. Auto Scaling and Parallel Execution


Use the .map() method to automatically distribute tasks across multiple containers for parallel execution. Supports configuring minimum/maximum container counts, reserved buffer containers, and other strategies to enable elasticity from zero to thousands of instances.

Frequently Asked Questions

What is Modal? What scenarios is it suitable for?


Modal is a Python-focused serverless cloud computing platform. You define functions and runtime environments in Python, and Modal automatically handles container deployment, scaling, and resource management. It's particularly well suited for ML model deployment, GPU training/inference, batch data processing, scheduled tasks, and serverless APIs.

Which GPUs does Modal support? How do I choose?


Modal supports T4, L4 (economical inference), A10, A100, A100-80GB (standard training/inference), L40S (high cost-effectiveness, 48GB), H100, H200 (high-performance training), and B200 (flagship performance). For inference we recommend L40S; for training we recommend H100/A100. You can specify GPUs with @app.function(gpu="A100"); for multiple cards use gpu="H100:8".

How much free credit does Modal offer? How is billing handled?


New users receive $30/month in free credit upon registration. Billing is based on the compute resources used (CPU, GPU, memory, storage) and supports charging based on reserved or actual usage (whichever is higher). Functions do not incur charges when they are not running. See the Modal console for detailed pricing.