image-gen
使用 OpenAI、Google、DashScope 和 Replicate API 的 AI 图像生成。支持文本到图像、参考图像和纵横比。默认按顺序生成;如有需要可并行生成。用户请求生成、创建或绘制图像时使用。
分类
图像处理安装
下载并解压到你的 skills 目录
复制命令,发送给 OpenClaw 自动安装:
Image Generation (AI SDK)
Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象) and Replicate providers.
Script Directory
Agent Execution:
SKILL_DIR = this SKILL.md file's directory${SKILL_DIR}/scripts/main.ts${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bunStep 0: Load Preferences ⛔ BLOCKING
CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
# macOS, Linux, WSL, Git Bash
test -f .gino-skills/image-gen/EXTEND.md && echo "project"
test -f "$HOME/.gino-skills/image-gen/EXTEND.md" && echo "user"# PowerShell (Windows)
if (Test-Path .gino-skills/image-gen/EXTEND.md) { "project" }
if (Test-Path "$HOME/.gino-skills/image-gen/EXTEND.md") { "user" }| Result | Action |
|---|---|
| Found | Load, parse, apply settings. If default_model.[provider] is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue |
CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
| Path | Location |
|---|---|
.gino-skills/image-gen/EXTEND.md | Project directory |
$HOME/.gino-skills/image-gen/EXTEND.md | User home |
EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models
Schema: references/config/preferences-schema.md
Usage
# Basic
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
${BUN_X} ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google multimodal or OpenAI edits)
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Specific provider
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# Replicate (google/nano-banana-pro)
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-bananaOptions
| Option | Description |
|---|---|
--prompt <text>, -p | Prompt text |
--promptfiles <files...> | Read prompt from files (concatenated) |
--image <path> | Output image path (required) |
--provider google\|openai\|dashscope\|replicate | Force provider (default: google) |
--model <id>, -m | Model ID (Google: gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; OpenAI: gpt-image-1.5) |
--ar <ratio> | Aspect ratio (e.g., 16:9, 1:1, 4:3) |
--size <WxH> | Size (e.g., 1024x1024) |
--quality normal\|2k | Quality preset (default: 2k) |
--imageSize 1K\|2K\|4K | Image size for Google (default: from quality) |
--ref <files...> | Reference images. Supported by Google multimodal (gemini-3-pro-image-preview, gemini-3-flash-preview, gemini-3.1-flash-image-preview) and OpenAI edits (GPT Image models). If provider omitted: Google first, then OpenAI |
--n <count> | Number of images |
--json | JSON output |
Environment Variables
| Variable | Description |
|---|---|
OPENAI_API_KEY | OpenAI API key |
GOOGLE_API_KEY | Google API key |
DASHSCOPE_API_KEY | DashScope API key (阿里云) |
REPLICATE_API_TOKEN | Replicate API token |
OPENAI_IMAGE_MODEL | OpenAI model override |
GOOGLE_IMAGE_MODEL | Google model override |
DASHSCOPE_IMAGE_MODEL | DashScope model override (default: z-image-turbo) |
REPLICATE_IMAGE_MODEL | Replicate model override (default: google/nano-banana-pro) |
OPENAI_BASE_URL | Custom OpenAI endpoint |
GOOGLE_BASE_URL | Custom Google endpoint |
DASHSCOPE_BASE_URL | Custom DashScope endpoint |
REPLICATE_BASE_URL | Custom Replicate endpoint |
Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.gino-skills/.env > ~/.gino-skills/.env
Model Resolution
Model priority (highest → lowest), applies to all providers:
--model <id>default_model.[provider]<PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.
Agent MUST display model info before each generation:
Using [provider] / [model]Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODELReplicate Models
Supported model formats:
owner/name (recommended for official models), e.g. google/nano-banana-proowner/name:version (community models by version), e.g. stability-ai/sdxl:<version>Examples:
# Use Replicate default model
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-bananaProvider Selection
--ref provided + no --provider → auto-select Google first, then OpenAI, then Replicate--provider specified → use it (if --ref, must be google, openai, or replicate)Quality Presets
| Preset | Google imageSize | OpenAI Size | Use Case |
|---|---|---|---|
normal | 1K | 1024px | Quick previews |
2k (default) | 2K | 2048px | Covers, illustrations, infographics |
Google imageSize: Can be overridden with --imageSize 1K|2K|4K
Aspect Ratios
Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1
imageConfig.aspectRatioaspectRatio parameterGeneration Mode
Default: Sequential generation (one image at a time). This ensures stable output and easier debugging.
Parallel Generation: Only use when user explicitly requests parallel/concurrent generation.
| Mode | When to Use |
|---|---|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel | User explicitly requests, large batches (10+) |
Parallel Settings (when requested):
| Setting | Value |
|---|---|
| Recommended concurrency | 4 subagents |
| Max concurrency | 8 subagents |
| Use case | Large batch generation when user requests parallel |
Agent Implementation (parallel mode only):
# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all completeError Handling
gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; or OpenAI GPT Image edits)Extension Support
Custom configurations via EXTEND.md. See Preferences section for paths and supported options.