image-gen
AI image generation with OpenAI, Google, DashScope and Replicate APIs. Supports text-to-image, reference images, aspect ratios. Sequential by default; parallel generation available on request. Use when user asks to generate, create, or draw images.
Author
Category
Image ProcessingInstall
Download and extract to your skills directory
Copy command and send to OpenClaw for auto-install:
Image Generation (AI SDK)
Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象) and Replicate providers.
Script Directory
Agent Execution:
SKILL_DIR = this SKILL.md file's directory${SKILL_DIR}/scripts/main.ts${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bunStep 0: Load Preferences ⛔ BLOCKING
CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
# macOS, Linux, WSL, Git Bash
test -f .gino-skills/image-gen/EXTEND.md && echo "project"
test -f "$HOME/.gino-skills/image-gen/EXTEND.md" && echo "user"# PowerShell (Windows)
if (Test-Path .gino-skills/image-gen/EXTEND.md) { "project" }
if (Test-Path "$HOME/.gino-skills/image-gen/EXTEND.md") { "user" }| Result | Action |
|---|---|
| Found | Load, parse, apply settings. If default_model.[provider] is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue |
CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
| Path | Location |
|---|---|
.gino-skills/image-gen/EXTEND.md | Project directory |
$HOME/.gino-skills/image-gen/EXTEND.md | User home |
EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models
Schema: references/config/preferences-schema.md
Usage
# Basic
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
${BUN_X} ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google multimodal or OpenAI edits)
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Specific provider
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# Replicate (google/nano-banana-pro)
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-bananaOptions
| Option | Description |
|---|---|
--prompt <text>, -p | Prompt text |
--promptfiles <files...> | Read prompt from files (concatenated) |
--image <path> | Output image path (required) |
--provider google\|openai\|dashscope\|replicate | Force provider (default: google) |
--model <id>, -m | Model ID (Google: gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; OpenAI: gpt-image-1.5) |
--ar <ratio> | Aspect ratio (e.g., 16:9, 1:1, 4:3) |
--size <WxH> | Size (e.g., 1024x1024) |
--quality normal\|2k | Quality preset (default: 2k) |
--imageSize 1K\|2K\|4K | Image size for Google (default: from quality) |
--ref <files...> | Reference images. Supported by Google multimodal (gemini-3-pro-image-preview, gemini-3-flash-preview, gemini-3.1-flash-image-preview) and OpenAI edits (GPT Image models). If provider omitted: Google first, then OpenAI |
--n <count> | Number of images |
--json | JSON output |
Environment Variables
| Variable | Description |
|---|---|
OPENAI_API_KEY | OpenAI API key |
GOOGLE_API_KEY | Google API key |
DASHSCOPE_API_KEY | DashScope API key (阿里云) |
REPLICATE_API_TOKEN | Replicate API token |
OPENAI_IMAGE_MODEL | OpenAI model override |
GOOGLE_IMAGE_MODEL | Google model override |
DASHSCOPE_IMAGE_MODEL | DashScope model override (default: z-image-turbo) |
REPLICATE_IMAGE_MODEL | Replicate model override (default: google/nano-banana-pro) |
OPENAI_BASE_URL | Custom OpenAI endpoint |
GOOGLE_BASE_URL | Custom Google endpoint |
DASHSCOPE_BASE_URL | Custom DashScope endpoint |
REPLICATE_BASE_URL | Custom Replicate endpoint |
Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.gino-skills/.env > ~/.gino-skills/.env
Model Resolution
Model priority (highest → lowest), applies to all providers:
--model <id>default_model.[provider]<PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.
Agent MUST display model info before each generation:
Using [provider] / [model]Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODELReplicate Models
Supported model formats:
owner/name (recommended for official models), e.g. google/nano-banana-proowner/name:version (community models by version), e.g. stability-ai/sdxl:<version>Examples:
# Use Replicate default model
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
${BUN_X} ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-bananaProvider Selection
--ref provided + no --provider → auto-select Google first, then OpenAI, then Replicate--provider specified → use it (if --ref, must be google, openai, or replicate)Quality Presets
| Preset | Google imageSize | OpenAI Size | Use Case |
|---|---|---|---|
normal | 1K | 1024px | Quick previews |
2k (default) | 2K | 2048px | Covers, illustrations, infographics |
Google imageSize: Can be overridden with --imageSize 1K|2K|4K
Aspect Ratios
Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1
imageConfig.aspectRatioaspectRatio parameterGeneration Mode
Default: Sequential generation (one image at a time). This ensures stable output and easier debugging.
Parallel Generation: Only use when user explicitly requests parallel/concurrent generation.
| Mode | When to Use |
|---|---|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel | User explicitly requests, large batches (10+) |
Parallel Settings (when requested):
| Setting | Value |
|---|---|
| Recommended concurrency | 4 subagents |
| Max concurrency | 8 subagents |
| Use case | Large batch generation when user requests parallel |
Agent Implementation (parallel mode only):
# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all completeError Handling
gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; or OpenAI GPT Image edits)Extension Support
Custom configurations via EXTEND.md. See Preferences section for paths and supported options.