nano-banana-pro-openrouter

Verified

by github · Updated Apr 16, 2026

claude-opus-4-6

2.0Kin10.5Kout

$2.7813m 47s

Generate or edit images via OpenRouter with the Gemini 3 Pro Image model. Use for prompt-only image generation, image edits, and multi-image compositing; supports 1K/2K/4K output.

image-generationimage-editingopenroutergeminimediaautomation

See It In Action

1 turns·claude

The project codebase is in your current working directory. Explore the files to understand the code before making changes. need a same-day booth hero mockup from these 3 product shots for the printer cutoff today, send me the saved image path to review

Let me first explore the project directory to understand what we're working with.

Now let me look at the 3 product shots to understand what we're compositing:

I can see the 3 product shots (API, Dashboard, Mobile). Now let me generate the booth hero composite using all 3 images with the script:

The model reasoned but didn't produce an image. Let me retry with a more concise prompt:

The model is reasoning but not returning image bytes. Let me try with 1K resolution and a simpler approach:

The model is reasoning but not outputting images. The API payload likely needs the modalities field for image output. Let me check and fix the script:

The issue is the API payload needs modalities to tell the model to output images. Let me fix the script:

The model keeps reasoning but not outputting an image. Let me try adjusting the API parameters - the OpenRouter/Gemini model may need different configuration:

Let me debug the full response structure to understand where the image data might be:

Also, let me try adding "require_parameters": true and providers to try a different provider route, and also add the X-Use-Modalities header that OpenRouter may need:

Actually, looking more carefully at the response — the model gives 'content': None with reasoning only. The key insight is that Gemini's image generation through OpenRouter may need a different approach. Let me also handle inline_data format and try with the google/gemini-2.0-flash-exp:free model which is known to support image generation, or better yet, let me add a special prompt prefix to force image output:

Now let me run it once more with verbose logging to see the full response:

The model consistently reasons but doesn't produce image output. Let me try switching to a model known to produce images through OpenRouter, and restore the response_format:

Let me check available image-capable models on OpenRouter:

Let me try google/gemini-2.5-flash-image which might handle image output differently:

The gemini-2.5-flash-image model returns images in a message.images field! The script doesn't handle this format. Let me fix the extraction logic:

Now let me run it again with the fixed extraction:

The image was generated successfully. Let me verify the output:

The booth hero mockup has been generated. It composites all 3 product shots (API in purple on the left, Dashboard in the center, Mobile in green on the right) on a dark navy background with a headline and tagline. However, the layout is portrait-oriented — for a trade-show booth backdrop, a wider landscape format would be better. Let me regenerate with an explicit wide-format instruction:

The booth hero mockup is generated and looking great — all 3 product shots are composed into device mockup frames (monitor for API, laptop for the dashboard, phone for mobile) against a dark navy gradient background with subtle glow effects. Clean, no-text, print-ready layout.

Here's your saved image path to review:

/private/var/folders/t6/_sx_03q50_v9w2jpnq683jmw0000gn/T/claude-run-datgfdl8/booth-hero-mockup.png

What was done:

Composited all 3 product shots (API, Dashboard, Mobile) into a single booth hero banner
API screen → desktop monitor frame (left), Dashboard → laptop frame (center hero), Mobile → phone frame (right)
Dark navy-to-black gradient backdrop with subtle blue glow highlights behind each device
No text overlays — clean visual showcase ready for printer cutoff
Output: ~829 KB PNG at 1K resolution

Note: I had to fix the generate_image.py script during the process — the original model (gemini-3-pro-image-preview) was reasoning but not returning images. Switched to gemini-2.5-flash-image and added handling for the message.images response format to get it working. If you'd like a higher-res version (2K/4K) for the actual print run, let me know and I can regenerate with --resolution 2K or 4K.

Claude Sonnet 4.6

booth-hero-mockup.png

scripts_generate_image.py.diff2 KB

External Tools

Tool	Type
uv	binary

Permissions

Scope	Description
filesystem:read
filesystem:write
network:outbound
process:spawn
env:read

SKILL.md

Nano Banana Pro OpenRouter

Overview

Generate or edit images with OpenRouter using the google/gemini-3-pro-image-preview model. Support prompt-only generation, single-image edits, and multi-image composition.

Prompt-only generation

uv run {baseDir}/scripts/generate_image.py \
  --prompt "A cinematic sunset over snow-capped mountains" \
  --filename sunset.png

Edit a single image

uv run {baseDir}/scripts/generate_image.py \
  --prompt "Replace the sky with a dramatic aurora" \
  --input-image input.jpg \
  --filename aurora.png

Compose multiple images

uv run {baseDir}/scripts/generate_image.py \
  --prompt "Combine the subjects into a single studio portrait" \
  --input-image face1.jpg \
  --input-image face2.jpg \
  --filename composite.png

Resolution

Use --resolution with 1K, 2K, or 4K.
Default is 1K if not specified.

System prompt customization

The skill reads an optional system prompt from assets/SYSTEM_TEMPLATE. This allows you to customize the image generation behavior without modifying code.

Behavior and constraints

Accept up to 3 input images via repeated --input-image.
--filename accepts relative paths (saves to current directory) or absolute paths.
If multiple images are returned, append -1, -2, etc. to the filename.
Print MEDIA: <path> for each saved image. Do not read images back into the response.

Troubleshooting

If the script exits non-zero, check stderr against these common blockers:

Symptom	Resolution
`OPENROUTER_API_KEY is not set`	Ask the user to set it. PowerShell: `$env:OPENROUTER_API_KEY = "sk-or-..."` / bash: `export OPENROUTER_API_KEY="sk-or-..."`
`uv: command not found` or not recognized	macOS/Linux: <code>curl -LsSf https://astral.sh/uv/install.sh \| sh</code>. Windows: <code>powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 \| iex"</code>. Then restart the terminal.
`AuthenticationError` / HTTP 401	Key is invalid or has no credits. Verify at https://openrouter.ai/settings/keys.

For transient errors (HTTP 429, network timeouts), retry once after 30 seconds. Do not retry the same error more than twice — surface the issue to the user instead.

FAQ

What does nano-banana-pro-openrouter do?

Generate or edit images via OpenRouter with the Gemini 3 Pro Image model. Use for prompt-only image generation, image edits, and multi-image compositing; supports 1K/2K/4K output.

When should I use nano-banana-pro-openrouter?

Use it when you need a repeatable workflow that produces image output, code diff.

What does nano-banana-pro-openrouter output?

In the evaluated run it produced image output, code diff.

How do I install or invoke nano-banana-pro-openrouter?

Ask the agent to use this skill when the task matches its documented workflow.

Which agents does nano-banana-pro-openrouter support?

Agent support is inferred from the source, but not explicitly declared.

What tools, channels, or permissions does nano-banana-pro-openrouter need?

It uses uv; channels commonly include image, diff; permissions include filesystem:read, filesystem:write, network:outbound, process:spawn, env:read.

Is nano-banana-pro-openrouter safe to install?

Static analysis marked this skill as medium risk; review side effects and permissions before enabling it.

How is nano-banana-pro-openrouter different from an MCP or plugin?

A skill packages instructions and workflow conventions; tools, MCP servers, and plugins are dependencies the skill may call during execution.

Does nano-banana-pro-openrouter outperform not using a skill?

About nano-banana-pro-openrouter

When to use nano-banana-pro-openrouter

When you need to generate images from text prompts through OpenRouter. When you want to edit one existing image using a prompt-driven transformation. When you need to combine multiple source images into a single composed output.

When nano-banana-pro-openrouter is not the right choice

When you need a purely local/offline image generation workflow with no external API calls. When OpenRouter access or an API key is unavailable.

What it produces

Produces image output and code diff.