generate-image
Verifiedby K-Dense-AI · Updated Apr 16, 2026
Generate or edit images using AI models (FLUX, Nano Banana 2). Use for general-purpose image generation including photos, illustrations, artwork, visual assets, concept art, and any image that is not a technical diagram or schematic. For flowcharts, circuits, pathways, and technical diagrams, use the scientific-schematics skill instead.
See It In Action
Let me start by exploring the codebase to understand the project structure and find the current placeholder image.
Let me read the hero section and current image to understand exactly what's needed.
The placeholder is essentially blank/white. Now let me generate a polished hero image that fits doany.ai's aesthetic — modern, purple-toned, AI/tech themed.
Let me check for the API key and generate the image.
API key is set in the environment. Let me generate the hero image.
The virtualenv's python is being used but doesn't have requests. Let me install it there.
The Gemini model needs too many credits. Let me try FLUX.2 Pro which should be more cost-efficient.
The image was generated but the script didn't parse the response correctly. Let me check the script and fix the parsing.
I can see the issue — the script doesn't handle the images array nested inside choices[].message. The FLUX response has choices[0].message.images but the script only checks for data["images"] at the top level. Let me fix this and rerun.
Now let me rerun the generation.
The response is truncated. Let me dump the full structure to understand the image format.
Now I see the structure: images[0] is a dict with type: "image_url" and image_url: {url: "data:..."}. My fix handled url at the top level of the dict but the actual key is image_url.url. Let me also handle this type: "image_url" format (same as the content array format).
Now let me generate the actual hero image with the fixed script.
Image generated. Let me preview it and then swap it into the HTML.
Looks great — dark navy with flowing purple luminous nodes, plenty of center space for the headline overlay. Now let me swap it into the homepage.
Done. Here's what I did:
- Generated a polished hero image using FLUX.2 Pro — dark navy background with flowing purple/violet luminous nodes and soft bokeh particles, matching doany.ai's
#6c3ce9purple +#1a1a2enavy palette - Fixed a bug in
scripts/generate_image.pywhere FLUX model responses weren't being parsed correctly (theimagesarray insidechoices[].messagewasn't handled) - Swapped the placeholder in
index.html— replacedplaceholder_hero.pngwithhero_image.pngand removed the TODO comment
The image has generous negative space in the center so the "Build Smarter with AI Skills" headline and CTA remain crisp and readable at 0.35 opacity.
External Tools
| Tool | Type |
|---|---|
| python3 | binary |
Permissions
| Scope | Description |
|---|---|
| filesystem:read | |
| filesystem:write | |
| network:outbound | |
| process:spawn | |
| env:read |
SKILL.md
Generate Image
Generate and edit high-quality images using OpenRouter's image generation models including FLUX.2 Pro and Gemini 3.1 Flash Image Preview.
When to Use This Skill
Use generate-image for:
- Photos and photorealistic images
- Artistic illustrations and artwork
- Concept art and visual concepts
- Visual assets for presentations or documents
- Image editing and modifications
- Any general-purpose image generation needs
Use scientific-schematics instead for:
- Flowcharts and process diagrams
- Circuit diagrams and electrical schematics
- Biological pathways and signaling cascades
- System architecture diagrams
- CONSORT diagrams and methodology flowcharts
- Any technical/schematic diagrams
Quick Start
Use the scripts/generate_image.py script to generate or edit images:
# Generate a new image
python scripts/generate_image.py "A beautiful sunset over mountains"
# Edit an existing image
python scripts/generate_image.py "Make the sky purple" --input photo.jpg
This generates/edits an image and saves it as generated_image.png in the current directory.
API Key Setup
CRITICAL: The script requires an OpenRouter API key. Before running, check if the user has configured their API key:
- Look for a
.envfile in the project directory or parent directories - Check for
OPENROUTER_API_KEY=<key>in the.envfile - If not found, inform the user they need to:
- Create a
.envfile withOPENROUTER_API_KEY=your-api-key-here - Or set the environment variable:
export OPENROUTER_API_KEY=your-api-key-here - Get an API key from: https://openrouter.ai/keys
- Create a
The script will automatically detect the .env file and provide clear error messages if the API key is missing.
Model Selection
Default model: google/gemini-3.1-flash-image-preview (high quality, recommended)
Available models for generation and editing:
google/gemini-3.1-flash-image-preview- High quality, supports generation + editingblack-forest-labs/flux.2-pro- Fast, high quality, supports generation + editing
Generation only:
black-forest-labs/flux.2-flex- Fast and cheap, but not as high quality as pro
Select based on:
- Quality: Use gemini-3.1-flash-image-preview or flux.2-pro
- Editing: Use gemini-3.1-flash-image-preview or flux.2-pro (both support image editing)
- Cost: Use flux.2-flex for generation only
Common Usage Patterns
Basic generation
python scripts/generate_image.py "Your prompt here"
Specify model
python scripts/generate_image.py "A cat in space" --model "black-forest-labs/flux.2-pro"
Custom output path
python scripts/generate_image.py "Abstract art" --output artwork.png
Edit an existing image
python scripts/generate_image.py "Make the background blue" --input photo.jpg
Edit with a specific model
python scripts/generate_image.py "Add sunglasses to the person" --input portrait.png --model "black-forest-labs/flux.2-pro"
Edit with custom output
python scripts/generate_image.py "Remove the text from the image" --input screenshot.png --output cleaned.png
Multiple images
Run the script multiple times with different prompts or output paths:
python scripts/generate_image.py "Image 1 description" --output image1.png
python scripts/generate_image.py "Image 2 description" --output image2.png
Script Parameters
prompt(required): Text description of the image to generate, or editing instructions--inputor-i: Input image path for editing (enables edit mode)--modelor-m: OpenRouter model ID (default: google/gemini-3.1-flash-image-preview)--outputor-o: Output file path (default: generated_image.png)--api-key: OpenRouter API key (overrides .env file)
Example Use Cases
For Scientific Documents
# Generate a conceptual illustration for a paper
python scripts/generate_image.py "Microscopic view of cancer cells being attacked by immunotherapy agents, scientific illustration style" --output figures/immunotherapy_concept.png
# Create a visual for a presentation
python scripts/generate_image.py "DNA double helix structure with highlighted mutation site, modern scientific visualization" --output slides/dna_mutation.png
For Presentations and Posters
# Title slide background
python scripts/generate_image.py "Abstract blue and white background with subtle molecular patterns, professional presentation style" --output slides/background.png
# Poster hero image
python scripts/generate_image.py "Laboratory setting with modern equipment, photorealistic, well-lit" --output poster/hero.png
For General Visual Content
# Website or documentation images
python scripts/generate_image.py "Professional team collaboration around a digital whiteboard, modern office" --output docs/team_collaboration.png
# Marketing materials
python scripts/generate_image.py "Futuristic AI brain concept with glowing neural networks" --output marketing/ai_concept.png
Error Handling
The script provides clear error messages for:
- Missing API key (with setup instructions)
- API errors (with status codes)
- Unexpected response formats
- Missing dependencies (requests library)
If the script fails, read the error message and address the issue before retrying.
Notes
- Images are returned as base64-encoded data URLs and automatically saved as PNG files
- The script supports both
imagesandcontentresponse formats from different OpenRouter models - Generation time varies by model (typically 5-30 seconds)
- For image editing, the input image is encoded as base64 and sent to the model
- Supported input image formats: PNG, JPEG, GIF, WebP
- Check OpenRouter pricing for cost information: https://openrouter.ai/models
Image Editing Tips
- Be specific about what changes you want (e.g., "change the sky to sunset colors" vs "edit the sky")
- Reference specific elements in the image when possible
- For best results, use clear and detailed editing instructions
- Both Gemini 3.1 Flash Image Preview and FLUX.2 Pro support image editing through OpenRouter
Integration with Other Skills
- scientific-schematics: Use for technical diagrams, flowcharts, circuits, pathways
- generate-image: Use for photos, illustrations, artwork, visual concepts
- scientific-slides: Combine with generate-image for visually rich presentations
- latex-posters: Use generate-image for poster visuals and hero images
FAQ
What does generate-image do?
Generate or edit images using AI models (FLUX, Nano Banana 2). Use for general-purpose image generation including photos, illustrations, artwork, visual assets, concept art, and any image that is not a technical diagram or schematic. For flowcharts, circuits, pathways, and technical diagrams, use the scientific-schematics skill instead.
When should I use generate-image?
Use it when you need a repeatable workflow that produces image output, code diff.
What does generate-image output?
In the evaluated run it produced image output, code diff.
How do I install or invoke generate-image?
npx skills add https://github.com/k-dense-ai/claude-scientific-skills --skill generate-image
Which agents does generate-image support?
Claude Code
What tools, channels, or permissions does generate-image need?
It uses python3; channels commonly include image, diff; permissions include filesystem:read, filesystem:write, network:outbound, process:spawn, env:read.
Is generate-image safe to install?
Static analysis marked this skill as medium risk; review side effects and permissions before enabling it.
How is generate-image different from an MCP or plugin?
A skill packages instructions and workflow conventions; tools, MCP servers, and plugins are dependencies the skill may call during execution.
Does generate-image outperform not using a skill?
About generate-image
When to use generate-image
You need to create illustrations, concept art, or presentation visuals from text prompts. You want to edit an existing image with AI using natural-language instructions. You already have or can provide an OpenRouter API key and want a scriptable workflow.
When generate-image is not the right choice
You need technical diagrams, schematics, flowcharts, or other structured scientific visuals. You cannot use external paid API services or do not have an OpenRouter API key.
What it produces
Produces image output and code diff.
Install
npx skills add https://github.com/k-dense-ai/claude-scientific-skills --skill generate-imageInvoke: Ask Claude Code to use generate-image for the task.
