
I wrote about this blog's automated image generation previously. The short version: generate-images.mjs runs before each build, finds posts with missing images, builds a prompt from the imgprompt front matter field (or has gpt-4o-mini summarize the post), then calls DALL-E 3 to produce a 1024x1024 image.
It works, but DALL-E 3 is frustrating. It ignores prompt details, can't make a white background to save its life, and the results look more "AI art" than "clean icon." Time to see what else is out there.
This space moves fast. As of February 2026, these are the top 3 image generation models on the LM Arena text-to-image leaderboard that have public APIs. These rankings are based on blind human preference voting and will almost certainly be different by the time you read this.
ELO Scores: ELO is a rating system where users are shown two AI-generated images side by side (without knowing which model made which) and pick the one they prefer. Wins and losses shift each model's score, so a higher ELO means the model's images are consistently preferred by real people.
| Model | Provider | Released | ELO | Notes |
|---|---|---|---|---|
| GPT Image 1.5 | OpenAI | December 2025 | 1248 | 4x faster than GPT Image 1. Token-based pricing, 20% cheaper than its predecessor. |
| Nano Banana Pro | Google (Gemini 3 Pro Image) | November 2025 | 1237 | Native Gemini image gen. Strong multimodal integration. |
| Flux 2 Max | Black Forest Labs | November 2025 | 1169 | 32B parameter model. Megapixel-based pricing. BFL's highest quality tier. |
GPT Image 1.5 and Nano Banana Pro are close in quality. Flux 2 Max sits ~80 ELO points behind them but is the strongest non-Google non-OpenAI option available.
What about open source? Flux 2 Dev is open-weight and Stable Diffusion 3.5 is fully open source, but neither is practical for this use case. Running them on a MacBook Air M2 (8GB unified memory, passive cooling) means throttling after ~90 seconds, a max practical resolution of 512x512, and needing to close basically every other app first. The quality gap vs. the API models is real too. For a build-time script that needs to reliably produce clean images, cloud APIs win.
This is the easiest since the blog already uses OpenAI.
OPENAI_API_KEY used for DALL-E 3.model parameter from dall-e-3 to gpt-image-1.5 in the images API call./v1/images/generations) and the newer Responses API. The Images API is the closest drop-in replacement. Quality can be set to low, medium, high, or auto. It also supports streaming partial renders and up to 16 reference images for editing.export OPENAI_API_KEY="sk-..."
If you don't already have a key, create one at platform.openai.com/api-keys.
Nano Banana is Google's image generation feature built on Gemini. You can access it through the official Google Gemini API.
npm install @google/genai
export GEMINI_API_KEY="your-key-here"
responseModalities: ["TEXT", "IMAGE"]. The model name for the standard tier is gemini-2.5-flash-preview-image and for Pro it's gemini-3-pro-image.Note: the API flow is different from OpenAI. You send a chat completion-style request with image output enabled, and the response includes inline image data.
Black Forest Labs runs their own API platform.
export BFL_API_KEY="your-key-here"
https://api.bfl.ai/v1/flux-2-max with your prompt, width, and height. The API is async — you submit a task and poll for the result using the returned task ID.Once you've configured your API keys, you can verify everything is wired up correctly by running:
node apps/blog/blog/scripts/generate-images.mjs --verify
This will check that each configured API key is valid and can reach its respective service without actually generating any images. We'll implement this flag in a follow-up change.
All prices are USD per image at 1024x1024.
| Model | Cost | Pricing Model | Notes |
|---|---|---|---|
| DALL-E 3 (prior) | ~$0.04 | Per image | Fixed price, single quality tier |
| GPT Image 1.5 | $0.009–$0.14 | Per token (output) | Varies by quality: low ~$0.009, medium ~$0.03, high ~$0.14. ~20% cheaper than GPT Image 1 |
| Nano Banana | ~$0.02 | Per image | Standard tier (Gemini 2.5 Flash). Pro tier is ~$0.10/image |
| Flux 2 Max | ~$0.07 | Per megapixel | Scales with resolution. 1024x1024 ≈ 1MP ≈ 7 credits |
The ambiguity here is mostly around GPT Image 1.5 — the token-based pricing means cost depends heavily on the quality setting and prompt length. For a fair comparison we'll use "medium" quality, which lands close to what we're paying for DALL-E 3 today. Flux 2 Max's megapixel pricing is predictable but gets more expensive at higher resolutions.
Each model was given the same prompt, built from imgprompt: "A cute robot holding a paint brush":
Subject: "A cute robot holding a paint brush".
Style: minimalist flat vector icon, clean lines, crisp edges, simplified geometric shapes.
Colors: black, white, and one primary accent color only. No gradients.
Composition: centered subject, generous negative space, wide landscape 16:9 aspect ratio.
Background: pure solid white (#FFFFFF), filling the entire image.
No text, no shadows, no textures, no transparency, no background elements.
All images were center-cropped to 533x300 (16:9) after generation.
Model: gpt-image-1.5, size: 1024x1024, quality: medium. Estimated cost: ~$0.03.

Model: gemini-2.0-flash-exp-image-generation. Estimated cost: ~$0.02 (free tier may apply).

Model: flux-2-max, size: 1024x576. Estimated cost: ~$0.04 (~0.6MP at $0.07/MP).

It's OpenAI. It's got that annoying same-y generic feel that lets you know it's AI art, and I don't love that, but it's also definitely more cute.
Flux's was more artsy. It's just kinda of... weird? It kept drawing unique stuff but just sort of unsettling.
Nano Banana's ones are pretty original but it just keeps making it really small and not as cute despite cute being in the name. I think I might switch to it once I figure out the size problems and get bored of the generic ChatGPT style.
Given the middling price too, gpt-image-1.5 wins today (2026-02-22).