[Commercial Use OK] Z-Image_clear_photoreal Released! A Fast, High-Quality Photorealistic Model for Image Generation AI
- Z-Image (Base) → rich in variation
- Z-Image-Turbo → excellent stability & high speed
- Z-Image_clear_photoreal → best of both worlds
Introduction
Hi everyone, this is Easygoing.
Today I’m excited to introduce Z-Image_clear_photoreal, the ultimate image generation AI model that delivers both high speed and high quality while being fully usable for commercial purposes.
What is Z-Image?
Z-Image is a generative AI model released by Alibaba Group on January 27, 2026, under the developer-friendly Apache-2.0 license.
gantt
title Lightweight Image Generative AIs
dateFormat YYYY-MM-DD
axisFormat %Y
tickInterval 12month
section Alibaba
Z-Image-Turbo : done, 2025-11-25, 2026-02-06
Z-Image : 2026-1-27, 2026-02-06
section Black Forest Labs
Flux.2 [klein]: crit, 2026-01-15, 2026-02-06
The Z-Image series first gained attention in November 2025 with the release of the fast distilled version Z-Image-Turbo, which became popular as an image generation model that runs smoothly even on mid-range PCs.
While the fast, distilled Z-Image-Turbo gained traction in late 2025 for its ability to run smoothly on mid-range PCs, the newly released Z-Image (Base) is the original core model. It represents the AI's raw "brain" exactly as it was originally trained, before any distillation process.
Why Merge? (Distilled vs. Base)
Distilled models are great for efficiency but often come with trade-offs:
- Enhanced stability and speed
- Potential loss in fine detail
- Reduced diversity in outputs
To prevent generation failures, distilled models are tuned for high stability, allowing images to converge in fewer steps.
However, this can lead to muted color palettes and residual noise. Furthermore, excessive stability often results in repetitive faces and compositions.
Merging the core model into the distilled base
For Z-Image_clear_photoreal, we started with the previously introduced distilled model Z-Image-Turbo_clear and blended in layers from the original Z-Image (Base) at specific points, with the goal of improving both quality and variation.
This approach maintains the high-speed generation without CFG that Z-Image-Turbo offers, while pushing for even higher image quality.
Actual generation examples!
Let’s compare some real outputs.
Left: the newly tuned Z-Image_clear_photoreal model
Right: the original Z-Image-Turbo_clear
Fireplace room
Late-night radio
Close-up comparison
When comparing image quality, skin texture is one of the easiest ways to see the difference.
The right side (Z-Image-Turbo_clear) shows the characteristic granular noise typical of fast distilled models, while the left side (Z-Image_clear_photoreal) is noticeably cleaner and smoother.
Let’s compare variation!
Next, let’s look at diversity.
We generated 8 images in a row using the exact same prompt but changing only the seed value.
Prompt (woman in traditional clothing)
realistic, photorealistic,
a female wears a purple and gold traditional dress and jewelry, standing in front of a snowy village at sunset or sunrise, surrounded by snow-covered houses and a warm orange and pink sky with orange and pink hues.,
dynamic angle, dutch angle, upper body, close up, face close up, happy, smile, laugh, peaceful, wind, rouge, alizarin, burgundy, maroon, indigo, royal blue, deep blue, deep purple, royal purple, stylish, elegant, turn around
Z-Image_clear_photoreal (this merged model)
Z-Image-Turbo_clear (distilled model)
Comparing the two models clearly shows that the tuned Z-Image_clear_photoreal produces rich variation in faces and compositions, while the base Z-Image-Turbo_clear tends to generate very similar-looking images with little diversity.
Merge recipe for Z-Image_clear_photoreal
Here is the merge recipe used to create Z-Image_clear_photoreal.
The Model Merge Z-Image node on the right mixes two Z-Image models.
A value of 0 uses the layer from Z-Image (Base), while 1 uses the layer from Z-Image-Turbo_clear.
In other words:
- Early layers use the diverse training data from Z-Image (Base) → increases variation
- Mid-to-late layers use the distilled Z-Image-Turbo_clear → preserves stability and generation speed
Custom node: Model Merge Z-Image
Use Z-Image_clear_vae as the VAE!
For the Z-Image_clear_photoreal model, I strongly recommend using the custom VAE I created: Z-Image_clear_vae.
- Z-Image_natural_vae: lower saturation, more natural look
- Z-Image_clear_vae: higher saturation, vivid and vibrant look
I’ve prepared several color-style variations, so feel free to choose the one that matches your preference.
Next time: Z-Image_clear_anime!
This time I created a photoreal-focused model because the tuning directions for photorealistic and anime-style illustrations turned out to be completely different.
In the next post, I plan to introduce the anime-oriented Z-Image_clear_anime model.
Summary: Try Z-Image_clear_photoreal!
- Z-Image (Base) → rich in variation
- Z-Image-Turbo → excellent stability & high speed
- Z-Image_clear_photoreal → best of both worlds
That’s the introduction to Z-Image_clear_photoreal.
When you actually start fine-tuning with Z-Image, you realize how little redundant structure it has—Alibaba clearly conducted deep research on Qwen-Image and achieved outstanding optimization and lightweight design.
Now that the core Z-Image (Base) model has been released, it has become much easier for the community to develop improved versions. I expect many excellent derivative models to appear from the community in the future.
If you haven’t tried it yet, why not take this opportunity to experience Z-Image’s combination of high quality and blazing-fast generation?
Thank you so much for reading until the end!