Does AI Illustration Turn Green? A Thorough Comparison of VAE Precision and Color Accuracy Between SDXL and Flux!

2026-4-192026-5-5

an_animated_female_character_with_silver_hair_and_blue_eyes_wears_a_white_uniform_with_a_blue_ribbon_and_a_red_ribbon,_set_against_a_dark_background_with_a_sta

SDXL’s VAE tends to shift toward green
Flux’s VAE offers high precision
Introduction of SDXL_anime_natural_vae

Introduction

Hello, this is Easygoing.

This time, I’d like to take a closer look at VAE, a topic that always comes up in image generation AI.

Anime illustration of a silver-haired girl wearing a white uniform — Today’s topic: VAE

AI Image Generation Requires Massive Computation

AI image generation involves repeatedly performing matrix operations, which demands enormous computational power.

Modern image generation AI models use a specialized AI model called VAE (Variational Auto Encoder) to compress the image space and perform calculations more efficiently.


flowchart LR

A1(Original Image)

subgraph VAE

B1(VAE Encode)
D1(VAE Decode)

end

subgraph Latent Space

C1(Latent Image)

end

E1(Generated Image)

A1-->B1
B1-->C1
C1-->D1
D1-->E1

The compressed space created by the VAE is called the latent space, and the images processed within it are specifically referred to as latent images.

Standard Information Volume is 1024 x 1024 x 3

In image generation AI models from SDXL onward, the standard resolution is set to 1024 x 1024.

Anime illustration of a silver-haired girl in a white uniform smiling at the viewer — 1024 x 1024 x 3 channels is standard

Furthermore, in the RGB color space we use, there are three color channels — Red, Green, and Blue — resulting in a total information volume of 1024 x 1024 x 3.

VAE Compresses It to 1/48th!

Looking at the major VAEs in chronological order, they have evolved as follows: SD1.5_vae → SDXL_0.9_vae → Flux.1_vae → Flux.2_vae.

gantt
    title VAE Roadmap
    dateFormat YYYY-MM-DD
    tickInterval 12month
	axisFormat %Y

    section Stability AI
        Stable Diffusion 1 : done, 2022-08-22, 2026-04-18
        Stable Diffusion XL 0.9 : done, 2023-06-22, 2026-04-18
        
    section Black Forest Labs
        Flux.1 : 2024-08-01, 2026-04-18
        Flux.2 : 2025-11-25, 2026-04-18

Model	SD_1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
Original Resolution	512 x 512 x 3	1024 x 1024 x 3	1024 x 1024 x 3	1024 x 1024 x 3
Latent Resolution	64 x 64 x 4	128 x 128 x 4	128 x 128 x 16	128 x 128 x 32
Compression Ratio	1/48	1/48	1/12	1/6
Representative Models	SD1.5	SDXL	Flux.1 HiDream Z-Image	Flux.2

Among them, SDXL_0.9_vae compresses the vertical and horizontal resolution to 128 x 128 (1/8th of the original) and converts the channels into 4 dedicated latent channels, thereby reducing the total information volume to 1/48th of the original.

Images Change When Passed Through VAE!

So, how does image quality change when going through the VAE?

From here, I will actually run VAE encode → VAE decode in ComfyUI to observe the changes.

ComfyUI workflow diagram connecting VAE Encode and VAE Decode to verify image changes — Connecting VAE encode → VAE decode

UI of the Image Difference Checker node showing difference map, MAE/SSIM values, and tone curve — Analyzing the difference map and tone curve

The Image Difference Checker node used in this analysis measures image differences using a difference map, MAE, and SSIM, and also allows color changes to be checked via the tone curve.

Mean Absolute Error (MAE): Mainly detects color differences
Structural Similarity Index (SSIM): Detects structural differences in black-and-white (luminance)

ComfyUI-easygoing-nodes custom node

easygoing0114/ComfyUI-easygoing-nodes: Experimental Text Encoder module, HDR Effects with LAB adjusts, Image Difference checker nodes

Comparison with Actual Images!

Now let’s compare using real images. This time, I will examine the precision of four VAEs: SD1.5_vae, SDXL_0.9_vae, Flux.1_vae, and Flux.2_vae.

1. White Uniform (Anime)

Test image: Anime-style female character wearing a white uniform

Difference Map

Looking at the difference map, SD1.5_vae shows relatively large image quality degradation, especially around the character’s outlines.

SDXL_0.9_vae also shows degradation in the same areas, but the degree is considerably milder compared to SD1.5_vae.

With Flux.1_vae and Flux.2_vae, the changes from the original image are much smaller, indicating that both are excellent VAEs with minimal degradation.

MAE and SSIM

Image1	SD1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
MAE_similarity	97.8 %	98.3 %	99.0 %	98.8 %
SSIM_similarity	98.8 %	99.1 %	99.8 %	99.8 %

I calculated MAE and SSIM to compare the precision of each VAE numerically.

While VAE precision improves with each generation, Flux.2_vae’s MAE is slightly worse than Flux.1_vae’s, showing that Flux.1_vae has better color reproduction.

Color Changes

Image1	SD1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
Red	-0.5 %	0.2 %	0.0 %	0.4 %
Green	-0.9 %	0.2 %	0.1 %	-0.3 %
Blue	0.0 %	0.0 %	0.0 %	0.7 %

Next, looking at color changes: SD1.5_vae has less green compared to the original, while Flux.2_vae shows an increase in red and blue.

Since the tendency is not very clear with just one illustration, I compared four additional illustrations for each VAE.

2. Night Harbor View (Anime)

Test image: Anime illustration of a harbor at night

Difference map comparison for each VAE on the night harbor illustration

Image2	SD1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
MAE_similarity	98.8 %	99.0 %	99.4 %	99.2 %
SSIM_similarity	99.5 %	99.6 %	99.9 %	99.9 %
Red	0.0 %	0.3 %	0.0 %	0.5 %
Green	-0.7 %	0.5 %	0.1 %	-0.4 %
Blue	-0.2 %	0.0 %	-0.2 %	0.3 %

MAE and SSIM show the same tendency
SDXL_0.9_vae increases green

3. Red Flowers (Anime)

Test image: Vibrant red flower anime illustration

Difference map comparison for each VAE on the red flower illustration

Image3	SD1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
MAE_similarity	97.1 %	97.8 %	98.8 %	98.9 %
SSIM_similarity	98.0 %	98.6 %	99.7 %	99.8 %
Red	-0.4 %	0.0 %	-0.2 %	0.4 %
Green	-0.8 %	0.1 %	0.0 %	-0.1 %
Blue	-0.4 %	0.1 %	0.0 %	0.1 %

Flux.2_vae shows the smallest changes in both MAE and SSIM

4. Takeoff (Photorealistic)

Difference map comparison for each VAE on the airplane photo

Image4	SD1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
MAE_similarity	98.0 %	98.3 %	99.0 %	99.0 %
SSIM_similarity	98.3 %	98.8 %	99.8 %	99.8 %
Red	0.0 %	0.2 %	0.1 %	0.3 %
Green	-0.8 %	0.3 %	0.2 %	0.0 %
Blue	-0.3 %	0.3 %	0.2 %	0.2 %

Flux.1_vae and Flux.2_vae perform equally well

5. Oil Painting (Photorealistic)

Test image: Heavy-textured oil painting style illustration

Difference map comparison for each VAE on the oil painting illustration

Image5	SD1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
MAE_similarity	97.9 %	98.3 %	98.7 %	98.8 %
SSIM_similarity	99.1 %	99.3 %	99.7 %	99.8 %
Red	0.1 %	-0.1 %	0.0 %	0.6 %
Green	-1.1 %	0.4 %	0.3 %	-0.3 %
Blue	-0.2 %	-0.1 %	-0.1 %	0.4 %

Flux.1_vae and Flux.2_vae perform equally well

Looking at the Average of All Five Illustrations

Now let’s examine the average across the five illustrations tested this time.

Image Changes

Average	SD1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
MAE_similarity	97.9 %	98.3 %	99.0 %	98.9 %
SSIM_similarity	98.7 %	99.1 %	99.8 %	99.8 %

Passing through VAE causes 0.2–2% degradation in the image
Color accuracy (MAE): Flux.1 > Flux.2 > SDXL > SD1.5
Structural accuracy (SSIM): Flux.2 ≈ Flux.1 > SDXL > SD1.5

Overall, images degrade by about 0.2–2% when passed through a VAE.

While performance improved steadily from SD1.5 to SDXL, the jump from SDXL to Flux.1 represents a dramatic leap in performance.

Model	SD_1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
Original Resolution	512 x 512 x 3	1024 x 1024 x 3	1024 x 1024 x 3	1024 x 1024 x 3
Latent Resolution	64 x 64 x 4	128 x 128 x 4	128 x 128 x 16	128 x 128 x 32
Compression Ratio	1/48	1/48	1/12	1/6
Representative Models	SD1.5	SDXL	Flux.1 HiDream Z-Image	Flux.2

Flux.1 increased the number of latent space channels from 4 to 16, quadrupling the information volume compared to SDXL. This increase in information capacity is believed to be the main reason for the significant improvement in precision.

Color Changes

Next, let’s look at the color shifts for each VAE.

Color and complementary color relationships

Average	SD1.5_vae	SDXL_0.9_vae	Flux.1_vae	Flux.2_vae
Red	-0.2 %	0.1 %	0.0 %	0.4 %
Green	-0.9 %	0.3 %	0.1 %	-0.2 %
Blue	-0.2 %	0.1 %	0.0 %	0.3 %

SD1.5_vae: Overall darkening, shifts toward purple
SDXL_0.9_vae: Shifts toward green
Flux.1_vae: Almost perfectly accurate
Flux.2_vae: Shifts toward purple

Regarding color, SD1.5_vae reduces all colors, making the entire illustration darker, with green being lost the most.

SDXL_0.9_vae shows less overall color change than SD1.5, but instead increases green.

Flux.1_vae excels at color reproduction, with almost no change.

Flux.2_vae emphasizes red and blue (magenta-blue). This is likely an intentional correction to compensate for the blue and red tones that tend to be lost when removing noise in AI illustrations.

Anime illustration of a silver-haired girl in a white uniform smiling confidently at the viewer — Flux.2 applies intentional color correction

Creating an Improved SDXL VAE!

Since we discovered that SDXL_0.9_vae tends to shift toward green, I adjusted a new VAE called SDXL_anime_natural_vae specifically for anime illustrations. It has the smallest MAE and SSIM changes and stays most faithful to the original color expression.

easygoing0114/SDXL_anime_vae · Hugging Face

SDXL_anime_natural_vae

Image2	SDXL_anime_natural_vae	SDXL_0.9_vae
MAE_similarity	99.4 %	99.0 %
SSIM_similarity	99.9 %	99.6 %
Red	-0.2 %	0.3 %
Green	-0.1 %	0.5 %
Blue	0.0 %	0.0 %

Since SDXL_anime_natural_vae was fine-tuned while monitoring MAE and SSIM, its numerical precision is significantly improved compared to the original.

Because it causes minimal deviation from the original image, it is expected to improve image quality especially in workflows that repeatedly process VAE, such as Hires.fix or Detailer.

SDXL_anime_clear_vae

Also available on the same page is SDXL_anime_clear_vae, which darkens the overall illustration and boosts contrast to emphasize a heavy, rich atmosphere.

Image2	SDXL_anime_clear_vae	SDXL_0.9_vae
MAE_similarity	99.1 %	99.0 %
SSIM_similarity	99.9 %	99.6 %
Red	-0.4 %	0.3 %
Green	-0.3 %	0.5 %
Blue	-0.2 %	0.0 %

This version has a stronger “flavor” compared to SDXL_anime_natural_vae, so it may not suit every illustration. However, if you try it and find it matches your desired expression, it’s worth using.

Does Image Quality Differ Between FP32 and BF16 Formats?

All verifications so far were performed using the highest-precision FP32 format of the VAE.

In actual image generation, FP16 / BF16 formats are used most frequently, so I compared image quality between FP32 and BF16 at the end.

sdxl_anime_natural_vae_BF16

Workflow for measuring image quality difference between VAE data formats (FP32 vs BF16)

Image2	SDXL_anime_natural_vae_FP32	SDXL_anime_natural_vae_BF16
MAE_similarity	99.40961 %	99.35945 %
SSIM_similarity	99.92685 %	99.91910 %

The FP32 format is slightly superior in image quality compared to BF16.

If you are pursuing the highest possible image quality in illustration generation, it may be worth trying the FP32 version of the VAE.

Screenshot showing where to check the model’s Precision (F32) in the file details — If Precision is F32, it is the FP32 format model

When using an FP32 VAE in ComfyUI, download the FP32 model and launch ComfyUI with the --fp32-vae argument.

What Makes Flux.2_vae So Good?

Finally, I’d like to introduce an article by Shiba*2 that explains Flux.2’s VAE in detail.

Learning about FLUX.2’s VAE again｜shiba*2

In this verification, I compared against the original images, so the difference between Flux.1_vae and Flux.2_vae was not very large.

Flux.2_vae is a next-generation VAE that understands the “meaning” contained in images, so its performance may improve further with different verification methods or future enhancements.

Summary

SDXL’s VAE tends to shift toward green
Flux’s VAE offers high precision
Introduction of SDXL_anime_natural_vae

This time, I investigated VAE.

Unlike UNet/Transformer or text encoders, it’s hard to understand how VAE affects illustrations, but using the Image Difference Checker node allowed me to objectively compare its precision.

Anime illustration of a silver-haired girl in a white uniform with clear eyes looking at the viewer — Try SDXL_anime_natural_vae!

The SDXL_anime_natural_vae introduced this time improves image quality simply by swapping out the regular VAE, making it a highly recommended VAE for everyone.

I will continue exploring AI illustration quality from various angles.

Thank you for reading until the end!