v-prediction is Fast! Vivid! But Doesn’t Blend Well?

2025-4-302025-7-9

an animated female character with red hair and blue eyes sits at a desk in a classroom surrounded by books and a chalkboard with japanese characters wearing a dark blue blazer and white shirt close up

v-pred is fast
Colors are vivid
Expressions are stiff

Introduction

Hello, I’m Easygoing.

This time, we’ll take a clear and simple look at v-prediction, a term you might occasionally hear about in the context of Stable Diffusion.

noob_v_pencil-XL_sample_image — noob_v_pencil-v2.0.1

Starting with e-pred!

The core mechanism of modern image generation AI is called the diffusion model, which generates images by removing noise from a noisy starting point.


gantt
    title e-pred and v-pred
    dateFormat YYYY-MM-DD
    tickInterval 12month
    axisFormat %Y

    section e-pred
        Stable Diffusion 1 :done, a1, 2022-08-22, 2025-05-01
                
    section v-pred
        Stable Diffusion 2.0 : b1, 2022-11-24, 2025-05-01
        
    section e-pred + v-pred
        Stable Diffusion XL 1.0 :done, c2, 2023-07-27, 2025-05-01
        
    section Flow-matching         
        Stable Diffusion 3 : d1, 2024-06-12, 2025-05-01
        AuraFlow : d2, 2024-07-12, 2025-05-01
        Flux.1   : d3, 2024-08-01, 2025-05-01
        HiDream-I1   : d4, 2025-04-06, 2025-05-01

The first practical open-source image generation AI was Stable Diffusion 1.

In Stable Diffusion 1, a standard method called e-pred (noise prediction) was used to remove noise.

Thinking of Noise and Images Separately!

In February 2022, a new approach was proposed to speed up image generation by separating noisy images into clean images and noise, then removing more of the noise component.

Progressive Distillation for Fast Sampling of Diffusion Models


flowchart LR

A1(Noisy Image)

subgraph e-pred
B1(Capture Noisy Image)
B2(Remove Noise)
end

subgraph v-pred
C1(Capture as Clean Image + Noise)
C2(Efficiently Remove Noise Component)
end

D1(Next Image)

A1-->B1
B1-->B2
B2-->D1
A1-->C1
C1-->C2
C2-->D1

e-pred (epsilon-prediction): Noise prediction
v-pred (velocity-prediction): Velocity prediction

This method learns the “flow velocity” from noise to a clean image, earning the name v-pred (velocity-prediction).

v-pred converges faster, allowing images to be generated in roughly half the number of steps compared to traditional methods.

v-pred Models Produce Vivid Colors

When AI learns images, dark colors have low RGB values, making them harder to distinguish from noise.

As a result, traditional e-pred models struggled with poor rendering of dark colors and high-contrast expressions.

In contrast, v-pred models theoretically learn from less noisy images, making them better at rendering dark colors compared to e-pred models.

v-pred models enable higher contrast and more vivid color rendering than e-pred models.

v-pred Models Don’t Blend Well!

Image generation AI can achieve various expressions by controlling noise.

Example 1: Increasing detail with noise methods

Want to Add More Detail? Try the Noise Method | AI image journey

Example 2: Advanced noise control with Detail Daemon

an animated character with blue hair and a halo - like circle on her head stands in a technologically advanced room filled with computer screens displaying various images surrounded by a predominantly — Adding detail through intermediate noise enhancement

e-pred models can blend images by adding noise, making them adaptable to various images.

However, v-pred models separate noise and clean images even when noise is added midway, making them less likely to blend.

v-pred models struggle with expressions that require model switching, and when used with high-resolution techniques like Hires.fix or Detailer, they tend to produce stiff expressions that don’t blend well with the original image.

SDXL Returns to e-pred!

Let’s revisit the first chart.


gantt
    title e-pred and v-pred
    dateFormat YYYY-MM-DD
    tickInterval 12month
    axisFormat %Y

    section e-pred
        Stable Diffusion 1 :done, a1, 2022-08-22, 2025-05-01
                
    section v-pred
        Stable Diffusion 2.0 : b1, 2022-11-24, 2025-05-01
        
    section e-pred + v-pred
        Stable Diffusion XL 1.0 :done, c2, 2023-07-27, 2025-05-01
        
    section Flow-matching         
        Stable Diffusion 3 : d1, 2024-06-12, 2025-05-01
        AuraFlow : d2, 2024-07-12, 2025-05-01
        Flux.1   : d3, 2024-08-01, 2025-05-01
        HiDream-I1   : d4, 2025-04-06, 2025-05-01

Stable Diffusion 1: e-pred
Stable Diffusion 2: v-pred
Stable Diffusion XL: e-pred (+ v-pred)

The first Stable Diffusion 1 used the standard e-pred model.

Stable Diffusion 2, released in November 2022, adopted v-pred, but it wasn’t widely adopted due to a lack of compatibility with the well-established Stable Diffusion 1 community.

Stable Diffusion XL, released in July 2023, returned to e-pred, addressing e-pred’s weaknesses in detail and contrast by using a dedicated Refiner model.

Anime illustration showing a woman with brown hair at night, smiling with a slightly languid expression 6 — Left: Animagine-XL 3.1 | Right: Animagine-XL 3.1 + Refiner

Anime illustration showing a woman with brown hair at night, smiling with a slightly languid expression 7 — Left: Animagine-XL 3.1 | Right: Animagine-XL 3.1 + Refiner

The True Value of the Refiner! What Emerges from Destructive Creation | AI image journey

However, the Refiner was difficult to fine-tune and slowed down illustration generation, so it didn’t gain widespread use. Instead, adjusting colors using the base model with CFG scale or extensions became the mainstream approach.

Back to v-prediction!

The vivid colors and stiff expressions of v-pred models aren’t a significant issue for anime-style illustrations.

In December 2024, the NoobAI-XL series, trained on a total of 13 million high-quality illustrations, introduced v-pred models, bringing them back into use with SDXL.

noob_v_pencil-Xl_colorful_sample_image — noob_v_pencil-v2.0.1: Vivid color rendering

Today, models that merge e-pred and v-pred have emerged (though merging is theoretically challenging, thanks to the trial-and-error efforts of model creators), improving usability.

Comparing e-pred and v-pred!

Now, let’s compare illustrations generated by e-pred and v-pred models.


flowchart TB

subgraph e-pred
A1(SDXL_Base)
B1(Animagine-XL 3.1<br>2024.3.21)
B2(Illustrious-XL_v0.1<br>2024.9.25)
B3(Illustrious-XL_v1.1<br>2025.2.18)
B4(NoobAI-XL<br>2024.10.8)
C1([blue_pencil-XL_v7.0.0<br>2024.6.23])
C2([anima_pencil-XL_v5.0.0<br>2024.6.25])
C3([illustrious_pencil-XL_v4.0.0<br>2025.3.29])
end

subgraph v-pred
B5(NoobAI-XL_V-pred-1.0<br>2024.12.22)
C4([noob_v_pencil-XL-v2.0.1<br>2025.4.7])
end

A1-->B1
A1-->C1
A1---->B2
B1-->C2
B2--->B3
B3-->C3
B2-->B4
B4---->B5
C1-->C2
C2-->C3
B5-->C4
C3-->C4

Models for Comparison

anima_pencil-XL-v5.0.0 (e-pred)
noob_v_pencil-XL-v2.0.1 (v-pred)

We’ll input the same prompt into each model and compare the generated illustrations.

Illustration 1: Space Travel

anima_pencil-v500-clip32-vae32_218 — Left: anima_pencil-XL-v5.0.0 (e-pred) | Right: noob_v_pencil-XL-v2.0.1 (v-pred)

noov_v_pencil-v500-clip32-vae32_227 — Left: anima_pencil-XL-v5.0.0 (e-pred) | Right: noob_v_pencil-XL-v2.0.1 (v-pred)

Illustration 2: Agent

Rough_anima_pencil-v500-clip32-vae32_303

Both models are merged from blue_pencil-XL, but the generated illustrations differ significantly.

At a glance, the color and detail rendering of the v-pred model on the right is superior.

However, soft expressions, such as character facial expressions, are better in the e-pred model on the left. A common drawback of v-pred is a tendency to shift toward warm tones, resulting in color burn.

Mixing It All Together!

Let’s design a workflow that combines the strengths of both models.

SDXL-Refiner-SDXL-Detailer_2025.4.30 — Workflow for inputting prompts as images

SDXL-Refiner-SDXL-Detailer_2025.4.30.json


flowchart LR

subgraph noob_v_pencil-XL
A1(Sketch)
end

subgraph Refiner
B1(Break Down Once)
end

subgraph anima_pencil-XL
C1(Redraw and Finalize)
end

A1-->B1
B1-->C1