HiDream-O1-Image_clear_v1: Easy Local Image Editing with UiT Architecture

2026-5-292026-6-14

HiDream-O1-Image_clear_v1_anime_LAB_adjust_00054_.png (1600❌1600)

HiDream-O1-Image_clear delivers clear and vibrant outputs
Supports Image-to-Image and various image editing tasks
MXFP8 is a format exclusive to RTX 5000 series

Introduction

Hello, I'm Easygoing.

This time, I've released the HiDream-O1-Image_clear_v1 model, which makes image editing easy to do locally, so let me introduce it to you.

HiDream-O1-Image_clear_v1 - BF16 | HiDream-O1 Checkpoint | Civitai

easygoing0114/HiDream-O1-Image_clear · Hugging Face

Anime-style woman generated with HiDream-O1-Image_clear_v1. Long black hair with a mature atmosphere — HiDream-O1-Image_clear_v1

HiDream-O1-Image is a versatile image generation and editing model

HiDream-O1-Image is a versatile AI-powered image generation and editing model.

Goodbye Latent Space? How HiDream-O1-Image is Revolutionizing General-Purpose AI Drawing | AI Image Journey

HiDream-O1-Image adopts the UiT architecture, which understands language and images in the same dimensional space. Thanks to this, even though it is a lightweight model, it possesses high image editing capabilities.

Since the HiDream-O1-Image model is a newly released base model, its outputs tend to be slightly blurry. This time, I fine-tuned it to produce clearer illustrations.

Real illustration examples!

Let’s compare some actual illustrations.

The left side shows the adjusted HiDream-O1-Image_clear_v1, and the right side shows the output from the original HiDream-O1-Image.

Neon Sign

HiDream-O1-Image_clear_v1_original_compare_anime_close_up.png (1600❌797)

Paris at Dusk

HiDream-O1-Image_clear_v1: photorealistic illustration of a woman in a black hat against a Paris dusk background.

HiDream-O1-Image (original): photorealistic illustration of a woman in a black hat against a Paris dusk background.

HiDream-O1-Image_clear_v1_original_compare_photoreal_close_up

You can see that the left side (HiDream-O1-Image_clear_v1) produces higher contrast and clearer illustrations compared to the right side (original HiDream-O1-Image).

On the other hand, since HiDream-O1-Image_clear_v1 was mainly fine-tuned on anime illustrations, photorealistic images may sometimes feel a bit too high in contrast. In such cases, try adjusting parameters by lowering the CFG scale or noise_level.

Let’s try image editing!

Now, let’s actually try image editing using the HiDream-O1-Image_clear_v1 model.

In these examples, the source images were also generated with HiDream-O1-Image_clear.

I’ve also attached the ComfyUI workflows I actually used.

Convert to Anime Illustration

Photo of a woman with long black hair in a blue dress

ComfyUI workflow for converting a photorealistic image to anime illustration using image-to-image

change to anime illustration

HiDream-O1-Image_clear_v1_photoreal_to_anime_20260529.json

Using as Refiner to Redraw and Enhance Details

Photorealistic illustration of a woman in an amphitheater

ComfyUI workflow using HiDream-O1-Image_clear_v1 as a refiner to enhance details

HiDream-O1-Image_clear_v1_i2i_refiner_20260529.json

Costume & Background Transformation

Unimpressive middle-aged Japanese man wearing glasses

Man smiling widely while holding a long sword and wearing plate armor

ComfyUI workflow for changing costume and background

A full-length portrait of a man in a fantasy world, holding a long sword and wearing plate armor, beaming with excitement as he looks forward to his upcoming adventures. The background features a medieval village with a wheat field in the distance.

HiDream-O1-Image_clear_v1_cloth_background_change_20260529.json

Convert to Black and White Manga

Photorealistic illustration of a supercar on a night coast

Converted to black and white manga style

ComfyUI workflow for converting to black and white manga style

change to black and white manga artwork

HiDream-O1-Image_clear_v1_i2i_manga_artwork_20260529.json

Combining Two Images

Photorealistic illustration of a cat curled up sleeping on a blanket

Photorealistic illustration of a living room with a fireplace and carpet

Photorealistic illustration of a cat sleeping on a carpet

ComfyUI workflow for combining two images

draw image1 cat on image2 carpet

HiDream-O1-Image_clear_v1_two_images_combine_20260529.json

ComfyUI-uit-hidream-o1 Custom Node

For this workflow, I used the ComfyUI-uit-hidream-o1 custom node.

easygoing0114/ComfyUI-uit-hidream-o1: Single sampling node (UIT Sampler) for HiDream-O1-Image .

Screenshot of searching for — Nodes Manager search screen

UIT Sampler Node

Inputs

model: Input the model
clip: Dummy input (required for ComfyUI connection)
vae: Dummy input (required for ComfyUI connection)
input_image: Use an input image instead of initial white noise
reference_image: Directly converts the input image into tokens (can be used as a replacement for or in combination with text prompts)

Settings

width, height: HiDream-O1’s default resolution is 2048 x 2048
- When an input_image is provided, it resizes the input_image to 4 megapixels and uses that resolution. In that case, width and height are ignored.
noise_scale: Strength of the noise

The official workflow uses more than half dummy nodes

HiDream-O1-Image has an official workflow published by Comfy Org, but it uses many dummy nodes for features that don’t actually exist. It is not recommended if you want to understand the UiT architecture.

Screenshot of the official Comfy Org HiDream-O1-Image workflow — More than half of the official Comfy Org workflow consists of dummy nodes.
UiT architecture does not have clip, vae, external conditioning, or latent.

ComfyUI is a tool specialized for inference using CLIP and VAE, which appeared after Stable Diffusion 1. Therefore, it is somewhat inevitable that implementing the simpler UiT architecture becomes relatively complex.

FP8_scaled vs MXFP8

On the release page for HiDream-O1-Image_clear_v1, I have published two types of high-precision FP8 format models: FP8_scaled and MXFP8.

Screenshot of the model list in the Hugging Face HiDream-O1-Image_clear repository

FP8_scaled: Improved precision, runs fast on RTX 4000 series
MXFP8: Further improved precision, runs fast on RTX 5000 series

FP8_scaled is a model that uses scaling to efficiently utilize the overall bit count and suppress precision loss.

MXFP8 further divides the data into blocks of 32 for scaling, which avoids the influence of outliers and improves precision even more. However, hardware support is currently limited to NVIDIA’s RTX 5000 series GPUs.

gantt
    title GPU Series Roadmap
    dateFormat YYYY-MM-DD
    tickInterval 12month
    axisFormat %Y
    section NVIDIA
        GTX 1000  : 2016-05-27, 2026-05-30
        RTX 2000  : 2018-09-20, 2026-05-30
        RTX 3000  : 2020-09-17, 2026-05-30
        RTX 4000  : 2022-10-12, 2026-05-30
        RTX 5000  : 2025-01-30, 2026-05-30
    section AMD
        RX 5000   : 2019-07-07, 2026-05-30
        RX 6000   : 2020-11-18, 2026-05-30
        RX 7000   : 2022-12-13, 2026-05-30
        RX 9000   : 2025-03-06, 2026-05-30
    section Intel
        Arc A     : 2022-03-30, 2026-05-30
        Arc B     : 2024-12-13, 2026-05-30

	FP32	FP16	BF16	FP8	MXFP8	FP4
NVIDIA
RTX 5000（Blackwell)	✅	✅	✅	✅	✅	✅
RTX 4000（Ada Lovelace)	✅	✅	✅	✅	❌	❌
RTX 3000（Ampere)	✅	✅	✅	❌	❌	❌
RTX 2000（Turing)	✅	✅	❌	❌	❌	❌
GTX 1000（Pascal)	✅	⚠️	❌	❌	❌	❌
AMD
RX 9000 (RDNA4)	✅	✅	✅	✅	❌	❌
RX 7000 (RDNA3)	✅	✅	✅	❌	❌	❌
RX 6000 (RDNA2)	✅	✅	❌	❌	❌	❌
RX 5000 (RDNA1)	✅	✅	❌	❌	❌	❌
Intel
Arc B (Battlemage)	✅	✅	✅	❌	❌	❌
Arc A (Alchemist)	✅	✅	✅	❌	❌	❌

Both FP8_scaled and MXFP8 will fall back to FP16 / BF16 format on unsupported GPUs, resulting in slower processing.

The model format and processing time vary greatly depending on your environment, so please try different models to find the best one for you.

Anime illustration of a woman with long black hair and a mature atmosphere indoors — The best format varies depending on the environment

Summary: Try HiDream-O1-Image_clear_v1!

HiDream-O1-Image_clear delivers clear and vibrant outputs
Supports Image-to-Image and various image editing tasks
MXFP8 is a format exclusive to RTX 5000 series

This time, I introduced the HiDream-O1-Image_clear_v1 model.

Previously, when I fine-tuned base models such as Z-Image-Base and Flux [klein] 4B-Base, I had to consider VAE post-processing, and color adjustment was often particularly difficult.