Image-to-Image: Definition & Meaning — AI Wiki

Generar una nueva imagen basada en una imagen existente más un prompt textual. En vez de partir de ruido puro (text-to-image), el proceso de difusión parte de una versión ruidosa de la imagen de entrada, preservando su estructura mientras la modifica según el prompt. «Una versión cyberpunk de esta foto» mantiene la composición pero transforma el estilo y los detalles.

Por qué importa

Image-to-image es el puente entre la fotografía y el arte IA. Te deja usar bocetos, fotos u obras existentes como punto de partida, manteniendo disposición y composición mientras la IA transforma el estilo, añade detalle o reimagina el contenido. Es más controlable que text-to-image porque guías la salida con estructura visual, no solo palabras.

Deep Dive

The mechanism: take the input image, encode it to latent space (via the VAE encoder), add noise proportional to a "denoising strength" parameter (0.0 = no change, 1.0 = pure noise = text-to-image), then denoise conditioned on the text prompt. At strength 0.3, the output closely resembles the input with subtle modifications. At strength 0.8, it's largely reimagined but keeps the basic composition.

Denoising Strength

The denoising strength is the key parameter: it controls how much the output can deviate from the input. Low strength (0.2–0.4): minor style changes, color adjustments, subtle detail additions. Medium strength (0.5–0.7): significant style transformation while preserving composition. High strength (0.8–1.0): major reimagining, only vague structural similarity to the input. Finding the right strength for your use case requires experimentation.

Sketch-to-Image

A powerful img2img workflow: draw a rough sketch (even in MS Paint), use it as the input image with medium-high denoising strength, and describe the desired output. The sketch provides spatial layout (where objects are, their relative sizes) while the AI fills in all the artistic detail. This makes AI image generation accessible to anyone who can draw a stick figure — the composition comes from you, the rendering from the AI.

Image-to-Image

Por qué importa

Deep Dive

Denoising Strength

Sketch-to-Image

Conceptos relacionados