Flux

Flux
FLUX is a family of image generation and editing models from Black Forest Labs built on rectified-flow architecture in latent space. The FLUX.1 Kontext variant accepts both text and image inputs to perform single-pass in-context image edits.

FLUX is Black Forest Labs’ family of image generation and editing models built on flow-matching architecture, with the Kontext variant handling text-plus-image editing in a single forward pass.

What It Is

Image editing used to mean either masking out a region by hand and asking a diffusion model to repaint it, or running multi-step inversion routines that often shifted colors and identity in ways no creator wanted. FLUX is Black Forest Labs’ answer to that pipeline pain. The base FLUX.1 line generates images from text, while the Kontext variant handles editing — accepting both a reference image and a text instruction, then producing the edited result in one shot.

Underneath, FLUX uses flow matching rather than the noise-prediction approach of traditional diffusion. Flow matching learns a direct path from random noise to a real image, which lets the model produce results in fewer denoising steps without losing fidelity. According to the FLUX.1 Kontext paper, the Kontext variant extends this formulation by treating the reference image as part of the conditioning signal alongside the text prompt, so the edit happens inside a single forward pass instead of an inversion-then-regeneration loop. That single-pass design is what makes Kontext directly comparable to GPT-Image and Qwen-Image-Edit rather than to the older mask-and-repaint workflow.

The family ships in tiers. According to Black Forest Labs, FLUX.1 comes as [pro] (closed API, highest quality), [dev] (open weights, non-commercial use, around twelve billion parameters), and [schnell] (distilled for one-to-four-step inference under an Apache 2.0 license). FLUX 1.1 Pro is the performance-tier successor for the closed API. FLUX.1 Kontext follows the same split — [pro] for hosted inference, [dev] released as twelve-billion-parameter open weights for consumer-hardware deployment. The closed [pro] tier’s parameter count has not been officially published. Each tier targets a different point on the quality-versus-control trade-off: the API tiers for production polish, open weights for customization and on-device use, schnell for latency-sensitive workflows.

How It’s Used in Practice

Most people meet FLUX inside a creative tool rather than calling the model directly. Designers and content teams use FLUX.1 Kontext through ComfyUI workflows, hosted nodes on Replicate or fal.ai, or in editor plugins that wrap the Black Forest Labs API. The pattern is consistent: upload a source image, write a one-line instruction (“change the jacket to navy blue”, “remove the background”, “place this product on a marble surface”), and the model returns the edited result without staging intermediate masks. For teams that need predictable output and commercial rights, the hosted [pro] tier is the default; teams that need to fine-tune on their brand’s image library reach for [dev] weights instead and accept the non-commercial licensing constraint that goes with them.

Pro Tip: Treat Kontext like a junior art director, not a Photoshop replacement. Write the instruction the way you would brief a designer — describe the intended outcome, the constraints (preserve identity, maintain lighting), and any framing you want kept. Vague prompts produce drift; specific prompts produce edits.

When to Use / When Not

ScenarioUseAvoid
Quick instruction-based edits where you can describe the change in one sentence
Pixel-perfect retouching where the client will inspect at 200% zoom
On-device or self-hosted editing where data cannot leave your environment
Commercial production work using the open-weights [dev] tier
Fine-tuning a base editing model on your brand’s image library
Workflows that require precise local masks and layer-by-layer compositing

Common Misconception

Myth: FLUX is just another text-to-image model competing with Stable Diffusion. Reality: That was true for FLUX.1 alone, but the Kontext variant turned the family into an in-context image editor. Kontext takes both an image and a text prompt as input and returns an edited image in a single forward pass, which puts it in the same category as instruction-based editors rather than pure generators.

One Sentence to Remember

If you need to change something inside an existing image with a one-line instruction and you want the option to run the model on your own hardware, FLUX.1 Kontext is the open-weights default to test first.

FAQ

Q: What is the difference between FLUX.1 and FLUX.1 Kontext? A: FLUX.1 generates images from text only. FLUX.1 Kontext accepts both text and a reference image, so it can edit existing pictures in a single forward pass instead of generating from scratch.

Q: Can I use FLUX commercially? A: The closed [pro] tier and the Apache-licensed [schnell] allow commercial use. The open-weights [dev] releases of FLUX.1 and Kontext are licensed for non-commercial use only.

Q: Does FLUX.1 Kontext replace inpainting? A: For instruction-based edits, yes — Kontext handles the change end-to-end without separate masks. For surgical pixel work on a defined region, classical inpainting in tools like Photoshop or ComfyUI still wins.

Sources

Expert Takes

Flow-matching models like FLUX learn a velocity field that transports random noise toward image distributions along straight paths. Compared to traditional diffusion, this geometry permits fewer denoising steps without quality collapse. Kontext extends the formulation to conditional editing: the model conditions on both text tokens and a reference image, producing edits inside a single forward pass rather than iterating between separate inversion and reconstruction stages.

For workflows that depend on reproducible edits, FLUX.1 Kontext changes the contract. Where instruction-based editors require carefully chained masks and inversion steps, Kontext lets a single specification — text plus reference image — drive one forward pass. That collapses several brittle stages into one, which means your spec describes the outcome, not the procedure. Build pipelines around the prompt-as-contract pattern and let the model handle the steps.

Open weights for editing-class models is the real story. Until recently, serious image editing meant either Adobe’s stack or a closed API — now Black Forest Labs has put a competitive editor onto consumer hardware under a non-commercial license, with paid tiers for production. That changes the build-versus-buy math for any product that needs in-app image editing. The companies that integrate first will own the workflow before defaults harden.

Single-pass editing means a photo can be plausibly altered with no specialist knowledge, no inversion artifacts to detect, and no obvious processing trail. Who is responsible when a Kontext-edited image circulates as evidence? The model author who shipped open weights? The platform that hosted the inference? The person who typed the prompt? Each layer of accessibility erodes the line between authentic capture and synthesized scene, and legal frameworks were not built for that geometry.