Skip to content

(Non-exhaustive) List of Training-Free Samplers

At the beginning of each section, you may find a table summarizing the samplers mentioned in the section.

Sampler Time Spent Order Converges Notes
Name of the sampler How many times slower than euler. 1x=the same as euler, 2x=takes twice as long as euler, etc The order, as mentioned in #accuracy/control. Some technically support a range of orders, in that case, I'll include the default & range. Yes (refines 1 image with more steps) / No (may change composition with more steps) Some notes about the sampler

Explicit Runge-Kutta Methods: Euler, Heun, and Beyond

Sampler Time Spent Order Converges Notes
euler 1x 1 Yes Simplest and most inaccurate, makes soft lines & blurs details.
euler_ancestral 1x 1 No Like euler but divergent (adds noise), popular.
heun 2x 2 Yes Can be thought of as the "improved" euler
bosh3 3x 3 Yes 3rd order RK
rk4 4x 4 Yes 4th order RK
dopri6 6x 5 Yes 6 model calls/step is needed for order 5.

euler, heun, and the rarer fehlberg2, bosh3, rk4, dorpi6, and more, all fall under the umbrella of explicit Runge-Kutta(RK) Methods for solving ODEs. They were developed way before any diffusion model, or even any modern computer, came to be.

RK methods are singlestep, which means that the higher order ones take a while to run. bosh3 for example takes 3 times longer than euler per step. Combined with the fact that diffusion DEs are stiff, this means that it's seldom worth using a high-order explicit RK method by itself, as it massively increases sampling time while netting you a very marginal gain in quality. Personally, I'd at most use bosh3, though even that's cutting it close. It's no wonder that you don't find some of these in popular UIs.

Samplers developed down the road employ optimizations specific to diffusion equations, making them the better choice 90% of the time.

Where Can I Find Higher Order RK Samplers?

Early Days of Diffusion: DDPM, DDIM, PLMS(PNDM)

Sampler Time Spent Order Converges Notes
ddpm 1x 1 No The original diffusion sampler
ddim 1x 1 Yes Converges faster than DDPM, trading a bit of quality
plms=pndm 1x default 4 (up to 4) Yes LMS tailored for use in diffusion (uses Adams–Bashforth under the hood)

DDPM was what started it all, applying diffusion models to image generation, achieving really high quality but requiring a thousand steps to generate a sample.

Through adjustments to the diffusion equation, people arrived at DDIM, drastically reducing the number of steps required at the cost of a little quality.

PNDM finds that classical methods don't work well with diffusion equations. So they design a new way to do it - pseudo numerical methods. They then tried many approaches and found that Pseudo Linear Multistep Method (PLMS) seemed the best, hence the other name.

These 3, along with the classical ODE solvers euler, heun, and LMS, were the original samplers shipped with the release of the original stable diffusion.

Steady Improvements: DEIS + iPNDM, DPM

Sampler Time Spent Order Converges Notes
deis 1x default 3 (up to 4) Yes
ipndm 1x 4 Yes ipndm found empirically better than ipndm_v
ipndm_v 1x 4 Yes
dpm_2 2x 2 Yes
dpm_2_ancestral 2x 2 No
dpm_adaptive 3x default 3 (2 or 3) Yes ignores steps & scheduler settings, runs until it stops itself
dpm_fast 1x (averaged) between 1~3 Yes Uses DPM-Solver 3,2,1 such that the number of model calls = number of steps, effectively taking the same time as euler would in the same number of steps

DEIS and DPM independently came to the same conclusion: Diffusion equations are too stiff for classical high-order solvers to do well. They use a variety of techniques to remedy this. Notably, they both solve a part of the equation exactly, removing any error associated with it while leaving the rest less stiff. This idea is so good in fact that many samplers down the road also do it.

The DEIS paper also introduced "improved PNDM" (iPNDM). ipndm_v is the variable step version that should work better for diffusion, though they find empirically that ipndm performs better than ipndm_v.

Differing Results With ipndm_v

In my personal tests in ComfyUI, for some reason, I find for the same exact parameters - prompt, seed, etc. - ipndm_v sometimes breaks (lots of artifacts) if you use KSampler, but not if you use SamplerCustomAdvanced. In fact, I've never gotten any image breakdown with SamplerCustomAdvanced + ipndm_v, unless at low steps where it hasn't converged.
Bextoper has also noted that ipndm_v breaks similarly to KSampler in Forge.

Cascade of New Ideas: DPM++, UniPC, Restart, RES, Gradient Estimation, ER SDE, SEEDS

Sampler Time Spent Order Converges Notes
dpmpp_2s_ancestral 2x 2 No "dpmpp" as in "DPM Plus Plus" = "DPM++"
dpmpp_sde 2x 2 No I think this is "SDE-DPM-Solver++(2S)" not found explicitly defined in the paper
dpmpp_2m 1x 2 Yes
dpmpp_3m_sde 1x 3 No
uni_pc 1x 3 Yes Official repo
uni_pc_bh2 1x 3 Yes Empirically found a little better than uni_pc in guided sampling
Restart default 2x (varies) default 2 (varies) No Time Spent & order depends on the underlying solver used (paper uses heun); Official repo
res_multistep 1x 2 Yes The authors give a general way to define res_singlestep for any order
gradient_estimation 1x 2 (?) Yes Uses 2 substeps, so I guess order 2? Not sure if the notion of order really applies...
seeds_2/3 2/3x 2/3 No
er_sde 1x default 3 (1-3) No Official repo

*(this nothing to do with StableCascade I just thought it's a cool title)

Around this time, the idea of guidance took off, offering the ability to specify what image we want to generate, but also bringing new challenges to the table:

  • High guidance makes the DE even stiffer, breaking high-order samplers
  • High guidance knocks samples out of the training data range (train-test mismatch), creating unnatural images

To address issues with high CFG, DPM++ adds 2 techniques (that were proposed in prior works by others already) to DPM:

  • Switch from noise (eps, \(\epsilon\)) prediction to data (\(x_0\)) prediction (which they show is better by a constant in Appendix B).
  • The above also allows them to apply thresholding to push the sample back into training data range.

Practical; Not a Technical Marvel

The dpmpp family, especially dpmpp_2m, are one of the most widely used samplers alongside euler_ancestral. However, it was rejected by ICLR due to heavily relying on existing works, so it "does not contain enough technical novelty."

UniPC came soon after. Inspired by the predictor-corrector ODE methods, they develop UniC, a corrector that can be directly plugged after any existing sampler to increase its accuracy. As a byproduct, they derive UniP from the same equation as UniC, which is a predictor that can go up to any arbitrary order. The two combine to UniPC, achieving SOTA results using order=3.

Restart doesn't actually introduce a new sampler, instead focusing on the discrepancy between trying to solve diffusion as an ODE (no noise injections) vs. an SDE (injects noise at every step). To get the best of both worlds, Restart proposes that rather than injecting noise at every step, let's do it in infrequent intervals.

*Visualization of ODE, SDE, and Restart taken from their [official repo](https://github.com/Newbeeer/diffusion_restart_sampling)*

RES identified an overlooked set of conditions that solvers must satisfy to achieve their claimed order (they find that dpmpp doesn't satisfy some of these, leading to worse-than-expected results). They then unify the equation for noise prediction and data prediction, making analysis easier. Finally, they pick coefficients that satisfy these additional conditions.

Gradient Estimation finds that denoising can be interpreted as a form of gradient descent, and designs a sampler based on it.

SEEDS rewrites a whole lot of equations so that more parts can be solved exactly or approximated more accurately. To make sure the equation stays true, a modified way of injecting noise is used. They derive SEEDS for both eps-pred and data-pred, though the former is very slow so ComfyUI includes only the latter.

ER SDE models the diffusion process as an Extended Reverse-Time SDE and develops a solver based on that.

Out of Reach: AMED, DC-Solver, among Others

With so many samplers, it's no surprise that some have been left out of the party. This section includes sampling techniques that, while exist in literature or python code, are unavailable in popular UIs that are more accessible to non-coders. Since they're not widely available, discussion is also low so there will be many more that I simply don't know about and are missing from the list.

Techniques Not Included: (in no particular order)

  • UniC: As said before, you could in theory plug UniC after any sampler to achieve better accuracy. No UI lets you actually do that though to my knowledge.

Samplers Not Included: (in no particular order)