* cn forward patcher
* simplify
* use args instead of kwargs
* postpond moving cond_hint to gpu
* also do this for t2i adapter
* use a1111's code to load files in a batch
* revert
* patcher for batch images
* patcher for batch images
* remove cn fn wrapper dupl
* remove shit
* use unit getattr instead of unet patcher
* fix bug
* small changte
put inpaint_v26.fooocus.patch in models\ControlNet, control SDXL models only
To get same algorithm as Fooocus, set "Stop at" (Ending Control Step) to 0.5
Fooocus always use 0.5 but in Forge users may use other values.
Results are best when stop at < 0.7. The model is not optimized with ending timesteps > 0.7
Supports inpaint_global_harmonious, inpaint_only, inpaint_only+lama.
In theory the inpaint_only+lama always outperform Fooocus in object removal task (but not all tasks).
* Make test client run on cpu
* test on cpu
try fix device
try fix device
try fix device
* Use real SD1.5 model for testing
* ckpt nits
* Remove coverage calls
since they are built-in extensions we can make the assumption that they will be at least one or more extensions
Co-Authored-By: Andray <33491867+light-and-ray@users.noreply.github.com>
This will move all major gradio calls into the main thread rather than random gradio threads.
This ensures that all torch.module.to() are performed in main thread to completely possible avoid GPU fragments.
In my test now model moving is 0.7 ~ 1.2 seconds faster, which means all 6GB/8GB VRAM users will get 0.7 ~ 1.2 seconds faster per image on SDXL.
A mistake in 0day release is that the attention layers of cond and uncond items in a batch are aligned when they should not.
after align batch in cond and uncond separately they now works and give same results to legacy sd-webui-cnet