UPDATE: 10 Sep 2023
This embedding adds crystal and/or gem effects to SDXL generated images. The low step count means it’s not subtle, but I hope it shows there’s untapped potential for SDXL embeddings.
The downloadable “kcrstal17xl-step00002000.safetensors” is 2000 steps, batch=1, trained on a set of fifty 1024x1024 images using kohya_ss. If I can bear the slooooooow training time I’ll update this page with a more refined embedding in the future.
The showcase is in pairs of automatic1111 webui outputs:
with the embedding, without the embedding
The showcase images use the Crystal Clear XL model:
https://civitai.com/models/122822?modelVersionId=133832
Prompting has been a bit annoying, perhaps because I don’t really have a feel for the changes from Stable Diffusion v1.5 to SDXL. The prompts for the showcase images are mostly in the form:
kcrstal17xl analog realistic __art styles 1__ colour photo of a __crystal_test_1__, very detailed, award-winning
Wildcard __art styles 1__ is a list of art styles such as “Impressionism”.
Wildcard __crystal_test_1__ is a list of short phrases such as “frog on a deckchair”.
You can weight the embedding up or down - I generally use values between 0.6 and 1.2. Strangely, I’ve found that a weighting of 0.1 still has a noticeable effect.
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
03 Sep 2023
This is a proof-of-concept rather than a polished embedding. Maybe it should be an Article rather than a normal resource page?
The downloadable “kcrstal17xl-step00001200.safetensors” is just 1200 steps, batch=1, trained on a set of fifty 1024x1024 images. If I can bear the slooooooow training time I’ll update this page with a more refined embedding in the future.
The embedding adds crystal and/or gem effects to SDXL generated images. The low step count means it’s not subtle, but I hope it shows there’s untapped potential for SDXL embeddings.
The showcase is in pairs of ComfyUI outputs:
with the embedding, without the embedding
and then 2 pairs of the mess that a1111 v1.6.0 outputs:
saved image, screencap of “Approx NN” preview
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Rambling explanation...
I’m quite impressed with the output from SDXL and I wanted to convert/recreate my SD v1.5 embeddings to work with SDXL.
The Train tab in automatic1111 webui was what I used to make CrystaliTI:
https://civitai.com/models/135403/crystaliti
and I hoped the recent v1.6.0 of a1111 would add SDXL training. Nope. And not likely to happen either according to this:
https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/11857#discussioncomment-6480804
The obvious answer seemed to be kohya_ss:
https://github.com/bmaltais/kohya_ss
I’ve never used it, so the sheer amount of config options was a bit of a shock! First thing I tried was to recreate CrystaliTI in kohya_ss. That didn’t get off to a good start, but thanks to:
I got there in the end. Thumbs-up to @Desi_Cafe for the above tutorial.
The kohya_ss TI is different from my original CrystaliTI of course, but it’s similar enough that I’m OK with it.
Next step was to change the kohya_ss settings to use SDXL and point it at 1024x1024 versions of my 50 training images. 21 hours later, kohya_ss has managed 1240 steps out of 2000 (batch=1).
That’s just over 1min/it. :-(
By comparison, I think recreating CrystaliTI ran at about 0.5s/it.
I was expecting (guessing!) a slowdown of between 4 and 16 times. Not >100 times slower :-(
The PC I’m using is:
Nvidia 3060/12GB (not the Ti version), Ryzen 7-2700 (8c/16t), 64GB system RAM, multiple SSDs, Win10pro.
While kohya_ss is making the SDXL embedding, it’s using all 12GB of the 3060 + 7GB of “Shared GPU memory”. I guess using the shared RAM (i.e. system RAM) is one reason it’s so slooooooow?
Perhaps there are some better settings I can use in kohya_ss? A quick google shows contradictory/incomplete info so if anyone has any suggestions of what options in the current kohya_ss can help with VRAM consumption... I’m all ears.
Kohya_ss is set to save the embedding every 50 steps, so I’ve been able to copy them over to my backup machine: 2060/6GB, i7-10750H (6c/12t), 16GB system RAM, multiple SSDs, Win10pro. It can generate SDXL images in the a1111 webui (v1.6.0) if I use --lowvram. Generally about 2mins per pic in a1111.
I tried the SDXL embedding in ComfyUI (current portable version, no custom nodes).
And here we are - it looks like creating SDXL embeddings is doable, but slow. My toy budget is not going to manage a 4090 anytime soon, so unless I can find a way of speeding up creation on the 3060/12 I guess I won’t be making many embeddings.
If any kohya_ss experts have read this far, I’ve added my config to the downloads in the hope some kind person might be able to suggest something (anything!) that would speed up the generation process.