Ongoing Updates
(9/16): I have had some major life events happen over the last month that have forced my attention away from this work (good things, fortunately); apologies to anyone who has been following this model's development. :) I have several candidates for a v3, but have had no time to test. I am actively working on finding that time!
(8/18): An experimental fp16
version of the v1
model has been uploaded. Initial testing seems to indicate that the results are at least comparable.
If you can run the bfloat16
version, bfloat16
will probably perform better.
(8/14): A v2
checkpoint has been uploaded. Please read the "about this version" notes in the drop-down panel to the right.
Presenting CanvasXL, an SDXL v1 Dreambooth fine-tune project which aims to empower your creativity like never before! Trained on a dataset of thousands of highly aesthetic, artistic AI-generated images, CanvasXL sets out to bring your vision to life with unparalleled beauty and creativity within the open-source image generation space.
I have tried my best to summarize the model, my intentions with it, and my plans around it in the text below. I hope you enjoy the fruits of my labor and create some beautiful images! If you like and want to support my work, you are welcome to buy me a coffee.
Features, Specifications, and Notices:
Versatility: CanvasXL produces a wide array of subjects and represents novel concept combinations impressively across many contexts.
Resolutions: Trained on 44 different resolutions, with particular preferences for 1024x1024 and 704x1216 (mobile portrait-style). Some subjects will be replicated better on one resolution than the other, currently.
Model Type (Base vs Refiner): Currently, a fine-tune of the SDXL v1 base
model is presented. This model can be used for txt2img
and img2img
, however I plan to fine-tune the refiner
model as well, once I have the time to figure it out!
Convenient Use: A ComfyUI workflow.json
is available for easy set up of the model, to help you get started. You can copy it from here.
Format: The model is currently presented in bfloat16
format. If anyone needs a fp16
version, please leave a comment and I will create it when I have the time!
Demo Images: The generation parameters for the images should be available. The prompts were directly copied from a selection of nice-looking images on a popular AI image generation site. The 20 demo images were cherry picked from a set of approximately 300 images, and are completely unedited.
Future Plans: I have amassed a dataset of ~30k well-tagged, good-quality images. I am currently pruning the dataset to increase the quality, cleaving about 60% of the total images. However, I have only made it through half of the total dataset, so the training set should expand and improve as I have the time to work through the set. As significant additions are made to the training set, new training runs will be performed on my local machine and uploaded.
Content Notice: The training dataset is entirely free of NSFW, nudity, violent, or otherwise potentially-highly objectionable imagery. As a result, this model should not be expected to be able to replicate such subjects well or even at all, as it is explicitly intended to produce non-objectionable imagery.
Usage Notice: Please be aware that you are solely responsible for any content that you create using this model. In addition, your use of this model implies that you accept an agreement to not use it to produce harassing, harmful, illegal, or otherwise highly-objectionable imagery.
Resolution Buckets:
CanvasXL's training encompasses various resolution buckets, ranging from (512, 1664) to (1792, 512). Due to variance across training runs, the general distribution is described:
bucket 0: resolution (512, 1664), count: low
bucket 1: resolution (512, 1728), count: low
bucket 2: resolution (512, 1792), count: low
bucket 3: resolution (576, 1472), count: low
bucket 4: resolution (576, 1536), count: low
bucket 5: resolution (576, 1728), count: low
bucket 6: resolution (640, 1344), count: low
bucket 7: resolution (640, 1408), count: low
bucket 8: resolution (640, 1472), count: low
bucket 9: resolution (640, 1536), count: low
bucket 10: resolution (640, 1600), count: low
bucket 11: resolution (704, 1216), count: HIGH
bucket 12: resolution (704, 1280), count: low
bucket 13: resolution (704, 1344), count: low
bucket 14: resolution (704, 1408), count: MID
bucket 15: resolution (704, 1472), count: low
bucket 16: resolution (768, 1152), count: low
bucket 17: resolution (768, 1216), count: low
bucket 18: resolution (768, 1280), count: low
bucket 19: resolution (768, 1344), count: low
bucket 20: resolution (832, 1088), count: low
bucket 21: resolution (832, 1152), count: low
bucket 22: resolution (832, 1216), count: MID
bucket 23: resolution (896, 1024), count: low
bucket 24: resolution (896, 1088), count: low
bucket 25: resolution (896, 1152), count: low
bucket 26: resolution (960, 896), count: low
bucket 27: resolution (960, 1024), count: low
bucket 28: resolution (1024, 896), count: low
bucket 29: resolution (1024, 960), count: low
bucket 30: resolution (1024, 1024), count: HIGH
bucket 31: resolution (1088, 832), count: low
bucket 32: resolution (1088, 896), count: low
bucket 33: resolution (1152, 832), count: low
bucket 34: resolution (1216, 704), count: low
bucket 35: resolution (1216, 768), count: low
bucket 36: resolution (1216, 832), count: low
bucket 37: resolution (1280, 768), count: low
bucket 38: resolution (1344, 704), count: low
bucket 39: resolution (1344, 768), count: low
bucket 40: resolution (1408, 704), count: low
bucket 41: resolution (1600, 640), count: low
bucket 42: resolution (1728, 576), count: low
bucket 43: resolution (1792, 512), count: low