SDXL have been a breakthrough in open source text to image, but it has many issues like base model do not produce good face and anatomy of far away characters, to fix this i will be going through some fixes and ways to counter this issue.
Stable Diffusion XL (SDXL) has marked a significant advancement in the realm of open-source text-to-image generation. However, it is not without its challenges, particularly in generating accurate facial features and anatomical structures for distant characters.
In this article, we will explore various strategies to address these limitations and enhance the fidelity of facial representations in SDXL-generated images.
Fix Face in SDXL
1. Use Fooocus
Fooocus is a new UI that lets you run SD models, including SDXL by default. Its newer version comes with inpainting and outpainting options. If you have experience with SDXL, you know that it is not very good at inpainting.
However, Fooocus is doing something new and unique that allows it to inpaint images quite well, and it is also quite decent at outpainting. In general, if you are looking for an interface for running SDXL and you don’t want to tweak with weird settings, then Fooocus is for you.
2. Utilizing Effective Negative Prompts
A negative prompt is a technique where you guide the model by suggesting what not to generate. In the context of SDXL, using a well-crafted negative prompt can be helpful in steering the model away from producing inaccurate facial representations. For instance, explicitly instructing the model not to create blurry or distorted faces might improve the quality of facial features in the generated images.
- No distorted facial features. This will help to ensure that the faces in the image are not distorted or warped.
- No missing eyes. This will help to ensure that the faces in the image have two eyes.
- No distorted eyes. This will help to ensure that the eyes in the image are not misshapen or bulging.
- No distorted nose. This will help to ensure that the nose in the image is not misshapen or crooked.
- No bad mouth. This will help to ensure that the mouth in the image is not open or crooked.
- No weird faces. This is a more general negative prompt that can be used to avoid any kind of distorted or unnatural faces.
- No creepy faces. This is another general negative prompt that can be used to avoid any kind of unsettling or disturbing faces.
- No ugly faces. This negative prompt can be used to avoid faces that are not conventionally attractive.
3. Leveraging Enhancer Lora for Image Enhancement
Enhancer Lora is a type of LORA model that has been fine-tuned specifically for enhancing images. By incorporating the output of Enhancer Lora into the generation process of SDXL, it is possible to enhance the quality of facial details and anatomical structures. This approach capitalizes on the strengths of both models to produce more realistic and accurate images.
Some Of the top Enhancer Lora:
- DetailedEyes_XL
- Detail Tweaker XL
- better faces sdxl
- Advanced Enhancer XL LoRA
- xl_more_art-full / xl_real / Enhancer
4. Exploring Aspect Ratio Optimization
SDXL, like many other image generation models, might implicitly learn to associate certain features with specific aspect ratios. Experimenting with different aspect ratios during the training and generation processes could potentially improve the model’s ability to generate better facial representations. By aligning the aspect ratio with the focal points of the image, the model might be better equipped to produce better faces even for distant characters.
Use this to get the total pixel from the aspect ratio: Calculate aspect ratio
- 1024 x 1024
- 1152 x 896
- 896 x 1152
- 1216 x 832
- 832 x 1216
- 1344 x 768
- 768 x 1344
- 1536 x 640
- 640 x 1536
Vertical (Taller Height) Aspect Ratios:
- 3:4
- 4:5
- 5:7
- 2:3
- 9:16 (common for mobile screens)
- 1:2 (used for portrait photography)
Horizontal (Wider Width) Aspect Ratios:
- 16:9 (standard for widescreen displays)
- 1.85:1 (common cinematic aspect ratio)
- 2:1
- 21:9 (ultrawide monitor standard)
- 3:2 (used in some digital cameras)
- 4:3 (traditional television and computer monitor ratio)
5. Harnessing Fine-Tuned Models
One of the advantages of SDXL being an open-source model is the vibrant community that contributes to its improvement. Community members often fine-tune the base model to rectify its shortcomings. If you find that the default SDXL is consistently generating subpar facial features and anatomies, it’s worth considering the adoption of a fine-tuned model from the community. These models have the potential to address specific issues and offer improved results.
Here are some other Fine-tuned SDXL Model:
6. Inpainting for Localized Enhancements
Inpainting is a technique that involves filling in missing or distorted parts of an image. Applying inpainting to SDXL-generated images can be effective in fixing specific facial regions that lack detail or accuracy.
By using a mask to pinpoint the areas that need enhancement and applying inpainting, you can effectively improve the visual quality of facial features while preserving the overall composition.
7. Exploring Alternative Models For Inpainting
If previous approaches have not yielded satisfactory results, it might be worth considering alternative models for inpainting. SD 1.5 based models, for instance, is a different model that could potentially offer better results in terms of image inpainting and enhancement. Exploring various models and comparing their performance can help you identify the most suitable option for achieving better facial representations.
Some of the best SD 1.5 Models: