MEMORIAL DAY - JOIN BRAZZERS FOR FREE - CLICK HERE!
Guide - Fake - Lora/Lycoris Training Guide | SimpCity Forums

Guide Fake Lora/Lycoris Training Guide

  • Welcome to the Fakes / AI / Deepfakes category
    When posting AI generated content please include prompts used where possible.

goat199

Diamond Tier
Mar 13, 2022
38
888
932
0fya082315al84db03fa9bf467e3.png
I've been trying to train my own loras on the last few weeks and finally managed to get good results and decided to share it with you guys.
Before I start, I just want to say that I'm not a pro at this stuff or fully understand all the technical stuff behind this, I just followed a bunch of tutorials and read some articles to be able to do this. This tutorial is also made to create Lycoris instead of traditional Loras, I did some tests and believe that for real people Lycoris is the best training for it.

Anyway, lets start. If you want to train your own models you need to have a GPU with at least 6gb of VRAM.

1. Installing kohya_ss

This is easier to understand with a video tutorial so I’m gonna leave this link here and follow the steps for the installation, I used this tutorial to install kohya in my computer:

Btw, most of the stuff on this guide already has on this video with some tweaks I made so you can follow along with my tutorial if that makes it easier to understand.

2. Preparing the dataset

You should have at least 15 pictures of the subject but the more pictures the better, most images should focus on the face but if you want the AI to also know the body shape of the person you should also get some full body pictures so you won’t have to prompt for the body when you use the Lora. Remember to always use high quality pictures because they'll make the results better. I mostly use duckduckgo to find the images since it's easier to download there compared to Gooogle Images, imagefap(dot)com also have many high quality pictures, you can also use this website to find more images.

After downloading your pictures you should resize them, in SD1.5 most models are trained on 512x512 resolution but you can use a resolution of 768x768 for better results but it takes more time, especially if you have a low VRAM GPU, I tested both resolutions and didn’t see much difference so I’m using 512x512 resolution.

To resize your images go to birme.net and upload the images there and you can crop to the desired resolution, remember to always focus on the face of the subject, here is an example on how I cropped some images so you can do the same:
Please, Log in or Register to see links and images

After cropping the image click on SAVE AS ZIP button and download the results.


Please, Log in or Register to see links and images

3. Folder preparation

You should create 3 folders for the training, one for images, one for the result and another one for the log, this is how I create them:

Inside the image folder, create a new folder and unzip the cropped pictures there, we will come back on that folder later.

4. Captioning the images

This is one of the most important steps on the training, captioning tells the AI what to look for when training the pictures, but it’s very easy to do since kyoha has a captioning tool on the utilities tab, for realistic models I prefer using BLIP Captioning and add “photo of” in the prefix since we will use it in the beginning of almost every caption. Here’s my settings for the captioning on kyoha:
Please, Log in or Register to see links and images

The next step is checking the captions and editing them, you should use a token to trigger the lora when used on Stable Diffusion, a token is a word that will be related to the lora you trained, you should avoid using the name of the subject since Stable Diffusion may already have it trained on its database and will mess up with the end result so it’s best to use random characters that has no meaning, I’m using “zwx” as a token and had no problem with it. Always write “woman” or “man”, depending on the subject after the token so the AI will understand that “zwx” is that person. Here’s an example of a picture with the caption that I used for it:
Please, Log in or Register to see links and images

Avoid describing things like hair or eye color because it will be harder to change it with prompts when you use it on SD, besides, the AI will know how these stuff looks like by the pictures, try focusing on clothes, facial expressions and poses when captioning.

5. Training setup

Like I said before, I’m no expert on the info, I just followed a bunch of tutorials and found a good setup for the training, bear in mind that I’m training for a LoCon Lycoris instead of a traditional Lora so the training parameters are different.

Your first step is choosing your model to train, I use RealisticVision V5.1 since I got the best results on it and it’s very flexible so I can use the Lora on other models as well, you can download it here:
Please, Log in or Register to see links and images


To select the model, go on the Lora tab on kyoha > Training > Source Model > Model Quick Pick > custom and select the model on the folder you download it, if you already use it on SD you can select the file from the A1111 models folder. The Source Model tab should look like this after you select the model:
Please, Log in or Register to see links and images

For the next step you should go to the Folders tab and select the folders you created earlier. On the Model output name you will use the name of the file, I like to use the name of the subject that I’m training plus the initials of the model so it’s easier to find on A1111. While it’s not required I recommend writing the token you used on training comment so you know what you used to trigger the model, in my case it is “zwx woman”. The Folders tab should look like this:
Please, Log in or Register to see links and images

Before we move on to the last step we will go back to that folder we extracted the cropped pictures, that folder will be obviously used to train the picture but we have to tell kyoha how many repeats of each image it will train, I use a batch size of 2 on my training, which means kyoha trains two pictures at once with 4 epochs, for more info on that click on this article: rentry(dot)org/59xed3#number-of-stepsepochs, since I’m not very good at maths I use a very nice spreadsheet that does the calculation for me, you can download it here:
Please, Log in or Register to see links and images
. Through my tests I think that 3000 training steps is the sweet spot for real people so in this case I’ll train each image 38 times based on the calculation of the spreadsheed.
Please, Log in or Register to see links and images

Back to the folder with images, we should rename it with xx_whatevernameyouwant (xx is the number of steps, 20 for example), kyoha will understand that it needs to train each image for the specified amount of time if you do that, change the number appropriately for the number of images you have. In my case I renamed the folder like this:
Please, Log in or Register to see links and images

The last step are the technical stuff so I’ll just leave the setup here, paste it on notepad and save it as .json file, to use it, click on Open in the Configuration File menu and it will load everything, just change the folders to the one you will be using for training.

Please, Log in or Register to view spoilers
If you setup everything correctly, click on Start training button and wait for it to finish, I have an RTX 3070 and a training of 3000 steps takes about 30 minutes with this configuration.

After the training finish, go to the Model folder you created and select the .safetensors files there and move it to the Loras folder on A1111, if you followed my instructions you’ll have 4 files, each one representing an epoch, usually the best results are from the 03 or the file without numbers since they were trained later on, most times you will use the file with no numbers but you can test all of them on SD to see each one was better.

These are the results for the Lora that I trained I couldn't put the prompts because of the character limits but I used the models epiCrealism Pure Evolution V5, Comic Babes and Rev Animated with the 3D animation Lora, you can find all of them on Civitai, my prompts were some variations of the ones used in the demos for theses models, remember to always use hires fix and adetailer to fix faces

Please, Log in or Register to see links and images
Please, Log in or Register to see links and images
Please, Log in or Register to see links and images
Please, Log in or Register to see links and images
Please, Log in or Register to see links and images
Please, Log in or Register to see links and images
Please, Log in or Register to see links and images
Please, Log in or Register to see links and images


If you want to test this Lora I uploaded on this link:
Please, Log in or Register to see links and images

Feel free to message me if you have any questions
:pepoLove:
 
Last edited:

roachman987

Lurker
Mar 24, 2022
6
13
120
0fya082315al84db03fa9bf467e3.png
Could you write out what to do once the models have been moved to the Lora folder in Stable Diffusion? (How to integrate it with SD, etc)
 

mkswolft

Bathwater Drinker
Mar 12, 2022
170
5,938
1,252
0fya082315al84db03fa9bf467e3.png
hey! did you try changing these parameters? I have read in other guides that recommend higher numbers for greater realism, but that "alpha" always be a multiple of dim

"network_alpha": 8,
"network_dim": 32,
 

pkmngotrnr

Bathwater Drinker
Mar 18, 2022
428
10,585
1,437
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
i always recommend the base 1.5 model. because most if not all popular checkpoints have the 1.5 model as base. so if your lora uses it as base, the resulting lora works basicly with all checkpoints or the majority of it. if you use an already trained and specialzied checkpoint like epicrealism and want to run your lora afterwards with any other checkpoint than epic realism the results will not be as good as if you used the base 1.5 from the beginning. i tried out the same thing with the LazyAmateure Checkpoint and thought "if i train my lora on a nude checkpoint the nudes will turn out great"... well no. the lora was not as versatile and flexible and used on other checkpoints the quality was ultra shit.

TL;DR Use the base stable diffusion 1.5 model for lora creation
 

backuppd

Lurker
Dec 14, 2023
6
1
51
0fya082315al84db03fa9bf467e3.png
After finishing the training I am receiving a .json file of just 4kb instead of a .safetensors in my "model" folder used as output. how to solve this?
 

mkswolft

Bathwater Drinker
Mar 12, 2022
170
5,938
1,252
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
I tried out dim32/alp8 first, then switched to 128x64 like pkmngotrnr suggested, and lastly went big with 512x128. The main thing I noticed was how the folder size changed, started at 500MB with 32x8 and shot up to over 5gb at 512x128. quality-wise, there's a tiny difference, but since storage isn't an issue for me, I'm sticking with 512x128.

si alguno habla español por favor dejemos de escribir en ingles XD
 

backuppd

Lurker
Dec 14, 2023
6
1
51
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
The training happens very quickly and in the code there was "ModuleNotFoundError: No module named 'bitsandbytes.cuda_setup.paths'" "subprocess.CalledProcessError". I forgot to mention that a folder called sample also appears in "model"
 
Last edited:

goat199

Diamond Tier
Mar 13, 2022
38
888
932
0fya082315al84db03fa9bf467e3.png
I had some free time and decided to try some different settings, this time I trained a regular Lora and I was pleased with the results.
I trained the same images with the same captions (using the WD14 method) and no regularization images. For this example I also used Krysten Ritter since it was the same person I trained before and using the same dataset, I only changed the training method and captioning and these are the results:

Please, Log in or Register to see links and images


Please, Log in or Register to see links and images

This xyz plot and the next are basically the same, the only difference is that this one I didn't use Hires. Fix and the other I used it.
And this is the prompt that I used:

Please, Log in or Register to view spoilers

Please, Log in or Register to see links and images

For this one I tried generating a full body picture, I used Hires. Fix without After Detailer.
And this is the prompt:

Please, Log in or Register to view spoilers

I used three different models to train this Lora to compare the results, which were the base Stable Diffusion Model, Realistic Vision V5.1 and epiCPhotogasm Last Unicorn. I included the initials of each model on the name of the Loras so it would be easy to compare them, SD is Stable Diffusion Base Model, RV is Realistic Vision and EP is EpicPhotogasm. You can download these models on Civitai.

I also used the "sks" token to train these images, some people don't like it but I didn't have any problem with it.

As for the results, I still believe that Loras trained on Realistic Vision always come up better and are more flexible without losing the person's characteristics.

For some reason the Lora is quite strong and I had to reduce its strength to 0.65, anything higher than that produces a lot of artifacts.

Here's a link with my training configuration if you guys want to test it:
https ://jsonblob .com/1191895023888490496

And also here's a link with the results if you want to use these Loras:
Please, Log in or Register to see links and images

If you have any questions feel free to ask me, I'm definitely not a pro at these things but I'll try my best to answer.
:pepoLove:
 

pkmngotrnr

Bathwater Drinker
Mar 18, 2022
428
10,585
1,437
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
If you use Any other model than sd1.5 as your base model for training loras, the result will look horrible if you use the lora with another checkpoint than the base

Also why do use use only .65 weight with your lora when rendering images? You train for a real life person resemblance, there is no use in trainig for that and then only telling the lora to use "65% of its potential" to create said person.
 

goat199

Diamond Tier
Mar 13, 2022
38
888
932
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
I used both models and I believe that the results at least in this training was better. As for the weight, if I use higher than .65% it renders a lot of artifacts, with this weight I managed to keep the subject likeliness without the artifacts.

I'm testing some other configurations as well, some results actually looked better on SD base model but there's still some room for improvement so I'll post it when I have more time to focus on training Loras again.
 

goat199

Diamond Tier
Mar 13, 2022
38
888
932
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
I'm training for about 3000 steps, the previous epochs, aren't actually that accurate to the person most of the times. I prefer to generate with a weight of .65 then use ADetailer to fix it, it always worked fine that way for me.
Anyway, I'm doing some test trainings with the SD base model using regularization images, it helped with those artifacts but depending on the prompt they still appear, I'm probably messing up with the pictures or the captions, I'll have to look into it later.
 

pkmngotrnr

Bathwater Drinker
Mar 18, 2022
428
10,585
1,437
0fya082315al84db03fa9bf467e3.png
i dont even recommend reg. images. i have created over 50+ lora models so far. and you can def. go waaay beyond 3000 steps IF your pictures depict the person in different outfits, and positions and you set your network matrix higher than 128x64 in kohya_ss. also use wd14 captioning with a character threshold above 0.75 and make sure you actually set the caption format to .txt instead of .wd14 in kohya_ss

if you want to try out my kohya settings you can find them attached in this guide:
Please, Log in or Register to see links and images
. i recommend settings the max res for sd 1.5 lora models to 512x512 or if your gpu can handle it 768x768

and for my models you can dl them here:
Please, Log in or Register to see links and images


also you might want to check your tensorflow board after your lora training is done. and look for the loss/epoch graph and zoom in. any model that is on a spike in the last third is worth checking out. TLDR you can read this as "my lora learned something new e.g. facial features at the end"

Please, Log in or Register to see links and images


if you have any more questions feel free to ask or msg me :)!
 
  • Like
Reactions: belladonna12221

ouciot

Fan
Mar 12, 2022
11
71
329
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
3000 is too much, the reason why you tune it back down to .65 is because .65 * 3000 is around 2k (it doesnt actually work like that but you get the gist), which is in the vicinity of the total step sweet spot (1.5-2k)
there's also no reason to train more than one epoch if your first epoch reaches the 1.5k mark (so for example 30 images at 50 repeats), real life face loras respond horribly to overtraining, if you're not happy with the results then you need better training dataset