Guide Fake AI [Kohya_SS] [Booru] [WD14] Automated picture gathering and selection for ai model lora training

  • Welcome to the Fakes / AI / Deepfakes category
    When posting AI generated content please include prompts used where possible.

pkmngotrnr

Bathwater Drinker
Mar 18, 2022
426
10,365
1,437
0fya082315al84db03fa9bf467e3.png
i wrote a small guide on how i download and tag all my pictures for lora model training at civitai:
Please, Log in or Register to see links and images
(same text as below)
you can find my models resulting from this guide here:
Please, Log in or Register to see links and images


requriments:

  1. Kohya_SS
    Please, Log in or Register to see links and images
  2. instaloader
    Please, Log in or Register to see links and images
  3. Booru
    Please, Log in or Register to see links and images

i make use of downloading a whole instagram profile and tagging ALL images in WD14 Captioning beforehand to sort out the pictures to train, by deleting the pictures with undesired tags, instead of going through the pictures 1 by 1 and deciding by hand. if you want to filter out pictures faster for your lora training. keep on reading this guide :)
Download the instagram profile of your favourite person you want to create a lora of.

i will use
Please, Log in or Register to see links and images
for this tutorial.

to make use of instaloader, open up cmd in the same path as the instaloader.exe and run the command according to her instagram tag. e.g. we need bambiskii
the command to download only her pictures, without the metadata, text or videos looks like this:
instaloader profile bambiskii --no-videos --no-captions --no-metadata-json --no-compress-json
hit enter and wait for the download to finish:

the pictures will be in a folder with the same name as the insta tag so "bambiskii".

now we need 3 new folders which you should create inside the bambiskii folder named
  • img
  • mod
  • log
so your folder structure should look like this:

bambiskii/img
bambiskii/mod
bambiskii/log

Put all the downloaded pictures into the "img" folder

we now need to tag all those images with WD14 captioning which can be found inside of the Kohya_SS Tool, the then tagged images help us sort out the unwanted stuff instead of going through all pics 1 by 1.

Open up Kohya SS and go to "Utilities" -> "Captioning" -> "WD14 Captioning"

801ebe5c-152f-473c-b7ef-46149a30feb4.jpeg


To get better person/facial recognition increase the "character threshold" to 0.7

e133475a-5db1-4afb-afd0-f688300d6c8e.jpeg


Choose the folder "img" in the "image folder to caption" section at the top

a97d296a-3da6-46eb-83ec-9a96c6c12237.jpeg


click on "caption images" and wait for it to finish
(there will be a message in the cmd window)

13:40:41-111336 INFO ...captioning done


Open up Booru Dataset Manager
and open up the folder we just tagged with WD14
the important stuff is on the right, which is a list of all the tags inside of this folder.
now choose (with CTRL pressed) all tags that you DONT want
e.g. "1boy, 2girls, multiple people" etc.
anything besides "1girl" basicly
for me it looked smth like this

1825954a-0303-4bc5-91b4-b09d19a64379.jpeg


now click on the filter icon to only get pictures you want to delete

f662fc4a-feb2-42b3-886c-4b35fd78a4d4.jpeg
d34d1d69-14f1-40f1-b4de-f96babd1092a.jpeg


mark all the pictures on the left side with CTRL+A and press DEL to delete all the unwanted pictures from your dataset.
now click on "1girl" on the right and on the filter option button until it says "NOT"
now you have every picture that does not contain the tag "1girl"

fc0e4507-cd4a-42ec-8ab5-cdedce5db2a0.jpeg


and mark all the pictures on the left again with CTRL+A like before and delete them.
as a last step in Booru you can now add your trigger word to your dataset.
click on the green plus sign on the right. enter your trigger word. for me its bambiskii, choose "TOP" so its the first tag on all pictures , go to "save changes" on the top left and thats it, you are done in booru

85f20878-65a5-4c26-ad7c-9c4efc070df9.jpeg


now just go to kohya_ss and train your lora with your favourite settings.
there are already a million guides on kohya_ss lora training and i dont think i can add anything special to it.
i uploaded my preset settings for kohysa
Please, Log in or Register to see links and images
(for NVIDIA GPUs) if you want to check it out.
 

mkswolft

Bathwater Drinker
Mar 12, 2022
154
5,374
1,252
0fya082315al84db03fa9bf467e3.png
this information is really useful, ty! and do you also recommend 3000 steps in total? I mean, if I have 400 dataset images, should I train each photo 4 times?
folders: images > 05_name > dataset ?
I always trained with 20 images max, and the folder name started with 10_
I'm trying to create a
Please, Log in or Register to see links and images
lora
 
  • Like
Reactions: SuperflyJohnson

pkmngotrnr

Bathwater Drinker
Mar 18, 2022
426
10,365
1,437
0fya082315al84db03fa9bf467e3.png
i always use as much pictures as i can. i dont care about the 3000 steps rule. i made a lora with 12.000 steps and everything was fine. just download all the pics from insta, tag them with booru and delete the ones where there is more than her on it. and just go and let it run. i run mine for 10-20 epochs, with 1 step for every picture (if i have alot of pictures) and i get over 3000 steps in total easy
 

pkmngotrnr

Bathwater Drinker
Mar 18, 2022
426
10,365
1,437
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
if its all of the same person, in good quality, and the focus is mainly on the face in those 650 pics, you can just do 1 repeat and 5 epochs resulting in 3250 steps in total. otherwise do 2 repeats but only 3 epochs, resulting in 3900 steps in total.
 
  • Like
Reactions: greeneye

jaaas

Tier 3 Sub
Mar 16, 2022
13
301
537
0fya082315al84db03fa9bf467e3.png
hey pkmngotrnr love your work thanks for making your guide (y)
and i agree just searching for good images can be a pain , that's why i created a little python download script that downloads batch images from alarmy just simply copy url's into a txt file run the script and done.
i find it really useful and quick to get good images and i think others will too :-)
am more than happy to share it but only with permission from staf before people think am trying to share malicious code or something

Please, Log in or Register to see links and images
 
  • Like
Reactions: Krammanheat