glandalf

Superfan
Mar 14, 2022
34
680
832
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes

My workflow may very well be flawed, because I'm far from a Linux power user, i'm very much still learn the ropes when it comes to more advanced usage. I'm taking a far more manual approach to tool, at least at first because I'm going through a organization/clean up task using the -H and -S flags on your tool, as well as running dedupes on video comparisons and image comparisons using other tools, and also re-encoding some stuff to h.265, all in the name of freeing up hardrive space. I was trying to script something a bit more automatic using nemo action scripts but I couldn't seems to get ripandtear to launch from it. So then I was trying to create a script that would cd into the directory of choosing and run the tool, but again I failed miserably lol. So basically my suggestion of being able to set the directory in the tool itself come from failure/frustration haha. Thank you for considering it :)

At the moment I'm using nnn to navigate my library, where I can get spawn a shell from inside the directory and then run ripandtear from an alias I set up.

If you have any suggestions of ways I could be using the tool better, I will happily take a look into them. Always looking to learn and try new things, even if I fail miserably at them haha.
 
Last edited:

glandalf

Superfan
Mar 14, 2022
34
680
832
0fya082315al84db03fa9bf467e3.png
So my folder names aren't to great when it comes to compatibility with the tool it seems. Its a combination of the models different usernames that I've found, then different tags for different attributes. e.g. if they have tattoos. So my folders names look like this
glandalf - gland_alf (Tattoo - Tall)

So what I'm going to do is something along the lines of this:

Bash:
Please, Log in or Register to view codes content!

While testing this out though I have found that ripandtear doesn't seem to like when there is period/full-stop "." in the directory name.
So I might run a bulk renamed to remove . from folder names
 

glandalf

Superfan
Mar 14, 2022
34
680
832
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes

That sounds like a cool idea. I think it would be good to ad a generic names category, because I've also been storing old usernames of defunct accounts the model used to use, for searching purposes. So it wouldn't be good to store the old names in the main site categories in the rat file.

Please, Log in or Register to view quotes

I'm starting to learn python, but I don't think I'm quite there yet to develop something like that, but it might give me reason to try expedite my learning lol.

Please, Log in or Register to view quotes

For now I'm just trying to clean up my library en masse to free up some drive space. So at current I don't want to go through and assign an account name or URL to each model manuall, I just want to create a rat file that can start logging the hashes. I will eventually take time to add usernames and URL's, but clearing space is my initial goal.
 

Jules--Winnfield

Cyberdrop-DL Creator
Mar 11, 2022
2,178
5,117
1,127
0fya082315al84db03fa9bf467e3.png
Nifty program you got going, Gonna have to give it a try later today. I'm mostly interested in the reddit portion personally
 

johnny.barracuda

Bathwater Drinker
Sep 26, 2022
221
3,039
1,252
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
It should be pretty easy and it will be a good way to learn about working with paths. There are two ways you can work with the file system in python. The OS or Path module. The Path module is newer and what most people recommend using. All you need to do is put the function in a .py file, then if you are on Linux add it somewhere within your $PATH and make sure it is executable. Then you can just call it like any other program.

Just making assumptions about your file structure, but if all your folders are like this glandalf - gland_alf (Tattoo - Tall) then you could grab the folder name using the "stem" from the path module and store it in a variable. Then from that variable split on the " ( ". That will return a list with two entries ["glandalf - gland_alf ", "Tattoo - Tall)"]. Then you can do some trimming up and isolating names for the first entry until you have a list with clean names and then do the same creating another list for the tags. You could end up with something like this.

Python:
Please, Log in or Register to view codes content!

Once you get those two lists you can loop over them one at a time (names first then tags) and call os.subprocess to call RAT and add them to your .rat file. The --generic-name and --tag are stand ins. I will probably add those name or something similar.

Psudocode:

Python:
Please, Log in or Register to view codes content!

Just an idea off the top of my head, take it or leave it. It might sound intimidating, but all you really need to do is get the folder name, then break apart the name into manageable pieces and add them to a list. I am doing updates to my home server, but once I am done I will try implementing what I was talking about. I will work on adding the generic names and tags first in case you end up wanting to do what I just mentioned.
 

johnny.barracuda

Bathwater Drinker
Sep 26, 2022
221
3,039
1,252
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
Wow! Thank you for the compliment! It means a lot coming from you seeing as I would take a peak at your source code whenever I would get stuck and need inspiration. I am just an amateur programmer so please don't judge me too much.

My reddit extractor is pretty ugly at the moment. I have been meaning to go back and clean it up, but haven't gotten around to it. I was planning on adding the ability to download text posts that users make so I was planning on doing it then.

Here is a link to the gitlab if you hadn't seen it -
Please, Log in or Register to see links and images


Just for quick orientation the philosophy I use is to create a dictionary that contains all information needed to download a file. It can be passed around between different extractors adding necessary information. The most common use case is a url_dictionary being created in the reddit extractor, adding the name of the text post, then passing it to the redgifs extractor to find the actual download link.

utils/content_finder is where most of the calls to find content are made

utils/tracker - this is a singleton that all the extractors connect to. Whenever a url_dictionary is completed it is added to the tracker. After all the extractors are done finding everything the tracker cleans up duplicate links. After everything is clean it passes it to the downloader

utils/custom_types - is where you can see all the fields that can be collected. Not all of them are used for every file though
 

johnny.barracuda

Bathwater Drinker
Sep 26, 2022
221
3,039
1,252
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
Damn, this is embarrassing, but I didn't realize that name was an option. That is definitely way better than stem. So better in fact that I already patched it and pushed an update. Do you want me to add your name as a contributor to the project? I am being serious.
 

johnny.barracuda

Bathwater Drinker
Sep 26, 2022
221
3,039
1,252
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
Here is the fish function I have for mkcd

Bash:
Please, Log in or Register to view codes content!

This is how I would make a folder for https://simpcity.su/threads/greatmoongirl-danibunnygirl.21/

Bash:
Please, Log in or Register to view codes content!

mkcd to create and move into the new directory

then use all of the different flags to fill in the names within the .rat file. -sa to sync everything (reddit, redgifs, coomer) and -d to download specific files I find from the thread that I want. This is how I would make a new user for the first time. I use a lot of keyboard shortcuts so I can switch between monitors and windows without having to move my hands so I am pretty fast.

After creation I would just need to go into the directory and run ripandtear -sa to sync everything, or ripandtear -d if I wanted to download specific files.

If you want to hash and clean up everything then you could throw in a -H -S (or -SH to combine and shorten)
 

glandalf

Superfan
Mar 14, 2022
34
680
832
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
ok cool, so you do just build the full command each time with only the arguments needed.

Might try and create a separate script that looks at a template and then builds the RAT command only if the argument contains a value.

Rough example:
template file
Code:
Please, Log in or Register to view codes content!

Then have a script that would turn this into something like:
Code:
Please, Log in or Register to view codes content!

So whilst I'm browsing a thread I can just copy what I want into the template as I go, so when I'm finished I can just execute it.
 

johnny.barracuda

Bathwater Drinker
Sep 26, 2022
221
3,039
1,252
0fya082315al84db03fa9bf467e3.png
Released an update last night that fixed an issue with the file extensions after files were downloaded. RAT now makes sure the file has the correct file extension, in lower case, after it has been successfully downloaded.
 

glandalf

Superfan
Mar 14, 2022
34
680
832
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
No rush at all mate, in your own time, I'm just happy your considering it. Currently I'm adding them to the rat file as a generic value (e.g. -g 'ManyVids:username')

Please, Log in or Register to view quotes
Nice one. I've still got to try and write a script to go through my files and update existing uppercase extensions to be lowercase, so the sorter picks them up. If you run hash a second time in a folder that's already been hashed and sorted, the sorter removes the existing files as it sees them as dupes because the hash is in the rat file. I've got a command that will do it, just need to iterate over my directories and run it.


Sorry to keep asking for stuff, but would you mind adding .ts to the video_suffix variable in the File_Sorter.py util? Again no worries if not.

Another thing is that after upgrading to 0.9.17, I got errors about the magic module. So you might want to add python-magic to the dependencies.
 
Last edited:

johnny.barracuda

Bathwater Drinker
Sep 26, 2022
221
3,039
1,252
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
Using the API, like I am right now, most likely not. I was just reading about this earlier and MAYBE I might be able to use playwright to pretend I am a user to gather the links, but who knows if they will block that method. Reddit seems like it is going to become a dead end in terms of downloading content which is really unfortunate. We just need to keep an eye out for where all the girls migrate to.
 

DaFapper

Lurker
Oct 7, 2022
3
0
46
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes

That is too bad. Actually, just saw on reddit a user mentioned this and i wonder if it could be used as a work around in the future. Since the official reddit app will have access to the full api, maybe this is feasible. I'm not a dev though so no way to know if it is or this dude is just talking out of his ass. Each user would probably have to extract their own key but if its doable than someone will probably come up with a guide

Please, Log in or Register to view quotes
 

johnny.barracuda

Bathwater Drinker
Sep 26, 2022
221
3,039
1,252
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
I think it's one of those situations of time will tell. I don't know how it is going to be implemented so I can't start working on a work around right now. I know with their API access they say that you will only get 100 requests per minute if you use OAuth. RAT currently does that, but instead of waiting for the cool down like you are supposed to, RAT just asks for a new API as if it were a different user to reset that API limit. In theory this would mean that the decrease in API requests shouldn't effect RAT in that sense. The only problem is them saying they are going to not show NSFW content in the API. That is what will break RAT.

As long as we can still download direct file links (i.redd.it/asdf123.png) we should be fine. I should be able to build a generic scrapper to find the images/video. It will just cost them a lot more data because I will need to load the whole website, but fuck them.
 

glandalf

Superfan
Mar 14, 2022
34
680
832
0fya082315al84db03fa9bf467e3.png
Got my script that builds the rat command from a template working.

I created a json file like this:
JSON:
Please, Log in or Register to view codes content!

I then will populate this with values as I browse a thread.

Then I run:
Python:
Please, Log in or Register to view codes content!

So for example, if a model doesn't have a instagram, thus I don't populate anything in that key in the json, -i just won't be appended to the command.
 
Last edited:

johnny.barracuda

Bathwater Drinker
Sep 26, 2022
221
3,039
1,252
0fya082315al84db03fa9bf467e3.png
Please, Log in or Register to view quotes
Ahhhh ok. I understand what you are doing now. Earlier when you first mentioned it, I thought you were going to build a script to input the commands into RAT for a brand new model. I thought you were being a bit redundant, but you were talking about for the models you already have and made templates for.

Looks really good. Pretty simple and I hope it works well. Hopefully after this you will truly be able to harness the power of RAT and it will be helpful for you in it's natural state.