exactly! Although, one potential thing i noticed the other day that would potentially also need to be addressed with this method is, i ran a scrape, subscribed to a new person, ran another scrape but it didnt pull the new person down. I had to stop the program, restart it then it pulled the new person in the next scrape.
I mean somehow I never thought of that approach before and I could probably make that the way things are grabbed from the API and probably will make scraping a lot quicker. Would require me to rework most of what happens currently but could probably be a very good thing to do in terms of the time it takes to scrape because I can see where this would be useful in the case that you want to get the latest 10 posts but you have to wait for every post to be collected just to get those 10 you want.
Makes a lot more sense now you've put it that way. I can do this by adding a value to the auth.json where by default it doesn't scrape again after it completes, you have to manually start it again or if you specify a time in minutes it will wait that amount of time and scrape again. You're other point about the subs not being pulled down is a simple fix, it just needs to call the API again to get the list of subscriptions, currently I think it's using the list from when you first start the program.
I was more thinking just leaving it as the default, if the metadata folder doesn't exist then it will know to start fresh because nothing would of been downloaded based on what's in the posts, messages etc tables with the DB file and if it gets stopped part way through it can find the last post/media it needs to start from in the DB.