is what i use it's free, just need to normalize then extract the audio, then upload the audio on google drive then do the process on google collab
3 hours video like canan's take about 45~60 minutes to finish and the accuracy is highly dependent on audio quality and accent/pronunciation of the speaker
It's really funny how youtube Once recommended me a Canan stream then once it ended I was immediately transferred to an archived Noel ASMR stream. Youtube knows what's up lmao