WebUI Automatic1111. It's just using img2img.
The issue is IRL videos have more moving parts than a cartoon making smearing inevitable. To counter this you'll see AI video apps (only paid subscription ones for now) use cross fade blending to transition during scenes once it detects smearing. Unfortunately there's not really a way around this due to something not being drawn that suddenly comes into frame that EBSynth doesn't recognize.
Things EBsynth and AI have issues with are shadows, black colors, white colors, reflective fabrics, and straight line geometry (lines, triangles, etc.).
So while cartoons are easier to draw over for AI and EBSynth, making a coherent AI deepfake video is (to put it bluntly) a fucking nightmare no matter what programs you use currently.
What alleviates this a bit is using controlnet and playing with the values for HED, normal, depth, and sometimes canny and scribble. However at 60 fps which accounts for 1440 frames a minute with a typical error threshold rounding out at 50-150
, you can easily spend all day just working on a minute long scene.
You could also mitigate this by lowering the frame rate then upscaling in Topaz or something, but it can sometimes make the flow feel more unnatural that way.
Taking this video for example, was just one frame edited and it took me close to 3 hours playing with the settings at 8gb vram using SD 1.5 as a base to get it barely coherent and I still consider it a piece of garbage.