undefined | Better HN

0 pointskamranjon3y ago0 comments

Have you tried setting condition_on_previous_text to False?

0 comments

Jach3y ago

Yeah, it can help a bit with looping, but introduces other problems. I recalled from earlier that a combo of tweaking no_speech_threshold and logprob_threshold settings helped somewhat, though trying again on a random video it doesn't do much. Still hallucinates a stream of captions (albeit non-repetitive, though one run had several Touhou related lines) for what should be 4 minutes of looping background music before the first sentence. If all one needs Whisper for is transcribing English though, I still think it's pretty decent. On my test video now it will 'correctly' transcribe the music as ♪ when I ask it to just transcribe it as English.

j / k navigate · click thread line to collapse

0 comments

Jach3y ago

j / k navigate · click thread line to collapse