Speech Recognition (Whisper)

Mon May 29, 2023 6:16 pm

It has been awesome using the new Whisper speech recognition. Accuracy is way up over VOSK. A few minor tweaks:

1) Length of the subtitles can't be adjusted or limited. VOSK has a pretty reasonable length, 8~15 characters or so. It'd be good to have this as an option, even if it is a post processing option. If there is an tool to do this, please point me to the source. Thanks.

2) In my speech recognition model, I use Chinese. Sometimes, it would return Simplified text, the other time would be Traditional text. It isn't a big deal, I just have to do some post-processing with google translate. If there is a parameter that I just haven't learned how to use it, please point me to the direction. Thanks.

3) Since I have the transcript, is there a way to use the subtitle editor to just time the speech timing? It would be great to have this option. Maybe the current version 23.04.1 already have this option, I just haven't learned how to use it. Again, please let me know. Thanks!

Bei

Speech Recognition (Whisper)

Page 1 of 1 (1 post)

Speech Recognition (Whisper)

Bookmarks

Who is online