LocalVocal: Local Live Captions & Translation On-the-Go

LocalVocal: Local Live Captions & Translation On-the-Go v0.3.9

Grumpus

New Member
This is great!
I would only want to use this for certain scenes where I will be translating from English to Japanese, so I would like to unload everything when not on specific scenes to free up resources, maybe a start/stop button for the plugin.
Is this possible/already implemented?

Thanks for all your hard work!
 

royshilkrot

Member
This is great!
I would only want to use this for certain scenes where I will be translating from English to Japanese, so I would like to unload everything when not on specific scenes to free up resources, maybe a start/stop button for the plugin.
Is this possible/already implemented?

Thanks for all your hard work!
Just hit the visibility button (eye icon) on the filter and it will deactivate
 

royshilkrot

Member
English works quite well, although there is a delay.. Polish... I don't know if this is a problem in -Model Whisper-.
Where can I get a ready-made model for Polish? or.. how to create it? /step by step/
The best performance will be from Whisper Large v3
But you need a powerful GPU to run that real time
There aren't many whisper models for different languages but some can be found on huggingface.co
You could search for whisper gguf and then look for a Polish fine tune. I doubt there is one though.
 

Grumpus

New Member
Just hit the visibility button (eye icon) on the filter and it will deactivate
I tried this, but VRAM usage was still there and so was the GPU load upon translation + reduced memory clocks of using CUDA.

I'll have another look and let you know, I think I may have tried everything I could do to no avail though.
 

Grumpus

New Member
Just an update, as soon as OBS loads it will load the models etc. into memory even if the visibility is off for everything.
1717890178208.png
 

Grumpus

New Member
Dunno how to edit a post if I can at all, just wanted to note that GPU load from translation is NOT there when on other scenes, but memory usage/memory clock reductions are still there.
 

momomelo

New Member
Hi Roy!

This plugin made me finally make an OBS forum account after 10 years of not doing so.
Thank you for your incredible work in this, like genuinely, this is a great accessibility tool that allows communication with audience beyond the native stream language, it's wild to think that now is the era we can do this, compared to even three or four years ago! Much love <3

I have a question and I'm not sure if here is appropriate to ask- or if it already exists and I've just missed it.
Is there a way to know what all the labels in the tweaking of the model mean or a wiki for additional context to know what specifically I would be tweaking by clicking a checkbox or interacting with a slider?

E.G under Whisper Model Parameters, I'm unsure what the "Speed Up" does.

My goal is to run the model and have the text appearing on-screen as real time as possible, while using a buffered output. I am aware that hardware limitations and selection of model size play a big impact on it, but i'd like to understand better what I can tweak from lowering the context etc I can do to try to get it as close as I can.
 

royshilkrot

Member
Hi Roy!

This plugin made me finally make an OBS forum account after 10 years of not doing so.
Thank you for your incredible work in this, like genuinely, this is a great accessibility tool that allows communication with audience beyond the native stream language, it's wild to think that now is the era we can do this, compared to even three or four years ago! Much love <3

I have a question and I'm not sure if here is appropriate to ask- or if it already exists and I've just missed it.
Is there a way to know what all the labels in the tweaking of the model mean or a wiki for additional context to know what specifically I would be tweaking by clicking a checkbox or interacting with a slider?

E.G under Whisper Model Parameters, I'm unsure what the "Speed Up" does.

My goal is to run the model and have the text appearing on-screen as real time as possible, while using a buffered output. I am aware that hardware limitations and selection of model size play a big impact on it, but i'd like to understand better what I can tweak from lowering the context etc I can do to try to get it as close as I can.
Thanks for the kind words!

The settings frankly are not going to affect the speed nearly as much as the selection of model and your hardware, like by an order of magnitude.
If you have a Nvidia GPU then get the CUDA version of the plugin. If AMD then the clblast version.
Hardware acceleration will boost speed by at least x5-x10 over CPU.
If you want true high accuracy realtime captions - invest in a powerful GPU.
You can also stick to small models like the Base and Small.
Get good audio quality mic. Add noise suppression.

For presentation of captions you can add delay to your audio or video to sync up... Or clone the audio.
There are some tricks..
 

royshilkrot

Member
royshilkrot updated LocalVocal: Local Live Captions & Translation On-the-Go with a new update entry:

v0.3.2 Improvements all around! Caption presentation, logs and bugfixes

Lots of things going on in this busy release!

  • Adding filter & replace option
  • Improving buffered output a lot making it much nicer (like word wrapping, word-by-word mode)
  • Improving the translation-sentence cadence
  • Fixing logs and potential crash-causing bugs

Download​


Read the rest of this update entry...
 

royshilkrot

Member

luisamanetti

New Member
Hi thanks for this fabulous plugin, it is very useful, I would like to use it, I use the large 3 version for Italian input which understands it very well but unfortunately translates badly, I have tried all the models, I wanted to translate from Italian to English, French, German and Spanish but the program invents the words or doesn't understand them, could you tell me how to download a model suitable for the translations I indicated? because I see that you can also put an external model but I don't know how to do it, thanks.
 

royshilkrot

Member
Hi thanks for this fabulous plugin, it is very useful, I would like to use it, I use the large 3 version for Italian input which understands it very well but unfortunately translates badly, I have tried all the models, I wanted to translate from Italian to English, French, German and Spanish but the program invents the words or doesn't understand them, could you tell me how to download a model suitable for the translations I indicated? because I see that you can also put an external model but I don't know how to do it, thanks.
Thanks for using the plugin
Sounds like you're looking for a dedicated Italian-English model
The plugin accepts CTranslate2 models, so you can look in huggingface for a "ct2 it en" model, download and try it out
It's not a simple thing. But you can figure it out with some trials.
 
Top