LocalVocal: Local Live Captions & Translation On-the-Go v0.5.3

adamesek · Jun 7, 2024

now VAD = 0,95 ; subtitles are not displayed correctly on the screen... some words and sentences are invisible

royshilkrot · Jun 8, 2024

adamesek said:
now VAD = 0,95 ; subtitles are not displayed correctly on the screen... some words and sentences are invisible

Hard to understand what is going on. I'm happy to look at some recordings on the discord https://discord.gg/Mu8WGw2Nka to get a better understanding

Grumpus · Jun 8, 2024

This is great!
I would only want to use this for certain scenes where I will be translating from English to Japanese, so I would like to unload everything when not on specific scenes to free up resources, maybe a start/stop button for the plugin.
Is this possible/already implemented?

Thanks for all your hard work!

royshilkrot · Jun 8, 2024

Grumpus said:
This is great!
I would only want to use this for certain scenes where I will be translating from English to Japanese, so I would like to unload everything when not on specific scenes to free up resources, maybe a start/stop button for the plugin.
Is this possible/already implemented?

Thanks for all your hard work!

Just hit the visibility button (eye icon) on the filter and it will deactivate

adamesek · Jun 8, 2024

royshilkrot said:
Hard to understand what is going on. I'm happy to look at some recordings on the discord https://discord.gg/Mu8WGw2Nka to get a better understanding

English works quite well, although there is a delay.. Polish... I don't know if this is a problem in -Model Whisper-.
Where can I get a ready-made model for Polish? or.. how to create it? /step by step/

royshilkrot · Jun 8, 2024

adamesek said:
English works quite well, although there is a delay.. Polish... I don't know if this is a problem in -Model Whisper-.
Where can I get a ready-made model for Polish? or.. how to create it? /step by step/

The best performance will be from Whisper Large v3
But you need a powerful GPU to run that real time
There aren't many whisper models for different languages but some can be found on huggingface.co
You could search for whisper gguf and then look for a Polish fine tune. I doubt there is one though.

Grumpus · Jun 8, 2024

royshilkrot said:
Just hit the visibility button (eye icon) on the filter and it will deactivate

I tried this, but VRAM usage was still there and so was the GPU load upon translation + reduced memory clocks of using CUDA.

I'll have another look and let you know, I think I may have tried everything I could do to no avail though.

Grumpus · Jun 9, 2024

Just an update, as soon as OBS loads it will load the models etc. into memory even if the visibility is off for everything.

Grumpus · Jun 10, 2024

Dunno how to edit a post if I can at all, just wanted to note that GPU load from translation is NOT there when on other scenes, but memory usage/memory clock reductions are still there.

royshilkrot · Jun 12, 2024

royshilkrot updated LocalVocal: Local Live Captions & Translation On-the-Go with a new update entry:

v0.3.1 - more models, fix timestamps

In this release:

Adding more whisper model options

Only allowing English selection for English models

Freeing up resources on filter disable

Fixing a bug to make accurate timestamps

Download:

obs-localvocal-0.3.1-macos-arm64.pkg 85.5 MB...

Read the rest of this update entry...

Grumpus · Jun 12, 2024

royshilkrot said:
royshilkrot updated LocalVocal: Local Live Captions & Translation On-the-Go with a new update entry:

v0.3.1 - more models, fix timestamps

Read the rest of this update entry...

Thanks for the fix king, works like a charm!

azamet · Jun 14, 2024

which one of setups should amd gpu users install?

royshilkrot · Jun 14, 2024

azamet said:
which one of setups should amd gpu users install?

clblast
that uses opencl

royshilkrot · Jun 14, 2024

We've reached 300 stars on GitHub!

momomelo · Jun 20, 2024

Hi Roy!

This plugin made me finally make an OBS forum account after 10 years of not doing so.
Thank you for your incredible work in this, like genuinely, this is a great accessibility tool that allows communication with audience beyond the native stream language, it's wild to think that now is the era we can do this, compared to even three or four years ago! Much love <3

I have a question and I'm not sure if here is appropriate to ask- or if it already exists and I've just missed it.
Is there a way to know what all the labels in the tweaking of the model mean or a wiki for additional context to know what specifically I would be tweaking by clicking a checkbox or interacting with a slider?

E.G under Whisper Model Parameters, I'm unsure what the "Speed Up" does.

My goal is to run the model and have the text appearing on-screen as real time as possible, while using a buffered output. I am aware that hardware limitations and selection of model size play a big impact on it, but i'd like to understand better what I can tweak from lowering the context etc I can do to try to get it as close as I can.

royshilkrot · Jun 21, 2024

momomelo said:
Hi Roy!

This plugin made me finally make an OBS forum account after 10 years of not doing so.
Thank you for your incredible work in this, like genuinely, this is a great accessibility tool that allows communication with audience beyond the native stream language, it's wild to think that now is the era we can do this, compared to even three or four years ago! Much love <3

I have a question and I'm not sure if here is appropriate to ask- or if it already exists and I've just missed it.
Is there a way to know what all the labels in the tweaking of the model mean or a wiki for additional context to know what specifically I would be tweaking by clicking a checkbox or interacting with a slider?

E.G under Whisper Model Parameters, I'm unsure what the "Speed Up" does.

My goal is to run the model and have the text appearing on-screen as real time as possible, while using a buffered output. I am aware that hardware limitations and selection of model size play a big impact on it, but i'd like to understand better what I can tweak from lowering the context etc I can do to try to get it as close as I can.

Thanks for the kind words!

The settings frankly are not going to affect the speed nearly as much as the selection of model and your hardware, like by an order of magnitude.
If you have a Nvidia GPU then get the CUDA version of the plugin. If AMD then the clblast version.
Hardware acceleration will boost speed by at least x5-x10 over CPU.
If you want true high accuracy realtime captions - invest in a powerful GPU.
You can also stick to small models like the Base and Small.
Get good audio quality mic. Add noise suppression.

For presentation of captions you can add delay to your audio or video to sync up... Or clone the audio.
There are some tricks..

royshilkrot · Jul 2, 2024

royshilkrot updated LocalVocal: Local Live Captions & Translation On-the-Go with a new update entry:

v0.3.2 Improvements all around! Caption presentation, logs and bugfixes

Lots of things going on in this busy release!

Adding filter & replace option

Improving buffered output a lot making it much nicer (like word wrapping, word-by-word mode)

Improving the translation-sentence cadence

Fixing logs and potential crash-causing bugs

Download

obs-localvocal-0.3.2-macos-arm64.pkg 86 MB...

Read the rest of this update entry...

royshilkrot · Jul 19, 2024

royshilkrot updated LocalVocal: Local Live Captions & Translation On-the-Go with a new update entry:

v0.3.3 Partial real-time Transcripts! new OBS, many bugfixes

In this release:

New simplified and streamlined filter UI, and properties refactoring

File output fixes

Model loading fixes

Language selection bug fix

Partial transcription option

More bugfixes

Download:

obs-localvocal-0.3.3-macos-arm64.pkg...

Read the rest of this update entry...

luisamanetti · Jul 27, 2024

Hi thanks for this fabulous plugin, it is very useful, I would like to use it, I use the large 3 version for Italian input which understands it very well but unfortunately translates badly, I have tried all the models, I wanted to translate from Italian to English, French, German and Spanish but the program invents the words or doesn't understand them, could you tell me how to download a model suitable for the translations I indicated? because I see that you can also put an external model but I don't know how to do it, thanks.

royshilkrot · Jul 29, 2024

luisamanetti said:
Hi thanks for this fabulous plugin, it is very useful, I would like to use it, I use the large 3 version for Italian input which understands it very well but unfortunately translates badly, I have tried all the models, I wanted to translate from Italian to English, French, German and Spanish but the program invents the words or doesn't understand them, could you tell me how to download a model suitable for the translations I indicated? because I see that you can also put an external model but I don't know how to do it, thanks.

Thanks for using the plugin
Sounds like you're looking for a dedicated Italian-English model
The plugin accepts CTranslate2 models, so you can look in huggingface for a "ct2 it en" model, download and try it out
It's not a simple thing. But you can figure it out with some trials.

LocalVocal: Local Live Captions & Translation On-the-Go v0.5.3

New Member

Member

New Member

Member

New Member

Member

New Member

New Member

New Member

Member

New Member

New Member

Member

Member

New Member

Member

Member

Download​

Member

New Member

Member

Download