LocalVocal: Local Live Captions & Translation On-the-Go

LocalVocal: Local Live Captions & Translation On-the-Go v0.2.1

Supported Bit Versions
  1. 64-bit
Source Code URL
Minimum OBS Studio Version
Supported Platforms
  1. Windows
  2. Mac OS X
  3. Linux
LocalVocal live-streaming AI assistant plugin allows you to transcribe & translate, locally on your machine, audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). ✅ No GPU required*, ✅ no cloud costs, ✅ no network and ✅ minimal lag! Privacy first - all data stays on your machine. (* GPU acceleration via CUDA is supported!)

Check out my AI for OBS tutorials and URL/API source! Be a 10x streamer / creator with free AI tools open source. Join me on Discord ! Check out LexiSynth Transcription+Translation all-in-one tool

If this plugin has been valuable to you consider adding a ⭐ to the GH repo, rating it here on OBS, subscribing to my YouTube channel, and supporting my work: https://github.com/sponsors/royshil . Check out the Home for Open Source Content Creator AI: https://github.com/occ-ai

Do more with LocalVocal:

The plugin adds an Audio Filter - use it on a speech source (mic, video) to get a transcription. Send the captions to a Text Source to show on scene.

Current Features:
Roadmap Features: (coming soon)
  • Remove unwanted words from the transcription
  • Summarize the text and show "highlights" on screen
  • Detect key moments in the stream and allow triggering events (like replay)
  • Detect emotions/sentiment and allow triggering events (like changing the scene or colors etc.)
Internally the plugin is running a neural network (OpenAI Whisper) locally to predict in real time the speech and provide captions.

It's using the Whisper.cpp project from ggerganov to run the Whisper network in a very efficient way on CPUs and GPUs. For translation it's using CTranslate2 and the M2M100 model.

If you use this plugin - let us know! We would love to feature your work/vids and showcase your success.

Check out our other plugins:
  • Background Removal removes background from webcam without a green screen.
  • Detect will detect and track >80 types of objects in real-time inside OBS
  • URL/API Source that allows fetching live data from an API and displaying it in OBS.
If you like this work, which is given to you completely free of charge, please consider supporting it on GitHub: https://github.com/sponsors/royshil
First release
Last update
5.00 star(s) 3 ratings

More resources from royshilkrot

Latest updates

  1. v0.2.1 - Translation! Built-in!

    This version adds built-in translation for live captioning. Translating to-and-from ~100...
  2. v0.2.0 - CUDA on Windows! Mac Apple ARM optimization!

    Introducing: CUDA for Windows! Now compiling Whisper.cpp vs. CUDA 11 and 12 for GPU accelerated...
  3. v0.1.1 - new Whisper, variable buffer, bugfix 7.1 audio

    What's Changed Update Whisper.cpp version by @royshil in #72 Variable buffer size options by...

Latest reviews

This is brilliant. The fully-local implementation of speech-to-text already works very well.

I can't wait to see what transpires as this matures.
Easy to install and setup. Exactly what I needed
This will be huge once it gets a bunch of optimizations, whether on plugin's or Whisper's side.

You can use it for standard subtitle-related stuff, but since it can output to text files unlike other similar plugins, it can be also used with e.g. Advanced Scene Switcher as something that fuels it with voice commands.

For now, on medicore modern CPUs, it works well with tiny model (except that has troubles recognizing certain words and phrases) and base (better at recognizing, but CPU and response time struggle a bit more). For me personally it works best with CUDA version, so if your GPU is more free or better than CPU, I recommend compiling for that. Bigger models are not too usable.