LocalVocal: Seamless Live Transcriptions On-the-Go

LocalVocal: Seamless Live Transcriptions On-the-Go v0.0.7

Supported Bit Versions
  1. 32-bit
  2. 64-bit
Source Code URL
https://github.com/occ-ai/obs-localvocal
Minimum OBS Studio Version
29.0.0
Supported Platforms
  1. Windows
  2. Mac OS X
  3. Linux
LocalVocal live-streaming AI assistant plugin allows you to transcribe, locally on your machine, audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). ✅ No GPU required, ✅ no cloud costs, ✅ no network and ✅ minimal lag! Privacy first - all data stays on your machine.

Interested in AI? Copilot for OBS? Check out my AI for OBS tutorials and URL/API source! Be a 10x streamer / creator with free AI tools open source.

If this plugin has been valuable to you consider adding a ⭐ to the GH repo, rating it here on OBS, subscribing to my YouTube channel, and supporting my work: https://github.com/sponsors/royshil . Check out the Home for Open Source Content Creator AI: https://github.com/occ-ai


AI on OBS with LocalVocal:

The plugin adds an Audio Filter - use it on a speech source (mic, video) to get a transcription. Send the captions to a Text Source to show on scene.

Current Features:
  • Transcribe audio to text in real time in 100 languages
  • Display captions on screen using text sources
  • Translate captions in real time to any language - see https://youtu.be/Q34LQsx-nlg
Roadmap Features: (coming soon)
  • Remove unwanted words from the transcription
  • Summarize the text and show "highlights" on screen
  • Detect key moments in the stream and allow triggering events (like replay)
  • Detect emotions/sentiment and allow triggering events (like changing the scene or colors etc.)
Internally the plugin is running a neural network (OpenAI Whisper) locally to predict in real time the speech and provide captions.

It's using the Whisper.cpp project from ggerganov to run the Whisper network in a very efficient way on CPUs and GPUs.

If you use this plugin - let us know! We would love to feature your work/vids and showcase your success.

Check out our other plugins:
  • Background Removal removes background from webcam without a green screen.
  • Experimental CleanStream for real-time filler word (uh,um) and profanity removal from live audio stream
  • URL/API Source that allows fetching live data from an API and displaying it in OBS.
If you like this work, which is given to you completely free of charge, please consider supporting it on GitHub: https://github.com/sponsors/royshil
Author
royshilkrot
Downloads
3,377
Views
15,040
First release
Last update
Rating
5.00 star(s) 1 ratings

More resources from royshilkrot

Latest updates

  1. v0.0.7 - ~25% performance increase! and fixing crash bug

    This release bumps Whisper.cpp to a version that brings ~25% gains in performance (!), and also...
  2. v0.0.6 - Fix ALL unicode languages! + Min sub duration

    In this release: Minimum subtitle display duration, removing the vexing [skip] message, fixing...
  3. v0.0.5 - fixing UTF8 encoding on Windows for several languages

    In this release I'm fixing an important bug with UTF8 characters not being handled properly on...

Latest reviews

This will be huge once it gets a bunch of optimizations, whether on plugin's or Whisper's side.

You can use it for standard subtitle-related stuff, but since it can output to text files unlike other similar plugins, it can be also used with e.g. Advanced Scene Switcher as something that fuels it with voice commands.

For now, on medicore modern CPUs, it works well with tiny model (except that has troubles recognizing certain words and phrases) and base (better at recognizing, but CPU and response time struggle a bit more). For me personally it works best with CUDA version, so if your GPU is more free or better than CPU, I recommend compiling for that. Bigger models are not too usable.
Top