Activating layers with my microphone

vxcheg · Jun 17, 2023

Good time everyone

I have several GIF animations (layers). This is a streaming avatar. I animate myself and plan a lot of words from a prepared animation gif. I need my microphone (when I say something) to activate a specific layer (gif animation of a talking avatar). Only when I speak. I think the essence is clear, I'm sure there is a solution (inside the OBS), but I can't find it.

Now I have a script that includes only one layer in the folder, for convenient animation switching with one button (I have it on the numeric keypad)

upd: In addition, I would like the plugin (and script) to determine my voice. And did not capture the noise.

AaronD · Jun 17, 2023

You want to analyze the audio signal and (correctly) choose one of those 6 layers to show based on the audio signal alone? The only thing I know of that is that good at analyzing audio is speech recognition. And it's heavily optimized for speech-to-text. Can't do anything else. Probably a lot of research and hard-coding for that one application (and more recently, AI training, which also can't be reused), that would have to be repeated for a different application.

If anyone else knows of a general-purpose version of that concept, I'd be interested too!

---

Originally, before seeing your list of layers, I thought you had animated a bunch of words and wanted your avatar to lip-sync based on that. That *might* be in the realm of possibility, if you could animate enough words to cover the vast majority of what you're likely to say, and then give it the output of a (good!) speech-recognition program. You'd have to delay everything though, to match the time it takes for the speech-recognition to figure it out. It'd probably be recognizable, but still rough unless you could animate and detect phrases and sentences instead, which is even *more* animation work and more live delay.

Or, you could use the direct lip-sync and other tracking that some VR rigs have now, and render the entire (rigged) model in real-time with those inputs.

vxcheg · Jun 18, 2023

AaronD said:
You want to analyze the audio signal and (correctly) choose one of those 6 layers to show based on the audio signal alone? The only thing I know of that is that good at analyzing audio is speech recognition. And it's heavily optimized for speech-to-text. Can't do anything else. Probably a lot of research and hard-coding for that one application (and more recently, AI training, which also can't be reused), that would have to be repeated for a different application.

If anyone else knows of a general-purpose version of that concept, I'd be interested too!

---

Originally, before seeing your list of layers, I thought you had animated a bunch of words and wanted your avatar to lip-sync based on that. That *might* be in the realm of possibility, if you could animate enough words to cover the vast majority of what you're likely to say, and then give it the output of a (good!) speech-recognition program. You'd have to delay everything though, to match the time it takes for the speech-recognition to figure it out. It'd probably be recognizable, but still rough unless you could animate and detect phrases and sentences instead, which is even *more* animation work and more live delay.

Or, you could use the direct lip-sync and other tracking that some VR rigs have now, and render the entire (rigged) model in real-time with those inputs.

Thanks

The described options look complicated and it seems I did not fully understand. Synchronization with words is not needed. I would like to do it on a simple level like that.

When the microphone is active (the lower noise threshold is cut off), it automatically activates and shows the layer I need (looped animation of the talking mouth).
When the microphone is not active (for example, less than -20 dB on the volume scale), then this layer is hidden.
It is necessary that the layer is shown only when I speak.

AaronD · Jun 19, 2023

vxcheg said:
When the microphone is active (the lower noise threshold is cut off), it automatically activates and shows the layer I need (looped animation of the talking mouth).
When the microphone is not active (for example, less than -20 dB on the volume scale), then this layer is hidden.
It is necessary that the layer is shown only when I speak.

Ah! Okay. So you really only need two states, possibly with some hysteresis between them to avoid it flipping back and forth with a constant audio level. The Advanced Scene Switcher plugin can do that!

Advanced Scene Switcher

This plugin will allow you to automate various tasks using "Macros". Macros consist of a list of conditions under which a list of actions will be performed. Examples and guides can be found in the wiki. Feel free to contribute! Currently...

obsproject.com

Of course, you can tweak that however you need. I'd consider the "only on change" checkbox to be important; that keeps it from running constantly while the condition remains true.

vxcheg · Jun 19, 2023

AaronD said:
Ah! Okay. So you really only need two states, possibly with some hysteresis between them to avoid it flipping back and forth with a constant audio level. The Advanced Scene Switcher plugin can do that!

Advanced Scene Switcher

This plugin will allow you to automate various tasks using "Macros". Macros consist of a list of conditions under which a list of actions will be performed. Examples and guides can be found in the wiki. Feel free to contribute! Currently...

obsproject.com

View attachment 95164
View attachment 95166
Of course, you can tweak that however you need. I'd consider the "only on change" checkbox to be important; that keeps it from running constantly while the condition remains true.

This is wonderful! I managed to set it up the way I wanted. Thank you very much for the prompt advice.

Activating layers with my microphone

vxcheg

New Member

Attachments

AaronD

Active Member

vxcheg

New Member

AaronD

Active Member

Advanced Scene Switcher

vxcheg

New Member

Advanced Scene Switcher