I have a hybrid meeting rig with a complex audio path. The overall system has lots of sources going to lots of destinations, with each combination getting it's own variation on the processing. And it's all automated with Advanced Scene Switcher. I make it work by not using OBS at all for the audio.
I use a DAW instead. Digital Audio Workstation. Essentially a complete sound studio all in one app. For this purpose, I'm only interested in the mixer part. All of the sources go directly to the DAW, not OBS, and all of the destinations are fed directly from the DAW, not OBS. If OBS is involved at all, it's *only* because it has an exclusive connection to something I want to use (playing a video, or recording the meeting), and I set that to pass through with no processing whatsoever. If that connection needs some special processing of its own, I still do it in the DAW, not OBS. So the DAW does *all* of the audio work, not OBS.
To make the automation work, my DAW accepts Open Sound Control (OSC) messages (and seemingly everything else there ever was as well), and I set up Adv. SS to send those messages. That can't fade, only on or off (and a ton of other "set this now" things that I don't use), but I use them to turn on and off a control signal that is generated in the DAW and stays in the DAW. That control signal then goes to a collection of side-chained gates on various audio signals, so I can use their timing controls to create the fades.
---
For what you're doing, I think you want a single signal that says you want to duck by a fixed amount, and that signal is either active or not. No additional strength if multiple things say it should duck. Correct?
To do that, I'd move all of your audio path into a for-real DAW, and then use that concept of a control signal. Full-scale sine generator at 20kHz (it'll max out the meter, but that's fine; you'll never route it somewhere that you can hear), followed by a series of massively-effective duckers (side-chained compressors) so that any one of them can completely kill the tone. Maybe even follow that string with a non-side-chained gate just to be sure. Each of those duckers is side-chained to a different thing that you want to duck under. Set all of those to produce a good control signal, which is not necessarily "audio transparent" at all. My control signals are hard on/off, when I ultimately want (and get) a fade.
Now that you have a control signal that represents what you want the main signal to do, just side-chain it to a single gate on the main signal, and set the attenuation and timing controls on that to have the audible effect that you want.
---
It's kind of an analog mentality here: there's not actually a difference between audio signals and control signals. In fact, neither one actually exists in the form that we normally think of.
In the analog world, it's almost always a voltage or current at each point in a circuit. Period. Signals don't exist, it's just voltages and currents.
(sometimes light, electrostatics, or magnetism, but those are almost always used to "jump a gap" that can't be done with a direct connection: LED on a light-sensitive resistor, for example, as the gain element of a compressor)
In the digital world, it's all numbers. Period. Signals still don't mean anything to the machine, so you can mix and mash them together much like you can in analog.
Either way, digital or analog, you can get some weird results if you don't understand what different things provide and expect. If you do understand, then you can combine them in creative ways to make something work. If you don't, then you'll beat your head wondering why you can't plug your consumer phone into a professional mixer and have it sound right. (the solution to that one is cheap, but not obvious to a novice)