qhobbes is referring to method for you/someone to trigger the OBS Studio Scene switch manually
and beware, I'm not a musician, though I do get to act as 'roadie' for my partner...
For a computer to watch/listen to a performance, and then act based upon that is doable, but, effectively would require some advanced programming to pull off. A computer has no idea, even if you input the sheet music, to know a solo vs other activity, and then desired camera shot to use (and/or manipulate a PTZ camera). Now, when you think autonomous vehicles, clearly there is machine learning/'artificial intelligence' that can accomplish the equivalent. However, you should be aware of the massive teams (hundreds/thousands of talented programmers and data analysts, outrageous sums of money, etc.. [getting 'close' to a solution is relatively easy with today's tech... getting reliable... that is really damn hard... ex Tesla FSD]
Another simpler option would be video matching, but that also quickly gets complicated [I've heard of one user using very simple video recognition (simple color) in a slide to trigger (using plugin Advanced Scene Switcher) scene changes [ie, put a very small color box in a portion of a slide that is cropped out of a capture. then plugin can still see entire source and act off content in it, and color matching is computationally easier than trying to read text in a video].
So, one option would be for you to provide the music to some software, and 'mark up' that music, such that when the computer picks up the audio, actions are taken based on your 'markings' [ie, you indicate when this sequence of music is played, do X]. but computers are really stupid. they will do EXACTLY what you tell them... meaning the music performed live would have to be EXACTLY (or very, very close) to what you 'marked' up.
Think big name live concert performances... and why are there are people as sound engineers, lighting, etc... because automating that, adjusting for live performances variances, etc... not really practical [costs WAY more to automate than to pay staff to execute, for years.. ie no ROI]
Now, I suspect there is some audio listening software that is more flexible than I'm aware of, and maybe really could tell where you were in a Cover song, even with live performance variations, impromptu bits, etc. How accurate/reliable that would be? no idea. But once you found that, and presuming that software offered 'triggers (ie perform a programmed action when a certain 'point' in a song hit), then you could send that trigger to other software (including OBS Studio, probably using websockets) to then execute what actions you programmed there. And if you had a PTZ camera, could trigger presets or other actions on that as well
so with all that... assuming no one else around other than band members performing... the question is whether there is one person who could be tasked with 'triggering' the changes... using either a MIDI trigger, or even a foot pedal or similar remote control pad [ex streamdeck or mobile device] ... there are other manual trigger options as well ... but in the low-cost approach, the switch will be manual. I'd recommend a direct trigger from a device... but thinking of the video slide example above... if you had a set camera perspective, and someone on stage changed a light/color card to correspond to a desired scene, that could work, with all the challenge of color matching in a variable lighting situation [ie, gets tricky/potentially unreliable].
The challenge will be.. .you are likely to start simple, then assuming a band member can multi-task (perform, and control livestream s/w at same time), you'll want to get more sophisticated ('cuz it will look better)... if that person is guitarist, gets to be a challenge if using a foot control ;^) but a phone or small tablet sized device on a mic stand... could easily have 6 -12 'trigger options' to select from?? [# depends primarily on person's eye sight and needed size of 'icon']
just some random thoughts... hope it triggers some ideas for you or others
Hopefully others will chime in