For audio, if you need different sync's per scene then you can include a new audio source that points to the same audio device. Essentially, you're creating two "unique" sources from OBS's perspective. That will allow you to use different audio delays on each. There is a side effect of this though, and that's when switching scenes you are (obviously) going to have different delays between the audio, so there will be repeated/cut audio during the transition.
For video delays, you apply an async delay to any video source you want... but only in the actual "delay" direction (you can't have it appear earlier than the rest of the scene).
Since you're talking about two separate video sources, just add a delay to whichever one needs it to match the other, and use the same audio source for both - that way you don't have the problem of audio repeat/cut when switching between scenes.