It has nothing to do with screens. Look at sources instead. Specifically:
Monitor feeds your headphones or speakers. Output feeds the Tracks selection here, and then you choose which track(s) to stream and/or record:
If you're in Simple Mode, I believe you're stuck with streaming Track 1 with no way to change that.
Now, if you have Desktop Audio, or an Audio Output Capture in a scene, that is set to the same device as the Monitor (either specifically as shown here, or both Default, or one of each if Default just happens to point to that device):
then that source will also pick up the Monitor. This has nothing to do with OBS, but with the operating system. The point that OBS has access to, to grab a device's output, is *after* everything is combined to feed that output. And so it includes *everything*, including what OBS sent to it. OBS is not copying it; it's just an artifact of how the OS itself processes audio.
Thus, if you want to, for example, capture a game's audio in isolation, and ignore the other stuff going on, then you need to be a little bit more intentional and verbose about how you explicitly design your audio rig. You can't just slap some common settings on and expect it to "just work," but you'll have to work around the limitations of what you're using. And a big source of those limitations is the operating system, not just OBS.