Another idea might be to maybe expand the transition condition and offer specifying the transition target scene or source scene.
I like that version. It gets a little bit iffy when the UI is different from what's actually happening under the hood, but if it accurately describes what a user sees, then that's still the way to go.
(Just be careful about building more functionality on top of that mismatch, because it *will* catch up with you at some point! A little bit is probably okay.)