Is it as simple as remapping the input from VGA to 9-PIN and then remapping the outpiut from 9-Pin to VGA, or do I need specialized circuits to do that?
VGA is 5 signals: Red, Green, Blue, Hsync, Vsync. Some things separate them into 5 BNC's. Projectors and professional switchers, most commonly. The 15-pin Dsub connector has a dedicated ground for each color, and a shared ground for the syncs, for a total of 9 pins for the video signal itself.
The remaining pins are used for two different formats of digital communication to identify the type and capabilities of a display device. Some graphics cards refuse to drive a port that doesn't receive that ID, which caused some problems with early projectors and early computers, and others can be made to blindly throw a signal out there anyway.
I heard analog mixing is faster than digital mixing.
Technically yes......until you need to do something that requires a frame buffer, like synchronizing two inputs that can't be forced into sync themselves. Then you have the exact same problem that a digital one has, with the additional problem of analog storage being generally worse than digital.
Or are you thinking of *hardware* mixing on a dedicated device, vs. *software* mixing on a PC operating system, and not necessarily *analog* vs. *digital*? The FPGA that I mentioned earlier is entirely digital, and entirely hardware, despite that hardware being programmable. The program there is not a list of sequential instructions like it is for a "traditional" computer, but a set of connections between preexisting logic functions. The number of available functions is in the thousands or millions, so you really can do pretty much anything with it, with end-to-end latency measured in nanoseconds if you want......unless the problem itself dictates otherwise.
Back to the sync example, there's never a complete picture on the wire. There's only ever a single pixel, if even that. (blanking intervals...) The sync signals determine where on the screen the presently encoded pixel is supposed to be. To get a complete frame, you have to wait for an entire sync cycle. To get a complete frame of multiple independent sources, you have to wait for two cycles (frames) worst-case. And while you're waiting, you need to store what you've got so far. If what you're doing requires that, then that's unavoidable latency no matter how you do it.
If what you're doing *doesn't* require a complete frame buffer, then you can make it faster. Less latency. But general-purpose tools often use a frame buffer anyway because it's much easier to do that than to switch modes.
A simple converter might be immediate, like VGA to Composite and vice-versa*, or it may only have a horizontal-line buffer instead of a frame buffer just to scale the resolution with, but a general-purpose mixer will probably convert everything to digital internally, use a full frame buffer for every connection regardless of what you're doing with it, and then convert to whatever output formats it has.
Bufferless pure-analog gear will probably have an explicit sync or clock signal that is intended to force all of the sources into sync, and of course everything that it connects to is expected to accept that signal and use it accordingly. There may even be a dedicated device in the rig somewhere, that generates the sync signals and does nothing else, and *everything* takes that as an input. Once you have that though, you can crossfade between live cameras smoothly with zero latency, mess with colors, etc., because there's only a single set of sync signals throughout the entire system, and you just process the colors independently from that. You can't do anything with *adjacent* pixels though; only the *one* that you have in that exact microsecond.
---
* Composite might be thought of as "analog compression" in a very loose sense. It has to be "uncompressed" into practically VGA internally before a TV can display it. And the Composite signal was designed originally to be easy for vacuum tube circuits to do that with because that's all there was at the time. No memory or buffer whatsoever, except for the state of each oscillator that swept the one single dot across the screen while its brightness varied. Both sync's are there, for those two local oscillators to lock onto, and the three colors are a clever backwards-compatible addition to the original grayscale, that uses the "dumb average" for brightness and what more-or-less amounts to another radio receiver to recover two difference signals. (color broadcast TV was therefore several layers of radio, not just the one that you tuned to: decoding one gave you access to another, etc.) Overall brightness combined with the two differences, produces red, green, and blue.
The blanking intervals are to provide some space for the sync's to happen while the dot is off-screen (that would be visible too), and for the oscillators to do the return part of their cycles without being seen, to get ready for another visible pass. So if the composite signal did have something in the right part of the blanking interval, it might appear as a diagonal line across the screen as the dot position reset. VGA computer monitors often use the blanking interval to set a reference black level for the next line or the next frame, since nothing is ever equal in analog, and they're trying to produce a good enough picture that that matters.
I think it's fascinating to see the original invention to drive a dirt-cheap tube circuit from a single radio receiver to produce a persistence-of-vision picture, and then its progression to color, and then computers used that same signal, and then the TV's internal color format was brought out to a connector for better quality, and then computers came to drive that format far better than they did originally, with "TV's" that only had that connector and were designed to do justice to that much better signal, and then that format changed to digital in the form of DVI while keeping all of the same concepts, and then DVI became HDMI which added audio and encryption in the new mode but still has a backwards compatible mode so that a DVI to HDMI adapter can be a pin-for-pin wiring job with no electronics. And I've seen DVI ports on graphics cards run in HDMI mode through an adapter, so that yes, you can get audio from a DVI port...because it's running as HDMI.
So......like space rockets being sized for a Roman horse by a linked progression of standards, today's digital video is also related to the original tube TV's.