Improving OBS Video Latency: What technologies are used for chroma key?

BensTechLab

New Member
I've been playing around with OBS and getting pretty decent results visually speaking. But there is some variable latency in the video processing time in OBS on my current hardware. To be fair I'm using a Microsoft Surface Pro 5 which is really only a 2 core processor hyperthreaded to 4 threads, with onboard/mobile graphics.

I'm considering either upgrading to a newer laptop or building a dedicated streaming box. I'd like to know what technologies are used or could be used to reduce OBS video processing latency including layered graphics/chroma key? Rather than just throwing a bunch of over-spec'd components (and money) at the build, I'd prefer to know that OBS can/will take advantage of such has higher core count CPU vs better graphics card, does it use AVX2, NVidia vs Radeon, etc...

Testing on my Microsoft Surface Pro 5 the video processing latency is variable. I tried setting up VoiceMeeter Potato with a 125ms delay and it was pretty good, but the variable nature of the latency made it come and go out of sync. Is the variability just because it really only has 2 processor cores, even though its relatively new and does have AVX2?

Testing on my old 2013 Mac Book Pro, it appears that OBS will use more cores (as activity monitor shows 4 cores getting loaded up), but the laptop is old and does NOT have AVX2 for example. I also noticed that OBS only supports software based H264 encoding on this old mac (maybe due to age of hardware? or is it due to MAC vs PC?). Would a new Mac Book Pro be much lower latency with AVX2 and perhaps more CPU cores and perhaps hardware encoding/decoding?

Any tips for choosing hardware for low latency video processing would be appreciated!
 

BensTechLab

New Member
Ok, I have started some benchmarking to see if we can't correlate some performance metrics with various hardware combinations!

My Microsoft Surface Pro 5 was bought in spring 2018 as “top of the line” for its time. It has a Kaby Lake Intel Core i7-7660U 2.5 to 4.0 GHz (2 cores/4 threads) Processor, Iris Plus 640 GPU, 16 GB Ram.

My MacBook Pro was bought early 2013 as “top of the line” for its time. It has an Ivy Bridge Intel Core i7-3635QM 2.4 Ghz (4 cores/8 threads) Processor, NVIDIA GeForce GT 650M 1 GB GPU, 16 GB 1600 MHZ DDR3 RAM.

I tested each device with 2 different HDMI capture devices, an AverMedia GC311 (USB2) and an Elgato Camlink 4k (USB3). In both cases capturing 1920x1080 HDMI from a Sony a6400 camera and in all four test cases using a Steinberg UR12 USB Audio Interface.

  • 2013 MacBook Pro + AverMedia GC311 = 4-6 frames latency at 60fps = 67ms - 100ms
  • 2013 MacBook Pro + Elgato Camlink 4K = 4-5 frames latency at 60fps = 67ms - 83ms
  • 2017/2018 Microsoft Surface Pro5 + AverMedia GC311 = 10 frames latency at 60fps = 167ms
  • Microsoft Surface Pro5 + Elgato Camlink 4k = 6-7 frames latency at 60fps = 100ms - 117ms

The latency on the Macbook is almost acceptable without adjustment! It appears the sub 100ms range is not that annoying. The Microsoft Surface latency is high enough that I was getting comments that people couldn't watch the video feed while listening to me because of how far out of sync it was (using virtual cam for virtual meetings). So 167ms is enough that some people couldn't stand to watch it.

So now the question is what hardware components made the difference here? CPU core count vs better GPU? The MacBook pro is old enough it does NOT have AVX2 instruction set which is used by apps like Microsoft Teams for virtual backgrounds. I imagine therefor that a brand new current generation Mac Book Pro could achieve even lower than 67ms latency with the elgato camlink 4k.

Also in all tests, I did notice a reasonably significant load on the GPU while OBS was running and then a further increase on GPU when I hit record (while idling with OBS closed, both machines showed virtually 0 GPU usage). Therefor I suspect GPU does play a significant role in OBS performance and latency.

Googling "NVIDIA GeForce GT 650M vs Iris Plus 640" the benchmarks don't show radical performance differences. So I'm not sure how much of a role this played in the delta between my two devices. However, the NVIDIA does have dedicated memory and dedicated PCIe lanes where as the on-board IRIS graphics is using shared memory. The comparison I found also show the NVidia having a much higher transistor count and TDP (45w vs 15w). However, again the benchmarks were not starkly contrasted.

Googling "Intel Core i7-3635QM vs Intel Core i7-7660U" the benchmarks note the obvious higher core count, but also that the older MacBook Pro processor has lower memory latency and more PCIe lanes (including that PCIe x16 slot for the GPU in the macbook pro). So it seems high core count, with more/faster memory/pcie bandwidth beat out a processor with newer instruction sets but a lower spec.

I'm still not sure what all to draw from this, except that I could certainly sell my 2018 Microsoft Surface now before I buy something new. :-p

I plan to test on a few more hardware combinations if I can borrow a few machines from friends for comparison. Hopefully we can build a bit of a benchmarking table of stats that help guide hardware selection for "streaming machine" builds!
 
Top