Beam v1.1.0

YorVeX · Apr 4, 2023

YorVeX submitted a new resource:

xObsBeam [BETA] - Transmit raw, uncompressed video and audio feeds between OBS instances.

xObsBeam
OBS plugin to transmit raw, uncompressed video and audio feeds between OBS instances. An alternative to NDI and Teleport for local data transmission.

View attachment 92932

Prerequisites
- OBS 29+ 64 bit
- Currently only working on Windows (tested only on Windows 10, but Windows 11 should also work)

Use case
Compression and resource usage
Both...

Read more about this resource...

crackshottv · May 13, 2023

Cool to see this, excited to try it out. Wondering if you can give me some input on my situation:
I have a pretty good gaming pc (Ryzen 7800X3D + 4080) and a streaming PC (Ryzen 3600X + 1660ti).
I had issues with teleport "dropping frames" which I'm likely chalking up to my network not being able to sustain a 900p/60 transmission while also in use around the house.

Basically my question to you as someone who's quite knowledgable on the matter - if I start to compress my feeds between machines beyond that point, am I almost just better off to stream from my gaming rig? If it's going to cost system resources to compress the feed to the point it needs to be to fit my network situation, am I just sidegrading my streaming solution?

Cheers!

sneaky4oe · May 13, 2023

crackshottv said:
Cool to see this, excited to try it out. Wondering if you can give me some input on my situation:
I have a pretty good gaming pc (Ryzen 7800X3D + 4080) and a streaming PC (Ryzen 3600X + 1660ti).
I had issues with teleport "dropping frames" which I'm likely chalking up to my network not being able to sustain a 900p/60 transmission while also in use around the house.

Basically my question to you as someone who's quite knowledgable on the matter - if I start to compress my feeds between machines beyond that point, am I almost just better off to stream from my gaming rig? If it's going to cost system resources to compress the feed to the point it needs to be to fit my network situation, am I just sidegrading my streaming solution?

Cheers!

Not OP. You need to connect both PCs by ethernet in order for them to be able to pass uncompressed data properly. Alternatively use 900p scene in obs on main PC. Even better - use a capture card to avoid any strain on main PC to offload it for the game completely.
Compressing the feed requires resources. You can do it if you have GPU to spare using nvenc veryfast with bitrate of about 10+ mbit/s for a decent looking stream, but you can get lags, since win10 and win11 distribute resources differently and if you'll get GPU bottlenecked by the game, you will get even more drops. Win7 and win 8.1 didn't have this issue. I've been streaming with main PC to my streaming server like that in the past (AMD GPU worked better for this purpose), but when games started throttling my GPU I had to switch to a capture card as I didn't want to lower in-game settings.

My question to OP is how it's different to NDI in terms of performance and network consumption for 1080, 1440, 4k streams in default config?

crackshottv · May 14, 2023

sneaky4oe said:
Not OP. You need to connect both PCs by ethernet in order for them to be able to pass uncompressed data properly. Alternatively use 900p scene in obs on main PC. Even better - use a capture card to avoid any strain on main PC to offload it for the game completely.
Compressing the feed requires resources. You can do it if you have GPU to spare using nvenc veryfast with bitrate of about 10+ mbit/s for a decent looking stream, but you can get lags, since win10 and win11 distribute resources differently and if you'll get GPU bottlenecked by the game, you will get even more drops. Win7 and win 8.1 didn't have this issue. I've been streaming with main PC to my streaming server like that in the past (AMD GPU worked better for this purpose), but when games started throttling my GPU I had to switch to a capture card as I didn't want to lower in-game settings.

My question to OP is how it's different to NDI in terms of performance and network consumption for 1080, 1440, 4k streams in default config?

Appreciate the response. In my case most capture card solutions wouldn't work as I play on a 1440p monitor and from what I've read the wordarounds for that resolution on capture cards is less than ideal. I am wired on both PCs, but after running some tests with iPerf it seems the max speed the two PCs can communicate at is 100mbps, despite both cables and LAN adapters on my PC being rated for 1000mbps. What's weird is that it seems like my router is supposed to be able to run 1000mbps on all four LAN ports.

YorVeX · May 14, 2023

crackshottv said:
Basically my question to you as someone who's quite knowledgable on the matter - if I start to compress my feeds between machines beyond that point, am I almost just better off to stream from my gaming rig? If it's going to cost system resources to compress the feed to the point it needs to be to fit my network situation, am I just sidegrading my streaming solution?

If you're looking for a solution where you can keep your current (apparently not very good) networking solution, then you've definitely come to the wrong place, since the lossless options that Beam offers need even more bandwidth than lossy options (e.g. Teleport or NDI), not less. The two advantages of Beam would be lower resource usage while retaining more of the original quality (up to 100%), but this is always traded for a higher bandwidth demand. E.g. my current gaming content 720p30 test loop peeks at 270 mbps with lossless QOI (Beam) and at 120 mbps with lossy JPEG (which is what Teleport is using) at quality 90.

Your only real option with the current network would be to see how much you need to move the quality slider down in Teleport (decreasing the bandwidth demand) for your network to be able to handle it and then check whether the remaining quality would still be acceptable for you. The good news is that unlike what you seem to think this will even lower your CPU usage on the gaming rig. Of course you could also decrease the resolution and/or the framerate of the feed that you send to your streaming PC, again trading in quality for lowering the bandwidth demand.

I don't have a crystal ball and might be wrong, but my guess would be that the quality ends up to be so bad that you would be better off just streaming directly from your gaming rig.

Before you go down that route I'd say you should prioritize trying to fix your network. I couldn't imagine still being limited to 100 mbps on my local network in 2023, even my Internet connection is 1000 mbps downstream, which I then could only utilize 10% of. 100 really is nothing nowadays. Unfortunately I am no expert on that so I'm afraid you need to search for advice on this elsewhere.

Only 2 things that had happened to me personally come to my mind: one was that I had an old Cat 5e network cable which worked fine for 1000 mbps for many years, but then suddenly would only be able to do 100. Either the material degraded over the years or it was bent one time too often, whatever it was, replacing that cable with a new one fixed it. And in another situation I had a network card driver that just wouldn't correctly auto-negotiate the speed, so I had to set it from the default "auto-negotiate" to a fixed 1000 mbps speed and then it was fine.

Anyway, this is the only thing you should focus on right now, instead of trying to make do with that very bad network and keep on living with it.

YorVeX · May 14, 2023

To use this as some kind of a blog post for both notes to myself and people who wonder about this: I am currently working on integrating more compression options. Since with NDI and Teleport we already have 2 lossy options I was prioritizing lossless with Beam to fill that gap, however, there will also be a JPEG lossy option in the next release and maybe others in future releases.

I am sure for most people the important point is that the sender, which is usually the gaming PC, has the smallest possible performance impact. When offloading resource usage for stream/recording encoding to a secondary machine is the whole point, then obviously resource usage from sending the feed there should be as small as possible, otherwise it would eat up all the benefits of such a setup. This is why I always look at resource usage first when trying new compression methods.

Unlike what people seem to think I am not really knowledgeable in this area, I only ever started to look into this whole topic when I started to work on Beam 7 weeks or so ago and now I just experimented a lot to see what works and what doesn't, like a kid with its new toy. My preliminary impression is that with QOI I seem to be already quite close to the optimum that can be done with lossless image compression while trying to have a low CPU usage at the same time.

I recently tried WebP, and while that peeked at 170 mbps bandwidth usage compared to QOI at 270 mbps in my test video loop, the CPU usage was so insanely high that I couldn't even correctly measure it, it just couldn't keep up and dropped frames all the time. The case for WebP would probably be "don't care about CPU usage, but need binary lossless and the lowest possible bandwidth", but I don't see where that would be sensible in practice. Therefore WebP will not make it into a release, feel free to try and change my mind by showing a scenario where this actually could make sense.

The next thing I tried was JPEG lossless, using the same libjpeg-turbo library that Teleport uses, but in the newer beta 3 version, which introduces support for lossless. The CPU usage is better than with WebP, albeit still too high to be useful, and the bandwidth usage is even worse than with QOI, peeking at a little over 300 in the test loop, making it lose in all relevant aspects against QOI.

The only reason why it will probably still be in future releases is that I want to offer JPEG lossy anyway, for BGRA and Windows it's already implemented, and offering JPEG lossless in addition is literally just an additional checkbox in the settings and a handful of extra code lines, it's more effort to remove it again than to just keep it as it is now. I might reconsider if people start trying to use it the wrong way or with wrong expectations and keep on annoying me about it.

Also QOI is known to compress technical/artificial content like my gaming test loop especially well, so my comparison might be totally unfair, there might be content where JPEG lossless produces a lot better results.

Here's how the settings window is currently looking:

JPEG is mutually exclusive with the other options, selecting it will force-uncheck the other options and vice-versa, it just doesn't make sense to combine them.

The "QOI compression level" logic has a breaking change in the next release, currently level 1 would skip compression for 50% of frames (and you can't skip more than that), with the new logic level 5 would skip 50%, level 1 would compress only 10% of frames and level 10 stays the same and compresses all of them. By this it's giving an even higher range of control over the CPU vs. bandwidth usage trade-off, e.g. if you can almost do raw but just lack that tiny bit of extra bandwidth, level 1 might solve that now.

This makes sense for QOI, because whether compressed or not, ultimately all of the frames on the receiver will be exactly the same as the original frames from the sender. Meaning there will be no way to tell a difference between frames that were originally sent compressed and raw frames.

However, with this all being the toy for the kid I am I did something funny and also left this logic in place for JPEG lossy. It's an entirely different story there, because it means that e.g. with level 5 half of the frames will have a visual degradation from the lossy compression, while every 2nd frame retains the original quality. At 30 fps this means the stream alternates between full and lowered quality from frame to frame, 15 times per second. My assumption would be that there is a 95% chance that this doesn't make sense to configure the CPU/bandwidth usage trade-off compared to just using the JPEG quality option for that, but I still wanted to play with it just for fun, try and see whether it's actually visible to the human eye especially combined with lower JPEG quality and how the human eye interprets it. I also wonder what this means for further procession of the stream, e.g. for x264/x265 encoding. Do the encoders perform bad with this, because it creates a difference between 2 frames that otherwise wouldn't even have been that different, or maybe it doesn't matter too much, since these encoders also apply a lossy logic so the 2 frames end up being very similar anyway? I guess I will find out

When looking at other lossless compression options, there indeed are still a few more interesting candidates. First and foremost there is QOIR, which is an enhanced version of QOI, but better in all aspects, producing a better compression ratio with smaller CPU usage, at least in theory. Swapping QOI for this would be an obvious choice. Unfortunately it's also a lot more complex than QOI, so a pure managed C# implementation as I have for QOI right now doesn't seem to be feasible, I will probably have to include it as a separate native library, making it harder to use cross-platform.

The QOIR author has also produced some benchmarks that are interesting for me here, because from that table fpng(e) and LZ4PNG could provide a bit higher throughput while sacrificing some compression rate for it, so I definitely want to have a look into that.

Losslessly compressing on the GPU (e.g. with nvComp) might also be interesting, but I have absolutely zero experience with GPU coding, so I don't know whether I really want to go down that route and what to expect from it. I could also try to utilize NVENC (H.264/H.265/AV1) for a lossy GPU based option.

I have yet to decide on whether after finishing all these experiments only a few algorithms will survive and stay (there's still always the compression level option to adapt it to specific needs) or whether I keep that many of them. It will depend on whether I get the impression that they perform very different based on the content they are applied to or whether that doesn't matter a lot. My favorite option would be to eventually keep only one lossy and one lossless compression (The Best™ of each for most common scenarios) because that also means less work maintaining them in the future.

Other things I have been working on and continue to work on right now is enhanced frame buffering/sorting logics, my current code already is a bit improved compared to 0.6.0, which can still produce frames in the wrong order when not synced to the OBS render thread, causing issues in OBS.

And of course I also want to implement a filter solution that can be applied to a single source. I realize that this is probably something that many people would want more than all of the other things mentioned here, but for me it makes sense to do this later to save me from having to do stuff twice for the output and the filter.

KattPhloxworthy · May 14, 2023

Hello!

I just saw this plugin and wanted to chime in that I have just set up three of my four computers I use to stream with Intel-based quad-port server gigabit NICs, then set them up along with the Cisco Catalyst 3560G gigabit switch to bundle them up into LACP aggregate teams using Intel's ProSet software. If this works, at a minimum, I might use this between my streaming and encoding PCs, though if bandwidth and system resources permit, I might add the gaming PC to the fray.

The convergence of these events could not have been more opportune! :3

I will be setting this up and will report back with more info as I mess with these.

--Katt. =^.^=

YorVeX · May 14, 2023

KattPhloxworthy said:
I just saw this plugin and wanted to chime in that I have just set up three of my four computers I use to stream with Intel-based quad-port server gigabit NICs, then set them up along with the Cisco Catalyst 3560G gigabit switch to bundle them up into LACP aggregate teams using Intel's ProSet software.

Given that 10G capable gear is still quite expensive this sounds like an interesting option to get more out of existing gigabit setups, never thought about that. I still wonder how cost effective that is, you will need many cables and a switch with lots of ports, which are usually also not that exactly cheap either. And I'm also curious how good and reliable this software link aggregation really is. Especially when you put it under stress and maybe at the same time still expect a good latency, e.g. for a gaming connection. It could be an option to only aggregate 3 of them together and use the 4th one as a dedicated link, but this needs some proper interface binding setup and I only now realized that Beam currently allows an interface binding configuration on the sender side but not for the receiver, one more TODO for the future.

Anyway, looking forward to your reports.

KattPhloxworthy · May 14, 2023

One thing I just noticed is that it only sends the left channel on both channels. I'm testing other modes to see how it'll fare.

More soon.

--Katt. =^.^=

YorVeX · May 14, 2023

KattPhloxworthy said:
One thing I just noticed is that it only sends the left channel on both channels. I'm testing other modes to see how it'll fare.

As in you only get the left side of stereo audio, but you hear it on both sides? So it's basically mono then, but only based on what was the left channel?
Just in case it's relevant: note that only track 1 is sent and other audio tracks are ignored.

KattPhloxworthy · May 14, 2023

YorVeX said:
As in you only get the left side of stereo audio, but you hear it on both sides? So it's basically mono then, but only based on what was the left channel?
Just in case it's relevant: note that only track 1 is sent and other audio tracks are ignored.

The slider in my encoder machine shows it's a two-channel source, but my stereo check only sounds on the left channel and center. I'm using my standard programming sequence to test things and I format the start of my stream like an older C-band satellite feed TV and production facilities use. The track that's going out on track 1 is stereophonic, and I get left/right/center announcements on Teleport and NDI, but only left and center on Beam.

Plus, unrelated to this issue, I suspect my hardware might be a bit on the anemic side for this code, using a Core i7 6700 on both the sending and receiving machines fur studio integration and encoding (I separate these two functions because I don't want to overwork any one machine).

Also, as of this writing, I haven't tried on my gaming machine just yet, which is the firebreather of the bunch, a Ryzen 7 5800X. However, it, too, uses an LACP agg link. It might do better with the encoding, however. I may still use it to send gaming video data to the studio PC.

For now, I'm using Teleport between the studio and encoder PCs, NDI from the VTuber PC (which runs VTube Studio, T.I.T.S., and an OBS Studio to composite that scene, and NDI is needed because it supports alpha channels) to the studio PC.

More soon.

--Katt. =^.^=

YorVeX · May 14, 2023

Thanks for the report, will investigate on the audio issue.

Regarding the hardware it depends. QOI is not that easy to compare, because while on one side it's quite light on the CPU on the other side it doesn't support any kind of hardware acceleration or utilization of SIMD CPU instruction sets like MMX, SSE or AVX. QOI itself also doesn't support multi-threading for a single frame pass, although when not running in sync mode Beam can QOI compress several frames in parallel on different threads.

What that means in practice depends a lot on the scenario and the setup. I haven't used NDI in a quite a long time, but compared to Teleport Beam should of course have lower CPU usage in raw mode since no compression needs to be done.

With QOI, as I said, hard to tell, especially because you either need to take risks to run it on NV12/I420 color format (which it wasn't made for) or switch to BGRA, which by itself already increases the CPU usage of OBS even before any plugin comes into play.

I'd guess QOI would especially benefit from a good single core performance but with level 10 probably still causes a little higher CPU load than JPEG lossy. This is where your increased bandwidth comes in, If you got enough bandwidth to trade for that you can lower the QOI level and should be able to reach a point where the CPU usage is lower, I just don't know how much the BGRA factor weighs here. There's just too many variables, in the end you just need to test for yourself anyway.

The good news is that either way it all supports alpha channels and that you get lossless transmission in return, which is especially helpful when in longer chains where you'd otherwise compress multiple times, losing a bit of quality every time.

Switching to QOIR, which is planned later, should also improve the situation a lot by lowering the CPU usage while increasing the compression ratio and if I saw it correctly also allows for some HW optimizations and multi-threading, we'll see.

KattPhloxworthy · May 14, 2023

YorVeX said:
Given that 10G capable gear is still quite expensive this sounds like an interesting option to get more out of existing gigabit setups, never thought about that. I still wonder how cost effective that is, you will need many cables and a switch with lots of ports, which are usually also not that exactly cheap either. And I'm also curious how good and reliable this software link aggregation really is. Especially when you put it under stress and maybe at the same time still expect a good latency, e.g. for a gaming connection. It could be an option to only aggregate 3 of them together and use the 4th one as a dedicated link, but this needs some proper interface binding setup and I only now realized that Beam currently allows an interface binding configuration on the sender side but not for the receiver, one more TODO for the future.

Anyway, looking forward to your reports.

All the gear, I got used, and at quite a price point. I've had the GBE switch for about four years, while I got the NICs last week. Of course, the switches don't HAVE to be Cisco; I just happen to have quite a bit of Cisco gear at my disposal. The switch involved just needs to support LACP.

And, as it so happens, the switch has 48 copper ports and four SFP (1Gbps) ports.

--Katt. =^.^=

sixdenk · May 14, 2023

I read this is an alternative to NDI?
Is it possible to send only audio from gaming pc to streaming pc? (Different audio tracks)

And can we please have a Mac version?
Have a Mac for streaming

YorVeX · May 14, 2023

sixdenk said:
I read this is an alternative to NDI?
Is it possible to send only audio from gaming pc to streaming pc? (Different audio tracks)

And can we please have a Mac version?
Have a Mac for streaming

Yes, but not a 1:1 alternative on the technical side of things, it works a bit differently, e.g. uses lossless compression or none at all. Also it's still in beta, meaning that features are still missing. Audio only is not yet available I'm afraid.

Regarding a Mac version there is this in the FAQ:

Q: Will there be a version for MacOS?
A: NativeAOT doesn't support cross-compiling and I don't have a Mac, so I currently can't compile it, let alone test it. You can try to compile it yourself, but note that MacOS is currently only supported by the next preview version of .NET 8, although people do already successfully create builds with it.

If you need an alternative for NDI that is stable, works on Mac and supports sending audio only I can recommend the very nice Teleport plugin.

crackshottv · May 15, 2023

YorVeX said:
If you're looking for a solution where you can keep your current (apparently not very good) networking solution, then you've definitely come to the wrong place, since the lossless options that Beam offers need even more bandwidth than lossy options (e.g. Teleport or NDI), not less. The two advantages of Beam would be lower resource usage while retaining more of the original quality (up to 100%), but this is always traded for a higher bandwidth demand. E.g. my current gaming content 720p30 test loop peeks at 270 mbps with lossless QOI (Beam) and at 120 mbps with lossy JPEG (which is what Teleport is using) at quality 90.

Your only real option with the current network would be to see how much you need to move the quality slider down in Teleport (decreasing the bandwidth demand) for your network to be able to handle it and then check whether the remaining quality would still be acceptable for you. The good news is that unlike what you seem to think this will even lower your CPU usage on the gaming rig. Of course you could also decrease the resolution and/or the framerate of the feed that you send to your streaming PC, again trading in quality for lowering the bandwidth demand.

I don't have a crystal ball and might be wrong, but my guess would be that the quality ends up to be so bad that you would be better off just streaming directly from your gaming rig.

Before you go down that route I'd say you should prioritize trying to fix your network. I couldn't imagine still being limited to 100 mbps on my local network in 2023, even my Internet connection is 1000 mbps downstream, which I then could only utilize 10% of. 100 really is nothing nowadays. Unfortunately I am no expert on that so I'm afraid you need to search for advice on this elsewhere.

Only 2 things that had happened to me personally come to my mind: one was that I had an old Cat 5e network cable which worked fine for 1000 mbps for many years, but then suddenly would only be able to do 100. Either the material degraded over the years or it was bent one time too often, whatever it was, replacing that cable with a new one fixed it. And in another situation I had a network card driver that just wouldn't correctly auto-negotiate the speed, so I had to set it from the default "auto-negotiate" to a fixed 1000 mbps speed and then it was fine.

Anyway, this is the only thing you should focus on right now, instead of trying to make do with that very bad network and keep on living with it.

Thanks for the response, I did some more testing with iPerf and regardless of changing out cables between Cat5e exclusively or adding in a Cat7 to the chain (even tried tossing my Gigabit switch in there to see if it helped), I can only get 100mbps between computers on this network, and I'm willing to bet that the rented Bell Modem/Router combo is the culprit. I'm gonna give a shot to just stream off of this main PC and see if there's a big performance hit, but my journey through all these different plugin solutions these past few weeks has been super beneficial in furthering my understanding of how this stuff works.

Keep up the great work and when it comes time for myself and my girlfriend to find a new place where I'll have more control over the networking situation, you best believe I'll be coming back to give Beam a shot.

Cheers!

YorVeX · May 20, 2023

KattPhloxworthy said:
One thing I just noticed is that it only sends the left channel on both channels. I'm testing other modes to see how it'll fare.

More soon.

--Katt. =^.^=

Thanks again for the report. Had a real BRUH moment when I finally found what caused the bug (accidental local redeclaration of a global variable).

Unfortunately I got no time anymore right now to create a new release with the fix, will do within the next 24 hours though. New version also comes with JPEG compression support, an improved QOI level selection logic and stable frame sorting for async mode.

YorVeX · May 21, 2023

YorVeX updated xObsBeam [BETA] with a new update entry:

Version 0.7 Beta

Known issues in this release

#14: Odd resolutions like 1279x719 cause screen distortions when using YUV color formats

What's Changed

#13: JPEG support (lossy and lossless) based on the libjpeg-turbo library

The Windows binary package provided here includes the necessary libjpeg-turbo library, for...

Read the rest of this update entry...

YorVeX · May 21, 2023

YorVeX updated xObsBeam [BETA] with a new update entry:

Version 0.7.1 Beta

Known issues in this release

#14: Odd resolutions like 1279x719 cause screen distortions when using YUV color formats

What's Changed

e51c8f8: Fixed LZ4 "Sync skips with QOI" option text locale identifier...

Read the rest of this update entry...

YorVeX · May 22, 2023

As a reference, I just tried sending raw 1080p60 BGRA (that should theoretically have 3729 mbps net bandwidth) over my 10G network and according to Windows Task Manager get a 4.1 gbps throughput, so that's what you always have to expect on top of the net values, quite significant difference. The good news is that it's 100% stable here with Beam.

When I started this plugin my only use case was local (named pipe) transmission, but then I just couldn't resist to try it over network and I am happy it really is working. Looking forward to hear about the LACP results.

Beam v1.1.0

Member

xObsBeam​

Prerequisites​

Use case​

Compression and resource usage​

New Member

Member

New Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

New Member

Member

Member

Known issues in this release​

What's Changed​

Member

Known issues in this release​

What's Changed​

Member

xObsBeam

Prerequisites

Use case

Compression and resource usage

Known issues in this release

What's Changed

Known issues in this release

What's Changed