OBS ASIO Recording - Best audio encoding for lowest cpu usage

osbjslwo

New Member
Hello,

Using the OBS ASIO plugin (https://github.com/Andersama/obs-asio) would anyone know which audio encoding uses the less the CPU when recording using OBS?

Would it be PCM little endian with mkv container? Or is there still format conversion happening from the internal audio in OBS even when using PCM?

Does anyone know what is the internal audio format in OBS? And is it different when using the ASIO plugin?

Thanks
 

AaronD

Active Member
The internal format is 32-bit float. Everything gets converted to that, processed, and the result encoded in whatever format you set.

If you're concerned with the processing effort of different formats of audio, and the conversion between them, then you probably don't have the juice to do video at all, in which case OBS would not be the tool for you. Get something that does audio only, like a DAW.

It takes a ***LOT*** of audio to add up to just one decent stream of video.
 

osbjslwo

New Member
I would like to avoid the double conversion, 24sle (as used in audio cards) -> 32 float (obs) -> 24sle (back to 24)
 

rockbottom

Active Member
BTW, you can record 32-bit float, you're already in the Custom Output. Edit: Don't even need to go there anymore, 32-Float is available in Advanced Standard mode now.
 
Last edited:

rockbottom

Active Member
Had some running around to do. Anyway, I record 24-bit with OBS since Avidemux doesn't like floating point audio & I like being able to use it in my workflow. But, if I'm recording something that I know will have a huge dynamic range, I also run a 32-bit recording with Audacity. If I happen to get any clipping in the 24-bit recording, I can easily fix the mistake by replacing the audio with the 32-bit & renerdering out a new 24-bit track.
 

AaronD

Active Member
I would like to avoid the double conversion, 24sle (as used in audio cards) -> 32 float (obs) -> 24sle (back to 24)

Like I said:
If you're concerned with the processing effort of different formats of audio, and the conversion between them, then you probably don't have the juice to do video at all, in which case OBS would not be the tool for you. Get something that does audio only, like a DAW.

It takes a ***LOT*** of audio to add up to just one decent stream of video.
In other words, don't worry about it. You're optimizing something that doesn't need to be optimized.

16-bit integer is already indistinguishable from analog, if done right, and most consumer sound cards are 16-bit, not 24. And most people who use them don't do it right. Despite all of that, it *still* comes out okay!

The on-board analog audio on a Raspberry Pi is only 11 bits, if I remember correctly, and I have a 24/7 streaming radio station running through it, that sounds just fine for what it does. You don't *need* all that many bits to be good.

24-bit integer is better than analog could ever hope for, so the bottom few bits of that will always be analog noise. Any more bits than that is not to sound better, because it can't. It's to reduce the cumulative roundoff error through all of the processing, so that the system is effectively perfect from the 24-bit input converters if it even has that, to the 24-bit output converters if it even has that. Again, most consumer systems are 16-bit on both ends, which is itself indistinguishable from analog if done right, and most consumers don't do it right and they still sound good.

32-bit is 24-bit, just the floating point sample rate is different.
Almost, sorta-kinda. The extra 8 bits do mean something, and you lose that when they go away. They're like a volume control, added onto each 24-bit sample, so that the top bit is always 1 (and therefore not actually stored) and you get the full 23-bit resolution following that leading 1, for every sample independently.

That gives you constant quality regardless of level, and makes it almost "clip proof" and "noise proof" if you really get to abusing the gain structure.
(there are other reasons though, to keep a good gain structure, one of which is to keep all of the controls in useful positions to adjust from)

Just make sure the levels are below 0 dBfs when going from 32 Float > 24 Fixed.
Yes! When you lose the built-in per-sample volume control, your dynamic range shrinks to what those 24 bits can actually have by themselves. Still better than analog could ever hope to be, but a restriction nonetheless compared to the world of floating point.
 

rockbottom

Active Member
There's no info in those last bits, empty. But yeah, 32-bit is pretty much clip proof. I always record the 32-bit @ the same levels as the 24-bit, no scaling needed, ready to go, just drop it on the timeline, sync it up, drop the levels as needed, go.
 

AaronD

Active Member
I always record the 32-bit @ the same levels as the 24-bit, no scaling needed, ready to go...
Yes, the meters are calibrated on purpose to read the same on either side of a conversion.

FP 1.0 is considered "full scale", even though it's only about half of the geometric range. That lines up nicely with the dB equation, to give you dBFS, which is referenced to "full scale".

Integer is actually a misnomer, carried over from the software engineering world that uses that datatype for what is functionally all-fractional.

Thus, 0.9999999999... is still full scale in "integer", if you interpret the bits as they're actually used for audio. The bit values are not 1, 2, 4, etc, starting with LSb. They're 1/2, 1/4, 1/8, etc, starting with MSb. More bits gives you finer precision, not higher values.

But the debugger doesn't know that. The software engineer has to do that conversion, and manually keep track of where the fractional point is. When using a general-purpose chip (which happens a lot), it's expected to explicitly shift the results over a few bits, or a lot of bits, partway through the math, because the hardware also assumes actual integers, not fractions, and you want to keep the fractional point (that you're keeping track of manually because the machine doesn't) from "running off the end".

When you get it right, it "just works" as if it were designed to do that. It's generally easier though, to use floating point on hardware that has a FP processor, but sometimes for basic stuff, you don't have that, and so you use fixed point, which is really what you're doing when you keep an "integer" datatype for the math itself.
 

osbjslwo

New Member
So 100% confirmed 24sle -> 32float -> 24sle introduces absolutely no artifact and does not impact the quality of the audio?
 

AaronD

Active Member
So 100% confirmed 24sle -> 32float -> 24sle introduces absolutely no artifact and does not impact the quality of the audio?
If you really think through it, you can find ways to abuse things to create artifacts that a machine can detect. That will always be possible. But you'll never hear them.

Bit-perfection is like chasing windmills. Don't bother. If it sounds the same, even with a completely different bit pattern, then it's the same.
The part about time-shifting less than one sample's worth, and what that does to the samples, might be interesting to you.
 
Top