I would like to avoid the double conversion, 24sle (as used in audio cards) -> 32 float (obs) -> 24sle (back to 24)
Like I said:
If you're concerned with the processing effort of different formats of audio, and the conversion between them, then you probably don't have the juice to do video at all, in which case OBS would not be the tool for you. Get something that does audio only, like a DAW.
It takes a ***LOT*** of audio to add up to just one decent stream of video.
In other words, don't worry about it. You're optimizing something that doesn't need to be optimized.
16-bit integer is already indistinguishable from analog, if done right, and most consumer sound cards are 16-bit, not 24. And most people who use them don't do it right. Despite all of that, it *still* comes out okay!
The on-board analog audio on a Raspberry Pi is only 11 bits, if I remember correctly, and I have a 24/7 streaming radio station running through it, that sounds just fine for what it does. You don't *need* all that many bits to be good.
24-bit integer is better than analog could ever hope for, so the bottom few bits of that will always be analog noise. Any more bits than that is not to sound better, because it can't. It's to reduce the cumulative roundoff error through all of the processing, so that the system is effectively perfect from the 24-bit input converters if it even has that, to the 24-bit output converters if it even has that. Again, most consumer systems are 16-bit on both ends, which is itself indistinguishable from analog if done right, and most consumers don't do it right and they still sound good.
32-bit is 24-bit, just the floating point sample rate is different.
Almost, sorta-kinda. The extra 8 bits do mean something, and you lose that when they go away. They're like a volume control, added onto each 24-bit sample, so that the top bit is always 1 (and therefore not actually stored) and you get the full 23-bit resolution following that leading 1, for every sample independently.
That gives you constant quality regardless of level, and makes it almost "clip proof" and "noise proof" if you really get to abusing the gain structure.
(there are other reasons though, to keep a good gain structure, one of which is to keep all of the controls in useful positions to adjust from)
Just make sure the levels are below 0 dBfs when going from 32 Float > 24 Fixed.
Yes! When you lose the built-in per-sample volume control, your dynamic range shrinks to what those 24 bits can actually have by themselves. Still better than analog could ever hope to be, but a restriction nonetheless compared to the world of floating point.