I have a few questions about the AudioSource::SortAudio() method, which I came across while trying to fix out-of-sync issues the VideoSourcePlugin had. The way I understand it the method is used to remove problems introduced by draining the buffers of auxiliary audio sources and the possible bad timestamps it introduces.
To explain what the problem in the plugin was I'll quote what I wrote on the pull request of the plugin:
So the first question I have is why does the method assume that the last time stamp pulled is correct? Since the OBS audio loop pulls all of the available audio data instantly, if the plugin relies on time information from OBS, the first segment time stamp would be correct and all of the following would be the same. Were there any plugins/sources, where the last time stamp pulled was more reliable than the first one?
The second question would be, why is the entire available audio buffer pulled at once? In the case of the desktop source, the buffer is only pulled until the specified scene buffering time is reached. Were there problems with auxiliary audio sources, if they aren't drained entirely?
To explain what the problem in the plugin was I'll quote what I wrote on the pull request of the plugin:
Basically VLC pushes a lot of audio at the start, let's say 2 seconds. In addition, OBS pulls all of the buffered audio from auxiliary source like our (null) device. Because of this, these first two seconds of audio all get timestamped with the same time stamp, because we use GetAudioTime() to get our time stamp and its value doesn't change during the burst. The pts times VLC delivers are also pretty unreliable, so they don't help at all.
OBS has two methods with dealing with this problem but both of them fail.
The first one is AudioSource::SortAudio(), which is called after OBS burst pulls the audio buffer. This method ensures that the segments' timestamps are spaced 10ms apart. But it assumes that the last time stamp pulled is correct and adjusts the timestamps before that accordingly. This is obviously wrong in our case, the first time stamp would be the correct one. Also this actually means that the first segments are shifted into the past and are therefore discarded.
The second method is part of AudioSource::QueryAudio2, where the timestamps are smoothed, meaning they are actually ignored and replaced with the last time stamp + 10. The original time stamp is only used, if the difference is greater than 70ms. Since VLC pushes way more than 70ms of audio and these segments all carry the same timestamp, this smoothing is not enough.
The way I fixed it, the plugin now does the smoothing itself by using new timestamp = last time stamp + 10 and only using GetAudioTime() again, if that value is greater than the last time stamp + 10.
So the first question I have is why does the method assume that the last time stamp pulled is correct? Since the OBS audio loop pulls all of the available audio data instantly, if the plugin relies on time information from OBS, the first segment time stamp would be correct and all of the following would be the same. Were there any plugins/sources, where the last time stamp pulled was more reliable than the first one?
The second question would be, why is the entire available audio buffer pulled at once? In the case of the desktop source, the buffer is only pulled until the specified scene buffering time is reached. Were there problems with auxiliary audio sources, if they aren't drained entirely?