Question / Help 1080p60: CPU load to high, despite of QuickSync Encoder

spyro

New Member
Hi there,

I wonder why OBS has a CPU load this high when I try to preview in 1080p60 (with QuickSync enabled). The encoding itself seems not to be problem at all as I can convert a 1080p60-Video with Handbrake (Nightly) with more than 120 fps to H.264 (CPU-Load is about 15%). So it seems, that the conversation to NV12 is by far the slowest part in the chain.

With OBS I have massive framedrops even at the preview - no chance for 1080p60 at all. It seems, that especially the color conversion to NV12 (which runs on the CPU for some reason) is a real bottleneck:

17:31:25: Settings::Video: Enabling Aero
17:31:33: =====Stream Start: 2014-05-26, 17:31:33===============================================
17:31:33: Multithreaded optimizations: On
17:31:33: Base resolution: 1920x1080
17:31:33: Output resolution: 1920x1080
17:31:33: ------------------------------------------
17:31:33: Loading up D3D10 on Intel(R) HD Graphics 4600 (Adapter 1)...
17:31:33: ------------------------------------------
17:31:33: Audio Format: 48000 Hz
17:31:33: ------------------------------------------
17:31:33: Audio Channels: 2 Ch
17:31:33: Playback device Default
17:31:33: ------------------------------------------
17:31:33: Using desktop audio input: Lautsprecher (High Definition Audio-Gerät)
17:31:33: Global Audio time adjust: 0
17:31:33: ------------------------------------------
17:31:33: Audio Encoding: AAC
17:31:33: bitrate: 128
17:31:33: Using text output
17:31:33: Using text output
17:31:33: ------------------------------------------
17:31:33: device: INOGENI HDMI/DVI-2-USB3,
17:31:33: device id \\?\usb#vid_2997&pid_0001&mi_00#6&8931867&0&0000#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\global,
17:31:33: chosen type: YUY2, usingFourCC: false, res: 1920x1080 - 1920x1080, frameIntervals: 166666-417083
17:31:33: use buffering: false - 0, fourCC: 'YUY2'
17:31:33: audio device: Digitale Audioschnittstelle (IN,
17:31:33: audio device id (null),
17:31:33: audio time offset 0,
17:31:33:
17:31:33: device audio info - bits per sample: 16, channels: 2, samples per sec: 44100, block size: 4
17:31:33: Using directshow input
17:31:33: Scene buffering time set to 700
17:31:33: Found QSV hardware support
17:31:34: ------------------------------------------
17:31:34: QSV version 1.6 using MFX_IMPL_HARDWARE_ANY | MFX_IMPL_VIA_D3D9 (actual: MFX_IMPL_HARDWARE | MFX_IMPL_VIA_D3D9)
17:31:34: Using 13 bitstreams and 16 frame buffers
17:31:34: ------------------------------------------
17:31:34: Video Encoding: QSV
17:31:34: fps: 60
17:31:34: width: 1920, height: 1080
17:31:34: target-usage: MFX_TARGETUSAGE_1_BEST_QUALITY
17:31:34: profile: MFX_PROFILE_AVC_HIGH
17:31:34: CBR: no
17:31:34: CFR: yes
17:31:34: max bitrate: 10000
17:31:34: buffer size: 10000
17:31:34: ------------------------------------------
(...)
17:32:31: Total frames encoded: 3414, total frames duplicated: 2140 (62.68%)
17:32:31: Number of frames skipped due to encoder lag: 1929 (56.50%)
17:32:31: Total frames rendered: 1459, number of late frames: 112 (7.68%) (it's okay for some frames to be late)
17:32:31: Profiler time results:
17:32:31:
17:32:31: ==============================================================
17:32:31: video thread frame - [100%] [avg time: 20.696 ms] [children: 94.2%] [unaccounted: 5.8%]
17:32:31: | scene->Preprocess - [66.4%] [avg time: 13.75 ms]
17:32:31: | GPU download and conversion - [27.8%] [avg time: 5.746 ms] [children: 10.6%] [unaccounted: 17.2%]
17:32:31: | | flush - [0.633%] [avg time: 0.131 ms]
17:32:31: | | CopyResource - [9.25%] [avg time: 1.914 ms]
17:32:31: | | conversion to 4:2:0 - [0.672%] [avg time: 0.139 ms]
17:32:31: Convert444Thread - [100%] [avg time: 14.791 ms] [children: 99.7%] [unaccounted: 0.264%]
17:32:31: | Convert444toNV12 - [99.7%] [avg time: 14.752 ms]
17:32:31: encoder thread frame - [100%] [avg time: 15.404 ms] [children: 99.5%] [unaccounted: 0.532%]
17:32:31: | QueueEncodeTask - [0.0584%] [avg time: 0.009 ms]
17:32:31: | ProcessEncodedFrame - [99.4%] [avg time: 15.307 ms]
17:32:31: | sending stuff out - [0.039%] [avg time: 0.006 ms]
17:32:31: ==============================================================
17:32:31:
17:32:31:
17:32:31: Profiler CPU results:
17:32:31:
17:32:31: ==============================================================
17:32:31: video thread frame - [cpu time: avg 13.921 ms, total 20311.3 ms] [avg calls per frame: 1]
17:32:31: | scene->Preprocess - [cpu time: avg 12.82 ms, total 18704.5 ms] [avg calls per frame: 1]
17:32:31: | GPU download and conversion - [cpu time: avg 0.299 ms, total 436.802 ms] [avg calls per frame: 1]
17:32:31: | | flush - [cpu time: avg 0.074 ms, total 109.2 ms] [avg calls per frame: 1]
17:32:31: | | CopyResource - [cpu time: avg 0.088 ms, total 124.801 ms] [avg calls per frame: 1]
17:32:31: | | conversion to 4:2:0 - [cpu time: avg 0.011 ms, total 15.6 ms] [avg calls per frame: 1]
17:32:31: Convert444Thread - [cpu time: avg 13.742 ms, total 19390.9 ms] [avg calls per frame: 1]
17:32:31: | Convert444toNV12 - [cpu time: avg 13.698 ms, total 19328.5 ms] [avg calls per frame: 1]
17:32:31: encoder thread frame - [cpu time: avg 0.05 ms, total 171.601 ms] [avg calls per frame: 1]
17:32:31: | QueueEncodeTask - [cpu time: avg 0.004 ms, total 15.6 ms] [avg calls per frame: 1]
17:32:31: | ProcessEncodedFrame - [cpu time: avg 0.037 ms, total 124.801 ms] [avg calls per frame: 1]
17:32:31: | sending stuff out - [cpu time: avg 0 ms, total 0 ms] [avg calls per frame: 1]
17:32:31: ==============================================================
17:32:31:
17:32:31: =====Stream End: 2014-05-26, 17:32:31=================================================


Maybe there is some DirectShow-Filter with hardware support missing or something on my system...? Are there any possibilities to speed this up? Maybe it's possible to use the Intel Media SDK for hardware accelerated color conversion or something?

Thank you and keep up the great work! :)
spyro
 
Last edited:

Floatingthru

Community Helper
Post your entire log please. Also, you can try changing the QSV preset to a faster one (7 is the fastest) then see if that eliminates all the duplicate and late frames.
 

spyro

New Member
Hello Floatingthru,

thanks for your response. Unfortunely I can post the full log not right now as I got no access to our grabbing hardware the next days. But I will do so as far as I can.

Anyway, this high CPU load even with QuickSync is very strange. When I convert a 1080p60 file with Handbrake (nightly) I got only a few percent CPU load and about 120-140 fps, even when using the highest QSV preset:

omuvxre8.png


jlwcrlg5.png


So, are you sure that QSV ist the problem here? Especially scene->Preprocess - [66.4%] [avg time: 13.75 ms] and Convert444toNV12 - [99.7%] [avg time: 14.752 ms] seems very high to me. Are this normal values?

Maybe it would be possible to use the highly optimized functions from the Intel IPP package to accelerate the color conversation (see https://software.intel.com/en-us/articles/intel-media-sdk-tutorial-simple-6-encode-ipp-cc)?

Thanks again,
spyro
 
Last edited:

Boildown

Active Member
Check and see which capture method you're using and check the stickied posts that you're doing it right. Without the full log we can't say for sure what your problem is though.
 

spyro

New Member
Here's the full logfile (testet streaming on 1080p).

Here's the interesting part:

12:33:05: =====Stream Start: 2014-09-12, 12:33:05===============================================
12:33:05: Multithreaded optimizations: On
12:33:05: Base resolution: 1920x1080
12:33:05: Output resolution: 1920x1080
12:33:05: ------------------------------------------
12:33:05: Loading up D3D10 on Intel(R) HD Graphics 4600 (Adapter 1)...
12:33:05: ------------------------------------------
12:33:05: Audio Format: 48000 Hz
12:33:05: ------------------------------------------
12:33:05: Audio Channels: 2 Ch
12:33:05: Playback device {0.0.0.00000000}.{7a9ddaa5-5261-4345-9889-ada38b0447b7}
12:33:05: ------------------------------------------
12:33:05: Using desktop audio input: USB Audio (Dummy) (2- USB Audio Device)
12:33:05: Global Audio time adjust: 500
12:33:05: ------------------------------------------
12:33:05: Using auxilary audio input: Audiomix (USB) (2- USB Audio Device)
12:33:05: Mic time offset: 500
12:33:05: ------------------------------------------
12:33:05: Audio Encoding: AAC
12:33:05: bitrate: 128
12:33:05: Using text output
12:33:05: Using text output
12:33:05: ------------------------------------------
12:33:05: device: XI100DUSB-HDMI Video,
12:33:05: device id \\?\usb#vid_2935&pid_0001&mi_00#6&36c2c457&0&0000#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\global,
12:33:05: chosen type: YUY2, usingFourCC: false, res: 1920x1080 - 1920x1080, frameIntervals: 166667-666667
12:33:05: use buffering: false - 0, fourCC: 'YUY2'
12:33:05: audio device: Disable,
12:33:05: audio device id Disabled,
12:33:05: audio time offset 0,
12:33:05:
12:33:05: Using directshow input
12:33:05: ------------------------------------------
12:33:05: device: INOGENI HDMI/DVI-2-USB3,
12:33:05: device id \\?\usb#vid_2997&pid_0001&mi_00#6&8931867&0&0000#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\global,
12:33:05: chosen type: YUY2, usingFourCC: false, res: 1920x1080 - 1920x1080, frameIntervals: 166666-417083
12:33:05: use buffering: false - 0, fourCC: 'YUY2'
12:33:05: audio device: Disable,
12:33:05: audio device id Disabled,
12:33:05: audio time offset 0,
12:33:05:
12:33:05: Using directshow input
12:33:05: Scene buffering time set to 700
12:33:05: Found QSV hardware support
12:33:06: ------------------------------------------
12:33:06: QSV version 1.8 using MFX_IMPL_HARDWARE_ANY | MFX_IMPL_VIA_D3D9 (actual: MFX_IMPL_HARDWARE | MFX_IMPL_VIA_D3D9)
12:33:06: Using 13 bitstreams and 16 frame buffers
12:33:06: ------------------------------------------
12:33:06: Video Encoding: QSV
12:33:06: fps: 30
12:33:06: width: 1920, height: 1080
12:33:06: target-usage: MFX_TARGETUSAGE_4_BALANCED
12:33:06: profile: MFX_PROFILE_AVC_HIGH
12:33:06: CBR: no
12:33:06: CFR: yes
12:33:06: max bitrate: 9000
12:33:06: buffer size: 9000
12:33:06: rate control: VBR
12:33:06: ------------------------------------------
12:33:06: CLRHost:: HDM-STREAMER: OnStartStream
12:33:06: CLRHost:: HDM-STREAMER: ShowSetActiveFirstMessage()
12:33:06: Audio timestamp for device 'Audiomix (USB) (2- USB Audio Device)' was behind target timestamp by 93453531
12:33:07: Error: all frames are in use
12:33:07: Error: all frames are in use
12:33:11: Error: all frames are in use
12:33:11: Error: all frames are in use
12:33:12: Error: all frames are in use
12:33:12: Error: all frames are in use
12:33:12: Error: all frames are in use
12:33:24: FlushBufferedVideo: Flushing 12 packets over 367 ms
12:33:25: Total frames encoded: 546, total frames duplicated: 94 (17.22%)
12:33:25: Number of frames skipped due to encoder lag: 4 (0.73%)
12:33:25: Total frames rendered: 551, number of late frames: 3 (0.54%) (it's okay for some frames to be late)
12:33:25: CLRHost:: HDM-STREAMER: OnStopStream()
12:33:25:
12:33:25: Profiler time results:
12:33:25:
12:33:25: ==============================================================
12:33:25: video thread frame - [100%] [avg time: 24.048 ms] [children: 94.1%] [unaccounted: 5.91%]
12:33:25: | scene->Preprocess - [81.9%] [avg time: 19.705 ms]
12:33:25: | GPU download and conversion - [12.2%] [avg time: 2.922 ms] [children: 10.2%] [unaccounted: 1.93%]
12:33:25: | | flush - [0.94%] [avg time: 0.226 ms]
12:33:25: | | CopyResource - [7.17%] [avg time: 1.724 ms]
12:33:25: | | conversion to 4:2:0 - [2.11%] [avg time: 0.507 ms]
12:33:25: Convert444Thread - [100%] [avg time: 14.12 ms] [children: 99.8%] [unaccounted: 0.212%]
12:33:25: | Convert444toNV12 - [99.8%] [avg time: 14.09 ms]
12:33:25: encoder thread frame - [100%] [avg time: 2 ms] [children: 91.3%] [unaccounted: 8.7%]
12:33:25: | QueueEncodeTask - [1.8%] [avg time: 0.036 ms]
12:33:25: | ProcessEncodedFrame - [68.1%] [avg time: 1.362 ms]
12:33:25: | sending stuff out - [21.4%] [avg time: 0.428 ms]
12:33:25: ==============================================================
12:33:25:
12:33:25:
12:33:25: Profiler CPU results:
12:33:25:
12:33:25: ==============================================================
12:33:25: video thread frame - [cpu time: avg 19.393 ms, total 10686.1 ms] [avg calls per frame: 1]
12:33:25: | scene->Preprocess - [cpu time: avg 18.006 ms, total 9921.66 ms] [avg calls per frame: 1]
12:33:25: | GPU download and conversion - [cpu time: avg 0.283 ms, total 156 ms] [avg calls per frame: 1]
12:33:25: | | flush - [cpu time: avg 0.198 ms, total 109.2 ms] [avg calls per frame: 1]
12:33:25: | | CopyResource - [cpu time: avg 0 ms, total 0 ms] [avg calls per frame: 1]
12:33:25: | | conversion to 4:2:0 - [cpu time: avg 0.029 ms, total 15.6 ms] [avg calls per frame: 1]
12:33:25: Convert444Thread - [cpu time: avg 12.48 ms, total 6614.44 ms] [avg calls per frame: 1]
12:33:25: | Convert444toNV12 - [cpu time: avg 12.48 ms, total 6614.44 ms] [avg calls per frame: 1]
12:33:25: encoder thread frame - [cpu time: avg 0.555 ms, total 296.402 ms] [avg calls per frame: 1]
12:33:25: | QueueEncodeTask - [cpu time: avg 0 ms, total 0 ms] [avg calls per frame: 1]
12:33:25: | ProcessEncodedFrame - [cpu time: avg 0.262 ms, total 140.402 ms] [avg calls per frame: 1]
12:33:25: | sending stuff out - [cpu time: avg 0.146 ms, total 78 ms] [avg calls per frame: 1]
12:33:25: ==============================================================
12:33:25:
12:33:25: =====Stream End: 2014-09-12, 12:33:25=================================================
 

Attachments

  • 2014-09-12-1157-26.log
    62.5 KB · Views: 8
Last edited:
Top