Hi there,
I wonder why OBS has a CPU load this high when I try to preview in 1080p60 (with QuickSync enabled). The encoding itself seems not to be problem at all as I can convert a 1080p60-Video with Handbrake (Nightly) with more than 120 fps to H.264 (CPU-Load is about 15%). So it seems, that the conversation to NV12 is by far the slowest part in the chain.
With OBS I have massive framedrops even at the preview - no chance for 1080p60 at all. It seems, that especially the color conversion to NV12 (which runs on the CPU for some reason) is a real bottleneck:
17:31:25: Settings::Video: Enabling Aero
17:31:33: =====Stream Start: 2014-05-26, 17:31:33===============================================
17:31:33: Multithreaded optimizations: On
17:31:33: Base resolution: 1920x1080
17:31:33: Output resolution: 1920x1080
17:31:33: ------------------------------------------
17:31:33: Loading up D3D10 on Intel(R) HD Graphics 4600 (Adapter 1)...
17:31:33: ------------------------------------------
17:31:33: Audio Format: 48000 Hz
17:31:33: ------------------------------------------
17:31:33: Audio Channels: 2 Ch
17:31:33: Playback device Default
17:31:33: ------------------------------------------
17:31:33: Using desktop audio input: Lautsprecher (High Definition Audio-Gerät)
17:31:33: Global Audio time adjust: 0
17:31:33: ------------------------------------------
17:31:33: Audio Encoding: AAC
17:31:33: bitrate: 128
17:31:33: Using text output
17:31:33: Using text output
17:31:33: ------------------------------------------
17:31:33: device: INOGENI HDMI/DVI-2-USB3,
17:31:33: device id \\?\usb#vid_2997&pid_0001&mi_00#6&8931867&0&0000#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\global,
17:31:33: chosen type: YUY2, usingFourCC: false, res: 1920x1080 - 1920x1080, frameIntervals: 166666-417083
17:31:33: use buffering: false - 0, fourCC: 'YUY2'
17:31:33: audio device: Digitale Audioschnittstelle (IN,
17:31:33: audio device id (null),
17:31:33: audio time offset 0,
17:31:33:
17:31:33: device audio info - bits per sample: 16, channels: 2, samples per sec: 44100, block size: 4
17:31:33: Using directshow input
17:31:33: Scene buffering time set to 700
17:31:33: Found QSV hardware support
17:31:34: ------------------------------------------
17:31:34: QSV version 1.6 using MFX_IMPL_HARDWARE_ANY | MFX_IMPL_VIA_D3D9 (actual: MFX_IMPL_HARDWARE | MFX_IMPL_VIA_D3D9)
17:31:34: Using 13 bitstreams and 16 frame buffers
17:31:34: ------------------------------------------
17:31:34: Video Encoding: QSV
17:31:34: fps: 60
17:31:34: width: 1920, height: 1080
17:31:34: target-usage: MFX_TARGETUSAGE_1_BEST_QUALITY
17:31:34: profile: MFX_PROFILE_AVC_HIGH
17:31:34: CBR: no
17:31:34: CFR: yes
17:31:34: max bitrate: 10000
17:31:34: buffer size: 10000
17:31:34: ------------------------------------------
(...)
17:32:31: Total frames encoded: 3414, total frames duplicated: 2140 (62.68%)
17:32:31: Number of frames skipped due to encoder lag: 1929 (56.50%)
17:32:31: Total frames rendered: 1459, number of late frames: 112 (7.68%) (it's okay for some frames to be late)
17:32:31: Profiler time results:
17:32:31:
17:32:31: ==============================================================
17:32:31: video thread frame - [100%] [avg time: 20.696 ms] [children: 94.2%] [unaccounted: 5.8%]
17:32:31: | scene->Preprocess - [66.4%] [avg time: 13.75 ms]
17:32:31: | GPU download and conversion - [27.8%] [avg time: 5.746 ms] [children: 10.6%] [unaccounted: 17.2%]
17:32:31: | | flush - [0.633%] [avg time: 0.131 ms]
17:32:31: | | CopyResource - [9.25%] [avg time: 1.914 ms]
17:32:31: | | conversion to 4:2:0 - [0.672%] [avg time: 0.139 ms]
17:32:31: Convert444Thread - [100%] [avg time: 14.791 ms] [children: 99.7%] [unaccounted: 0.264%]
17:32:31: | Convert444toNV12 - [99.7%] [avg time: 14.752 ms]
17:32:31: encoder thread frame - [100%] [avg time: 15.404 ms] [children: 99.5%] [unaccounted: 0.532%]
17:32:31: | QueueEncodeTask - [0.0584%] [avg time: 0.009 ms]
17:32:31: | ProcessEncodedFrame - [99.4%] [avg time: 15.307 ms]
17:32:31: | sending stuff out - [0.039%] [avg time: 0.006 ms]
17:32:31: ==============================================================
17:32:31:
17:32:31:
17:32:31: Profiler CPU results:
17:32:31:
17:32:31: ==============================================================
17:32:31: video thread frame - [cpu time: avg 13.921 ms, total 20311.3 ms] [avg calls per frame: 1]
17:32:31: | scene->Preprocess - [cpu time: avg 12.82 ms, total 18704.5 ms] [avg calls per frame: 1]
17:32:31: | GPU download and conversion - [cpu time: avg 0.299 ms, total 436.802 ms] [avg calls per frame: 1]
17:32:31: | | flush - [cpu time: avg 0.074 ms, total 109.2 ms] [avg calls per frame: 1]
17:32:31: | | CopyResource - [cpu time: avg 0.088 ms, total 124.801 ms] [avg calls per frame: 1]
17:32:31: | | conversion to 4:2:0 - [cpu time: avg 0.011 ms, total 15.6 ms] [avg calls per frame: 1]
17:32:31: Convert444Thread - [cpu time: avg 13.742 ms, total 19390.9 ms] [avg calls per frame: 1]
17:32:31: | Convert444toNV12 - [cpu time: avg 13.698 ms, total 19328.5 ms] [avg calls per frame: 1]
17:32:31: encoder thread frame - [cpu time: avg 0.05 ms, total 171.601 ms] [avg calls per frame: 1]
17:32:31: | QueueEncodeTask - [cpu time: avg 0.004 ms, total 15.6 ms] [avg calls per frame: 1]
17:32:31: | ProcessEncodedFrame - [cpu time: avg 0.037 ms, total 124.801 ms] [avg calls per frame: 1]
17:32:31: | sending stuff out - [cpu time: avg 0 ms, total 0 ms] [avg calls per frame: 1]
17:32:31: ==============================================================
17:32:31:
17:32:31: =====Stream End: 2014-05-26, 17:32:31=================================================
Maybe there is some DirectShow-Filter with hardware support missing or something on my system...? Are there any possibilities to speed this up? Maybe it's possible to use the Intel Media SDK for hardware accelerated color conversion or something?
Thank you and keep up the great work! :)
spyro
I wonder why OBS has a CPU load this high when I try to preview in 1080p60 (with QuickSync enabled). The encoding itself seems not to be problem at all as I can convert a 1080p60-Video with Handbrake (Nightly) with more than 120 fps to H.264 (CPU-Load is about 15%). So it seems, that the conversation to NV12 is by far the slowest part in the chain.
With OBS I have massive framedrops even at the preview - no chance for 1080p60 at all. It seems, that especially the color conversion to NV12 (which runs on the CPU for some reason) is a real bottleneck:
17:31:25: Settings::Video: Enabling Aero
17:31:33: =====Stream Start: 2014-05-26, 17:31:33===============================================
17:31:33: Multithreaded optimizations: On
17:31:33: Base resolution: 1920x1080
17:31:33: Output resolution: 1920x1080
17:31:33: ------------------------------------------
17:31:33: Loading up D3D10 on Intel(R) HD Graphics 4600 (Adapter 1)...
17:31:33: ------------------------------------------
17:31:33: Audio Format: 48000 Hz
17:31:33: ------------------------------------------
17:31:33: Audio Channels: 2 Ch
17:31:33: Playback device Default
17:31:33: ------------------------------------------
17:31:33: Using desktop audio input: Lautsprecher (High Definition Audio-Gerät)
17:31:33: Global Audio time adjust: 0
17:31:33: ------------------------------------------
17:31:33: Audio Encoding: AAC
17:31:33: bitrate: 128
17:31:33: Using text output
17:31:33: Using text output
17:31:33: ------------------------------------------
17:31:33: device: INOGENI HDMI/DVI-2-USB3,
17:31:33: device id \\?\usb#vid_2997&pid_0001&mi_00#6&8931867&0&0000#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\global,
17:31:33: chosen type: YUY2, usingFourCC: false, res: 1920x1080 - 1920x1080, frameIntervals: 166666-417083
17:31:33: use buffering: false - 0, fourCC: 'YUY2'
17:31:33: audio device: Digitale Audioschnittstelle (IN,
17:31:33: audio device id (null),
17:31:33: audio time offset 0,
17:31:33:
17:31:33: device audio info - bits per sample: 16, channels: 2, samples per sec: 44100, block size: 4
17:31:33: Using directshow input
17:31:33: Scene buffering time set to 700
17:31:33: Found QSV hardware support
17:31:34: ------------------------------------------
17:31:34: QSV version 1.6 using MFX_IMPL_HARDWARE_ANY | MFX_IMPL_VIA_D3D9 (actual: MFX_IMPL_HARDWARE | MFX_IMPL_VIA_D3D9)
17:31:34: Using 13 bitstreams and 16 frame buffers
17:31:34: ------------------------------------------
17:31:34: Video Encoding: QSV
17:31:34: fps: 60
17:31:34: width: 1920, height: 1080
17:31:34: target-usage: MFX_TARGETUSAGE_1_BEST_QUALITY
17:31:34: profile: MFX_PROFILE_AVC_HIGH
17:31:34: CBR: no
17:31:34: CFR: yes
17:31:34: max bitrate: 10000
17:31:34: buffer size: 10000
17:31:34: ------------------------------------------
(...)
17:32:31: Total frames encoded: 3414, total frames duplicated: 2140 (62.68%)
17:32:31: Number of frames skipped due to encoder lag: 1929 (56.50%)
17:32:31: Total frames rendered: 1459, number of late frames: 112 (7.68%) (it's okay for some frames to be late)
17:32:31: Profiler time results:
17:32:31:
17:32:31: ==============================================================
17:32:31: video thread frame - [100%] [avg time: 20.696 ms] [children: 94.2%] [unaccounted: 5.8%]
17:32:31: | scene->Preprocess - [66.4%] [avg time: 13.75 ms]
17:32:31: | GPU download and conversion - [27.8%] [avg time: 5.746 ms] [children: 10.6%] [unaccounted: 17.2%]
17:32:31: | | flush - [0.633%] [avg time: 0.131 ms]
17:32:31: | | CopyResource - [9.25%] [avg time: 1.914 ms]
17:32:31: | | conversion to 4:2:0 - [0.672%] [avg time: 0.139 ms]
17:32:31: Convert444Thread - [100%] [avg time: 14.791 ms] [children: 99.7%] [unaccounted: 0.264%]
17:32:31: | Convert444toNV12 - [99.7%] [avg time: 14.752 ms]
17:32:31: encoder thread frame - [100%] [avg time: 15.404 ms] [children: 99.5%] [unaccounted: 0.532%]
17:32:31: | QueueEncodeTask - [0.0584%] [avg time: 0.009 ms]
17:32:31: | ProcessEncodedFrame - [99.4%] [avg time: 15.307 ms]
17:32:31: | sending stuff out - [0.039%] [avg time: 0.006 ms]
17:32:31: ==============================================================
17:32:31:
17:32:31:
17:32:31: Profiler CPU results:
17:32:31:
17:32:31: ==============================================================
17:32:31: video thread frame - [cpu time: avg 13.921 ms, total 20311.3 ms] [avg calls per frame: 1]
17:32:31: | scene->Preprocess - [cpu time: avg 12.82 ms, total 18704.5 ms] [avg calls per frame: 1]
17:32:31: | GPU download and conversion - [cpu time: avg 0.299 ms, total 436.802 ms] [avg calls per frame: 1]
17:32:31: | | flush - [cpu time: avg 0.074 ms, total 109.2 ms] [avg calls per frame: 1]
17:32:31: | | CopyResource - [cpu time: avg 0.088 ms, total 124.801 ms] [avg calls per frame: 1]
17:32:31: | | conversion to 4:2:0 - [cpu time: avg 0.011 ms, total 15.6 ms] [avg calls per frame: 1]
17:32:31: Convert444Thread - [cpu time: avg 13.742 ms, total 19390.9 ms] [avg calls per frame: 1]
17:32:31: | Convert444toNV12 - [cpu time: avg 13.698 ms, total 19328.5 ms] [avg calls per frame: 1]
17:32:31: encoder thread frame - [cpu time: avg 0.05 ms, total 171.601 ms] [avg calls per frame: 1]
17:32:31: | QueueEncodeTask - [cpu time: avg 0.004 ms, total 15.6 ms] [avg calls per frame: 1]
17:32:31: | ProcessEncodedFrame - [cpu time: avg 0.037 ms, total 124.801 ms] [avg calls per frame: 1]
17:32:31: | sending stuff out - [cpu time: avg 0 ms, total 0 ms] [avg calls per frame: 1]
17:32:31: ==============================================================
17:32:31:
17:32:31: =====Stream End: 2014-05-26, 17:32:31=================================================
Maybe there is some DirectShow-Filter with hardware support missing or something on my system...? Are there any possibilities to speed this up? Maybe it's possible to use the Intel Media SDK for hardware accelerated color conversion or something?
Thank you and keep up the great work! :)
spyro
Last edited: