Question / Help Elgato hd 60 pro 100% cpu usage

alpinlol

Active Member
According to the initial Log you have struggle running x264 1080p60 on veryfast Preset your CPU should be capable of just encoding. Keep in mind Webcam as soon as it gets active in any sort it will start using the CPU and USB Bandwidth. The Capture Card also takes a slight hit on the CPU but it should still be not the reason your CPU maxes out with 1080p60 on veryfast. Bitrate also doesnt really matter here.

Did you make sure that your capture card is properly configured?

And all that hardware encoder forcing is really stupid for streaming since 1080p60 with either NVENC or AMF would need about 10000 Bitrate to look even somewhat good in fast paced games.
 

SumDim

Member
I still don't understand where you are coming up with this 12 thread number.

Hyperthreading results in more logical processors because that's its definition by nature. An Intel CPU with hyperthreaded support has two logical processors per core. So a 4C/8T is exactly that - 4C/8T.
This, as opposed to a 4C/4T or any CPU architecture that does one to one mapping between core and logical processor.

Where is this 12 spawning threads you speak of coming from? Its not coming from the physical CPU with hyperthreading support magically creating 4 more threads on the logical processor is it? I don' think so. Its coming from the application making a API call to create X software threads/processes.

In the table above, I said 1.5 multiplier of core. So given a 4 core Intel hyperthreaded enabled CPU, that is 6 threads, not 12 being used by x264 by default.

I have always looked at hyperthreading as being "just there". Software threads and processes are created through the O/S and get assigned to physical logical processors by the O/S.

By the way, ARM and AMD processors, even i5 Core don't have hyperthreaded support. So surely, x264 isn't written taken that in mind. I would think that offers a means to tell it how many threads you want to create through API and command line options.
 

SumDim

Member
Orzanel, I really think my original assertion stands true. You just don't have enough CPU power to drive a 1080@60fps. You don't have NVENC to help you either.

One thing I noticed is that you only have 4GB of memory. Bring up Task Manager and watch your memory performance while you stream. My gut says, that is way too low and you need more like 8GB, if not better.

Other than that, since you tried most of the suggestions in tweaking bitrate and CPU preset (which I do when CPU usage climbs and I start to drop frames), I have no other advice than try the following in this order:

- Run with 16GB RAM
- Get a GTX 1070 and use NVENC
- Get a higher performing CPU (Ryzen 7 1700X+ or Xeon 2630 and better)

Other than that, thats all I have on my end. Again, I stream perfectly fine at 1080@60fps with components much more stronger than your setup. I even use the same Elgato HD 60 Pro card.

Good luck with you.
 

Orzanel

New Member
the i7 4790k is more than capable of 1080p 60 fps like i said if i stream at that quality without the elgato it uses about 30% cpu power.Also it makes no sense to use nvenc for a dedicated streaming machine, because i want it to use the cpu and give me the better quality x264, otherwise i could just use nvenc and stream on 1 pc.Also the ram is not the issue as i had the same problem with 8 gb of ram but one of the ram sticks went bad.
 

Boildown

Active Member
User was warned for this post. Its content has been edited.
I still don't understand where you are coming up with this 12 thread number.

x264 measures the number of logical cores and multiplies it by 1.5, and creates that many threads. Are you saying it doesn't do that? If so, you're wrong. It does. I've already provided sources to that effect and given you a way to verify it yourself.
 
Last edited by a moderator:

SumDim

Member
Everything you stated above was the obvious to people who are aware of what hyperthreading is.

If you were paying attention, I was assuming it was PHYSICAL CORES multiplied by 1.5 and mentioned that several times.
At no point during this entire discussion, except now, did you say it is LOGICAL CORES multiplied by 1.5.

And by the way, I don't appreciate being called dense. Just so we are on the same page, I am a retired 25+ year lead Windows software engineer. I wrote everything from device drivers to popular software applications.

Be careful how you talk to people on here. You are no more better than any of the rest of us who fetching information off another website and presenting it here.
 

SumDim

Member
TLDR: x264 creates threads based on many different factors. The number of threads it creates is not a simple calculation.

The x264 encoder algorithm creates threads in the manner below:

if( h->param.i_threads == X264_THREADS_AUTO )
{
h->param.i_threads = x264_cpu_num_processors() * (h->param.b_sliced_threads?2:3)/2;
/* Avoid too many threads as they don't improve performance and
* complicate VBV. Capped at an arbitrary 2 rows per thread. */
int max_threads = X264_MAX( 1, (h->param.i_height+15)/16 / 2 );
h->param.i_threads = X264_MIN( h->param.i_threads, max_threads );
}

1) max_threads is the height of the video image, rounded up to the next highest divisable of 16, and divided by 2.
2) if there are sliced threads, the number of threads is 2 times the number of processors
3) if there are no sliced threads, the number of threads is 1.5 times the number of processors
4) CPU processors can have nothing to do with the number of threads created if max_threads is the minimum


Further in the encoder source there is a comment discussion of the slice to thread relationship:
/* The slice structure only allows a maximum of 2 threads for 1080i/p and 1 or 5 threads for 720p */

Even further yet, there is considerable amount of code for look ahead threads for which I'm not even going to delve into.

Therefore, there is a helluva lot more to it than just saying its 1.5 times the number of processors.

What follows is the gory details that in a Microsoft Windows platform, the number of processors in the algorithm above, is based on logical processors.

x264 makes a call to x264_cpu_num_processors that is dependent upon definition at compile time. There are MANY different O/S that x264 supports.

In the case of Microsoft Windows this is:

int x264_cpu_num_processors( void )
{
(snip)
#elif SYS_WINDOWS
return x264_pthread_num_processors_np();
(snip)
}

x264_pthread_num_processors_np calls the Win32 SDK GetNativeSystemInfo and returns the number of processors:

xint x264_pthread_num_processors_np( void )
{
SYSTEM_INFO si;
GetNativeSystemInfo(&si);
return si.dwNumberOfProcessors;
}

A further look into the structure reveals this:

typedef struct _SYSTEM_INFO {
(snip)
DWORD dwNumberOfProcessors;
DWORD dwProcessorType;
(snip)
} SYSTEM_INFO;

dwNumberOfProcessors
The number of logical processors in the current group. To retrieve this value, use the GetLogicalProcessorInformation function.

According to Microsoft's logical processor definition:

"A logical processor is perceived by Windows as a processor, and each logical processor is capable of executing its own stream of instructions simultaneously, to which the OS can in turn assign simultaneous independent units of work. Windows Server enables each core to appear as a logical processor, so the server shown here, which has two quad-core physical processors, can have eight logical processors. Some processors support a technology called symmetric multithreading (which Intel calls "hyperthreading"), which enables a core to execute two independent instruction streams simultaneously. If the technology were enabled here, the result would be 16 logical processors."

If you would have spent the time to sift through the source code to arrive at this conclusion instead of going to a third party website citing the rationale, I would have given you more credibility. But I never, ever accept any response from anyone who I don't know and hasn't proven to me they know what they are talking about without hard proof.

Being a lead s/w engineer, if I had asked one of my team members to go do this legwork, I would have kicked it back to them to further investigate, not provide me with a forum link from a third party website not the same as VideoLan, the creator of x264. If source code was available, I would tell them to study the source code and prove to me what is going on under the hood.

And even if you did arrive at a correct answer from not doing your own homework and drilling down yourself, presenting it in the "are you dense" manner you did, you would be sitting in a Human Resources office explaining yourself.

THE ONLY EXPERTS IN ANYTHING ARE PEOPLE WHO DO.
 

Boildown

Active Member
If you were paying attention, I was assuming it was PHYSICAL CORES multiplied by 1.5 and mentioned that several times.
At no point during this entire discussion, except now, did you say it is LOGICAL CORES multiplied by 1.5.

And by the way, I don't appreciate being called dense.

I did say it, right here: https://obsproject.com/forum/threads/elgato-hd-60-pro-100-cpu-usage.70062/#post-300379

And Hyperthreading does result in more logical cores, and logical cores is what x264 uses to determine how many threads are created. So a 4C / 8 "logical" CPU (because of Hyperthreading) will spawn 12 threads even though (and I agree) that's "too many". I had a previous observation of this here: https://obsproject.com/forum/threads/for-video-encoding-more-cores-or-more-ghz.10722/#post-60264

I don't know how I could have been more clear. Thus the "are you being purposely dense?" meme-video. I'm sorry you were offended, it wasn't meant as a serious cutdown.

The rest of what I said still stands and is correct.

You are no more better than any of the rest of us who fetching information off another website and presenting it here.

If you would have spent the time to sift through the source code to arrive at this conclusion instead of going to a third party website citing the rationale, I would have given you more credibility.

...not provide me with a forum link from a third party website not the same as VideoLan, the creator of x264.

I really have no idea what you're going on about. The "third party" web site I linked was a discussion of how the word "threads" is commonly misused as a synonym for "logical cores" (which you also claim I never talked about, but I did). It wasn't material to anything except to clear up possible confusion with terminology earlier in the discussion.

The other link is to another discussion from years ago on these forums (not a third party) where I posted about how opening "too many" threads is bad, which only supports your own position that 12 threads would be too many. Not every time I quote someone am I disagreeing with them entirely.

12 threads may be too many, yet that is how many it opens, because the algorithm was written at a time when mere dual cores were advanced and the scaling and hyperthreading of the future couldn't be foreseen. Its a legacy that mostly still does no harm so we mostly leave it as is... but if you want to really tweak things, using the threads=x can be advantageous to a small degree.

1) max_threads is the height of the video image, rounded up to the next highest divisible of 16, and divided by 2.
2) if there are sliced threads, the number of threads is 2 times the number of processors
3) if there are no sliced threads, the number of threads is 1.5 times the number of processors
4) CPU processors can have nothing to do with the number of threads created if max_threads is the minimum

Further in the encoder source there is a comment discussion of the slice to thread relationship:
/* The slice structure only allows a maximum of 2 threads for 1080i/p and 1 or 5 threads for 720p */

1) This means that 720p output can use a max of 720/16/2 = 22 or 23 threads, depending on how it rounds, if I'm reading that right. That's a sufficiently greater than 12 I think to not be any kind of limitation, and it becomes less and less of a potential limitation as vertical resolution increases.
2) There aren't any sliced threads in OBS unless someone is using custom x264 commands. They're used to help certain players, but if anything are a hindrance to encoding speed, not a help, so aren't used in live video encoding.
3) Important part. The only question left is what makes a "processor" and the answer is "logical core".
4) max_threads was just defined in part 3), so I'd hardly characterize that as "nothing to do with".

What's more, experimentally I've shown (via the OBS-internal link) that 1.5x is how it actually does work, and explained how you can prove it to yourself. If you can show evidence in the contrary, I'd like to see that a lot more than quotes from the x264 source code which after a lot of reading only support my position anyways.
 

Boildown

Active Member
Post a log with webcam downscaled to 540p from a 5 minute or longer attempt. It really is a problem for most people when they run it at 1080p.
 

Orzanel

New Member
I ran it without webcam and it was the same issue.Anyway the capture card died so i'm getting it replaced.I hope it was faulty from the start and will run ok when i get a new one.
 

Orzanel

New Member
Elgato support team also said they can't help me further and i should look for further help on the obs support.Is my 4790k really incapable of handling this capture card or what?
 

Videophile

Elgato
Hi there. I work at Elgato and can try to give you one more thing to try:

Using the HD60 Pro 3PS/multi device driver.

  1. Download the driver from http://e.lga.to/MultiHD60Pro
  2. Install it
  3. Reboot your PC
  4. Open the Elgato Game Capture software.
  5. If it prompts you to reboot again, do so. If it does not, reboot anyways.
  6. Now in OBS Studio, remove any video capture source that uses the "Game Capture HD" source
  7. Now add a new video capture device and from the dropdown choose "Elgato Game Capture HD60 Pro (video)(#01)"
  8. Now apply these settings:
    1. Resolution: 1920x1080
    2. FPS: 59.94 or 60
    3. Format: YV12
    4. Only change YUV color space/range if you know what they do
    5. Buffering: Disable
    6. Use custom audio device: Check
    7. From the dropdown, choose the "Elgato Game Capture HD60 Pro (Audio)(#01)" device
  9. Now hit ok.
You may also want to change the "Audio format" in the OBS Studio audio settings to 48khz as that is the native format of the capture card and will make is so there has to be no audio format conversion done, potentially saving even more CPU (May not be a noticeable amount)

If after this you still get high CPU usage, then just turn down your render settings.
 

Orzanel

New Member
I think this worked, i'm getting 25% cpu usage during no motion scenes, around 50-60% during motion, it still spikes up to 90% sometimes but it works and there's no image freeze.Thanks!
 

Unkas

New Member
I think this worked, i'm getting 25% cpu usage during no motion scenes, around 50-60% during motion, it still spikes up to 90% sometimes but it works and there's no image freeze.Thanks!
Did you fix it with @LtRoyalShrimp solution? I had the salme problem with dual rig conf with elgato hd60 pro but on the stream rig i don't have any graphic card and obs allways show me "encoding overload" even on veryfast preset.
 
Top