Question / Help its like a rant and suggestion

ok..soo.. i have a low power system..let me clear up something here, get an idea of my situation, not uncommon one, no elitiest please, im extremely happy with my build, it keeps up with systems costs 3x its price ^_^ just sucks in 1 area, recording realtime gaming.. but im building a it as cheap as possible, i have scored an A10-7860K and an Sapphire Nitro R7-370 4GB which i already massively overclocked from 980Mhz to 1,090Mhz on core and from 1400Mhz to 1600Mhz on VRAM, very stable.. and games nearly max out with that OC at 1080P@60FPS and 1440P @ 30FPS with VSR (i have a 1080P monitor). .. i spend $150 on the card, used $50.00 on APU and $60 on A88X motherboard, $70 on 16GB of hyperX 1855mhz memory from local shop. and a freebie 500watt evga power supply from a POS donor PC.

so here my idea..since my CPU side is not weak, but they aren't strong either(good for doing 1 large task like GTA 5 at once or ton of small tasks). is it possible to force encoding onto the iGPU of the A10-7860K? its very fast on GPU side... and would handle encoding better as it supports VCE and H.264 on graphic side and isn't under load in games makes it great and ideal for capture/encoding. i have tested it using VCE branch version of OBS but, it seems to not want to capture via the A10 APU, it record audio, but black screen, got me excited though was fact unlike audio captured with video or even black screen on cpu ..audio did not clip or cut or lag in the black screen recording. knowing my APU is very strong on graphics side, wouldn't it make sense to use that instead of CPU to encode or capture? my CPU absolutely sucks at capturing and encoding, obviously, not worst, it does stellar job if i want to make a grainy blocky video. i mean how hard could it be to utilize it?

i think that u guys and gals working this awesome software over should definitely add this capability, like 2nd GPU capture/encoding. including APU

and second of all, its a bit of a waist to have such a nice APU, then not be able to utilize the iGPU for what its worth.
for the A10-7860K; this is what the A10 kaveri APU equals internally in 1 package: the CPU side is Athlon X4 860K, the GPU side is a R7-250X.

im sure its possible to add it here, i somehow captured audio using iGPU to encode, but not video. i also used bandicam (another user here did it and posted, i tested it) toggled it over to iGPU for encoding and capture, and captured gameplay via preview screen disabling encode while previewing and that works great, albeit its a lot of work, and and its not most ideal given adding a small delay having 3 layers of video... and bandi isn't free, but if that can do it, im sure something can be figured out

*edited for some clarification..maybe?*
 
Last edited:

Dilaz

Member
GPU is very good at parallel tasks where you can do a lot stuff at the same time. Encoding is not one of those tasks so you pretty much have to stick with VCE.

Also game capturing is always done on GPU and audio encoding is pretty effortless task for modern computer so you really don't need to think about it.
 
ya, but my APU has a GPU built in, not doing anything..is my point
and the iGPU inside the APU has VCE 2.x support..which is.. like.....twice as good ....as ...1.0 *nods..*

basically the iGPU better at capturing and encoding than my dedicated gpu. if it could be utilized along side
 

FerretBomb

Active Member
A minor counterpoint. If you're just recording locally, follow the high-quality local recordings guide here:
https://obsproject.com/forum/resources/how-to-make-high-quality-local-recordings.16/
It should sort out your performance issues nicely. Just read the link and ignore the rest of this post if you are only recording locally.


That out of the way, there are a few points as far as livestreaming goes:
1. Hardware encoding is low-quality at best. VCE, NVENC, and to a somewhat lesser extent QSV all result in bad compression for the bitrate. Really, the only reason to use them is if you have a CPU too weak to otherwise stream using x264, as a band-aid. Generally no one serious about streaming is going to use a hardware encoder, regardless of if it's just sitting there, as the result will look like crap without using bitrates far surpassing realistic streaming rates.

2. AMD APUs are... notoriously anemic, to be polite. If your intent is to livestream, you are doing yourself a grave disservice by buying one. Generally you end up stuck with one, and want to stream on the side. Hence the VCE branch, allowing you to use a fallback method to bypass the poor overall multicore performance of the line of chips. Yes, most games do not utilize all of the cores; one reason a cheapo budget two-core $70 Intel Pentium G3258 can 'keep up' with a $1000 i7-5960x in many games. It's the same thing with APUs. They're fine for gaming, but when it comes to computationally intense tasks (and real-time video encoding is one of the 'heaviest' around at present) it falls over twitching.

I think it's more a question of the devs not wanting to spend their limited and very much in-demand time to enable use of a poor-quality method that's meant to be a contingency plan to begin with, more than anything else. There's no need to also use a shovel when you're trying to paint a picture, just because it's there. You tell the person to put down the shovel, and go get a paintbrush. A tool suited to the task, instead of makeshift.
 

dping

Active Member
inside the APU has VCE 2.x support..which is.. like.....twice as good ....as ...1.0 *nods..*

The only difference between VCE 1.0 and VCE 2.0 (which you have with your A10-7860K) is bframes. added bframes "can" do a better job at creating smaller files from equal recordings. This being the case they both handle recordings about the same. VCE 3.0 is supposed to encode even faster (rumored up to 4k) but current software ADK limits this to the same as VCE 1.0 and 2.0.

TL;DR. all gens are about the same right now sadly with some minor improvements between.
 
Last edited:
..counter points made back:

1: tested the iGPU portion of the APU (it can be used separately, VCE MediaSDK allows this), and slowest it will go is with MJPG with quality setting at 10/10 on afterburner .. and mustered up 105.9/frames a second with heavy compression. iGPU H.264 was hovering in high 200s with all normal OS load on the dGPU for windows 10 and iGPU selected for the test. still looking for a CPU test that does realtime encoding stress or benchmarks thats accurate. i know the compute cores are sh.t, but for most part they are fast at everyday tasks and simple data streaming.

2: the A10-7750 (+) APUs are 2.0 VCE compliant, found data sheet on AMD's server ( http://developer.amd.com/community/blog/2014/02/19/introducing-video-coding-engine-vce/ )

3. i was able to make it work using bandicam while previewing stream via GPU (no encoding, this no CPU usage spike creating or capturing frames separately using cpu bound frame plugins on bandicam, just using OBS preview which runs flawlessly at 1080P 30 and 60) and manually configuring it to run off the iGPU (Radeon R7 graphics) to typical encoding, without hiccup or shutter. while it totally works, its a mess, specially with VCE OBS branch's instability or having to buy bandicam or other app.

4: i considered a dual core cheap intel CPU however, several games i play that even with weak compute like APU, it requires 4 cores/4threads atleast, it can be 2 cores hyperthreaded or 4 physical cores, but example is far cry 4 uses a lot of multithreading, and runs great on my APU with dGPU and iGPU basically doing nothing sitting idle, and i plan on getting corsair H60 or H80 liquidcooler for APU if, this can be sorted out.

5, and finally: right now not big on livestreaming, i plan on doing it later, using $200 Elgato HD60Pro PCI-e adapter, and using iGPU to leverage this, as i seem to be one of the few that knows how to make it run in background for other stuff (strange to think).
so my main concern is creating usable footage locally, my drives are all fast enough, my 1TB 2.5inch WBblack touching 112MB/sec and my Drive SSD touching almost 600MB/sec. so thats not even close to being bottleneck.

bonus round: i played with adobe premier and trial of sony vegas pro as well, and been able to utilize the iGPU for rendering...get this, while not having any performance hit while gaming, until tempts pushed 63C on the APU or forced CPU bound task not forcibly to GPU level. then it began throttling on my aircooling, albeit rendering 10minute 100mb/sec 1080P footage with a bunch of filters (forced on at iGPU level) as a test it a bit steep. it proves an interesting concept of leveraging it almost as a separate idle GPU for anything that can utilize it in the background.

i also use it on a second monitor when i can for chats, and browsing, cuz while my dGPU can handle it, it actually seems to eat away frames watching HD video in Chrome for example, the game while shutter on my weaker dGPU. but not an issue if i plug second display into the HDMI onboard using iGPU, it plays far nicer, having on some hiccups on cpu bound tasks like opening files but rendering/decoding in chrome likely using whatever is most local source. and im not using eyefinity as for whatever reason that seems to want to combine resources too much. i just expand on windows APi at the OS level.


so as it seems the APU is just blindly frowned on by traditionalists :(
(and i agree, cpu side is garbage, but happens to do very well and to even my own surprise!)

its almost think that you guys cant even cant think of iGPU has any use when dedicated card plugged in but yet alot of intel encoding happens on its iGPU level in background. but using that logic, AMD APU graphic cores are far superior to intel's graphic cores.. so..why can't it be handled in similar or same or a better way, i have greater access to iGPU level on the OS user side (plug in second monitor to use or select from dropdown assuming u forced integrated to on in bios), entirely usable separately.. whereas with intel iGPU, once dGPU installed it isn't an option anymore to use iGPU for anything upfront, all API based..

also sorry for my english, not my first langauge
 
Last edited:
im sorry if it makes it seem like im whiney and all.. but hear me out, so many people that ignore the capabilities it has if used properly (as a lot of people that overstate it also) but i think AMD needs to make better examples of this, and advertise it better because its a huge advantage being able to run separate non related (or semi related in this case) workloads in parallel without a performance hit. ..only time i hit performance running unrelated parallel tasks is when it uses compute cores, temps pushing up, or my limited memory (16gb) is being nearly all utilized. seriously.... if i could create code and contribute here, i would love to create a plugin or recreate the APP specifically for APU + dGPU usage, i know its possible.. downside is, i cant, i dont know how to create app and plugins, no experience writing or rewriting codes, i tried to learn in school, just not my strongest coarse. i hate to just be a whiney brat about it.
 

VooDoo

Member
A minor counterpoint. If you're just recording locally, follow the high-quality local recordings guide here:
https://obsproject.com/forum/resources/how-to-make-high-quality-local-recordings.16/
It should sort out your performance issues nicely. Just read the link and ignore the rest of this post if you are only recording locally.


That out of the way, there are a few points as far as livestreaming goes:
1. Hardware encoding is low-quality at best. VCE, NVENC, and to a somewhat lesser extent QSV all result in bad compression for the bitrate. Really, the only reason to use them is if you have a CPU too weak to otherwise stream using x264, as a band-aid. Generally no one serious about streaming is going to use a hardware encoder, regardless of if it's just sitting there, as the result will look like crap without using bitrates far surpassing realistic streaming rates.

2. AMD APUs are... notoriously anemic, to be polite. If your intent is to livestream, you are doing yourself a grave disservice by buying one. Generally you end up stuck with one, and want to stream on the side. Hence the VCE branch, allowing you to use a fallback method to bypass the poor overall multicore performance of the line of chips. Yes, most games do not utilize all of the cores; one reason a cheapo budget two-core $70 Intel Pentium G3258 can 'keep up' with a $1000 i7-5960x in many games. It's the same thing with APUs. They're fine for gaming, but when it comes to computationally intense tasks (and real-time video encoding is one of the 'heaviest' around at present) it falls over twitching.

I think it's more a question of the devs not wanting to spend their limited and very much in-demand time to enable use of a poor-quality method that's meant to be a contingency plan to begin with, more than anything else. There's no need to also use a shovel when you're trying to paint a picture, just because it's there. You tell the person to put down the shovel, and go get a paintbrush. A tool suited to the task, instead of makeshift.

Wait huh??? I think you meant serious streamers wouldn't use SOFTWARE encoders. Cpu encoding is directly a hardware encoder not the gpu using a virtual rerender basically.
 

Osiris

Active Member
Encoding on the CPU is referred to as software encoding and has the highest quality at a given bitrate.
 
uhm.. ya im trying to avoid encoding on the CPU as much as possible..lets put it this way... i have a multi GPU config, basically R7 graphics built into my APU and R7 graphics in PCI-e Slot, both totally recognized independently, and OBS logs even show multi GPU config. my thing is i want to encode on one GPU, say R7 built into my APU while letting my dedicated GPU in PCi-e slot have more room to breath.

now, i decided to try to maximize my local recording from one of great posters above here!. and using VCE branch of OBS. i manually shuffled the device index from 0, 1, 2. my results are 0 gives me ingame lag, im assuming this is occurring on the dedicated card where game is running (GTA 5) pushing it to device index 1, no onscreen lag, and video recording look 90% smooth (great improvement but its still off) turning it to device 2 gave me device index 0 results. no assuming from my log, i kind of know how to read it, device 0 = dedicated GPU, device 1 is secondary or other GPU and another number would follow same logic if present.
so that being said my secondary GPU (atleast by the way of other apps) is the iGPU (R7 built into my APU). ..further reading the log it shows that its still using device 0 (dedicated GPU) is still primary in use for recording. but noticed while ingame footage is smoother and output footage is smoother as well using device index 1, the latancy increased, my only guess from this, could be that the application (OBS VCE) is taking the long way around and looping back or something of that nature, however seems to recovery and render more frames with few lost frames (1.28% vs 5.8%) using device index 1 vs 0, even with this added latency bug. as well it seems to be rocking out faster. now i could be totally wrong, but logically it makes sense primary = 0 as its closest and 1 = secondary as next in line..

but semi good news!
 
further testing ..using window capture on the integrated R7 graphics.. results are.. its capuring it frame for frame exactly.. now we are onto something..downside..window capture appears to have a high CPU overhead even with use of VCE. but good news it, it seemed to capture it frame for frame, so moment it frame dropped in game from lag of CPU overhead, also resulted in video having equal drop. i also able to use higher quality.. but..it have a higher CPU overhead.

so next question, is there a better plugin for this? to capture game/window from a second/separate GPU integrated or not, without extra CPU overhead ? something AMD or just openly compatible, if so how can i go about adding it or using it? im getting closer to what i want, once i get constant frame rat in game and recording(locally for both), i plan on water cooler, and bumping my OC on compute side up from 4.1 to 4.45GHz (tested it on an H60 hydro, it was happy on liquid at that clock). i don't mind extra 5-10% but im getting extra 20-35% using window capture in either compatability mode or not..gta at 1080P 30fps on highest settings i can get with dedicated gpu overclocked to hell, is peaking at 60%. and having 35% ontop of that means 95% load on a weak compute core structure... *edit* and at 1080P 60FPS cpu peaks at 60% also and 1440P(vsr), it peaks at 77%...tested driving and flying really fast by buildings and grassy areas where its offloading and reloading and flushing data in and out(at least in theory) just standing in road with not much happening 30% range for either 1080P or 1440P..and it records that just fine, obviously..add chaos from online (non modder server just a crazy one) but still on foot, i touched 40% range of CPU, again captured fine.. but 90% of my GTA play time is ...crazy sh.t ..at speed with explosions..and physics...and it works fine until add 35% overhead to my CPU from window capture which has given me my best frame by frame capture..using iGPU(radeon R7 graphics) to capture in drop down menu...so concept lives on! :D

but also changed the res during recording on game..and OBS crashed my DX11 in an amazing array of screen flashes and dots everywhere and total lockup for a minute until it recovered itself. but beyond that it worked...good
 
Last edited:
There's just what there is in the VCE fork thread. No one else has written any other VCE encoding options.

well that sucks... ill be checking here regularly to see if any news maybe help out my friend's situation that's in similar place.. but i guess means im def going elgato. my internet is great for streaming. 50mb/sec up and 200mb/sec down so low compression feed is ok with me when im at the point of twitching and youtube gaming..... im paying $65/month for it.. now onto buying $200 elgato and enjoying uploading footage without bogging my system down...
 

Boildown

Active Member
Do yourself a favor and get a PCIe internal card instead of a USB based device. Also, capture cards are only good for two-PC or console to PC solutions. They have no advantages for single PC solutions, so if you plan to use a capture card on the same computer you're gaming on, just scrap those plans now.
 
Do yourself a favor and get a PCIe internal card instead of a USB based device.

yes, the $200 Elgato HD60 Pro is a PCI-e Card with passive HDMI 1.4. which..should at least allow higher than 1080p 60fps pass thru. and i can tinker with settings for downsampling it it does. i imagine it should allow up to 4K at 30fps pass through, i have a few hdmi pass threw devices that 1.4 and they dont do anything extra in 4k but they allowed me to run desktop just fine at 4K 30Hz. ok off i go to spend almost same as i sopent on making this desktop for a good capture card, i looked at blackmagic 4k pci-e card..recommend that over elgato pci-e capture card? reason im considering elgato over it, is cuz software updates, and gaming features but if user friendly and better quality on blackmagic's pci solution, sell me on it.. its $10 cheaper
 

Boildown

Active Member
Elgato has tons of people making threads asking for help about it. I don't really recommend them. But Blackmagic seems to be even worse. At least Elgato eventually works, albiet with frequent audio/video desync issues. And maybe the PCIe device avoids those problems, I'm not sure. I don't own either, and I'm not interested in "selling you" on either of them either. All I know is that there's frequent posts on these forums from people asking why their Elgato has problems and why their Blackmagic doesn't work at all.
 
ya i know, but welcome to the internet i guess.. plenty of people that cant even turn on and use anything requires forethought, if it doesnt just work, alot of people freak out and blame companies. im not totally stupid with tech, but im not most brilliant either. and as such that's why i chosen the $200 PCI-e version vs the HD60 that uses USB 3.0 that i could pickup used for $100. same reason, i heard complains, but no offense after forcing myself to figure out OBS, i feel well equipped to run with Elgato.

also leaves me in a rather awkward position at $200... i could just get a GTX 900 series card and use shadowplay..but i like that AMD cards age so much better than Nvidia cards, specially seeing that even Xrossfired R7-370 outpacing some of higher end GTX 900's for same part cost(around $300) in the current few DX12 tests.

also my Sapphire 4GB (256Bit bus) Nitro R7-370 running at 1090Mhz and 1600Mhz vRAM plays games as good or better than mid level GTX 960/970. very happy with it... but sucks no good solution and VCE appears to be broken its self..... even raptr fails hard.. it works for me sometimes but then it shutters hard for seemingly no reason after few minutes.. then after update it or reinstall it, forget it.. its un-usable recording..and raptr team blames my ssd or hdd or says my card isnt powerfull enough and thats why it isnt working...meanwhile open ticket support forums people with Fury X and intel skylake i7... having exactly same issues.... .. omg so annoying.. *giving up on it until AMD's VCE solutions are in order*
....issue not related directly to the cards and cpu i think.... i think its cuz VCE isnt specifically AMD thing and more microsoft thing.. and microsoft is good at breakings more than fixing them...

/rant
 
ok so, funny thing happened..i think. i changed my AUu to a slightly lower end A8-7600 (non K) just to hopefully save heat and power since ordered Elgato HD60 Pro PCI-e model, but i hate leaving puzzles unfinished :( it bugs me, like an obsession

..so i tested OBS VCE out again, not sure how i did it, but same exact settings, untouched in OBS using window capture and AMD VCE checked, and using previous local recording setting, and it recorded nearly half an hour of black ops 3 in high quality nearly 15GB! no problem until i got little fiesty with settings, wanting to get it in 1:1 picture quality meaning as good as it is in game without any loss, figuring its possible as my integrated R7 was using only 30% of its otal power but nearly all of its 1.1GB reserved memory, i entered bios, forced 2GB to iGPU, rebooted, no problem but now OBS freaks out after recording exactly 60 frames (1 second) says D3D11 Texture error: out of memory! makes literally no sense. as my dedicated GPU has 4GB and just over 2GB being used in BO3. and i didn't touch a setting in OBS yet, just doing second control make sure it works...but every time preview window or recording..same thing again and again.. being cunning, i put the A10 back in, same memory forced on board.. same results, put it to auto its choppy, put it to 1GB choppy. put the A8 back in auto..very clean, 1Gb very clean, 2GB nothing..freaks out after it captures 60 frames (1 second). but in Log file OBS shows i have 2GB available (2.1 technically on counted availble in bits) ran i memory test for GPus and selected the integrated GPU and it passed all 2GB are available...so..huh? why is OBS/VCE not playing nice here? i tried changing DX mode to 11, host and same thing with memory errors, DX9 doesn't even work on the iGPU (somehow), so i know i selected correct one for encoding when it errored no native D3D9 and another DX9 error.

so long story short, even importing my saved configuration, it refuses to work as smoothly again, my temps are fine, my memory is all good, still got 16GB, 1.1GB or 2.1GB reserved for iGPU, 4GB on main graphics card.. my CPU isnt stressing out any extra in BO3 either.

BO3; sizing glitch, easily corrected with using windowed mode in game, set it to 1080p/1080p in game(no scaling 100%) then setting the base res of window in OBS.. 1st recording, was amazing, top notch, second recording, before i messed up settings, again was rough and choppy again but better than my other games. cpu doesn't seem to have much a of a sh.t fit but get memory errors with OBS now with it. cpu not recording 15-35% dedicated gpu/cpu recording 20-35%, iGPU/cpu recording 20-35%

tested it in GTA 5, and seems that maybe its GTA update/files not playing terribly nice with the software during recording. its the only game with a hugely erratic CPU spikes, even tho recordings in all games look pretty rough and i gotten lucky only couple times with clean ones. GTA 5 makes a bit of a mess. it vomits on either recording method however using iGPU for windowed capture is a noticeable improvement but video files come out are equally as sh.t as the frame drop in game. regardless of quality. no recording, cpu 40-50% ; cpu bounces around like easter bunny on meth but never drops below 70% range while using OBS to record regardless of gpu usaged

Dying light has zero cares in the world it seems, adding only 2% CPU spike on A10 and 10% spike on A10 during recording and the numbers are low enough it doesn't matter. but recording any native encoding settings is rough as always. no ingame chop with dedicated gpu or the iGPU recording window. video chop and shutter is same as everything else. non recording cpu usage 20-30% ; recording igpu or dedicated gpu / cpu 20-30% usage

FO4, again, small spike on CPU during recording, video, rough however its smooth as always in game unless i use the dedicated GPU for window recording then my CPU has a sh.t fit and maxes out on A8 and about same on A10 and noticeable screen chop in game and in video. so seems using iGPU for window recording helps massively but buggy and slower API. cpu in game no recording 30-40% ; igpu/cpu recording 35-55% ; dedicated gpu/cpu 80-90%


so it seems there maybe more to why it has a sh.t fit than meets the eye.. also to note, no noticeable percentage increase on dedicated GPU workload percentage with added recording, so seems it maybe hidden task or layering it in somewhere else, however it does show percentage increase for the iGPU but only 30% from 0% when using 1080P 60FPS as well the A10 pushes 25% increase from 0% (5% difference workload encoding?). my memory is set with all four channels filled with 4GB in each slot of kingston HyperX 1866MHz CL10 memory, with tested bandwidth per channel of 14.9GB/sec (allegedly). so even if my board is dual channel and only able to get 29Gb/sec max bandwidth one direction should be enough speed. however i can see my A10 gets 58Gb/sec and my A8 gets 29Gb/sec. this all makes no sense either way, as the A8 is recording better to my eye than A10, with less power, less bandwidth, no possible way of easily overclocking either side of the APU (cpu or gpu side). i tested the heck out of the A10 OCed on CPU and iGPU.. not sure what to make of it, other than some broken code somewhere hindering this or just bad/slow APIs wrecking potential either way. ...plot thickens


*edited correction to my numbers*
*edit 3; i should really proof read before submitting*
 
Last edited:
Top