Resource icon

All Versions NVIDIA NvEnc Guide

rockbottom

Active Member
Clearly missed it. It's automatic, just have to adjust your Preset accordingly so there's no lag. This applies to 1, 2 or 3 NVENC encoders on die.

NVENC Performance




With every generation of NVIDIA GPUs (Maxwell 1st/2nd gen, Pascal, Volta, Turing, Ampere and Ada), NVENC performance has increased steadily. Table 2 provides indicative1 NVENC performance on Pascal, Turing, and Ada GPUs for different presets and rate control modes (these two factors play a major role in determining the performance and quality). Note that performance numbers in Table 2 are measured on GeForce hardware with assumptions listed under the table. The performance varies across GPU classes (e.g. Quadro, Tesla), and scales (almost) linearly with the clock speeds for each hardware.



While first-generation Maxwell GPUs had one NVENC engine per chip, certain variants of the second-generation Maxwell, Pascal, Volta and Ada GPUs have two/three NVENC engines per chip. This increases the aggregate encoder performance of the GPU. NVIDIA driver takes care of load balancing among multiple NVENC engines on the chip, so that applications don’t require any special code to take advantage of multiple encoders and automatically benefit from higher encoder capacity on higher-end GPU hardware. The encode performance listed in Table 2 is given per NVENC engine. Thus, if the GPU has 2 NVENCs (e.g. GP104, AD104), multiply the corresponding number in Table 2 by the number of NVENCs per chip to get aggregate maximum performance (applicable only when running multiple simultaneous encode sessions). Note that unless Split Frame Encoding is enabled, performance with single encoding session cannot exceed performance per NVENC, regardless of the number of NVENCs present on the GPU. Multi NVENC Split Frame Encoding is a feature introduced in SDK12.0 on Ada GPUs for HEVC and AV1. Refer to the NVENC Video Encoder API Programming Guide for more details on this feature.



NVENC hardware natively supports multiple hardware encoding contexts with negligible context-switching penalty. As a result, subject to the hardware performance limit and available memory, an application can encode multiple videos simultaneously. NVENCODE API exposes several presets, rate control modes and other parameters for programming the hardware. A combination of these parameters enables video encoding at varying quality and performance levels. In general, one can trade performance for quality and vice versa.

Preset​
RC Mode​
Tuning Info​
H.264​
HEVC​
AV1​
Pascal​
Turing​
Ampere​
Ada​
Blackwell​
Pascal​
Turing​
Ampere​
Ada​
Blackwell​
Ada​
Blackwell​
p1​
CBR​
LL​
667​
855​
868​
910​
977​
539​
932​
943​
1055​
1134​
1090​
1076​
VBR​
HQ​
692​
833​
846​
885​
948​
506​
920​
939​
1037​
1119​
741​
957​
p3​
CBR​
LL​
649​
600​
613​
652​
718​
442​
463​
467​
494​
529​
774​
798​
VBR​
HQ​
398​
602​
617​
647​
708​
443​
552​
557​
706​
947​
549​
678​
p5​
CBR​
LL​
363​
271​
273​
291​
323​
370​
305​
307​
343​
506​
512​
624​
VBR​
HQ​
327​
264​
266​
283​
317​
371​
334​
335​
411​
521​
440​
552​
p7​
CBR​
LL​
321​
229​
231​
247​
264​
345​
306​
308​
343​
464​
356​
395​
VBR​
HQ​
250​
207​
213​
211​
227​
260​
171​
171​
181​
181​
323​
401​


  • Resolution/Input Format/Bit depth: 1920 × 1080/YUV 4:2:0/8-bit
  • Above measurements are made using the following GPUs: GTX 1060 for Pascal, RTX 8000 for Turing, RTX 3090 for Ampere, and RTX 4090 for Ada. All measurements are done at the highest video clocks as reported by nvidia-smi (i.e. 1708 MHz, 1950 MHz, 1950 MHz, 2415 MHz for GTX 1060, RTX 8000, RTX 3090, and RTX 4090 respectively). The performance should scale according to the video clocks as reported by nvidia-smi for other GPUs of every individual family. Information on nvidia-smi can be found at https://developer.nvidia.com/nvidia-system-management-interface.
  • H.264 and HEVC encoding fps for Volta GPU can be obtained by multiplying the Pascal fps in the above table by ratio of the clocks, as reported by nvidia-smi.
  • Software: Windows 11, Video Codec SDK v13.0
  • CBR: Constant bitrate rate control mode, VBR: Variable bitrate rate control mode, LL : Low latency tuning info, HQ: High quality tuning info
 
Top