Question / Help Sharing Experience of x264 (seeking perfection suggestions)

koala

Active Member
MSE is a term from statistics. For the layman, it is a score for the difference of two values, one original and one estimated. It is the squared difference between both values. The higher it is, the more different the estimated value is from the original. The lower it is, the more similar the estimated value is to the original. So you try to get an MSE as low as possible.

For images (or one frame of a video), the MSE of the image is the average of all the MSE of the pixels if you compare an original image with its encoded counterpart.

For videos, it is the average of the MSE of all the frames if you compare the original raw footage with its encoded version.
Actually, ffmpeg doesn't simply average, it puts more weight into bigger MSE values by computing a PSNR, which regards the peak values better, so single bad encoded frames weight severe and are not drowned by a huge number of good encoded frames.
The lower the PSNR, the better the quality of the encoded video. It can be directly used as sorting criteria if you compare a series of videos that were encoded with differing parameters. The best setting is the video with the lowest PSNR.

If you compare visually, you're doing too much work, and with questionable result, because your eyes can be deceived. Math, on the other hand, and PSNR is math, cannot be deceived. The best video is simply the encoded video with the lowest PSNR, so the best settings are the settings that were used for producing the video with the lowest PSNR. And you can do this automatically. You can write scripts to vary a bunch of parameters and create 1000 videos with it, compute the PSNR, write a list, sort by PSNR, and grab the settings from the video on top.
 

BigYuckFou

Member
I just wanted to come back and share my final settins I now use.

4k->1080p->GPU-downscale-lanczos->CPU-render
profile->high
preset->placebo
keyframe->1
70%cpu
render time->2.8/3.0

x264 options

threads=26 b_adapt=1 direct-pred=spatial me=tesa trellis=2 bframes=30 ref=1 subme=4 analyse=all rc_lookahead=240 me_range=22 aq-mode=2 partitions=all
 

energizerfellow

New Member
@BigYuckFou Given what x264's placebo preset actually are, here's a few things to consider:

- GOP length is fixed by the 2-second keyframe internal most streaming providers specifify
- Doing more than ~2 seconds of lookahead seems pointless and indeed is x264's default if you assume x264's defaults are built around 24/30 fps film/video material.
- OBS talks directly to x264 and doesn't use the FFmpeg CLI. That means it's, for instance, rc-lookahead, not rc_lookahead.
- The aq-mode=2 options doesn't seem to improve much. It's aq-mode=3 that re-balances the bit distribution in H.264 more towards how displays and human vision works.
- High reference frames seem to break some players. Apple's iOS seems broken on >6 ref, for instance.
- In theory you could have up to 34 threads on 1080p due to some math, but more threads makes things look worse. The automatic threads setting is 1.5 times the number of logical CPU threads, but setting threads to 1:1 with logical cores is likely best (and the default if you enable sliced threads).

With that in mind, try this on placebo for 1080p60:
aq-mode=3 direct-pred=spatial rc-lookahead=120 ref=1 threads=[lowest you can get away with, up to 1:1 with local CPU threads]
 

BigYuckFou

Member
I freaking love you! thank you for the help!

Can you possibly elaborate on why OBS log file would not error on using rc_lookahead? it seems to accept it.
 

BigYuckFou

Member
just wanted to report back on the final settings that I use after your suggestions.

4k->1080p->GPU-downscale-lanczos->CPU-render
profile->high
preset->placebo
keyframe->1
55%cpu (went down 15%)
render time->2.8/3.0
threads=21 b_adapt=1 direct-pred=temporal me=tesa trellis=2 bframes=16 ref=1 subme=5 analyse=all rc-lookahead=120 me_range=16 aq-mode=3 partitions=all



**The changes were**
reduced threads from 26 to 21 (great improvement)
increased aq-mode=3 instead of two
reduced bframes to 16 from 30 (high b frames can cause playback issues
direct-pred=temporat instead of spatial (better motion searching)
subme=5 instead of 4 to do multiple QPEl + BIME --- 5 is damn good
re-lookahead reduced from 240 -> 120
merange dropped from 20 -> 16 (so I could use temporal instead of spatial direfct-pred
 

BigYuckFou

Member
I did this for someone else in another thread and will move it here for posterity.
these are just my opinions and learning over time. I am a noob remember that.

----

It has taken me 9 months of testing to get to those settings, and they are specific to each PC/iuse case. I will be happy to assist if you find yourself in need.

I can give you a few pointers. The goal I had was squeezing 1080p@60fps fast motion content into 6000kbps so it looked good as possible. I can tell you no matter how much CPU you have at some point you play a balancing game between encoder lag/dropped render frames and quality.

Here are the settings that matter most on fast motion content. I will explain my understanding of each. and their encoder speed impact. when testing these if you start your stream and get immediate dropped encoder or render frames you set them too high. and you need to back them off as even if it stops immediately after initial drop during your stream you WILL get encoder overload.

My current x264 options separated by spaces.
threads=21
b_adapt=1
direct-pred=temporal
me=tesa trellis=2
bframes=16
ref=1
subme=5
analyse=all
rc-lookahead=120
me_range=16
aq-mode=3
partitions=all


Explanation:

threads=21

(1-32) you can go higher, but don't... get this as low as possible. preferably equal to the number of real CPU cores you have for best quality. If you raise the other settings below and start to get encoder dropped frames raising this can thwart that in some cases.

b_adapt=1
(values are 1 or 2) two being newer and significantly slower. two is not necessary. yet preferred. if you use bframes greater than two, the speed decrease significantly and I mean SIGNIFICANTLY. so I leave it at 1. I prefer more brames at 16 for better bitrate compression and overall quality gains. badapt two and brames >3 will give your CPU a heart attack.

direct-pred=temporal
(spatial or temporal) spatial is perfectly fine. temporal is better looking and searches a more complex pattern for motion vectors

temporal has a quite significant speed impact dependent on other settings. leave this at spatial, get the other settings as high as possible and if temporal then works use it.. otherwise use spatial. the quality gain using temporal is almost insignificant.

me=tesa
(dia, hex, umh, esa, tesa) do not use dia or esa. if you are going to use esa you might as well use tesa. this setting is the most impact in my opinion of fast moving content. tesa is the best you can get, and dropping to umh has a significant impact on quality during fast motion. so set this to tesa and then raise everything else after.

trellis=2
(0, 1, or 2) zero is off, one is ok, and 2 is superb. speed impact is little. it helps to use 2 so that you get better bitrate usage about 5-10% savings overall on the video as a whole. so you get better quality out of less bitrate, say the 6000 preferred.

bframes=16
(0-50) but you can go higher. these help with video quality. do not go too high as playback issues for viewers may arise. 16 is great. you can test up to around 30 IMO. you will read they have a significant impact on bitrate usage. but the quality gain is immense. B frames look great. IMO using 6000kbps you have plenty of headroom for bframes. use them!

ref=1
Reference frames are tough. they take A LOT of cpu to process. I use 1, as I have found they are simply not necessary in my use case. (fast motion looking good). obviously for recording you want higher, and higher is better. but the speed impact going from 1 to 2 is HUGE. especially using me=tesa from above. leave this at 1. if you get the other settings where you want them and can raise it, do it.. but your CPU will have a heart attack as will dropped encoder frames. this one settings is why most people cannot encode using the slow preset but can do medium. hell most people can use placebo as long as they make ref=1...

subme=5
(0-11) you wont read about 11 anywhere. 10 is insane. 8-9 is preferred but very CPU intensive/ 6-7 is superb. 5 is premium and a great overall look. 4 is perfectly fine. I started with 4. perfected everything then moved to 5 in the end. but again. 4 looks wonderful! and the speed at 4 is superb. moving to 5 has a significant speed impact and may cause dropped frames.

analyse=all
This setting seems to smooth the content overall. I have found very little information about this setting. if you find something let me know. I leave it on "all" but you can remove this one if you feel the need.

rc-lookahead=120
(0-240) if you stream at 60fps, then 120 lookahead is two seconds worth of forward seeking for frame type determination. 120 @60fps is IMO preferred. 240 is outrageous. 60 is sufficient in most cases. speed impact is negligible, but quality is affected at lower than FPS out numbers.

me_range=16

(0-???) this has a significant impact on speed but is the depth of motion vector search when using me>= umh, esa, or tesa. this is locked at 16 using hex and cannot be raised higher. significant impact on fast motion smoothness. I use 16. 24 is top dog, but SIGNIFICANTLY slower.

aq-mode=3
(0, 1, 2, or 3) 3 is preferred. and is what you should use for visual perception quality.

  • 0: Do not use AQ at all.
  • 1: Allow AQ to redistribute bits within each frame.
  • 2: Allow AQ to redistribute bits across the whole video.
  • 3: Auto-variance AQ with bias to dark scenes.

partitions=all
allows the usage of all partition sizes of macroblocks. all is best, or default setting of slow-placebo will also work just fine. slightly less fast and slightly better quality, just a tad.. and I mean a tad.

resources:
 

BigYuckFou

Member
If you want to test this start with this and then raise slowly. this is not for most people. this is in the search to get 1080p@60fps crunched into 6000kbps at high quality with fast motion content. most people cannot even use [me=tesa] without significant issues.

threads=32
b_adapt=1
direct-pred=spatial
me=tesa
trellis=1
bframes=16
ref=1
subme=4
analyse=all
rc-lookahead=120
me_range=12
aq-mode=3
partitions=all


You want to get to these settings or better if possible:

threads=(equal to your logical CPU cores) I have 14 cores but can only get it to 21 (24 if I want zero dropped encoder frames)

b_adapt=1 (two is better but extremely slow) just use 1 unless bframes is set to <= 2

direct-pred=temporal (only a slight quality increase over spatial)

me=tesa (use this if you can) umh is terrible comparatively for fast motion

trellis=2 (two is what you want. off may be best [0] if you need a slight speed increase) wont affect quality, only speed and bitrate usage

bframes=16 (the higher the better, lower than half your fps)

ref=1 (raising will be hard. significant quality gains at significant dropped frames possibility) get above 5 for more quality otherwise use 1

subme=5 (4 is great. 5 looks awesome. 6-7 is wonderful. 8-9 is tremendous. 10 is magnificent. 11 is insane) use 4 or 5.

analyse=all (just use this) i cannot tell you why.

rc-lookahead=120 (fps at 60, each +60 is one second of lookahead) double fps is preferred.

me_range=16 (12-16 will look great) higher numbers significantly increase CPU/encoder strain and dropped frames.

aq-mode=3 (three is visually preferred) if you need a slight speed increase set this to 0, but quality does drop a bit

partitions=all, just use the default of your preset, use all if you want the absolute best, but is it really not necessary.
 

energizerfellow

New Member
Can you possibly elaborate on why OBS log file would not error on using rc_lookahead? it seems to accept it.
The right ones more definitely are the libx264-native versions. I'd have to go digging through some source code why some of the FFmpeg versions work.

In relation to other settings, having a short GOP will hurt things. H.264 typically runs into the multi-second range in the modern era and more modern things like the HEVC-generating x265 default to open-GOP. Services like Twitch mandate a 2-second GOP.

As a test I suggest stepping through the medium/slow/slower/veryslow/placebo presets on high profile while keeping only these:
aq-mode=3 b-adapt=1 me=tesa ref=[something between 1-4] threads=[fewest you can get away with]

The size limits of the 4.x levels pretty much means you can't get >4 reference frames on 1080p anyways. You'll see errors relating to this in the OBS logs if you do.

Leaving temporal on direct-pred on either spacial or auto seem best as fording temporal seems to cause problems. Even with auto, you'll likely be on mostly spacial anyway if you poke around with something like ffprobe.
 

Cripzor

New Member
Hello, I've read this thread and find this thread very interesting.
It's been a while since this thread was updated so I was wondering if any of the settings have changed?
I'll be replacing my 2700x with a 3900x soon and I've used some of the settings here which greatly improve my overall quality.

Here are the x264 flags I use on my 2700x:

CPU Usage Preset : medium
b-adapt=2
direct-pred=spatial (I recon I could remove this flag since medium preset already runs this as spatial)
trellis=2
bframes=2
ref=1
subme=5 (I tried with subme=7 but with rc-lookahead=40 because rc-lookahead=60 causes encoder lag. But subme=5 rc-lookahead=60 looks better and I get no encoder lag)
analyse=all
rc-lookahead=60
aq-mode=3
partitions=all
 

ignaciomtds

New Member
Hello! I've read all this an im interested. I have a Ryzen 5900x and use 6000kbps, medium, main, 1080p, 60 fps and lanczos. I want to know if now are some new settings. Thanks!
 

mipsou

New Member
I use a dedicated streaming PC watercooled with ryzen TR4 2950X with 32gb DDR 3600hz. I try yours flags with Forza Horizon 4 and Warframe @ 1080p60.
I play in 1440p144
 

vankedisiTV

New Member
I'm at slow 264x 6000 bit rate 936p
Tune animation
Profile main
Threads 0
Rc-lookahead 80
Me tesa
Trellis 2
Bframes 16
Partitions all
Subme 4
Let's see if my cpu can take this
 

vankedisiTV

New Member
I'm at slow 264x 6000 bit rate 936p
Tune animation
Profile main
Threads 0
Rc-lookahead 80
Me tesa
Trellis 2
Bframes 16
Partitions all
Subme 4
Let's see if my cpu can take this

Threads 0
Rc-lookahead 120
Trellis 2
Subme 5
Partitions all
Direct pred temporal
Ap mode 3


However B frames
Tune animation
And me=tesa have failed.

Slow x264

Should i try to give b frames 2 value?


When you select slow preset, is the Bframes 0 by default or turned off? I tried to Google the documentation for obs x264 preset details but i couldn't find much info. Does my additional tuning increase quality? Because i realised my cpu usage didn't Change much with the values that worked fine.
 

vankedisiTV

New Member
Hi

I realised with ref=1 setting i can go slower!

But I'm torn between 2 options here (even 3)

Slower preset:
Threads 0
Rc-lookahead 120
Trellis 2
Subme 5
Partitions all
Direct pred temporal
Aq mode 3
Ref 2
Bframes 2

Or

Slow preset

Threads 0

Rc-lookahead 120

Trellis 2

Subme 5

Partitions all

Direct pred temporal

Aq mode 3

Ref 1

Me=tesa

Bframes 4

The difference is me tesa, ref, preset and bframes. Thanks!
 

msiamax

New Member
just to bump the thread i have been using x264 with custom settings for 2 years now i have my own perfect settings but does anyone know why the threads= does nothing on amd 16 core parts? but will work on my 12900k

i have settings im eager to test with upcoming 13900k and 7950x cpus
 

msiamax

New Member
just ran ya settings on spider man remastered

i have 4 more settings that makes it perfect in my eyes
hope the new cpus can hang lol

 

vankedisiTV

New Member
just ran ya settings on spider man remastered

i have 4 more settings that makes it perfect in my eyes
hope the new cpus can hang lol

You can just set slower ,profile high tune none
threads = 18 rc-lookahead 80 partitions=all
That's all you need. Test if you can do 6000 on x264.

If you can't go down to slow and add me=umh. You will get better quality if you select rescale output to 936p while canvas and output is 1080p.
This way your cpu's igpu can do recording at much higher bitrate without hurting performance.

The settings above from older comments are for peasants cpus lol. But with a 16 core cpu it would lower my threads manually for better visual quality.

Especially with i9s since they have lot more threads. More threads = more nvenc like poor quality

X264 is really good in heavy detailed areas.
 
Top