Encoding boost--Advance Profile Options...

jmc

Active member
Been running quick x264.mp4 encodes of a dvd.mpg on new 3950X 16/32 thread cpu.
Found one definite boost in Advance Profile Options... "Encoding threads (0=Auto)".

Unchecked or on Auto ------159 fps max 40s% cpu use.
If set to the full 32threads - 167 fps, (several seconds of 70% cpu use).

So "Auto" or unchecked is not what you want to have. Check it and max your threads.

I've got most of the H.264 check boxes checked and am unchecking one at a time to see what happens.
Found a goody right off the bat!

Testing just a Default x264.mp4 profile with dual pass and the threads setting...
Unchecked "threads" 211 fps
Checked "threads" 342 fps.

-----------------------------------
(EDIT....Have not been able to duplicate this (below) on my 3950x or my 6 core, Gotta redo this on the threadripper again)

Weirdness... Single pass (unchecked threads) 30s% 378fps
(32 checked threads) 40-50s% 287fps. No idea what's going on here,
More cpu use and less fps!
----------------------------------

So Dual pass check box the threads. Single pass do not check box the threads.

EDIT... if anyone gets the same results or If I've got things upside down and backwards with the "threads checkbox" results let me know.

Way too many combinations to keep straight.
 
Last edited:

Otter

Member
Way too many combinations to keep straight
Too True - Any recode is a VERY complex chain of processes. Some can be processor based and some GPU based depending on your rig, software and settings.

First your ripped dvd file will be in a container (maybe a .TS) and have to be demuxed into MPEG video and AC3 audio streams.
Then the MPEG has to be decoded and expanded to a RAW stream of video frames - decoding could be HW, SW based or a combo of both.
If doing 2-pass, scene & frame complexity are analyzed on 1st pass and extra bitrate flagged where needed and less where image is simpler.
CPU load can vary a lot during this analysis as frame complexity changes constantly.
On 2nd pass, your encoder (VRD/H264) comes into play and does recoding, cropping, any resizing and remuxing. All this constantly changes CPU load and fps.

If you're running Win10, there will be a lot of background processes jumping in and out grabbing CPU time while you are benching - enough to skew both CPU load and FPS from run to run

Then there is your Ryzen 3950X --- It is actually 2 CPUs in one package. Each CPU chiplet has 8 cores and can run 8 Primary + 8 Secondary threads. The 8+8 threads have to run on the same 8 cores and juggle access to same on-CPU memory caches. The other chiplet is identical - 8 cores / 8 Pri threads and 8 Sec threads: all sharing it's internal caches.
The two 8-core CPUs can and do need to send each other data, but over a much slower pipeline called the "Infinity Fabric" bus. The IF speed is based on the 1/2 the MB memory module speed.
I run DDR4 3600 mem and my Infinity runs at 1800 MHz.

AMD only "guarantees" that at least 1 of your 16 cores will hit the advertised turbo - the rest can be slower. How many "fast" cores you get is called the "binning lottery". I have a 3900X where 2 cores on chiplet1 will hit 4600+ MHz the other 4 and all 6 cores on chiplet2 run 50-200 MHz slower. Not the best "bin" , but not the worst.

The latest version of Win10 v1909 has an updated thread scheduler that interfaces with the latest AMD BIOS/AGESA 1.0.0.4 B and the Ryzen3 controller to allocate multi-task data to the "best" cores and threads at that microsecond. A single data result often depends on other data being available before it can be completed. If work flow needs to hold up often waiting for the next data, things slow down. Data will process fastest on the Primary threads within a chiplet, Secondary threads will be slower and needing data transfered from the other chiplet will be even slower.

By forcing an encode to all 32 threads, it well might run slower than letting the "Automatic" scheduling do it's work. Will also vary a lot, depending if few or many chiplet-chiplet data transfers ended up being needed.
 
Last edited:

tobyW

Member
Very interesting! Thanks for taking the time to explain this to the rest of us.
 
Last edited:

jmc

Active member
I run DDR4 3600 mem and my Infinity runs at 1800 MHz.

By forcing an encode to all 32 threads, it well might run slower than letting the "Automatic" scheduling do it's work. Will also vary a lot, depending if few or many chiplet-chiplet data transfers ended up being needed.
Thanks for the info.

I just got the 3950x set up last week. That plus moving to W10-1909 from W7...LOTS to take in.
I imagine my mem speed in the bios is a low default. As I had to do two bios updates to get the 3950x support on the "Gigabyte 370 Gaming 5".

No bios tweaking yet. Just getting a feel for what it is like "stock". No overclocking. Want to try to "under volt" and reduce the temps.
Hot at 82C with a CPUZ stress test. My overclocked (3.9Ghz)1950x Threadripper is only 68-70C. Both have top Noctua air coolers.

Others report 80sC also thankfully, I was starting to think I had a bad mount or uneven heat spreader.
I'm thinking that it is the cpu size. My old gen1 Threadripper has 2.4 times the contact area to transfer heat with and is 12C cooler!

Oh, CPUZ bench test indicates the 3950x is 25 to 30% faster then the old 1950x threadripper. But boy do you lose PCie lanes!

The "threads" checkbox test results was a few quick runs to see if anything interesting happened.
With the very odd "single pass", "double pass" results I'm going do a lot more testing with other cpus and core counts.
And use my heavily modified MP4 Profile instead of the "almost" Default one I tested with.

Wonder if my old 3930 6/12threads cpu will see a large effect or not.

Thanks again,
jmc
 

jmc

Active member
Set down to do some more testing on the 0 to 32 Threads setting effect.
32 threads helped with "Dual Pass" only. 7-8% boost over the Threads check box not being checked". My 16 core(160fps) or my 6 core(107fps).

This time I'm using my very modified Profile that I use for the files I keep.

And I was stuck on thinking that it is CPU core/threads as in my 6/12 core/thread cpu...NOPE.
Threads in general. I set my 6 core 3930 to 12 threads and not good, 60% use and 100fps.
Set to 32 threads the 6 core went up to 98% (on Second Pass) and 107fps...gain of 7%!

EDIT....
Did two more tests and the drop in fps on a 32 thread "Single Pass" happens ONLY on a very simple Default x264.mp4 Profile.
0 threads, single pass. 194fps.
32 threads single pass 128fps

With the VERY modified (Preset VerySlow, etc) Profile that I used today...The Threads setting and "Single Pass" NO EFFECT AT ALL.

I am tested out for the day...was interesting tho. That Single Pass thing was bugging me. Thank goodness I remembered I used a simple profile before!
-------------------

So with a low bitrate (1.3Mbps) x264 encode you can not beat "Dual Pass" encoding for quality.
 
Last edited:

Danr

Administrator
Staff member
I'm interested in the thread for 2 reasons:

1. Been planning on purchasing a 3950x and now that it's in stock was wondering if you are pleased with your purchase? Also, did you go with a water cooler or an air cooler.

2. The other reason is do you think there's a need for dual pass now that CRF is available? I've been doing almost all my encodes in CRF and have been pleased with the results although I haven't formally compared the CRF with dual pass.
 

jmc

Active member
I'm interested in the thread for 2 reasons:

1. Been planning on purchasing a 3950x and now that it's in stock was wondering if you are pleased with your purchase? Also, did you go with a water cooler or an air cooler.

2. The other reason is do you think there's a need for dual pass now that CRF is available? I've been doing almost all my encodes in CRF and have been pleased with the results although I haven't formally compared the CRF with dual pass.


1. Finally got my 3950x AM4 put together last week and still learning. Can see why AMD says water cool it.
It maxes out at 81-82C. with the top Noctua AM4 cooler. Even 2000 RPM or 3000 RPM fans make little difference.

My 1950x 16 core Threadripper overclocked to 3900 Ghz maxes out at 69-70C.
And this is with a carbon fiber (Carbonaut) pad not paste.
12-13C degrees cooler then the 3950x.

I was thinking I had a bad mount or uneven heatspreader until I googled it and saw 80s on air was common.
Then it popped into my head about the die size.
If I got the figures correct the Threadripper has 2.4 times the surface area to transfer heat.
The 3950x has 40% of that surface area.

I tried a Carbonaut pad with the 3950x and when I removed it to apply Kryonaut paste(should drop temps a couple degrees).
You could tell where there was good contact with the heatspreader. Looked like 80% of the center, like you would want. the edges did not show the same contact.

When I installed the latest Ryzen Master it had something about 95C. Think it said 95C was the MAX temp for a Ryzen.
So I'm 13 degrees below that.(82C). I'm still learning and I hope low 80s is ok. I don't like it but hope it is ok(?).

I'm not going to try to get into water cooling. I think it is all down to the small transfer surface area.
I was not worried about the watts because the 1950x was ok, Never thought about the smaller die size.

If you are not anti water cooling like I am, get an all in one (AIO) and get the temps down.

And I like it! it is in the ball park of 25% to 30% faster then my first gen 1950 threadripper.

AM4 not having all those PCIe lanes was a lose but I already owned the Gigabyte 370 motherboard (had a Ryzen 1700 on it).
If I was buying new I would go with the 570 boards. PCie 4 speeds can make up some for the lose of PCie lanes.
---------------------------------------------------------------------------------

2. I did do CRF X264.MP4 encoding and compared.
Same Profile just changed to Dual Pass or CRF.

I tested CRF till I got a similar file size/bitrate results. I'm thinking that it was CRF18.
I believe VRD6 said the CRF results was 1.37Mbps. I use 1.30Mbps Dual Pass X264.MP4 Preset VerySlow etc. Close enough.

So looking at the same frame in VRD6. Each taking up half the monitor screen side by side.
(I seem to notice things best by looking at a human face).
Eyes,edges,hair. It seem to me that the Dual Pass had a little more detail,little sharper eyebrows.

You are going to have to sit down zoom and hunt to find differences between the two encode types in a frame.

Not like NVEnc(a Medium X264 Preset level results at best) where X264 Dual Pass kicked it butt...I like a clear win.
 

Danr

Administrator
Staff member
jmc, Thanks for your 3950x summary. I too am partial to air cooled, but wanted to use the new Dune case (case clone of Macpro) and it won't tolerate a Noctua cooler height by about 5 mm. Linux Tech tips on Youtube did a comparison of air cooled vs water cooled for the 3950x and found that air cooled was about the same as water cooled. I've never used an AIO and am afraid that I'll wake up one morning to either find a puddle of water at the base of my tower or that the pump in the AIO had failed. I've been considering waiting for a 3960/3970, but those CPUs draw about 250w compared with 105w for the 3950x. PCIE lane counts don't bother me much, I just need one or 2 GPU cards for testing, a couple NVME drives, and 2 8TB SATA drives in a RAID 1 for user file samples.

For encoding, I rotate about every 2-3 weeks between encoders and settings so that I can give different configurations (both settings and encoder types) a workout. Currently I"m using NVENC H264 at CRF at 16. Quality is fine although I'm getting about 9 MBps on 1080i content which isn't very stressful on video quality. I've been having problems playing HEVC on my old FireTV w/KODI, lots of video stuttering. Got a new FireTV 4K stick today, will load KODI and see if it can play HEVC streams smoothly.
 

jmc

Active member
jmc, Thanks for your 3950x summary. I too am partial to air cooled, but wanted to use the new Dune case (case clone of Macpro) and it won't tolerate a Noctua cooler height by about 5 mm. Linux Tech tips on Youtube did a comparison of air cooled vs water cooled for the 3950x and found that air cooled was about the same as water cooled. I've never used an AIO and am afraid that I'll wake up one morning to either find a puddle of water at the base of my tower or that the pump in the AIO had failed. I've been considering waiting for a 3960/3970, but those CPUs draw about 250w compared with 105w for the 3950x. PCIE lane counts don't bother me much, I just need one or 2 GPU cards for testing, a couple NVME drives, and 2 8TB SATA drives in a RAID 1 for user file samples.

For encoding, I rotate about every 2-3 weeks between encoders and settings so that I can give different configurations (both settings and encoder types) a workout. Currently I"m using NVENC H264 at CRF at 16. Quality is fine although I'm getting about 9 MBps on 1080i content which isn't very stressful on video quality. I've been having problems playing HEVC on my old FireTV w/KODI, lots of video stuttering. Got a new FireTV 4K stick today, will load KODI and see if it can play HEVC streams smoothly.
On the air cooling...
Someone in the Tech report forums matched my cooling system (Noctua D15s) and CPUZ stress test with a 3900x which uses a tiny bit more power then my 3950x due to the extreme binning.

The important part is that his temps(72C) are 10C lower then mine(82C)! So I must have some kind of seating problem or non flat heat spreader. I'm going to be so ticked off If I have to do any cpu lapping to get temps down.
(just realized I don't know his ambient temp. Mine is around 28C/82F)

Several people said that 80sC is fine...Not the way I was raised it isn't. :)

Video encoding...
9Mbps wow, I doubt I would ever see a difference at that level.
At my level 1.3Mbps every little bit counts. My 43 minute "hour" shows take up a half Gig each.
I'm waiting for the time HEVC runs standard on every tv. My 2TB SSD is almost full. Next stop 4TB - ouch!

thanks,
jmc
 

Danr

Administrator
Staff member
Cooling:
Wouldn't the thermal paste avoid the need for cpu lapping? You could pull the cooler, clean the paste and re-apply. Perhaps there's an accidental bare spot on the HS.

Why would the 3900x run hotter than the 3950x, it's only 12 cores. Also, doesn't the 3900x come with the Wraith cooler in the box, which is much smaller than the Noctua.

Which GPU are you using?

Encoding:
I rarely use SSD for video files, just spinning media.
 

jmc

Active member
Cooling:
Wouldn't the thermal paste avoid the need for cpu lapping? You could pull the cooler, clean the paste and re-apply. Perhaps there's an accidental bare spot on the HS.

Why would the 3900x run hotter than the 3950x, it's only 12 cores. Also, doesn't the 3900x come with the Wraith cooler in the box, which is much smaller than the Noctua.

Which GPU are you using?

Encoding:
I rarely use SSD for video files, just spinning media.
The thermal paste would be fine if there is not a real issue with the heat spreader flatness or heatsink alignment with the cpu.
Need to remove the cpu and check it for flatness against the heatsink surface.

When I first installed the cpu with a carbonaut carbon fiber pad(temps + 2-3 degrees over good paste) it was 85C. With a just a reseat the temp dropped 3C.
There had to be a problem with the seating of the heatsink to get such a temp drop.

Changing to kryonaut paste only dropped the temp around another degree...not as expected.
So going to have to take it apart and try to figure out the alignment problem or most likely keep reseating it till I get lucky!
Probably play with the screw pressures also while watching the temp

EDIT...------------------
"If I could get the temps to drop to 72C like the person with the 3900x I would be thrilled."

The person responded and his ambient is 22C. 6C cooler then my 28C ambient. So not the 10C difference I thought.
Think I'm ok with my temps...(not happy but ok)

EDIT...
I just read at Hardware Unboxed, that his new build of 3950x water cooled runs at 58C over ambient.
My Air Cooled runs at 55 over ambient! So I'm gradually seeing that it's doing better then I thought.
---------------------------

From what I've read, due to the extreme chip binning of the 3950x it runs at a lower voltage then the 3900x does and most of the time uses less power.
Yes, the 3900x does come with the much smaller Wraith cooler.

On my 6 core W7 pc I have a RX 460 videocard. Still trying to force myself to move over to my 1950x with the Vega 56...(hate W10).
The 3950x has an old Radeon 290 card.
I did my testing of NVEnc on a Gigabyte 1660 Super which is sitting on a shelf as I don't care for Nvidia.

I keep my videos on a SSD because I carry it over to someone's house to watch movies or shows. It's in a USB ext case, very light and safe to handle.
Just plug and play with the tv.
 
Last edited:

jmc

Active member
jmc, Thanks for your 3950x summary. I too am partial to air cooled, but wanted to use the new Dune case (case clone of Macpro) and it won't tolerate a Noctua cooler height by about 5 mm.
Hey, I found this...The NH_D15S is 5mm SHORTER! Maybe it will fit.

"the total height of the NH-D15 including fans is 165mm whereas the total height of the NH-D15S including its single center fan is only 160mm."

""https://noctua.at/en/products/cpu-cooler-retail/nh-d15s/faq""

------------------------
Noctua FAQ...
Furthermore, Ryzen 3000 CPUs are using the rated temperature headroom (up to 95°C) quite aggressively in order to reach higher boost clocks. As a result, it is absolutely no problem and not alarming if the processor runs into this temperature limit. The clock speed and supply voltage will be adjusted automatically by the processor itself in order to remain within AMD’s specifications and to prevent overheating.

Due to the higher heat density, higher thermal limits and more aggressive boost clock usage, it is perfectly normal that Ryzen 3000 CPUs are reaching higher temperatures than previous generation Ryzen CPUs with the same TDP rating. Higher CPU temperatures are normal for Ryzen 3000 processors and not a sign of that there is anything wrong with the CPU cooler.
------------------------
 
Last edited:
Top Bottom