I have a high-end HPE ProLiant DL325 G10 server with an AMD EPYC 7401P 24-core/48-thread CPU, 128GB DDR4 RAM, and a Intel P4800x PCIe NVMe. I tried to run ffmpeg to convert a video (MKV) to MP4 for online streaming and it does not utilize all 24 cores. During the encoding process it uses about 10% of the CPU according to top. An example of the top output is below.
I've read similar questions on stackexchange sites but all are left unanswered and are 6-8 years old. I tried adding the -threads parameter, before and after -i with option 0, 24, 48 and various others but it seems to ignore this input. I'm not scaling the video either.
I'm also encoding in H.264. Below are some of the commands I've used. I can't figure out what I'm doing wrong or what the bottleneck exactly is.
Any suggestions how I can go about this?
Command used:
ffmpeg -threads 24 -i input.mkv -c:v libx264 -preset medium -c:a copy -vf subtitles=input.mkv output.mp4
I also trued using -sws_flags fast_bilinear & -x264-params sliced-threads=1 but both don't change much. I did notice using -tune zerolatency slightly increased CPU use for some cores, but the overall CPU usage was below 15%.
top output:
top - 16:14:13 up 57 min, 2 users, load average: 2.65, 0.59, 0.50
Tasks: 509 total, 1 running, 280 sleeping, 1 stopped, 0 zombie
%Cpu0 : 5.7 us, 0.7 sy, 12.2 ni, 80.4 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st
%Cpu1 : 2.6 us, 0.3 sy, 4.6 ni, 92.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 6.2 us, 0.3 sy, 6.9 ni, 86.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 4.0 us, 0.0 sy, 1.7 ni, 94.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 2.0 us, 0.0 sy, 13.9 ni, 84.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 2.3 us, 0.0 sy, 3.3 ni, 94.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 6.6 us, 0.3 sy, 8.2 ni, 84.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 8.1 us, 0.0 sy, 12.8 ni, 79.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu8 : 1.7 us, 0.3 sy, 31.5 ni, 66.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu9 : 21.6 us, 0.0 sy, 2.6 ni, 75.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu10 : 15.3 us, 0.7 sy, 9.6 ni, 74.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu11 : 12.3 us, 0.0 sy, 10.3 ni, 77.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu12 : 1.3 us, 0.3 sy, 26.4 ni, 71.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu13 : 3.3 us, 0.0 sy, 12.0 ni, 84.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu14 : 4.0 us, 0.0 sy, 7.7 ni, 88.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu15 : 2.0 us, 0.3 sy, 20.3 ni, 77.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu16 : 4.0 us, 0.3 sy, 29.7 ni, 66.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu17 : 2.3 us, 0.3 sy, 21.7 ni, 75.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu18 : 11.6 us, 0.0 sy, 22.8 ni, 65.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu19 : 6.0 us, 0.0 sy, 20.4 ni, 73.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu20 : 6.0 us, 0.0 sy, 26.3 ni, 67.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu21 : 0.3 us, 0.3 sy, 44.5 ni, 54.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu22 : 0.0 us, 0.3 sy, 26.1 ni, 73.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu23 : 20.1 us, 0.0 sy, 15.7 ni, 64.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu24 : 7.6 us, 0.7 sy, 5.0 ni, 86.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu25 : 3.0 us, 0.3 sy, 14.9 ni, 81.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu26 : 0.3 us, 0.0 sy, 16.6 ni, 83.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu27 : 3.0 us, 0.7 sy, 0.0 ni, 96.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu28 : 0.7 us, 0.0 sy, 0.3 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu29 : 8.3 us, 0.0 sy, 0.7 ni, 91.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu30 : 6.7 us, 0.3 sy, 9.7 ni, 83.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu31 : 4.0 us, 0.3 sy, 22.3 ni, 73.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu32 : 1.0 us, 0.0 sy, 24.6 ni, 74.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu33 : 10.6 us, 0.3 sy, 9.9 ni, 79.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu34 : 5.4 us, 0.0 sy, 2.7 ni, 91.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu35 : 6.6 us, 0.0 sy, 4.6 ni, 88.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu36 : 4.3 us, 0.0 sy, 17.1 ni, 78.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu37 : 5.6 us, 0.3 sy, 3.3 ni, 90.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu38 : 2.3 us, 0.0 sy, 8.6 ni, 89.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu39 : 2.3 us, 0.0 sy, 22.0 ni, 75.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu40 : 1.7 us, 0.0 sy, 17.4 ni, 80.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu41 : 6.0 us, 0.3 sy, 3.3 ni, 90.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu42 : 4.7 us, 0.3 sy, 15.7 ni, 79.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu43 : 7.9 us, 0.7 sy, 24.2 ni, 67.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu44 : 7.0 us, 0.3 sy, 28.3 ni, 64.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu45 : 29.1 us, 1.0 sy, 13.9 ni, 56.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu46 : 0.0 us, 0.0 sy, 45.3 ni, 54.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu47 : 0.0 us, 0.0 sy, 48.5 ni, 51.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
ffmpeg info:
ffmpeg version 3.4.4-0ubuntu0.18.04.1 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
libavutil 55. 78.100 / 55. 78.100
libavcodec 57.107.100 / 57.107.100
libavformat 57. 83.100 / 57. 83.100
libavdevice 57. 10.100 / 57. 10.100
libavfilter 6.107.100 / 6.107.100
libavresample 3. 7. 0 / 3. 7. 0
libswscale 4. 8.100 / 4. 8.100
libswresample 2. 9.100 / 2. 9.100
libpostproc 54. 7.100 / 54. 7.100
Ubuntu version:
Linux ubuntu 4.15.0-46-generic #49-Ubuntu SMP Wed Feb 6 09:33:07 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
And lastly here is an is a snippet of the output during the encoding process:
frame=34048 fps=316 q=-1.0 Lsize= 119840kB time=00:23:40.03 bitrate= 691.3kbits/s dup=2 drop=0 speed=13.2x
-threads 24 -i input.mkv--> this sets decoding threads, not encoding. Place-threadsafter the inputs. AFAIK, x264 tops out atvertical resolution / 40threads. That limit is set for effective motion search. – Gyan Mar 29 '19 at 05:04