FAQ  •  Register  •  Login

Support for nVidia GPU acceleration

<<

Ty_Bower

Serviio newbie

Posts: 5

Joined: Wed Sep 09, 2020 6:55 pm

Post Thu Sep 10, 2020 11:00 pm

Support for nVidia GPU acceleration

The nVidia GPU from Pascal generation onward has capable hardware decoders and encoders for both h.264 and h.265 (HEVC) media.

I have some Android tablet devices which are not capable of playing h.265 files directly. They fall back to software decoding (MX Player, VLC) which results in stuttering and heavy battery consumption.

I would very much value the ability to transcode the h.265 source files to h.264 as they are served to my tablet device. I've got a capable nVidia GPU which I would like to use to offload this transcoding, as otherwise it presents quite a burden to the Serviio's CPU.

I've added request #1153 to the bit bucket so that my request may be recorded and prioritized.
<<

Ty_Bower

Serviio newbie

Posts: 5

Joined: Wed Sep 09, 2020 6:55 pm

Post Sat Sep 12, 2020 4:24 pm

Re: Support for nVidia GPU acceleration

I've tested FFmpeg v4.3.1 using an nVidia GTX 1060 graphics board. The setup was quite straight forward, using the Zeranoe build. The command line string was found in an old forum post concerning a similar request for Quick Sync Video (Intel QSV) support. Encoding rate is very good by my standards. The Core i3-2320 processor in this machine can barely muster 0.8x ~ 1.5x, while the GPU can do it in 6x ~ 8x realtime.


J:\Video>ffmpeg.exe -threads 3 -i demo_video.mp4 -y -c:v h264_nvenc -pix_fmt yuv420p -b:v 12000k -maxrate:v 12000k -bufsize:v 12000k -preset:v fast -r 24000/1001 -g 15 -bsf:v h264_mp4toannexb -flags -global_header -c:a:0 ac3 -b:a:0 384k -ac:a:0 6 -map 0:0 -map 0:1 -sn -f mpegts C:\Transcode\Serviio\transcoding-temp-test.stf

ffmpeg version 4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 10.2.1 (GCC) 20200726

configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libgsm --enable-librav1e --disable-w32threads --enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf

libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'demo_video.mp4':

Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2mp41
creation_time : 2020-03-21T03:45:17.000000Z
title : demo_video
encoder : Lavf58.20.100
comment : demo_video

Duration: 02:03:02.42, start: 0.000000, bitrate: 2229 kb/s

Stream #0:0(und): Video: hevc (Main 10) (hev1 / 0x31766568), yuv420p10le(tv), 1920x804 [SAR 1:1 DAR 160:67], 1999 kb/s, 23.98 fps, 23.98 tbr, 1200k tbn, 23.98 tbc (default)
Metadata:
creation_time : 2020-03-21T03:45:17.000000Z
handler_name : VideoHandler

Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 224 kb/s (default)
Metadata:
creation_time : 2020-03-21T03:45:17.000000Z
handler_name : SoundHandler

Stream mapping:
Stream #0:0 -> #0:0 (hevc (native) -> h264 (h264_nvenc))
Stream #0:1 -> #0:1 (aac (native) -> ac3 (native))

Press [q] to stop, [?] for help

Output #0, mpegts, to 'C:\Transcode\Serviio\transcoding-temp-test.stf':

Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2mp41
comment : demo_video
title : demo_video
encoder : Lavf58.45.100
Stream #0:0(und): Video: h264 (h264_nvenc) (Main), yuv420p, 1920x804 [SAR 1:1 DAR 160:67], q=-1--1, 12000 kb/s, 23.98 fps, 90k tbn, 23.98 tbc (default)
Metadata:
creation_time : 2020-03-21T03:45:17.000000Z
handler_name : VideoHandler
encoder : Lavc58.91.100 h264_nvenc
Side data:
cpb: bitrate max/min/avg: 12000000/0/12000000 buffer size: 12000000 vbv_delay: N/A
Stream #0:1(eng): Audio: ac3, 48000 Hz, 5.1, fltp, 384 kb/s (default)
Metadata:
creation_time : 2020-03-21T03:45:17.000000Z
handler_name : SoundHandler
encoder : Lavc58.91.100 ac3

frame=176996 fps=146 q=9.0 Lsize=10776477kB time=02:03:02.40 bitrate=11958.3kbits/s dup=0 drop=2 speed= 6.1x
video:10130220kB audio:346050kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.865590%
<<

Ty_Bower

Serviio newbie

Posts: 5

Joined: Wed Sep 09, 2020 6:55 pm

Post Sun Sep 13, 2020 1:08 pm

Re: Support for nVidia GPU acceleration

I have access to a number of different processors and graphics boards, so I did some testing of various encode options with FFmpeg. All CPUs are 4th generation Intel or older (Haswell, circa 2015). The nVidia GPU was released in 2016, but can be purchased today on the used market for around $100. The Radeon RX5700 was released last year (2019) and typically sells for over $300.

My goal is to find a cost effective combination of hardware that offers solid transcoding performance. I am hopeful that support for these GPU accelerated encoders can be added to Serviio in the near future.

Here is a quick summary of results. In all cases, GPU (or iGPU) encoding outperforms CPU encoding. More cores and more threads outperform fewer. Newer generation hardware typically outperforms older, except in cases where other factors (such as core count) weigh heavily.

Haswell i3, CPU 0.835 x
Sandy Bridge i5, CPU 0.861 x
Haswell i7, CPU 1.73 x
Haswell i3, QSV 1.87 x
nVidia GTX 1060 3GB 5.96 x
AMD RX5700 8GB 7.2 x

========================================
The same input file was used in all test cases.
All transcodes were performed using:

ffmpeg version 4.3.1 Copyright (c) 2000-2020 the FFmpeg developers
built with gcc 10.2.1 (GCC) 20200726
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libgsm --enable-librav1e --disable-w32threads --enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100

========================================
Core i5-2320 (Sandy Bridge), 4C/4T, 3.0 GHz, libx264 (CPU encode - transcoding aborted due to lack of time)

J:\Video>ffmpeg.exe -threads 3 -i demo_video.mp4 -y -c:v libx264 -pix_fmt yuv420p -b:v 12000k -maxrate:v 12000k -bufsize:v 12000k -preset:v fast -r 24000/1001 -g 15 -bsf:v h264_mp4toannexb -flags -global_header -c:a:0 ac3 -b:a:0 384k -ac:a:0 6 -map 0:0 -map 0:1 -sn -f mpegts C:\Transcode\Serviio\transcoding-libx264.stf

frame=57600 fps= 21 q=-1.0 Lsize= 3640044kB time=00:40:02.49 bitrate=12411.8kbits/s speed=0.861x
video:3426479kB audio:112617kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.852378%

========================================
Core i5-2320, nVidia GTX 1060 3GB (Pascal), h264_nvenc (GPU encode)

J:\Video>ffmpeg.exe -threads 3 -i demo_video.mp4 -y -c:v h264_nvenc -pix_fmt yuv420p -b:v 12000k -maxrate:v 12000k -bufsize:v 12000k -preset:v fast -r 24000/1001 -g 15 -bsf:v h264_mp4toannexb -flags -global_header -c:a:0 ac3 -b:a:0 384k -ac:a:0 6 -map 0:0 -map 0:1 -sn -f mpegts C:\Transcode\Serviio\transcoding-h264_nvenc.stf

frame=176996 fps=143 q=9.0 Lsize=10776477kB time=02:03:02.40 bitrate=11958.3kbits/s dup=0 drop=2 speed=5.96x
video:10130220kB audio:346050kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.865590%

========================================
Core i3-4170 (Haswell), 2C/4T, 3.7 GHz, libx264 (CPU encode)

E:\Video>ffmpeg.exe -threads 3 -i demo_video.mp4 -y -c:v libx264 -pix_fmt yuv420p -b:v 12000k -maxrate:v 12000k -bufsize:v 12000k -preset:v fast -r 24000/1001 -g 15 -bsf:v h264_mp4toannexb -flags -global_header -c:a:0 ac3 -b:a:0 384k -ac:a:0 6 -map 0:0 -map 0:1 -sn -f mpegts C:\Transcode\Serviio\transcoding-libx264.stf

frame=176996 fps= 20 q=-1.0 Lsize=10904180kB time=02:03:02.40 bitrate=12100.0kbits/s dup=0 drop=2 speed=0.835x
video:10253964kB audio:346050kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.869495%

========================================
Core i3-4170 (Haswell), 2C/4T, 3.7 GHz, h264_qsv (Quick Sync Video encode)

E:\Video>ffmpeg.exe -threads 3 -i demo_video.mp4 -y -c:v h264_qsv -pix_fmt yuv420p -b:v 12000k -maxrate:v 12000k -bufsize:v 12000k -preset:v fast -r 24000/1001 -g 15 -bsf:v h264_mp4toannexb -flags -global_header -c:a:0 ac3 -b:a:0 384k -ac:a:0 6 -map 0:0 -map 0:1 -sn -f mpegts C:\Transcode\Serviio\transcoding-temp.stf

Incompatible pixel format 'yuv420p' for codec 'h264_qsv', auto-selecting format 'nv12'
[mpegts @ 0000005fb0a05a80] Non-monotonous DTS in output stream 0:0; previous: 7508, current: 7508; changing to 7509. This may result in incorrect timestamps in the output file.

frame=176996 fps= 45 q=10.0 Lsize=11477859kB time=02:03:02.40 bitrate=12736.6kbits/s dup=0 drop=2 speed=1.87x
video:10816133kB audio:346050kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.828080%

========================================
Xeon E3-1271 v3 (Haswell, w/o iGPU), 4C/8T, 3.6~4.0 GHz, libx264 (CPU encode)

G:\Video>ffmpeg.exe -threads 3 -i demo_video.mp4 -y -c:v libx264 -pix_fmt yuv420p -b:v 12000k -maxrate:v 12000k -bufsize:v 12000k -preset:v fast -r 24000/1001 -g 15 -bsf:v h264_mp4toannexb -flags -global_header -c:a:0 ac3 -b:a:0 384k -ac:a:0 6 -map 0:0 -map 0:1 -sn -f mpegts G:\Transcode\Serviio\transcoding-libx264.stf

frame=176996 fps= 42 q=-1.0 Lsize=10814763kB time=02:03:02.40 bitrate=12000.8kbits/s dup=0 drop=2 speed=1.73x
video:10166430kB audio:346050kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.875468%

========================================
Xeon E3-1271 v3, AMD Radeon RX5700 8GB (Navi/RDNA1), h264_amf (GPU encode)

G:\Video>ffmpeg.exe -threads 3 -i demo_video.mp4 -y -c:v h264_amf -pix_fmt yuv420p -b:v 12000k -maxrate:v 12000k -bufsize:v 12000k -preset:v fast -r 24000/1001 -g 15 -bsf:v h264_mp4toannexb -flags -global_header -c:a:0 ac3 -b:a:0 384k -ac:a:0 6 -map 0:0 -map 0:1 -sn -f mpegts G:\Transcode\Serviio\transcoding-h264_amf.stf

frame=176996 fps=173 q=-0.0 Lsize=11453620kB time=02:03:02.40 bitrate=12709.7kbits/s dup=0 drop=2 speed= 7.2x
video:10792486kB audio:346050kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.828779%

========================================
<<

GhostNZ

Serviio newbie

Posts: 1

Joined: Tue Sep 15, 2020 6:08 am

Post Tue Sep 15, 2020 6:56 am

Re: Support for nVidia GPU acceleration

Hopefully this gains traction, GPU's that support transcoding are getting cheaper all the time so it would be a huge feature if it can be included.
My Poor G7 HP Microserver is starting to struggle, might have to get an Nvidia Quadro p400 and run some tests myself..
<<

zip

User avatar

Serviio developer / Site Admin

Posts: 17154

Joined: Sat Oct 24, 2009 12:24 pm

Location: London, UK

Post Wed Sep 16, 2020 12:04 pm

Re: Support for nVidia GPU acceleration

Looks good. I'll have a look.
<<

Ty_Bower

Serviio newbie

Posts: 5

Joined: Wed Sep 09, 2020 6:55 pm

Post Wed Sep 16, 2020 12:41 pm

Re: Support for nVidia GPU acceleration

Thanks. Let me know if I can do anything to help test.
<<

Ty_Bower

Serviio newbie

Posts: 5

Joined: Wed Sep 09, 2020 6:55 pm

Post Mon Oct 12, 2020 2:59 pm

Re: Support for nVidia GPU acceleration

Ty_Bower wrote:I've tested FFmpeg v4.3.1 using an nVidia GTX 1060 graphics board. The setup was quite straight forward, using the Zeranoe build.


It seems the Zeranoe site is no longer with us. For the moment, the FFmpeg builds are available here:

http://acyun.org/
<<

atc98092

User avatar

DLNA master

Posts: 4118

Joined: Fri Aug 17, 2012 10:22 pm

Location: Washington (the state)

Post Mon Oct 12, 2020 10:14 pm

Re: Support for nVidia GPU acceleration

Ty_Bower wrote:
Ty_Bower wrote:I've tested FFmpeg v4.3.1 using an nVidia GTX 1060 graphics board. The setup was quite straight forward, using the Zeranoe build.


It seems the Zeranoe site is no longer with us. For the moment, the FFmpeg builds are available here:

http://acyun.org/


Well, that's a bummer. I'm sure someone will continue to maintain the Windows version.
Dan

LG NANO85 4K TV, Samsung JU7100 4K TV, Sony BDP-S3500, Sharp 4K Roku TV, Insignia Roku TV, Roku Ultra, Premiere and Stick, Nvidia Shield, Yamaha RX-V583 AVR.
Primary server: Intel i5-6400, 8 gig ram, Windows 10 Pro, 22 TB hard drive space | Test server Windows 10 Pro, AMD Phenom II X4 965, 8 gig ram

HOWTO: Enable debug logging HOWTO: Identify media file contents

Return to Feature requests

Who is online

Users browsing this forum: No registered users and 8 guests

Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software for PTF.