Question / Help Build for ARM64/AARCH64

LinuxDevice

New Member
I'm trying to build on Ubuntu 16.04 arm64/aarch64, which does not have intrinsics and these compiler options:
-msse -msse2 -mmmx

Related file xmmintrin.h also does not exist:
obs-studio/libobs/graphics/vec4.h:21:23: fatal error: xmmintrin.h: No such file or directory

Is there a configuration possible to avoid the x86 dependent code for build on 64-bit ARMv8-a Linux?
 

Harold

Active Member
Considering that OBS requires hardware opengl 3.2 for scene composition AND decent h264 capable encoders, no.
 

LinuxDevice

New Member
This platform has a Pascal GPU with NVIDIA hardware acceleration and current OpenGL capable of 4k@60fps.
https://developer.nvidia.com/embedded/buy/jetson-tx2-devkit

When you see UHD video shot from a drone, this is what goes behind it. There is no issue with OpenGL version support, h264, NVENC, or even h265. I'm trying to get around x86 options not possible on ARMv8-a. So the question is still whether or not the obs code is able to have the "-msse -msse2 -mmmx" options disabled for non-x86 hardware?

I'm hoping to get it before the next version releases:
https://developer.nvidia.com/embedded/jetson-xavier-faq
 

Harold

Active Member
OBS uses Hardware opengl3.2 for scene composition on linux. So yes the opengl version support does actually matter.
 

LinuxDevice

New Member
Yes, this platform supports OpenGL3.2 using hardware acceleration. I'm not concerned about GPU here...I'm wondering about the x86-centric code. Is there a simple means to leave out "-msse -msse2 -mmmx"? Or is there actual x86 code required, e.g., assembler?
 

Tuna

Member
The code currently targets x86 platforms exclusively. It uses MMX/SSE intrinsics and is not #ifdef guarded. So you would have to first add some code that lets you enable/disable this path and port the code in question to NEON or provide unoptimsed C variants of these code blocks.
 

LinuxDevice

New Member
Were you able to get this working? I'm hoping to do the same on the Jetson Xavier.
I have not even started on it yet, too many other things to do. The more I see on Xavier the more it seems OBS would be a great addition, but I don't know if/when I'll get to it.
 

DanielR

New Member
Hey there

Same here... I tried to get obs-studio running/compiling on Nvidia Jetson Nano, but there hasn't been any success until now.
Convince the compiler not to use MMX was not that difficult (just edit CMakeList.txt). But like Tuna already said: The hard part is to "port the code in question to NEON [arm_neon.h] or provide unoptimsed C variants of these code blocks."

Has anyone gotten any further with that?
 

LinuxDevice

New Member
I have not yet tried as I've ended up on other projects. I do believe the conversion to NEON is possible. There might be cases where CUDA could be used as well. The Nano might have trouble keeping up with OBS, although this could in theory run. My thought is that Xavier could handle this without much issue.
 

mortanian

New Member
https://imgur.com/a/XqsTDRf

I've gotten it to work, and yeah you have to convert the Intel intrinsics to NEON intrinsics. Besides that, on the Jetson Nano the GL library doesn't feel like linking to libobs-opengl.so for some reason (even though the compiler line shows -lGL like 5 times.)

I haven't played with nvmpi encoding yet, since support for that in ffmpeg (via https://github.com/jocover/jetson-ffmpeg) isn't exactly reliable (or I don't know how to use it correctly). Same with RPi4, you'll need to use a custom encoder line or the gstreamer encoder pipeline (though I've not tested this.)

That said, I get a smooth 30fps at 720p with 2 60fps video sources using just libx264. I didn't really understand how this could work so well, considering it's software encoding, until I realised it also uses NEON instructions which probably helps a lot.

I borrowed and modified an sse2neon.h project, Gist contains credits for that and a link to the Git repo.

https://gist.github.com/danieloneill/9952820ad3ae0fe36738b4106ae6a775

Currently against Git Master (as of Nov 1ish, 2019).

Also, I only tested this on 64-bit ARM (Jetson Nano)
 

LinuxDevice

New Member
https://imgur.com/a/XqsTDRf

I've gotten it to work, and yeah you have to convert the Intel intrinsics to NEON intrinsics. Besides that, on the Jetson Nano the GL library doesn't feel like linking to libobs-opengl.so for some reason (even though the compiler line shows -lGL like 5 times.)

I haven't played with nvmpi encoding yet, since support for that in ffmpeg (via https://github.com/jocover/jetson-ffmpeg) isn't exactly reliable (or I don't know how to use it correctly). Same with RPi4, you'll need to use a custom encoder line or the gstreamer encoder pipeline (though I've not tested this.)

That said, I get a smooth 30fps at 720p with 2 60fps video sources using just libx264. I didn't really understand how this could work so well, considering it's software encoding, until I realised it also uses NEON instructions which probably helps a lot.

I borrowed and modified an sse2neon.h project, Gist contains credits for that and a link to the Git repo.

https://gist.github.com/danieloneill/9952820ad3ae0fe36738b4106ae6a775

Currently against Git Master (as of Nov 1ish, 2019).

Also, I only tested this on 64-bit ARM (Jetson Nano)
Would you mind if I post this information on the official NVIDIA Jetson web pages? There would probably be a lot of people testing and trying this out (along with questions).
 

mortanian

New Member
Would you mind if I post this information on the official NVIDIA Jetson web pages? There would probably be a lot of people testing and trying this out (along with questions).
I don't mind, but I don't know how to get around the need to LD_PRELOAD on libGL.so to launch it. (I just preload it in a script each time.)

It's not really a finished thing until that Just Works.
 

LinuxDevice

New Member
I don't mind, but I don't know how to get around the need to LD_PRELOAD on libGL.so to launch it. (I just preload it in a script each time.)

It's not really a finished thing until that Just Works.
The correct location to link against is the NVIDIA-provided driver at "/usr/lib/aarch64-linux-gnu/" (and then there are some sym linked versions there). To see if all of the NVIDIA drivers are correct you can run "sha1sum -c /etc/nv_tegra_release". If something isn't valid, then likely it is due to the software-only Nouveau driver having overwritten the NVIDIA version.

Note that you can specifically link against a file during compile, but then the file will always have to be there.

If you run this command from any Linux system (including Jetsons), then you can see everything the linker automatically finds:
Code:
ldconfig -p

To see the automatic linker search path:
Code:
ldconfig -p | grep 'libGL[.]"

If you examine "/etc/ld.so.conf.d/", then you'll see file "aarch64-linux-gnu.conf". This file merely lists the directories related to "aarch64-linux-gnu" architecture. If the libGL.so you are looking for linking with does not list by default, then you can examine the directories from the "/etc/ld.so.conf.d/" location to see if they are mentioned. If the file is mentioned more than once, then an application linking against will link the first occurrence.

Which specific libGL.so are you looking to link against?
 

mortanian

New Member
I'm not. I'm using the OBS studio cmakefile build system, and it adds -lGL to link. Ldd later reports it linking to egl and gldispatch, but not libGL.

This is good information, I'll look into it when I recover a bit. Cheers.
 

mortanian

New Member
Well, following your tips I checked what the linker sees and got:

Code:
root@jetson:~# ldconfig -p | grep libGL\\.
    libGL.so.1 (libc6,AArch64) => /usr/lib/aarch64-linux-gnu/libGL.so.1
    libGL.so (libc6,AArch64) => /usr/lib/aarch64-linux-gnu/libGL.so
root@jetson:~# dpkg -S /usr/lib/aarch64-linux-gnu/libGL.so
libglvnd-dev:arm64: /usr/lib/aarch64-linux-gnu/libGL.so
root@jetson:~# dpkg -l libglvnd-dev
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                         Version                     Architecture                Description
+++-============================================-===========================-===========================-==============================================================================================
ii  libglvnd-dev:arm64                           1.0.0-2ubuntu2.3            arm64                       Vendor neutral GL dispatch library -- development files
root@jetson:~#

So the "libglvnd-dev" thing is an NVIDIA library to abstract vendor GL libraries from the app. I suppose from the docs at <https://github.com/NVIDIA/libglvnd> that the actual GL symbol table is loaded when there's a call to "glXMakeCurrent", which isn't called by libobs-opengl.

I don't understand why "LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libGL.so" before launch DOES work, though. The only odd thing I noticed about it is the soname is "libGL.so.1" instead of "libGL.so", but .... I don't think that should matter? I'd love to have somebody try to whip this together in a proper way and can chime in with the fix for this issue.
 

LinuxDevice

New Member
Well, following your tips I checked what the linker sees and got:

Code:
root@jetson:~# ldconfig -p | grep libGL\\.
    libGL.so.1 (libc6,AArch64) => /usr/lib/aarch64-linux-gnu/libGL.so.1
    libGL.so (libc6,AArch64) => /usr/lib/aarch64-linux-gnu/libGL.so
root@jetson:~# dpkg -S /usr/lib/aarch64-linux-gnu/libGL.so
libglvnd-dev:arm64: /usr/lib/aarch64-linux-gnu/libGL.so
root@jetson:~# dpkg -l libglvnd-dev
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                         Version                     Architecture                Description
+++-============================================-===========================-===========================-==============================================================================================
ii  libglvnd-dev:arm64                           1.0.0-2ubuntu2.3            arm64                       Vendor neutral GL dispatch library -- development files
root@jetson:~#

So the "libglvnd-dev" thing is an NVIDIA library to abstract vendor GL libraries from the app. I suppose from the docs at <https://github.com/NVIDIA/libglvnd> that the actual GL symbol table is loaded when there's a call to "glXMakeCurrent", which isn't called by libobs-opengl.

I don't understand why "LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libGL.so" before launch DOES work, though. The only odd thing I noticed about it is the soname is "libGL.so.1" instead of "libGL.so", but .... I don't think that should matter? I'd love to have somebody try to whip this together in a proper way and can chime in with the fix for this issue.
The soname is generally the "logical" name and becomes useful whenever there might be more than one version available. In particular, when there are lots of patch levels and several packages with a few extra patch numbers on the end. If they use a given API revision, then the soname should be the same.

The LD_PRELOAD question is indeed interesting. Normally the linker has a set of libraries it can search through in a default order. If some version is found which the linker thinks is compatible, then the linker stops looking further. As an example, the soname above might be the same on several libraries, and yet you are interested in one with a particular patch. The LD_PRELOAD will instead become the first library found. Anything preload is about order of loading, although you could also preload a library which is not in the linker's search path.

This tends to suggest that either the CMake files either skip the standard linker path and miss the libGL.so which is needed, or else that it reorders in such a way that the wrong libGL.so is found first. I have not looked at it, but my bet is that there is some sort of custom linking rule and the preload makes visible something not normally visible (i.e., it is possible the make content excludes that directory from search and you're adding it back in to the search list).

It wouldn't be uncommon to see the make linker setup assume the desktop PC architecture. In that case you'd be missing the aarch64 libs (I don't know if this is what is happening, but from the above it is a reasonable guess).

You can use "ldd /some/file/bin/name" and see what files are linked against for a given executable. This isn't directly useful to you, but it might list a number of directories where a linked file exists. You might see a pattern where for example the "aarch64" directory is only mentioned on the libGL.so file you ran LD_PRELOAD on.

Another possibility is that someone mistakenly made the soname too specific...either in the library, or in the CMake content. As mentioned before, sometimes a particular patch is important, but the same patch might not exist on aarch64.

If you have the compile output log giving verbose information and not hiding what it does, then anything related to linking setup or to that particular link could be compared with and without the LD_PRELOAD to see what it does when it works right, and then compare the log to what it does if you don't preload. This would be a big clue as to what to look at in the CMake content/code.
 

boblinux12

New Member
Can anyone provide the compile steps outlined above by mortanian or another method to get OBS installed on Jetson Nano?

I understand from the posts this requires addition steps or process to get it installed.
 
Top