Compiling OpenCV

Contents

The Problem
Important Compilation Flags
Python Virtualenv Example
Conda Notes
System Package Build
Printing Build Information

The Problem

Compiling OpenCV is a necessity if we want to deploy it beyond some small toy examples. Especially in computer vision dependency/complexity management within the Python realm is best described as a giant pain in the ass. OpenCV is a giant software package with many compilation options and therefore various dependencies. Furthermore, there may are some license issues involved if it is compiled with certain flags. A common feature that is missing from most compiled versions of OpenCV ready for re-distribution is the CUDA functionality. Other features such as gstreamer support which is handy when streaming processed images out without the need of building a proper gstreamer pipeline ourselves are often missing from default packages available via pip or conda but may be present if OpenCV is provided by the OS. Moreover, building our own version of OpenCV gives us fine control on which linear algebra package to use (e.g. Intel MKL or OpenBLAS) and what instruction set extensions to use.

When building our own packages, we basically have 3 options:

building packages on an OS level (for DEB and RPM-based distributions it can be painful, for ArchLinux/Manjaro/Gentoo it is straight forward)
conda (you end up replicating almost an entire OS in a userland folder)
virtualenv (you only end up replicating relevant packages)

Important Compilation Flags

Some algorithms are patented or not available for commercial usage. OPENCV_ENABLE_NONFREE=ON enables/disables building these algorithms.

Some OpenCV functions are implemented using instruction set extensions such as AVX2 (x86_64) or NEON (ARM). The baseline is the minimum requirement that needs to be available on the CPU whereas dispatch will build them but use them only if the CPU provides them accordingly:

CPU_BASELINE_REQUIRE=SSE2,SSE3,SSE4_2
CPU_DISPATCH=FP16,FMA3,AVX,AVX2,AVX512_ICL

Full cuda support can be enabled using the following flags:

WITH_CUDA=ON
WITH_NVCUVID=ON 
CUDA_FAST_MATH=ON
WITH_CUBLAS=ON
WITH_CUFFT=ON
OPENCV_DNN_CUDA=ON

What OpenCV CUDA modules are going to be build can be controlled separately:

BUILD_opencv_cudaarithm=ON
BUILD_opencv_cudabgsegm=ON
BUILD_opencv_cudacodec=ON
BUILD_opencv_cudafeatures2d=ON
BUILD_opencv_cudafilters=ON
BUILD_opencv_cudaimgproc=ON
BUILD_opencv_cudaobjdetect=ON
BUILD_opencv_cudaoptflow=ON
BUILD_opencv_cudastereo=ON
BUILD_opencv_cudawarping=ON

There is still some legacy support for AMD:

WITH_OPENCLAMDFFT enable a OpenCL FFT version by AMD
WITH_OPENCLAMDBLAS enable OpenCL based BLAS by AMD

Important options for video I/O:

WITH_FFMPEG enable ffmpeg support (make sure to compile it with all optimizations before and hardware accelerator support if desired)
WITH_GSTREAMER enable gstreamer support (make sure to compile it with all optimizations before - good/bad/ugly plugins)
WITH_V4L enable use with Video4Linux

OpenCL and OpenGL support are handy as well:

WITH_VULKAN enable Vulkan support (OpenCL and OpenGL successor)
WITH_OPENGL enable OpenGL support
WITH_OPENCL enable OpenCL support

Some more useful CPU optimizations (parallel computing):

WITH_IPP enable Intel TPP support
WITH_TBB enable Intel TBB support
WITH_EIGEN enable libeigen support
WITH_OPENMP enable OpenMP support (CPU multi-processing)

If you plan to use OpenCV in environments without a display/xserver (aka headless), make sure you disable

WITH_GTK
WITH_QT
WITH_VTK

I highly recommend rebuilding TIFF, WEBP, JPEG and PNG when working with conda!

BUILD_TIFF build libtiff from source
BUILD_JPEG build libjpeg from source
BUILD_PNG build libpng from source
BUILD_WEBP build webp from source

Additional Deep Neural Network backends can be compiled with:

WITH_TIMVX - support for various NPUs
WITH_HALIDE - halide backend support
WITH_INF_ENGINE - OpenVINO support
WITH_ONNX - support for the ONNX Runtime

Python Virtualenv Example

You can install virtualenv from your distribution package management system and set it up accordingly.

Let’s assume that you want to compile and install opencv with full cuda support within a user’s home folder. All you need to do is create an virtual environment:

mkvirtualenv opencvCU --python=/usr/bin/python3.10

workon opencvCU

Make sure that you installed everything you’ll need for compiling. This will vary according to your demands. Next, we have to get opencv’s source code. I recommend cloning the git repos and use git chechout to select a version:

git clone https://github.com/opencv/opencv
cd opencv
git submodule add https://github.com/opencv/opencv_contrib

Selecting a release version using git:

# in the opencv folder
git checkout tags/4.5.5
cd opencv_contrib
git checkout tags/4.5.5
cd ../

Now, you can follow the basic opencv build instructions except for one thing:

You should make a subfolder to store opencv as a user e.g. ~/.opencv_version/4_5_5/ to orchestrate different versions. This implies that you have to set the CMAKE option -DCMAKE_INSTALL_PREFIX=~/.opencv_version/4_5_5/. Run cmake with all the we desire, compile it and run make install as a user.

NB!: You may have to copy the python package manually by copying ./lib/python3.10/site-package/cv2/ (in the install folder) to ~/.virtualenvs/opencvCU/lib/python3.10/site-packages/.

Conda Notes

I do like to rant about the conda ecosystem as it is considerable less maintainable when custom packages are requried than any classical system-wide packagemanagement. Not even conda-forge maintains OpenCV with CUDA support. If no cuda support is needed (nvenc support seem to be available via conda-forge gstreamer plugins), then using conda is less of an issue. As soon as this is required using system packages is a much better and more importantly maintainable approach. Yes, even inside fucking docker containers system-wide packages makes it a lot easier than rebuilding it all the time or running some weird “container to container shit”. Spending a day or two to setup a proper build system will reduce debugging efforts massively!

System Package Build

System wide packages is the oldschool and most likely most maintainable approach to build and manage custom opencv installations.

The process is straight forward:

1.) Setup a custom repo (or install the output package manually if the setup is more “experimental”)

2.) Grab a basic template of the build script:

3.) Modify the templates to match requirements (be careful when opencv is split into multiple packages - requires some extra work)

4.) Run the build process

5.) If all tests pass: add package to custom repo (or install it on a single machine)

The biggest advantage of this approach is that it is more maintainable with respect to dependency management and there usually are a couple of packages more that might need some re-compilation for e.g. proper CUDA support. Hence having a single repo for all of them makes it easy to integrate but also to debug later (if records are kept).

I publish the following build scripts:

Print Build Information

When debugging source code that use OpenCV, then it may come handy to know how the package was build. We can print this information to the console.

// C++

#include <opencv2/core.hpp>

#include <iostream>

int main(){
	std::cout << cv::getBuildInformation() << std::endl;
	return 0;
}

# python

import cv2

print(cv2.getBuildInformation())

An example build information of the (standard) ArchLinux package (with cuda support) may look like this:

General configuration for OpenCV 4.5.5 =====================================
  Version control:               unknown

  Extra modules:
    Location (extra):            /build/opencv/src/opencv_contrib-4.5.5/modules
    Version control (extra):     unknown

  Platform:
    Timestamp:                   2022-02-19T21:32:22Z
    Host:                        Linux 5.16.9-arch1-1 x86_64
    CMake:                       3.22.2
    CMake generator:             Unix Makefiles
    CMake build tool:            /usr/bin/make
    Configuration:               Release

  CPU/HW features:
    Baseline:                    SSE SSE2
      requested:                 SSE3
      required:                  SSE2
      disabled:                  SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (16 files):         + SSE3 SSSE3 SSE4_1
      SSE4_2 (1 files):          + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (0 files):            + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (4 files):             + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (31 files):           + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
      AVX512_SKX (5 files):      + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

  C/C++:
    Built as dynamic libs?:      YES
    C++ standard:                11
    C++ Compiler:                /usr/bin/c++  (ver 11.2.0)
    C++ flags (Release):         -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions         -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security         -fstack-clash-protection -fcf-protection -Wp,-D_GLIBCXX_ASSERTIONS -flto -fno-lto   -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
    C++ flags (Debug):           -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions         -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security         -fstack-clash-protection -fcf-protection -Wp,-D_GLIBCXX_ASSERTIONS -flto -fno-lto   -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -fvisibility=hidden -fvisibility-inlines-hidden -g  -DDEBUG -D_DEBUG
    C Compiler:                  /usr/bin/cc
    C flags (Release):           -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions         -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security         -fstack-clash-protection -fcf-protection -flto -fno-lto   -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
    C flags (Debug):             -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions         -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security         -fstack-clash-protection -fcf-protection -flto -fno-lto   -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -fvisibility=hidden -g  -DDEBUG -D_DEBUG
    Linker flags (Release):      -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto -fno-lto  -Wl,--gc-sections -Wl,--as-needed
    Linker flags (Debug):        -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto -fno-lto  -Wl,--gc-sections -Wl,--as-needed
    ccache:                      NO
    Precompiled headers:         NO
    Extra dependencies:          m pthread cudart_static dl rt nppc nppial nppicc nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/opt/cuda/lib64 -L/lib64
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 alphamat aruco barcode bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev cvv datasets dnn dnn_objdetect dnn_superres dpm face features2d flann freetype fuzzy gapi hdf hfs highgui img_hash imgcodecs imgproc intensity_transform java line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking video videoio videostab viz wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
    Disabled:                    world
    Disabled by dependency:      -
    Unavailable:                 julia matlab ovis python2 sfm ts
    Applications:                examples apps
    Documentation:               NO
    Non-free algorithms:         YES

  GUI:                           QT5
    QT:                          YES (ver 5.15.2 )
      QT OpenGL support:         YES (Qt5::OpenGL 5.15.2)
    GTK+:                        NO
    OpenGL support:              YES (/lib64/libGL.so /lib64/libGLU.so)
    VTK support:                 YES (ver 9.1.0)

  Media I/O:
    ZLib:                        /lib64/libz.so (ver 1.2.11)
    JPEG:                        /lib64/libjpeg.so (ver 80)
    WEBP:                        /lib64/libwebp.so (ver encoder: 0x020f)
    PNG:                         /lib64/libpng.so (ver 1.6.37)
    TIFF:                        /lib64/libtiff.so (ver 42 / 4.3.0)
    JPEG 2000:                   OpenJPEG (ver 2.4.0)
    OpenEXR:                     OpenEXR::OpenEXR (ver 3.1.4)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      YES (2.2.6)
    FFMPEG:                      YES
      avcodec:                   YES (58.134.100)
      avformat:                  YES (58.76.100)
      avutil:                    YES (56.70.100)
      swscale:                   YES (5.9.100)
      avresample:                NO
    GStreamer:                   YES (1.20.0)
    v4l/v4l2:                    YES (linux/videodev2.h)

  Parallel framework:            TBB (ver 2021.5 interface 12050)

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Intel IPP:                   2020.0.0 Gold [2020.0.0]
           at:                   /build/opencv/src/build-cuda/3rdparty/ippicv/ippicv_lnx/icv
    Intel IPP IW:                sources (2020.0.0)
              at:                /build/opencv/src/build-cuda/3rdparty/ippicv/ippicv_lnx/iw
    VA:                          YES
    Lapack:                      YES (/usr/lib/liblapack.so /usr/lib/libblas.so /usr/lib/libcblas.so)
    Eigen:                       YES (ver 3.4.0)
    Custom HAL:                  NO
    Protobuf:                    /lib64/libprotobuf.so (3.19.4)

  NVIDIA CUDA:                   YES (ver 11.6, CUFFT CUBLAS)
    NVIDIA GPU arch:             35 37 50 52 60 61 70 75 80 86
    NVIDIA PTX archs:

  cuDNN:                         YES (ver 8.3.1)

  Vulkan:                        YES
    Include path:                /build/opencv/src/opencv-4.5.5/3rdparty/include
    Link libraries:              Dynamic load

  OpenCL:                        YES (INTELVA)
    Include path:                /build/opencv/src/opencv-4.5.5/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python 3:
    Interpreter:                 /usr/bin/python3 (ver 3.10.2)
    Libraries:                   /lib64/libpython3.10.so (ver 3.10.2)
    numpy:                       /usr/lib/python3.10/site-packages/numpy/core/include (ver 1.22.2)
    install path:                lib/python3.10/site-packages

  Python (for build):            /usr/bin/python3

  Java:
    ant:                         /bin/ant (ver 1.10.11)
    JNI:                         /usr/lib/jvm/default/include /usr/lib/jvm/default/include/linux /usr/lib/jvm/default/include
    Java wrappers:               YES
    Java tests:                  NO

  Install to:                    /usr
-----------------------------------------------------------------