Contents
- The Problem
- Important Compilation Flags
- Python Virtualenv Example
- Conda Notes
- System Package Build
- Printing Build Information
The Problem
Compiling OpenCV is a necessity if we want to deploy it beyond some small toy examples. Especially in computer vision dependency/complexity management within the Python realm is best described as a giant pain in the ass. OpenCV is a giant software package with many compilation options and therefore various dependencies. Furthermore, there may are some license issues involved if it is compiled with certain flags. A common feature that is missing from most compiled versions of OpenCV ready for re-distribution is the CUDA functionality. Other features such as gstreamer support which is handy when streaming processed images out without the need of building a proper gstreamer pipeline ourselves are often missing from default packages available via pip
or conda
but may be present if OpenCV is provided by the OS. Moreover, building our own version of OpenCV gives us fine control on which linear algebra package to use (e.g. Intel MKL or OpenBLAS) and what instruction set extensions to use.
When building our own packages, we basically have 3 options:
-
building packages on an OS level (for
DEB
andRPM
-based distributions it can be painful, for ArchLinux/Manjaro/Gentoo it is straight forward) -
conda (you end up replicating almost an entire OS in a userland folder)
-
virtualenv (you only end up replicating relevant packages)
Important Compilation Flags
Some algorithms are patented or not available for commercial usage. OPENCV_ENABLE_NONFREE=ON
enables/disables building these algorithms.
Some OpenCV functions are implemented using instruction set extensions such as AVX2 (x86_64) or NEON (ARM). The baseline is the minimum requirement that needs to be available on the CPU whereas dispatch will build them but use them only if the CPU provides them accordingly:
CPU_BASELINE_REQUIRE=SSE2,SSE3,SSE4_2
CPU_DISPATCH=FP16,FMA3,AVX,AVX2,AVX512_ICL
Full cuda support can be enabled using the following flags:
WITH_CUDA=ON
WITH_NVCUVID=ON
CUDA_FAST_MATH=ON
WITH_CUBLAS=ON
WITH_CUFFT=ON
OPENCV_DNN_CUDA=ON
What OpenCV CUDA modules are going to be build can be controlled separately:
BUILD_opencv_cudaarithm=ON
BUILD_opencv_cudabgsegm=ON
BUILD_opencv_cudacodec=ON
BUILD_opencv_cudafeatures2d=ON
BUILD_opencv_cudafilters=ON
BUILD_opencv_cudaimgproc=ON
BUILD_opencv_cudaobjdetect=ON
BUILD_opencv_cudaoptflow=ON
BUILD_opencv_cudastereo=ON
BUILD_opencv_cudawarping=ON
There is still some legacy support for AMD:
WITH_OPENCLAMDFFT
enable a OpenCL FFT version by AMDWITH_OPENCLAMDBLAS
enable OpenCL based BLAS by AMD
Important options for video I/O:
WITH_FFMPEG
enable ffmpeg support (make sure to compile it with all optimizations before and hardware accelerator support if desired)WITH_GSTREAMER
enable gstreamer support (make sure to compile it with all optimizations before - good/bad/ugly plugins)WITH_V4L
enable use with Video4Linux
OpenCL and OpenGL support are handy as well:
WITH_VULKAN
enable Vulkan support (OpenCL and OpenGL successor)WITH_OPENGL
enable OpenGL supportWITH_OPENCL
enable OpenCL support
Some more useful CPU optimizations (parallel computing):
WITH_IPP
enable Intel TPP supportWITH_TBB
enable Intel TBB supportWITH_EIGEN
enable libeigen supportWITH_OPENMP
enable OpenMP support (CPU multi-processing)
If you plan to use OpenCV in environments without a display/xserver (aka headless), make sure you disable
WITH_GTK
WITH_QT
WITH_VTK
I highly recommend rebuilding TIFF, WEBP, JPEG and PNG when working with conda!
BUILD_TIFF
build libtiff from sourceBUILD_JPEG
build libjpeg from sourceBUILD_PNG
build libpng from sourceBUILD_WEBP
build webp from source
Additional Deep Neural Network backends can be compiled with:
WITH_TIMVX
- support for various NPUsWITH_HALIDE
- halide backend supportWITH_INF_ENGINE
- OpenVINO supportWITH_ONNX
- support for the ONNX Runtime
Python Virtualenv Example
You can install virtualenv from your distribution package management system and set it up accordingly.
Let’s assume that you want to compile and install opencv with full cuda support within a user’s home folder. All you need to do is create an virtual environment:
mkvirtualenv opencvCU --python=/usr/bin/python3.10
workon opencvCU
Make sure that you installed everything you’ll need for compiling. This will vary according to your demands. Next, we have to get opencv’s source code. I recommend cloning the git repos and use git chechout
to select a version:
git clone https://github.com/opencv/opencv
cd opencv
git submodule add https://github.com/opencv/opencv_contrib
Selecting a release version using git:
# in the opencv folder
git checkout tags/4.5.5
cd opencv_contrib
git checkout tags/4.5.5
cd ../
Now, you can follow the basic opencv build instructions except for one thing:
You should make a subfolder to store opencv as a user e.g. ~/.opencv_version/4_5_5/
to orchestrate different versions. This implies that you have to set the CMAKE option -DCMAKE_INSTALL_PREFIX=~/.opencv_version/4_5_5/
. Run cmake with all the we desire, compile it and run make install
as a user.
NB!: You may have to copy the python package manually by copying ./lib/python3.10/site-package/cv2/
(in the install folder) to ~/.virtualenvs/opencvCU/lib/python3.10/site-packages/
.
Conda Notes
I do like to rant about the conda ecosystem as it is considerable less maintainable when custom packages are requried than any classical system-wide packagemanagement. Not even conda-forge maintains OpenCV with CUDA support. If no cuda support is needed (nvenc support seem to be available via conda-forge gstreamer plugins), then using conda is less of an issue. As soon as this is required using system packages is a much better and more importantly maintainable approach. Yes, even inside fucking docker containers system-wide packages makes it a lot easier than rebuilding it all the time or running some weird “container to container shit”. Spending a day or two to setup a proper build system will reduce debugging efforts massively!
System Package Build
System wide packages is the oldschool and most likely most maintainable approach to build and manage custom opencv installations.
The process is straight forward:
1.) Setup a custom repo (or install the output package manually if the setup is more “experimental”)
2.) Grab a basic template of the build script:
3.) Modify the templates to match requirements (be careful when opencv is split into multiple packages - requires some extra work)
4.) Run the build process
5.) If all tests pass: add package to custom repo (or install it on a single machine)
The biggest advantage of this approach is that it is more maintainable with respect to dependency management and there usually are a couple of packages more that might need some re-compilation for e.g. proper CUDA support. Hence having a single repo for all of them makes it easy to integrate but also to debug later (if records are kept).
I publish the following build scripts:
-
OpenCV as a single package for Debian/Ubuntu with full CUDA support
-
OpenCV as a single package for Debian/Ubuntu with all features but CUDA
Print Build Information
When debugging source code that use OpenCV, then it may come handy to know how the package was build. We can print this information to the console.
// C++
#include <opencv2/core.hpp>
#include <iostream>
int main(){
std::cout << cv::getBuildInformation() << std::endl;
return 0;
}
# python
import cv2
print(cv2.getBuildInformation())
An example build information of the (standard) ArchLinux package (with cuda support) may look like this:
General configuration for OpenCV 4.5.5 =====================================
Version control: unknown
Extra modules:
Location (extra): /build/opencv/src/opencv_contrib-4.5.5/modules
Version control (extra): unknown
Platform:
Timestamp: 2022-02-19T21:32:22Z
Host: Linux 5.16.9-arch1-1 x86_64
CMake: 3.22.2
CMake generator: Unix Makefiles
CMake build tool: /usr/bin/make
Configuration: Release
CPU/HW features:
Baseline: SSE SSE2
requested: SSE3
required: SSE2
disabled: SSE3
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
SSE4_1 (16 files): + SSE3 SSSE3 SSE4_1
SSE4_2 (1 files): + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (0 files): + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (4 files): + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 AVX
AVX2 (31 files): + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
AVX512_SKX (5 files): + SSE3 SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX
C/C++:
Built as dynamic libs?: YES
C++ standard: 11
C++ Compiler: /usr/bin/c++ (ver 11.2.0)
C++ flags (Release): -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -Wp,-D_GLIBCXX_ASSERTIONS -flto -fno-lto -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG
C++ flags (Debug): -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -Wp,-D_GLIBCXX_ASSERTIONS -flto -fno-lto -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -fvisibility=hidden -fvisibility-inlines-hidden -g -DDEBUG -D_DEBUG
C Compiler: /usr/bin/cc
C flags (Release): -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -flto -fno-lto -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -fvisibility=hidden -O3 -DNDEBUG -DNDEBUG
C flags (Debug): -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -flto -fno-lto -fsigned-char -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -fvisibility=hidden -g -DDEBUG -D_DEBUG
Linker flags (Release): -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto -fno-lto -Wl,--gc-sections -Wl,--as-needed
Linker flags (Debug): -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto -fno-lto -Wl,--gc-sections -Wl,--as-needed
ccache: NO
Precompiled headers: NO
Extra dependencies: m pthread cudart_static dl rt nppc nppial nppicc nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/opt/cuda/lib64 -L/lib64
3rdparty dependencies:
OpenCV modules:
To be built: alphamat aruco barcode bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev cvv datasets dnn dnn_objdetect dnn_superres dpm face features2d flann freetype fuzzy gapi hdf hfs highgui img_hash imgcodecs imgproc intensity_transform java line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking video videoio videostab viz wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
Disabled: world
Disabled by dependency: -
Unavailable: julia matlab ovis python2 sfm ts
Applications: examples apps
Documentation: NO
Non-free algorithms: YES
GUI: QT5
QT: YES (ver 5.15.2 )
QT OpenGL support: YES (Qt5::OpenGL 5.15.2)
GTK+: NO
OpenGL support: YES (/lib64/libGL.so /lib64/libGLU.so)
VTK support: YES (ver 9.1.0)
Media I/O:
ZLib: /lib64/libz.so (ver 1.2.11)
JPEG: /lib64/libjpeg.so (ver 80)
WEBP: /lib64/libwebp.so (ver encoder: 0x020f)
PNG: /lib64/libpng.so (ver 1.6.37)
TIFF: /lib64/libtiff.so (ver 42 / 4.3.0)
JPEG 2000: OpenJPEG (ver 2.4.0)
OpenEXR: OpenEXR::OpenEXR (ver 3.1.4)
HDR: YES
SUNRASTER: YES
PXM: YES
PFM: YES
Video I/O:
DC1394: YES (2.2.6)
FFMPEG: YES
avcodec: YES (58.134.100)
avformat: YES (58.76.100)
avutil: YES (56.70.100)
swscale: YES (5.9.100)
avresample: NO
GStreamer: YES (1.20.0)
v4l/v4l2: YES (linux/videodev2.h)
Parallel framework: TBB (ver 2021.5 interface 12050)
Trace: YES (with Intel ITT)
Other third-party libraries:
Intel IPP: 2020.0.0 Gold [2020.0.0]
at: /build/opencv/src/build-cuda/3rdparty/ippicv/ippicv_lnx/icv
Intel IPP IW: sources (2020.0.0)
at: /build/opencv/src/build-cuda/3rdparty/ippicv/ippicv_lnx/iw
VA: YES
Lapack: YES (/usr/lib/liblapack.so /usr/lib/libblas.so /usr/lib/libcblas.so)
Eigen: YES (ver 3.4.0)
Custom HAL: NO
Protobuf: /lib64/libprotobuf.so (3.19.4)
NVIDIA CUDA: YES (ver 11.6, CUFFT CUBLAS)
NVIDIA GPU arch: 35 37 50 52 60 61 70 75 80 86
NVIDIA PTX archs:
cuDNN: YES (ver 8.3.1)
Vulkan: YES
Include path: /build/opencv/src/opencv-4.5.5/3rdparty/include
Link libraries: Dynamic load
OpenCL: YES (INTELVA)
Include path: /build/opencv/src/opencv-4.5.5/3rdparty/include/opencl/1.2
Link libraries: Dynamic load
Python 3:
Interpreter: /usr/bin/python3 (ver 3.10.2)
Libraries: /lib64/libpython3.10.so (ver 3.10.2)
numpy: /usr/lib/python3.10/site-packages/numpy/core/include (ver 1.22.2)
install path: lib/python3.10/site-packages
Python (for build): /usr/bin/python3
Java:
ant: /bin/ant (ver 1.10.11)
JNI: /usr/lib/jvm/default/include /usr/lib/jvm/default/include/linux /usr/lib/jvm/default/include
Java wrappers: YES
Java tests: NO
Install to: /usr
-----------------------------------------------------------------