Computer Vision with Rust

Contents

Introduction
Software Libraries
Examples
- OpenCV Wrapper

Introduction

As mentioned in the comment on Machine Learning with Rust the situation of computer vision with rust might be similar. Since many computer vision libraries are programmed in C++ and especially those for industrial applications enforce every best practice out there plus some additional safety guidelines it is really difficult to sell the idea to do computer vision with Rust unless we’re trying to make something even faster than C/C++. The rather poor CUDA and in general GPGPU support adds to this problematic.

The vast majority of computer vision applications, especially industrial applications, are way more than deploying a neural network via some REST API or gPRC which seem to be standard scenarios people think of who are new to computer vision and deep learning. In this case using Rust for gateways/interfaces integrate e.g libraries programmed in C++ or even Python via some safe API would be a sane approach.

However, there are many real-time applications where “classical computer vision” does the job well enough and extremely low latencies are required. In such a case rewriting everything in Rust, including cross-compilers for e.g. FPGAs or GPUs, is neither feasible nor necessarily necessary as existing systems are build extremely cautiously already. This does not mean that there might be not a case to reprogram everything as done with e.g. Braid but in industrial/commercial settings the incentives and overhead attached to that might simply not worth it. If someone starts from scratch, e.g a startup using hardly anything beyond some basic frame I/O libraries, then this point changes if using Rust can be converted into a sales argument.

The largest computer vision applications programmed in Rust which can be found on github are:

Braid
- low-latency 2D and 3D tracking using single/multiple high-speed cameras
gyroflow
- video stabilization software

Software Libraries

Software libraries might be grouped into wrappers and some basic rewrites.

Wrappers

darknet-rust
- bindings for the Darknet framework
opencv-rust
- OpenCV wrapper
gstreamer-rs
- Wrapper for gstreamer
paddle-sys
- bindings for the Paddle inference engine (programmed in C)
rust-ffmpeg
- Wrapper for FFMPEG
tch-rs
- Rust bindings for libtorch

Native Crates

cv
- Rust CV mono repo to replicate OpenCV in pure Rust
image-rs
- image-rs is a GitHub organization which provides many repos for basic image processing and I/O
strawlab
- strawlab is the GitHub organization of of Andrew Straw’s lab at the university of Freiburg/Germany. It contains wrappers for e.g. using industrial cameras and many other repos which allow to build the strand-braid project.

Examples

OpenCV Wrapper

If someone likes OpenCV or not is usually irrelevant for initial prototyping. It simply is too useful to not to use. Proper rewrites/optimized rewrites, e.g. in pure Rust, may follow later when it is clear what functions to use exactly.

Perhaps the simplest thing we could do with OpenCV is to print its build information. OpenCV is most commonly used via its C++ API or Python wrapper. Printing debug information using these two programming languages looks like this.

// C++

#include <iostream>

#include <opencv2/core.hpp>

int main()
{
	std::cout << cv::getBuildInformation() << std::endl;
	return 0;
}

# Python

import cv2 as cv

def main():
    print(cv.getBuildInformation())

if __name__ == '__main__':
    main()

Except for explicit typing and importing correct header files, C++ and Python code looks pretty identical with respect to OpenCV.

While the headers based structure seem to match what is exposed via the Rust wrapper, there is one thing that is really hard to get used to when being experienced OpenCV. Classes are usually written as PascalCase and functions as camelCase. However, the Rust wrapper seems to keep PascalCase for classes as modules or structures (or however we want to call it correctly) whereas functions are converted from camelCase to snake_case. Therefore getBuildInformation is renamed to get_build_information. Further the namespace is more explicit and not “exported” to the cv:: namespace. This is probably the most challenging part when using OpenCV within software programmed in Rust. Another thing to get used to is that e.g. a opencv::core::Mat needs to be initialized as default mat. In C++ a simple cv::Mat frame; would be enough. The more experience someone with using OpenCV with C++ or Python has, the more it may be an issue or the longer it takes to adjust.

Printing the build information would look like this:

// print_build_info.rs

use opencv;

fn main() {
    let build_information = opencv::core::get_build_information().unwrap();
    println!("{}",build_information);
}

A very simple example of opening a video source and displaying frames read would look like this:

use opencv;
use opencv::prelude::*;

fn main() -> opencv::Result<()>{
    // argument parsing to input correct source
    let args: Vec<String> = std::env::args().collect();
    let source_raw : &str;
    if args.len() > 1 {
        source_raw = &args[1];
    } else {
        source_raw = &"no source";
    }
    let source_as_int: i64 = match source_raw.parse::<i64>() {
        Ok(n) => n,
        Err(_) => -1
    };

    // modify source to use VideoCapture::from_file with camera source
    let source;
    if source_as_int == -1 {
        source = source_raw.to_string();
    } else {
        source = "/dev/video".to_string() + &source_as_int.to_string();
    }

    if source != "no source" {
        let mut frame = opencv::core::Mat::default();
        let mut cap = opencv::videoio::VideoCapture::from_file(
            &source, opencv::videoio::CAP_ANY)?;

        if cap.is_opened()? {
            let mut counter:u64 = 1;
            let start_time = std::time::Instant::now();
            loop {
                cap.read(&mut frame)?;
                if frame.empty() {
                    break;
                }
                opencv::highgui::imshow("Display", &frame)?;
                let key = opencv::highgui::wait_key(1)?;
                if key == 27 {
                    break;
                }
                counter += 1;
            }
            
            // print fps
            let run_time = start_time.elapsed().as_secs();
            if run_time > 0 {
                println!("Frames read: {}", counter);
                println!("FPS: {}", counter/run_time);
            }
        } else {
            println!("Video source empty");
        }

    } else {
        println!("No source specified!");
    }

    Ok(())
}

Have a look at the official repo examples for more examples of the OpenCV wrapper for Rust.