Deploying Neural Networks in Real-Time Applications

Deploying models using Python is getting worse every year. The obvious reason to avoid Python is related to performance issues. As soon as we’re facing strong latency requirements, the Python GIL removes Python from the set of useful languages. The other reason why Python is a pain in the ass choice of language is not related to the language itself but to the ecosystem. While there are certain shortcommings that could be mitigate using e.g. cython or accept certain performance penalities (as long as they don’t render a product or service pointless), ecosystem related problems are severe. Example? Ever tried to use CuPy, Rapids and PyTorch in a single project on recent Ampere GPUs? Another ecosystem related issue is the licensing. Ever tried to resolve all the licences of all the packages used? And are they really trustworthy? So, what are the alternatives?

As most Python libraries are written in C++ anyhow, C++ is an obvious choice. However, C++ has its flaws and programming it correctly with proper error handling is somehow not that simple, rather slow (for prototyping) and “the ecosystem” might be even more disconnected. Actually, writing proper and useful error handling considering a rather small stack (opencv, libtorch + perhaps boost for network bound I/)) is almost the most time consuming part of it. Many applications, especially computer vision related, can be centered around 2-4 larger libraries that are somewhat well-maintained and have clear licensing but many other things have to be written from scratch which has a very negative impact on prototyping (clean re-write for production is something that is often lacking nowadays).

Rust is a decent choice as well. The language itself has many features to overcome all the issues everyone faces with C or C++. However, there are also some features it lacks and the “workarounds” are everything but beautiful. The more critical issue is related to Rust’s ecosystem. The nice thing first, licensing is usually Apache or MIT (or dual). The bad thing is the dependency hell. Even rather small libraries have so many dependencies that it is really difficult to estimate if there are any security fuck-ups involved when compiling all. Further, it seems like many libraries, excuse me crates, were hacked together rather rapidly and maintained for only a short period of time. Being dependend on such libraries, even as dependency of well maintained libraries, is a tricky thing to evaluation when aiming to deploy it in production environments.

What to choose? I don’t have a definite answer to this but I find myself designing well-optimized C++ pipelines quite frequently. Rust certainly has some nice guarantees when it comes to network bound I/O which justifies spending signficantly more time writing some of the computer vision related parts from scratch. Probably both programming languages are good sources for upcoming blog posts.