List of Data Science and Machine Learning Related Benchmarks

Contents

Introduction
Database/Data Frame Benchmarks
I/O Benchmarks
General Computation Benchmarks
Graph/Network Benchmarks
Web Frameworks

Introduction

When it comes to deploying neural networks or data science related projects it is important to benchmark all components to identify bottlenecks. On the deployment side of any project this is even more important than on the development/training side. However, there is no point in choosing much slower solutions on the development/training side to avoid a few changes.

Most of my notes contain some micro-benchmarks. This list here covers a wider range of benchmarks that might be worth considering when architecting a data science or machine learning project. NB!: These benchmarks provide some initial impressions however they do not free us from micro-benchmarking our own source code to identify more bottlenecks.

Database/Data Frame Benchmarks

Database-like ops benchmark
- benchmarks for doing database like operations of various sizes for data frames such as pandas, data.table and others
modin-benchmark
- benchmark I programmed in 2019 shortly after modin was released

I/O Benchmarks

CSV readers
- benchmarks for reading csv files using Julia, Python and Rust

General Computation Bechmarks

Julia Micro Benchmarks
- micro benchmarks computed by the Julia project some years ago - unknown when updated last
Python Math Benchmarks
- simple comparison of python, numpy, pytorch and tensorflow

Graph/Network Benchmarks

Benchmark of popular graph/network packages
- graph/network benchmarks covering Python and Julia

Hardware Benchmarks

Deep Learning GPU Benchmarks
- various deep learning benchmarks of various nvidia GPUs (different number of GPUs, different batch sizes, different models)

Machine Learning Framework Benchmarks

mlpack benchmarks
- comparison of mlpack to scikit-learn, shogun and others

Machine Learning Model Benchmarks

Papers With Code - State of the Art
- well-known benchmark results for pretty much all published models out there (main focus on deep learning)

Web Framework Benchmarks

Python ASGI Web Framework Benchmarks
- benchmarks of various web frameworks programmed in Python and/or Cython
Web Framework Benchmarks
- general purpose web framework benchmarks