Anyone working in the field of deep learning for solving computer vision tasks has heard of YOLO. If this actually is a useful architecture for many real-life deployment cases is something I’ll leave open here. It certainly is useful for some quick initial prototypes if object detection is required. Some comparison of model performance can be found here.
Over the past years an enormous amount of object detectors are called YOLO. Here is a list of them:
The darknet family
- YOLOv3
- the last version of the guy who started it initially
- YOLOv4
- most likely the most used of the original darknet-based version
Based on PyTorch and other frameworks
- YOLOv5
- most likely one of the most popular versions as it allows easy model export in all kind of formats from its trained pytorch models
- YOLOv8
- successor of YOLOv5 by the same company
- YOLOv11
- successor of YOLOv8 by the same company
- YOLOv12
- “Attention-Centric Real-Time Object Detectors”
- YOLO-NAS
- model generated using neural architecture search
- Gold-YOLO