The detection and recognition of different objects in an image quickly and reliably is an important field of computer vision. People look at an image and immediately know what objects are in the image, where they are located and how they interact. The human visual system is fast and accurate, allowing us to perform complex tasks, such as driving with a little prudent thinking. In contrast, the problem of locating objects from a computer is not so simple. The purpose of this thesis is the presentation and analysis of machine learning and computer vision algorithms for detecting objects in an image or video.
The aim of this thesis is to make a generalized analysis of the prevailing machine learning algorithms for object detection that are available today, to study in depth the use of convolutional neural networks through machine learning algorithms in object recognition and localization applications. in images, videos but also in applications that require object detection in real time. Faster R-CNN, SSD, RetinaNet, EfficentDet, YOLOv3, and YOLOv4 are the state-of-the-art object detection algorithms are main algorithms that are thoroughly studied in this thesis. It also aims to train object detection models and apply them to detect weapons in images, videos and real-time. Finally, metrics for the performance and effectiveness of these algorithms are presented.