Real-Time Hand Gesture Recognition Using YOLO and (Darknet-19) Convolution Neural Networks


  • Raad Ahmed Mohamed Computer Science, Iraqi Commission for Computer and informatics Informatics Institute for Postgraduate Studies Baghdad, Iraq
  • Karim Q Hussein Mustansiryha University-Faculty of Science Computer Science Dept. Baghdad, Iraq



Deaf and dumb, sign language, gesture detection. hand gestures, region-based convolution, human hands, DarkNet-19 and YOLO


There are at least three hundred and fifty million people in the world that cannot hear or speak. These are what are called deaf and dumb. Often this segment of society is partially isolated from the rest of society due to the difficulty of dealing, communicating and understanding between this segment and the rest of the healthy society. As a result of this problem, a number of solutions have been proposed that attempt to bridge this gap between this segment and the rest of society. The main reason for this is to simplify the understanding of sign language. The basic idea is building program to recognize the hand movement of the interlocutor and convert it from images to symbols or letters found in the dictionary of the deaf and dumb. This process itself follows mainly the applications of artificial intelligence, where it is important to distinguish, identify, and extract the palm of the hand from the regular images received by the camera device, and then convert this image of the movement of the paws or hands into understandable symbols. In this paper, the method of image processing and artificial intelligence, represented by the use of artificial neural networks after synthesizing the problem under research was used. Scanning the image to determine the areas of the right and left palm. Non-traditional methods that use artificial intelligence like Convolutional Neural Networks are used to fulfill this part. YOLO V-2 specifically was used in the current research with excellent results. Part Two: Building a pictorial dictionary of the letters used in teaching the deaf and dumb, after generating the image database for the dictionary, neural network Dark NET-19 were used to identify (classification) the images of characters extracted from the first part of the program. The results obtained from the research show that the use of neural networks, especially convolution neural networks, is very suitable in terms of accuracy, speed of performance, and generality in processing the previously unused input data. Many of the limitations associated with using such a program without specifying specific shapes (general shape) and templates, hand shape, hand speed, hand color and other physical expressions and without using any other physical aids were overcome through the optimal use of artificial convolution neural networks.




How to Cite

Mohamed, R. A., & Hussein, K. Q. (2023). Real-Time Hand Gesture Recognition Using YOLO and (Darknet-19) Convolution Neural Networks. International Journal of Innovative Computing, 13(1-2), 73–79.