Enhancing gesture recognition for assisting visually impaired persons using deep learning in an IoT environment-based improved snake optimisation algorithm

Scritto il 01/11/2025

da Hanan Abdullah Mengash

Sci Rep. 2025 Oct 31;15(1):38149. doi: 10.1038/s41598-025-22070-7.

ABSTRACT

Gesture recognition (GR) is an emerging and wide-ranging area of research. GR is extensively applied in sign language, Immersive game technology, and other computer interfaces, among others. People with visual impairments face challenges in completing tasks, including navigating environments, using technologies, and engaging in social interactions. Additionally, people face challenges in balancing their individuality with the need for protection in their day-to-day work. It is likely to recognize the communication of visually challenged and deaf people by recording their speech, and in comparison, with recent datasets, hence establishing their objectives. The conventional machine learning (ML) model attempts to utilize handcrafted features, but often underperforms in real-time environments. Deep learning (DL) models have become a sensation amongst investigators recently, making conventional ML techniques comparatively old. Therefore, this study presents a new approach, Enhancing Gesture Recognition for the Visually Impaired using Deep Learning and an Improved Snake Optimization Algorithm (EGRVI-DLISOA), in an IoT environment. The EGRVI-DLISOA approach is an advanced GR system powered by DL in an IoT environment, designed to provide real-time interpretation of gestures to assist the visually impaired. Initially, the EGRVI-DLISOA technique utilizes the Sobel filter (SF) technique for the noise elimination process. For feature extraction, the SqueezeNet model is utilized due to its efficiency in capturing meaningful features from complex visual data. For an accurate GR process, the long short-term memory (LSTM) approach is implemented. To fine-tune the hyperparameter values of the LSTM classifier, the improved snake optimization algorithm (ISOA) is utilized. The experimentation of the EGRVI-DLISOA technique is investigated under the hand gestures dataset. The comparison study of the EGRVI-DLISOA technique revealed a superior accuracy value of 98.62% compared to existing models.

PMID:41173994 | PMC:PMC12578915 | DOI:10.1038/s41598-025-22070-7