Wearable Audio-Visual Enhanced Speech-recognition System (WAVESS): A Conceptual Model with Lipreading Capabilities

A.E. Adeoye; E. Olaye; U. I. Onwuegbuzie

doi:10.5281/zenodo.17205329

Vol. 2 No. 1 (2025), Articles

Vol. 2 No. 1 (2025)

Wearable Audio-Visual Enhanced Speech-recognition System (WAVESS): A Conceptual Model with Lipreading Capabilities

Articles

https://doi.org/10.5281/zenodo.17205329

Published 24-09-2025

Adeoye A. E.⁺⁻
Olaye E.⁺⁻
Onwuegbuzie U. I.⁺⁻

Adeoye A. E.

Department of Computer Science, Dennis Osadebay University, Asaba, Delta State, Nigeria.

https://orcid.org/0000-0003-2132-7267

Olaye E.

Department of Computer Engineering, University of Benin, Benin City, Edo State, Nigeria.

Onwuegbuzie U. I.

Department of Cybersecurity, Dennis Osadebay University, Asaba, Delta State, Nigeria.

https://orcid.org/0000-0002-1582-1739

PDF

Keywords

Multimodal Speech Recognition
Wearable Technology
Audio-Visual Fusion
Lipreading Systems
Human–Computer Interaction

How to Cite

Adeoye , A., Olaye, E., & Onwuegbuzie, U. I. (2025). Wearable Audio-Visual Enhanced Speech-recognition System (WAVESS): A Conceptual Model with Lipreading Capabilities. Tech-Sphere Journal for Pure and Applied Sciences, 2(1). https://doi.org/10.5281/zenodo.17205329

Abstract

Speech recognition technologies have evolved significantly from early rule-based systems to modern deep learning models; however, conventional audio-only approaches remain constrained by noise interference, diverse accents, and speech impairments, limiting their robustness in real-world applications. Recent research highlights the value of multimodal systems that combine auditory and visual cues, with lipreading offering complementary information where audio signals alone may fail. This study proposes the Wearable Audio-Visual Enhanced Speech-recognition System (WAVESS), a conceptual model implemented in the form of smart glasses equipped with a microphone array and a mini-camera that captures lip movements. The system integrates audio and video inputs through preprocessing pipelines, noise reduction and Mel-Frequency Cepstral Coefficients (MFCC) for audio, and lip region detection and feature extraction for video, before fusing them in a real-time multimodal recognition engine. The fused representation enhances recognition accuracy, adaptability, and resilience in challenging conditions such as noisy environments, hearing impairment contexts, and human–machine interaction scenarios. The model also incorporates connectivity features for wireless or edge-based computation and provides multimodal feedback through augmented reality overlays, audio, or haptic signals. WAVESS demonstrates the comparative advantage of wearable, multimodal systems in accessibility, communication, education, and security applications while addressing scalability and ethical considerations. The conceptual framework establishes a foundation for future prototyping, dataset expansion, and real-world deployment in advancing robust speech recognition research.

https://doi.org/10.5281/zenodo.17205329

PDF

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Downloads

Download data is not yet available.

Wearable Audio-Visual Enhanced Speech-recognition System (WAVESS): A Conceptual Model with Lipreading Capabilities

Keywords

Categories

How to Cite

Abstract

Downloads

Similar Articles

Similar Articles

Explainable Artificial Intelligence (XAI): A Comprehensive Review of Methods, Applications, and Open Issues

Literature Review on Explainable Artificial Intelligence (XAI): Techniques, Tools, and Applications

The Palliative Effects of Moringa Oleifera –Mediated Iron Oxide Nanoparticles Against Mercury – Induced Immune Responses and Mmp9 mRNA Expression in Rats

Protective Effect of Jatropha Tanjorensis-Based Green Selenium Nanoparticles Against Cadmium-Induced Hepatotoxicity in Male Rats

Projectile Dynamics under Aerodynamic Drag: Application to the Flight of a Soccer Ball

A Comprehensive Survey of Advances in Deep Reinforcement Learning for Autonomous Robotics Applications

The Risks and Rewards of AI Dependence in Nigerian Education: A Critical Evaluation

Blockchain Applications for Dependable and Secure Data Management: A Review

Design and Implementation of a Microcontroller-Based Solar Inverter for Efficient DC–AC Conversion

A Comprehensive Survey of Federated Learning Approaches for Privacy-Preserving Machine Learning

Wearable Audio-Visual Enhanced Speech-recognition System (WAVESS): A Conceptual Model with Lipreading Capabilities

Keywords

Categories

How to Cite

Download Citation

Abstract

Downloads

Similar Articles