Harold Harrison and Mazlina Mamat and Farah Wong and Hoe Tung Yew and Racheal Lim and Wan Mimi Diyana Wan Zaki (2025) A vision-based deep learning approach for non-contact vibration measurement using (2+1)D CNN and optical flow. JOURNAL OF VIBROENGINEERING. pp. 1-20. ISSN 1392-8716
|
Text
FULLTEXT.pdf Restricted to Registered users only Download (2MB) | Request a copy |
Abstract
This paper introduces a proof-of-concept vision-based deep learning approach for vibration measurement, proposing a factorized (2+1)D Convolutional Neural Network (CNN) model to predict four vibration metrics: acceleration, velocity, displacement, and frequency, with a focus on rigid body motion. Unlike conventional neural network models that primarily focus on frequency prediction alone, this approach uniquely enables the simultaneous estimation of four critical vibration metrics, offering a comprehensive and cost-effective alternative to traditional contact-based sensors such as accelerometers. The framework relies on the visibility of a training fiducial marker, eliminates the need for calibration in controlled settings, enhancing scalability across specific environments. A curated dataset was generated using a controlled experimental setup comprising a single object in a lab-scale environment, augmented synthetically to enhance frequency diversity. An optical flow-based preprocessing algorithm synchronized motion features in recorded video inputs with measured vibration labels, improving measurement accuracy. The proposed model achieved an average Mean Absolute Percentage Error (MAPE) of 7.51 %, with acceleration predictions exhibiting the lowest error at 4.84 % and displacement the highest at 8.80 % across varying brightness levels and object-camera distances. Techniques such as Region of Interest (ROI) cropping and multi-section frame extraction were implemented to reduce computational complexity while further enhancing accuracy. These results highlight the framework’s potential for non-invasive vibration analysis, though its generalizability is limited by the single-object dataset. Future work will expand the dataset, integrate multi-sensor inputs, explore marker-less tracking methods, and enable real-time deployment for predictive maintenance and structural health monitoring.
| Item Type: | Article |
|---|---|
| Keyword: | vibration, Non-contact, Vision, deep learning |
| Subjects: | T Technology > TA Engineering (General). Civil engineering (General) > TA1-2040 Engineering (General). Civil engineering (General) > TA401-492 Materials of engineering and construction. Mechanics of materials |
| Department: | FACULTY > Faculty of Engineering |
| Depositing User: | DG MASNIAH AHMAD - |
| Date Deposited: | 31 Oct 2025 17:19 |
| Last Modified: | 31 Oct 2025 17:19 |
| URI: | https://eprints.ums.edu.my/id/eprint/45594 |
Actions (login required)
![]() |
View Item |

