A Comparative Analysis of CNN and SVM for Static Sign Language Recognition Using MediaPipe Landmarks
DOI:
https://doi.org/10.26740/jistel.v1n2.p225-238Keywords:
CNN, SVM, Mediapipe, SIBI, Handsign classificationAbstract
This study presents a direct comparative analysis of a Convolutional Neural Network (CNN) and a Support Vector Machine (SVM) for static Indonesian Sign Language (SIBI) alphabet recognition, utilizing landmarks extracted via MediaPipe. The primary contribution of this work is to provide a comprehensive performance benchmark, evaluating the trade-offs between a deep learning model (CNN) and a classical machine learning model (SVM) in terms of accuracy, computational efficiency, and robustness under a unified experimental framework. The evaluation, conducted using metrics such as accuracy, F1-score, balanced accuracy, and ROC AUC, reveals divergent performance profiles. The CNN model achieved perfect classification accuracy (1.00) across all metrics, with its learning curve demonstrating stable and effective generalization. In contrast, the SVM model achieved a respectable test accuracy of 80% and a ROC AUC score of 0.99, but exhibited some misclassifications for visually similar gestures. Notably, the SVM demonstrated significantly faster training times, completing its training in under 0.09 seconds, whereas the CNN required approximately 0.5 seconds per epoch. These findings empirically validate that while CNN offers superior accuracy, the SVM remains a highly relevant and efficient alternative for applications with constrained computational resources. This research provides a crucial reference for developers in selecting the appropriate architecture for real-time sign language recognition systems.
Downloads
Published
How to Cite
Issue
Section

