Machine learning-based Annotation of Customer-Operator Conversation Clips for Voice Activity Detection
Published in SIU, 2019
This study presents the development of a voiceactivity detection (VAD) system tested on call center telephonydata obtained from our local site. The concept of bag of audiowords (BoAW) combined with a naive Bayes classifier was appliedto achieve the task. It was formulated as a binary classificationproblem with speech as the positive class and silence/backgroundnoise as the negative class. All the processing was performed onthe Mel-frequency cepstral coefficients (MFCCs) extracted fromthe audio recordings. The results which are presented as accuracyscore and receiver operating characteristics (ROC) indicate anexcellent performance of the developed model. The system is to bedeployed within our call center to aid data analysis and improveoverall efficiency of the center.
Recommended citation: L.O. Iheme, Ş. Ozan, E. Akagündüz. (2019). "Machine learning-based Annotation of Customer-Operator Conversation Clips for Voice Activity Detection." SIU 2019. http://sukruozan.github.io/files/2019-SIU-2.pdf