APPLICATION OF MEL-FREQUENCY CEPSTRAL COEFFICIENTS IN AUTOMATIC SPEAKER RECOGNITION AS PART OF IOT SOLUTIONS FOR SECURITY AND OPTIMIZATION IN SMART CITIES

Authors

DOI:

https://doi.org/10.46793/AlfaTech1.1.05J

Keywords:

Automatic speaker recognition, Mel-frequency cepstral coefficients, MFCCs, Covariance matrix, Exponential, Sigmoidal

Abstract

This paper presents an implementation of automatic speaker recognition utilizing feature vectors composed of 21 mel-frequency cepstral coefficients (MFCCs) as part of an IoT-driven solution for enhancing security and optimization in smart cities. Experiments are conducted on the Solo portion of the CHAINS database, containing 33 unique sentences pronounced by each of 36 speakers. Results indicate that recognition accuracy varies with the training and testing datasets and improves with longer test recordings. A comparative analysis of MFCC calculation methods reveals that accuracy is generally higher when a sigmoidal square of amplitude characteristic is applied to frequency-selective ranges, rather than an exponential approach. Models are developed for each speaker’s recordings, represented by a covariance matrix of feature vectors, and applying a sigmoid function to the model elements yields a 5% increase in recognition accuracy in most cases. These findings highlight the potential for MFCC-based speaker recognition as a scalable, data-driven IoT tool for security, public safety, and resource optimization in the context of smart cities.

Downloads

Published

22-04-2025

Issue

Section

Articles