AUTOMATIC KEYWORD DETECTION AND CRIME PREDICTION FROM PHONE CALL ANALYSIS (SEERIUM)

Abstract

Traditional voice recognition systems predominantly rely on anatomical features such as vocal tract geometry and articulatory patterns. This study introduces an alternative approach centered on the acoustic characteristics of speech, with an emphasis on frequency-domain features. The proposed system integrates spectral analysis, Mel-Frequency Cepstral Coefficients (MFCCs), and deep learning architectures to perform three tasks: keyword detection, gender classification, and speaker identification. Using a custom dataset comprising real-world, noise-contaminated voice recordings, the model achieved an accuracy of 94% for keyword detection, 80.5% for gender classification, and 71% for speaker identification. These results underscore the robustness of frequency-based features in non-ideal conditions and highlight their applicability in privacy-sensitive and locally processed voice recognition systems. Future work will explore advanced neural architectures and signal enhancement techniques to further improve performance across diverse environments. A web-based platform was also developed to allow users to test the system via voice uploads and receive immediate analysis results without needing MATLAB.

Description

Keywords

Voice recognition, Frequency-domain features, Deep learning, Keyword detection, Speaker identification, MFCCs, Noisy data

Citation

Collections