Acoustic Scene Classification with Modulation Spectrogram Features and a Convolutional Recurrent Network
محل انتشار: یازدهمین کنفرانس بین المللی آکوستیک و ارتعاشات
سال انتشار: 1400
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 226
فایل این مقاله در 7 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
این مقاله در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
ISAV11_037
تاریخ نمایه سازی: 20 بهمن 1400
چکیده مقاله:
One of the major objectives of artificial intelligent systems is making the machine aware of the environment. Acoustic scene classification (ASC) aims to detect the auditory scene of the recorded sound. In this paper, we propose a novel feature extraction approach based on evaluating the modulation spectrogram features instead of the commonly used Mel spectrogram. Modulation spectrogram provides more discriminant features for classification. We split the recording into several temporal segments and compute the modulation spectrogram for each segment individually. The obtained feature tensors then construct the input data of a Convolutional Long Short Term Memory (Conv-LSTM) model for classification. Using LSTM, we can capture constructive temporal information used for classification. The spectral structure of the audio signal is effectively extracted by convolutional layers. The proposed model outperforms the state of the art methods in terms of the prediction accuracy for evaluation data in ASC on the DCASE ۲۰۱۷ dataset.
کلیدواژه ها:
Acoustic scene classification ، Convolutional Neural Network (CNN) ، Long Short Term Memory (LSTM) ، Conv-LSTM ، modulation spectrogram.
نویسندگان
Sayeh Mirzaei
School of Engineering Science, College of Engineering, University of Tehran, Tehran, Iran.
Saeedeh Davoudi
School of Engineering Science, College of Engineering, University of Tehran, Tehran, Iran