基于深度学习的音频事件检测
首发时间:2020-12-30
摘要:神经网络方法在音频事件检测及标记任务中被广泛采用,国际权威声学场景和事件检测及分类竞赛 (Detection and Classification of Acoustic Scenes and Events, DCASE) 中大多数系统都采用时域音频信号或者音频的log-mel谱图作为输入,并取得了优秀的效果。本文介绍了2D-Wave和2D-Wave-LogMel系统,基于神经网络强大的学习能力,将时域信号作为输入并学习出相应的频域表示,再结合log-mel谱图获得更为丰富的音频信号表示作为输入,在FSD50K数据集上取得了优于基线系统的效果。
For information in English, please click here
Audio Event Detection Based on Deep Learning
Abstract:Neural networks are widely used in audio event detection and tagging tasks. In the detection and classification of acoustic scenes and events (dcase), most systems use time-domain audio signal or log Mel spectrum of audio as input, and achieved excellent results. In this paper, we use the 2D wave-50mel network as the input signal, and use it as the input signal to represent the learning effect of the system.
Keywords: audio event detection Neural networks DCASE FSD50K
基金:
引用
No.****
动态公开评议
共计0人参与
勘误表
基于深度学习的音频事件检测
评论
全部评论0/1000