تقویت گفتار با استفاده از یادگیری فرهنگ لغت در حوزه تبدیل ویولت بسته
|کد مقاله||سال انتشار||تعداد صفحات مقاله انگلیسی||ترجمه فارسی|
|124406||2017||26 صفحه PDF||سفارش دهید|
نسخه انگلیسی مقاله همین الان قابل دانلود است.
هزینه ترجمه مقاله بر اساس تعداد کلمات مقاله انگلیسی محاسبه می شود.
این مقاله تقریباً شامل 15903 کلمه می باشد.
هزینه ترجمه مقاله توسط مترجمان با تجربه، طبق جدول زیر محاسبه می شود:
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computer Speech & Language, Volume 44, July 2017, Pages 22-47
Sparse coding, as a successful representation method for many signals, has been recently employed in speech enhancement. This paper presents a new learning-based speech enhancement algorithm via sparse representation in the wavelet packet transform domain. We propose sparse dictionary learning procedures for training data of speech and noise signals based on a coherence criterion, for each subband of decomposition level. Using these learning algorithms, self-coherence between atoms of each dictionary and mutual coherence between speech and noise dictionary atoms are minimized along with the approximation error. The speech enhancement algorithm is introduced in two scenarios, supervised and semi-supervised. In each scenario, a voice activity detector scheme is employed based on the energy of sparse coefficient matrices when the observation data is coded over corresponding dictionaries. In the proposed supervised scenario, we take advantage of domain adaptation techniques to transform a learned noise dictionary to a dictionary adapted to noise conditions captured based on the test environment circumstances. Using this step, observation data is sparsely coded, based on the current situation of the noisy space, with low sparse approximation error. This technique has a prominent role in obtaining better enhancement results particularly when the noise is non-stationary. In the proposed semi-supervised scenario, adaptive thresholding of wavelet coefficients is carried out based on the variance of the estimated noise in each frame of different subbands. The proposed approaches lead to significantly better speech enhancement results in comparison with the earlier methods in this context and the traditional procedures, based on different objective and subjective measures as well as a statistical test.