This paper presents a statistical method for single-channel speech dereverberation using a variational autoencoder (VAE) for modelling the speech spectra. One popular approach for modelling speech spectra is to use non-negative matrix factorization …
Biophysically realistic models of the cochlea are based on cascaded transmission-line (TL) models which capture longitudinal coupling, cochlear nonlinearities, as well as the human frequency selectivity. However, these models are slow to compute …
Popular neural network-based speech enhancement systems operate on the magnitude spectrogram and ignore the phase mismatch between the noisy and clean speech signals. Recently, conditional generative adversarial networks (cGANs) have shown promise in …
Recent advances in neural network (NN)-based speech enhancement schemes are shown to outperform most conventional techniques. However, the performance of such systems in adverse listening conditions such as negative signal-to-noise ratios and unseen …
Exemplar-based techniques, where the noisy speech is decomposed as a linear combination of the speech and noise exemplars stored in a dictionary, have been successfully used for speech enhancement in noisy environments. This paper extends this …
Deep neural network (DNN) based acoustic modelling has been shown to yield significant improvements over Gaussian Mixture Models (GMM) for a variety of automatic speech recognition (ASR) tasks. In addition, it is also becoming popular to use rich …
We present a novel automatic speech recognition (ASR) scheme which uses the recently proposed noise robust exemplar matching framework for speech enhancement in the front-end. The proposed system employs a GMM-HMM back-end to recognize the enhanced …
Exemplar-based feature enhancement successfully exploits a wide temporal signal context. We extend this technique with hybrid input spaces that are chosen for a more effective separation of speech from background noise. This work investigates the use …
In this paper, we propose a single-channel speech enhancement system based on the noise robust exemplar matching (N-REM) framework using coupled dictionaries. N-REM approximates noisy speech segments as a sparse linear combination of speech and noise …
Deep neural network (DNN) based acoustic modelling has been successfully used for a variety of automatic speech recognition (ASR) tasks, thanks to its ability to learn higher-level information using multiple hidden layers. This paper investigates the …