Classification Of Spontaneous And Scripted Speech For Multilingual Audio
Shahar Elisha, Andrew McDowell, Mariano Beguerisse-Díaz, Emmanouil Benetos
Variational autoencoders are a versatile class of deep latent variable models. They learn expressive latent representations of high dimensional data. However, the latent variance is not a reliable estimate of how uncertain the model is about a given input point. We address this issue by introducing a sparse Gaussian process encoder. The Gaussian process leads to more reliable uncertainty estimates in the latent space. We investigate the implications of replacing the neural network encoder with a Gaussian process in light of recent research. We then demonstrate how the Gaussian Process encoder generates reliable uncertainty estimates while maintaining good likelihood estimates on a range of anomaly detection problems. Finally, we investigate the sensitivity to noise in the training data and show how an appropriate choice of Gaussian process kernel can lead to automatic relevance determination.
Shahar Elisha, Andrew McDowell, Mariano Beguerisse-Díaz, Emmanouil Benetos
A. Ghazimatin, E. Garmash, G. Penha, K. Sheets, M. Achenbach, O. Semerci, R. Galvez, M. Tannenberg, S. Mantravadi, D. Narayanan, O. Kalaydzhyan, D. Cole, B. Carterette, A. Clifton, P. N. Bennett, C. Hauff, M. Lalmas-Roelleke
Amar Ashar, Karim Ginena, Maria Cipollone, Renata Barreto, Henriette Cramer