Language Technologies

Spotify listeners use language to express their needs, whether typing queries or speaking the names of songs they would like to hear. Additionally, songs and podcasts contain language that we can understand, classify, and match to user interests. We conduct research on all aspects of language technologies that are applicable to audio streaming. We help Spotify understand you, by being conversational, multilingual and interactive. We also learn the semantics of audio content and creators from language descriptions, including a knowledge graph of entities, ensuring our methods are scalable and include approaches to developing and maintaining shared vocabularies and ontologies. Our research areas range from computational linguistics, natural language processing and speech applications, to machine learning applied to all aspects of language.

Latest Language Technologies Publications

September 2023 | CLEF

Cem Mil Podcasts: A Spoken Portuguese Document Corpus For Multi-modal, Multi-lingual and Multi-Dialect Information Access Research

Ekaterina Garmash, Edgar Tanaka, Ann Clifton, Joana Correia, Sharmistha Jat, Winstead Zhu, Rosie Jones, Jussi Karlgren

August 2023 | Interspeech

Lightweight and Efficient Spoken Language Identification of Long-form Audio

Winstead Zhu, Md Iftekhar Tanveer, Yang Janet Liu, Seye Ojumu, Rosie Jones

May 2023 | TheWebConf

Improving Content Retrievability in Search with Controllable Query Generation

Gustavo Penha, Enrico Palumbo, Maryam Aziz, Alice Wang, and Hugues Bouchard