Automatic Music Transcription – An Overview

Abstract

The capability of transcribing music audio into music notation is a fascinating example of human intelligence. It involves perception (analyzing complex auditory scenes), cognition (recognizing musical objects), knowledge representation (forming musical structures), and inference (testing alternative hypotheses). Automatic music transcription (AMT), i.e., the design of computational algorithms to convert acoustic music signals into some form of music notation, is a challenging task in signal processing and artificial intelligence. It comprises several subtasks, including multipitch estimation (MPE), onset and offset detection, instrument recognition, beat and rhythm tracking, interpretation of expressive timing and dynamics, and score typesetting.

Related

April 2025 | 2024 IEEE Spoken Language Technology Workshop (SLT)

Classification Of Spontaneous And Scripted Speech For Multilingual Audio

Shahar Elisha, Andrew McDowell, Mariano Beguerisse-Díaz, Emmanouil Benetos

November 2024 | SIAM Journal on Mathematics of Data Science

Topological Fingerprints for Audio Identification

Wojciech Reise, Ximena Fernández, Maria Dominguez, Heather A. Harrington, Mariano Beguerisse-Díaz

August 2023 | Interspeech

Lightweight and Efficient Spoken Language Identification of Long-form Audio

Winstead Zhu, Md Iftekhar Tanveer, Yang Janet Liu, Seye Ojumu, Rosie Jones