Undergraduate ypec-2024

UG15 – Pop Music to MIDI Piano Cover Generation

Automatic Music Transcription (AMT) of piano music is a well-established study area. This paper addresses the broader problem of generating MIDI piano covers for pop music automatically. This is a more challenging task due to the diverse styles and instrumentation of pop music. In addition, there is a lack of large-scale dataset with paired and synchronized audio waveform and piano MIDI file. In this paper, we build a pipeline for data collection and synchronization using music information retrieval (MIR) techniques, resulting in a dataset of 3000 paired audio and piano MIDI file samples. We propose two metrics for measuring the quality of synchronization, which is used to filter out poorly synchronized samples. Using a Transformer neural network, our model can generate piano cover song solely from the raw audio, without any input of external information. We design an evaluation metric for the quality of the generated cover song, compared with the ground truth labels. Our method outperforms Pop2Piano, a recent similar work by Choi and Lee, on this metric.

Leave a Reply

Your email address will not be published.