Skip to content

Latest commit

 

History

History
109 lines (60 loc) · 5.38 KB

transcription.md

File metadata and controls

109 lines (60 loc) · 5.38 KB

Transcription and analysis

tldr: ready-to-use pipeline is demucs + basic pitch + omnizart drum + omnizart chord + all-in-one (+ sheet sage chords). or buy RipX license, it's worth it

Audio to midi

RipX

Suppose you have a wav/mp3 of Western music and you want to produce a midi.

Well, take RipX DeepRemix and you'll get a certain quality.

Questions:

  • Does it use spleeter? Is demucs better?
  • How does it draw precise Melodyne-like pitch contours? I see them in a demo of Basic Pitch, but they're lost on midi export?
  • How does it estimate midi notes?
  • Does it extract midi notes first and then Q-filters a certain part over it for playback? Or how does it split a part into harmonics/notes in a polyphonic texture?

The rest of this doc focuses on how to build your own RipX from existing tools.

Demixing

Demucs v4 (2022) ▶️ splits into bass, drums, vocals and other. Previously spleeter (2020) was widely used - it can do additional fifth stem with piano.

Beat tracking

Next, you want to get beats and downbeats (measures) - in millisecond timestamps. And verse/chorus/bridge form annotation, ideally. Well, since July 2023 you can get All-in-One (2023) 🤗:

image

There's also a nice visualizer:

Screenshot 2023-09-05 at 11 14 02

Other:

Questions:

  • does All-in-One improve madmom significantly?
  • is Spotify's API any good?

Pitch recognition

For each of non-drum parts you want pitch recognition. General-purpose SOTA is Basic Pitch:

Screenshot 2023-09-05 at 11 17 27

Other:

  • Sheet Sage (2022) for melody - although I tested it, and results are far from great

Piano

Historically, a lot of effort was put specifically into solo piano recognition. So now we have Onsets and Frames (2018) 🐟

Screenshot 2023-09-05 at 11 37 00

Other:

Automatic drum tracking

Assemble midi

Doing it straight away via MT3 gives very bizarre results.

So: get your parts Basic Pitched, get your drums separately, get your downbeat timings, and then use gpt4 mido scripting / midisox / Ableton (XML-hackable!) to merge parts back together.

Caveat: Basic Pitch produces midi with resolution 480, whereas omnizart outputs in 220. Merging them can lead to errors when using online tools

Harmonic analysis: chords, keys

Possibly, with "omnizart chord" and any beat tracking you can already build a clone of Chordify and even improve it by adding form annotation.

Also, a paper by Chordify (2014) with original references.

Screenshot 2023-09-05 at 11 56 05

PDF sheet music to MusicXML (Optical Music Recognition, OMR)

It's tough and not solved holistically yet.

Notes