The results seemed impressive until I noticed some of the "Thinking" statements in the UI.
One made it apparent the model / agent / whatever had read the title from the screenshot and was off searching for existing ABC transcripts of the piece Ode to Joy.
So the whole thing was far less impressive after that, it wasn't reading the score anymore, just reading the title and using the internet to answer my query.