That's fascinating. This sounds like a variational autoencoder. The embeddings, which from my humble point of view (as a trained musician) are a largely unexplored field not really supported by existing theory, are at the same time game-deciding. Have you found a good solution for this?