If I remember correctly, the DeepMind x UCL RL Lecture Series proves the underlying Bellman equation in this video: https://www.youtube.com/watch?v=zSOMeug_i_M
As for "hidden information" games, I thought the trick was to concatenate the current state with all past states and treat that as the new state, thereby making it an MDP.