If you can’t simulate, stick to multi-armed bandits.
https://sites.google.com/view/deep-rl-bootcamp/lectures