For those wanting to play w/ the data, there are a lot of resources [0]. I personally have combined older retrosheet data [1] with modern MLB data to some neat uses, not the least of which to try out tech like Druid (big data, live slicing, etc). E.g. If you wanted data from Sunday's Houston vs Texas game, GDX has tons of XML for parsing at [2]. There are plenty of guides that tell you what is what of course. It has been on my mind to develop a tensorflow graph trained w/ existing data to help me win some FanDuel/DraftKings money, but I haven't as of yet (and I should note the MLB data has restrictions against bulk or commercial use).
0 - https://github.com/baseballhackday/data-and-resources/wiki/R...
1 - http://retrosheet.org/
2 - http://gdx.mlb.com/components/game/mlb/year_2017/month_06/da...