> Interesting, but surely it has to retain some information about the past to be able to know how to update that belief vector?
The belief vector is updated on the fly. When players take moves, we use the belief vector and our CFR-generated move probabilities to perform Bayes' rule. Once the belief vector is updated, we throw out all information related to the specific move they took.
> Outside of directly being on a mission, the only major "good" or "evil" actions are voting for/against missions, once you know if the mission succeeded or failed. Do you just not take that into account?
DeepRole takes all player actions into account - the key to good performance in Avalon is knowing how to interpret the voting/proposal actions of all the players. We explored this in our paper: LogicBot only uses the mission fail results to deduce who is good, and has a lower win rate than DeepRole in all situations.
> If it's literally just a representation of the outcomes of the missions and who went on them, then isn't the Belief Vector just the venn diagram of how every mission went with some iterative statistics laid over it?
While you can tease out the "venn diagram" aspect out of the belief vector (it will assign 0 probability to impossible assignments), it's far richer than that - it weights the possible assignments based on all of the moves it has observed.
In some sense, DeepRole is playing with one of its hands tied behind its back. All it knows about the state of the game is this belief vector, the number of succeeds, the number of fails, and the proposal count. It doesn't know the specific moves that led that point in the game. The fact we can summarize everyone's previous moves into this belief vector is somewhat surprising, considering human players can look back at the game history and re-synthesize for new insights.