undefined | Better HN

0 pointsnoirbot7y ago0 comments

Interesting, but surely it has to retain some information about the past to be able to know how to update that belief vector? Outside of directly being on a mission, the only major "good" or "evil" actions are voting for/against missions, once you know if the mission succeeded or failed. Do you just not take that into account?

It's possible this is in the paper - a lot of the more math/modeling parts went a bit over my head, so feel free to point me to a section to specifically read if I missed out.

If it's literally just a representation of the outcomes of the missions and who went on them, then isn't the Belief Vector just the venn diagram of how every mission went with some iterative statistics laid over it? I would have assume any regular/competitive players would be fairly good at keeping that mental model themselves, which makes it seem confusing to me that the Agent would be better than that, unless it's essentially just saying that the game is better if you play purely logically and ignore all context, which defeats the fun of playing it?

0 comments

3 comments · 2 top-level

Detry3227y ago

> Interesting, but surely it has to retain some information about the past to be able to know how to update that belief vector?

The belief vector is updated on the fly. When players take moves, we use the belief vector and our CFR-generated move probabilities to perform Bayes' rule. Once the belief vector is updated, we throw out all information related to the specific move they took.

> Outside of directly being on a mission, the only major "good" or "evil" actions are voting for/against missions, once you know if the mission succeeded or failed. Do you just not take that into account?

DeepRole takes all player actions into account - the key to good performance in Avalon is knowing how to interpret the voting/proposal actions of all the players. We explored this in our paper: LogicBot only uses the mission fail results to deduce who is good, and has a lower win rate than DeepRole in all situations.

> If it's literally just a representation of the outcomes of the missions and who went on them, then isn't the Belief Vector just the venn diagram of how every mission went with some iterative statistics laid over it?

While you can tease out the "venn diagram" aspect out of the belief vector (it will assign 0 probability to impossible assignments), it's far richer than that - it weights the possible assignments based on all of the moves it has observed.

In some sense, DeepRole is playing with one of its hands tied behind its back. All it knows about the state of the game is this belief vector, the number of succeeds, the number of fails, and the proposal count. It doesn't know the specific moves that led that point in the game. The fact we can summarize everyone's previous moves into this belief vector is somewhat surprising, considering human players can look back at the game history and re-synthesize for new insights.

chadmeister7y ago· 1 in thread

I think you are exactly right. The algo is simply outmemorizing it's human counterparts. This isn't a very good paper at all.

Detry3227y ago

I don't think this is true. On ProAvalon, human players can see the full history of the game at all times [1], and use it to make decisions. DeepRole, on the other hand, can only use its internal belief state. This belief state is only a summary of what has happened in the game - DeepRole has no way of knowing who went on previous missions, or how people voted, or who proposed what. See above for more detail.

[1] See this video for an example: https://www.youtube.com/watch?v=LKdY4Us0Ci4

j / k navigate · click thread line to collapse

0 comments

3 comments · 2 top-level

Detry3227y ago

> Interesting, but surely it has to retain some information about the past to be able to know how to update that belief vector?

chadmeister7y ago· 1 in thread

I think you are exactly right. The algo is simply outmemorizing it's human counterparts. This isn't a very good paper at all.

Detry3227y ago

[1] See this video for an example: https://www.youtube.com/watch?v=LKdY4Us0Ci4

j / k navigate · click thread line to collapse