56.1k views
1 vote
Minimax and alpha-beta are great, but they both assume that you are playing against an adversary who makes optimal decisions. As anyone who has ever won tic-tac-toe can tell you, this is not always the case. In this question, you will implement the which is useful for modeling probabilistic behavior of agents who may make suboptimal choices. As with the search and constraint satisfaction problems covered so far in this class, the beauty of these algorithms is their general applicability. To expedite your own development, we've supplied some test cases based on generic trees. You can debug your implementation on small game trees using the command:

1 Answer

3 votes

An example of how to implement Expectimax for a game where the opponent has a 50% chance of playing optimally and a 50% chance of playing randomly is given below:

Python

def expectimax(node, depth, isMaximizingPlayer):

if depth == 0:

return node.utility

if isMaximizingPlayer:

bestValue = -float("inf")

for child in node.children:

bestValue = max(bestValue, expectimax(child, depth - 1, False))

return bestValue

else:

totalValue = 0.0

p = 0.5

for child in node.children:

totalValue += p * expectimax(child, depth - 1, True)

return totalValue

So, to implement Expectimax, the steps are:

  1. Define a utility function
  2. Recursively evaluate game states
  3. Handle chance nodes
User Alexander Fuchs
by
7.9k points