Godot Version
4.2
Question
In this article, there is a MCTS (Monte Carlo Tree Search) implementation in python. I think I was able to adapt to GDscript, but I need help to get a better understanding of this particular method:
def best_child(self, c_param=0.1):
choices_weights = [(c.q() / c.n()) + c_param * np.sqrt((2 * np.log(self.n()) / c.n())) for c in self.children]
return self.children[np.argmax(choices_weights)]
Since I’m not used to python, I’m not sure if I was able to adapt it correctly to gdscript:
func get_best_child(exploration_parameter:float = 0.1) -> MCTSNode:
var choices_weights:int
for _c in children:
var c:MCTSNode = _c
choices_weights = (c.get_winning_score()/c.get_visits()) + exploration_parameter * sqrt((2 * log(get_visits()) / c.get_visits()))
return children[max(choices_weights)]
Is that right? Is this method running a for loop with all children from children and then calculating whatever choices_weights
would be? When I run the code, it gives me an error in the return, saying that max()
requires 2 arguments. What would be a similar method in gdscript to argmax()
?
NOTE: I changed the name of some methods to be more readable to me.