Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Calculating UCB in MCTS

$
0
0

In this article, in iteration 4, for S1, UCB1 is calculated as follows:

10+2*sqrt(ln(3)/2)

Should it be following?:

20+2*sqrt(ln(3)/2)

UCB1 formula is given as:

image

where Vi is the average reward/value of all nodes beneath this node. Does that reduce Vi at S1 from iteration 3 to iteration 4 from 20 to 10, because in interaction 4, S1 has 2 more children? If yes, I am unable to get why exactly. Can someone please explain?

1 post - 1 participant

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles