Kizspy Question: 48
(Choose 1 answer)
FJOVERFLOW.COM
Which exploration strategy selects actions according to a probability distribution that balances the known
rewards with the potential for discovering new rewards?
A. Softmax Action Selection
B. Epsilon-Greedy
C. Upper Confidence Bound (UCB)
D. Temporal Difference Learning