Q48.webp
Sakura_chan

Q48.webp

Kizspy Question: 48
(Choose 1 answer)
FJOVERFLOW.COM
Which exploration strategy selects actions according to a probability distribution that balances the known
rewards with the potential for discovering new rewards?
A. Softmax Action Selection
B. Epsilon-Greedy
C. Upper Confidence Bound (UCB)
D. Temporal Difference Learning

Thông tin

Category
REL301m
Thêm bởi
Sakura_chan
Ngày thêm
Lượt xem
496
Lượt bình luận
1
Rating
0.00 star(s) 0 đánh giá

Share this media

Back
Bên trên Bottom