(17
(Choose 1 answer)
You are training an RNN, and find that your weights and activations are all taking on the value of NaN ("Not a Number"). Which of these is the most likely cause of this problem?
A. Vanishing gradient problem
B. Exploding gradient problem
C. ReLU activation function g(.) used to compute g(z), where z is too large
D. Sigmoid activation function g(.) used to compute g(z), where z is too large