(Choose 1 answer)
Which of the following statements about Adam is False?
A. We usually use "default" values for the hyperparameters B1, B2 and ɛ in Adam (B1 = 0.9, β1 = 0.999, ε =
10-8)
B. The learning rate hyperparameter a in Adam usually needs to be tuned
C. Adam combines the advantages of RMSProp and momentum.
D. Adam should be used with batch gradient computations, not with mini-batches.
Exit 27