1
(Choose 1 answer)
Compared to the below encoder-decoder model (which does not use an attention mechanism),we expect the attention model to have the greatest advantage when:
<1>
<Ty>
A. B
B. A
a <0>
x<1>
x<Tx>
a. The input sequence length Ty is large.
b. The input sequence length Ty is small.