最佳实践:深度学习用于自然语言处理(二)

2017 年 8 月 20 日 待字闺中 Sebastian Ruder

(continue)

Attention

Attention is most commonly used in sequence-to-sequence models to attend to encoder states, but can also be used in any sequence model to look back at past states. Using attention, we obtain a context vector