2014
Seq2Seq
Encoder squeezes the whole source into one fixed vector.
2014
Bahdanau attention
Learned soft alignment per decoded token.
2015
Show, Attend & Tell
Attention over CNN feature grids for image captioning.
2015
Luong attention
Dot & general scoring; local windows.
2015
Listen, Attend, Spell
End-to-end attention-based speech recognition.
2015
End-to-End MemNets
Multi-hop soft attention over a memory bank.
2016
GNMT
Attention + deep LSTMs in production translation.
2017
Transformer
Drop recurrence; scaled dot-product, multi-head, positional encoding.