independent-research

Home Independent Research

Index

Luong, M.-T., Pham, H., and Manning, C. D. Effective approaches to attention-based neural machine translation. In Conference on Empirical Methods in Natural Language Processing (2015).

Motivation

The paper examined two classes of attentional mechanism: global attention model and local attention model in order to better improve neural machine translation (NMT), which is attention-based NMT.

Approach

The proposed model is of objective function:

Based on LSTM, they introduced a variable-length alignment vector at for two kinds of attentional mechanism:

where the size of alignment vector “equals the number of time steps on the source side”:

where the size of at equals to window size.

Also, the paper introduced two variants for the local attention model: monotonic alignment (local-m) and predictive alignment (local-p), specifically, the alighments weights of local-p looks like:

The score of both global and local attention model is “referred as a content-based function”:

The paper also proposed an input-feeding approach, in order to take past alignment information into account in alignment decisions, and the structure looks like:

Limitation