Home | Independent Research |
Luong, M.-T., Pham, H., and Manning, C. D. Effective approaches to attention-based neural machine translation. In Conference on Empirical Methods in Natural Language Processing (2015).
The paper examined two classes of attentional mechanism: global attention model and local attention model in order to better improve neural machine translation (NMT), which is attention-based NMT.
The proposed model is of objective function:
Based on LSTM, they introduced a variable-length alignment vector at for two kinds of attentional mechanism:
where the size of alignment vector “equals the number of time steps on the source side”:
where the size of at equals to window size.
Also, the paper introduced two variants for the local attention model: monotonic alignment (local-m) and predictive alignment (local-p), specifically, the alighments weights of local-p looks like:
The score of both global and local attention model is “referred as a content-based function”:
The paper also proposed an input-feeding approach, in order to take past alignment information into account in alignment decisions, and the structure looks like: