Due October 17, 2024 at 11:59 PM
In this assignment, you will be building a sequence to sequence neural machine translation model.
If you have a clone of the repository from previous homework:
git pull origin master
Alternatively, get a fresh copy:
git clone https://github.com/xutaima/jhu-mt-hw
In this assignment, you will be building a basic NMT model with attention. In the next assignment you will be creating extensions and adding speedups. Your next assignment will build upon this one.
This code is based on the tutorial by Sean Robertson found here. Students MAY NOT view that tutorial or use it as a reference in any way.
Your task is to implement this paper, which describes neural machine translation with attention. As in the paper, you should also write the visualization for the attention mechanism and discuss selected plots in your writeup.
The starter code for this assignment is written in PyTorch, a framework for neural networks.
INSTALL_NOTES.txt includes the instructions to install PyTorch inside a conda environment. We have provided instructions that are tested on the cs ugradx machine (which currently runs Fedora release 27). We have also tested this assignment on Ubuntu 14.04.
The primary file for this assignment is seq2seq.py Once you have installed PyTorch, you can view the arguments by running.
python seq2seq.py -h
The arguments have reasonable default values for training the initial system (e.g. the file paths to the data should not need to changed). You can inspect the defaults in the code.
One argument you should note is the load_checkpoint argument. This allows you to load in a model that was generated in a previous training run (which may be useful if you kill your training script part way through).
The portions of the code you will need to fill in are denoted by “** YOUR CODE HERE **”. Further instructions and references are also in the provided code.
translations
.