Challenges of Neural Machine Translation

Neural Machine Translation (NMT) is one of the most promising fields in Machine Learning and Deep Learning as the need for understanding multiple languages has increased day after day. Although there are many companies like Google and Systran which have implemented the NMT successfully, there are some low-resource programs like DARPA LORELEI which show poor performance. This is because of the challenges that are faced in the NMT. I’ll be going over some of the major challenges in this domain.
- Domain Mismatch
This is one of the oldest and well-known challenge in language translation as a word is used differently in different domains. To overcome this obstacle, we need to train the model using in-domain text for at least a few epochs before deploying. - Amount of training data
Data required for training the neural network must consist of both the languages i.e. the input language and the target language. According to Google statistics, there are over 6500 languages that are being spoken all over the world and the data for converting from one language to another is not present for every combination of languages. - Rare Words
With the vast vocabulary in each language, the training set may not have the information regarding each and every word and this can cause a lot of discrepancies in the model as it has not come across the word before. This is also a problem in an in-domain model. - Long Sentences
The neural machine translators were never good with long sentences as they needed huge computation power while training and also a huge training dataset. With the lack of such requirements, the models were not trained for long sentences that can be usually found in speeches. - Word Alignment
As we all know the number of words in an input language may vary from the number of words in the target language based on the usage. Also, the structure of the sentences can vary from input to target. This can be a problem while converting sentences of medium and huge length. - Beam Search
Since we develop a neural network for machine translation, we need to specify the size of the input neurons which determine the length of the sentence. While testing, if we get an input larger than the number of input neurons, the latter part of the sentence is going to be trimmed and this causes wrong output.
Before finishing the article, I would like to point out one other common problem/challenge and that is the understanding of the model. Since it is a neural network and neural networks are considered to be a black box, understanding the model can be difficult.
Keep learning guys and I hope you’ve understood the major challenges in the domain. Here is a link to a paper written by Philip Koehn and Rebecca Knowles on the same topic. Have a great day :)