![]() Note that although this repository supports the use of BPE codes, all work was done with regular word vectors. This notebook helps convert your text file data to serialized data for training. To do so, just follow the instructions in the The first thing to do to run the NMT model is to preprocess data. You can check the pretrained folder for already aligned pidgin word vectors. ![]() Running Unsupervised NMT Aligning Word VectorsĬheck the Alignment Folder and run the notebook to align and evaluate the Pidgin Word Vectors to the English Word Vectors. RCSLS Alignment (for generating cross-lingual embeddings).NLTK (for tokenization, make sure to download punkt).More example translations are in the translations folder Dependencies changing to possessive forms as in translation 5. (pd)Īs we can see, the language model helps with performing translations that are not necessarily word-for-word, but also grammatically correct as in translations 2, 3 and 8. ĭem don hail di mission say na dem package am. (en)ĭem dey praised the mission say make dem arranged it. they have praised the mission that they arranged it. Given that they are fishermen, they saw the whale. as fishermen wey dem be dem see the whale. The woman that learned gymnastics just started to walk with bristol student. The woman 's gymnastics learn just to start walking with bristol student. di woman wey dey learn gymnastics just start dey waka with bristol student. Since wey michael job land kenya im don popular well well. Since when michael get job for kenya he don become very popular. since when michael job arrived kenya he has become very popular. India space head has said the agency has returned stronger. India 's space agency said it is coming back to be kampe. india space oga yarn say agency don come back kampe. One student wey begin dey come di kingdom hall. One of my student come start to come kingdom hall. one student began coming to the kingdom hall. Wetin most people are today no dey aware of (pd)ġ0. what are most people today not aware of ? (en) How can public witnessing prove to be effective ? (en)ĩ. What could we do to get better result when we preach in open place ? wetin we fit do to get better result when we dey preach for open place ? (pd) With this list of ministers we are confident that we will provide those results. With this list of ministers we were confident we will knack results. with dis list of ministers e sure for us say we go knack dem results (pd) ĭis no be di first time job don come africa to do crusade. (en)ĭis na no be di first time wey job don come africa to do am for crusade. this is not the first time job has come to africa to do crusade. they began to thank god for the fish (en)ģ. afta dem cancel dia first attempt (pd)Īfter they cancelled their first attempt (en)Ģ. 3 days) and selected the best model from evaluating on our test dataset.īelow are some translations by our model:ġ. We trained for 8 epochs on a V100 (approx. On-the-fly back translation and reconstruction on sentences which acts as translation trainingīelow is the algorithm for our training process:.Denoising autoencoder training on each language which acts as language model training.Discriminator training to constrain encoder to map both languages to the same latent space.Test set was obtained from the JW300 dataset and preprocessed by the Masakhane group hereįor our results, at each training step, we performed the following: The creation of an Unsupervised Neural Machine Translation model between Pidgin and English which achieves a BLEU score of 7.93 from Pidgin to English and 5.18 from English to Pidgin on a test set of 2101 sentence pairs. This aligned vector will be helpful in the performance of various downstream tasks and transfer of models from English to Pidgin. Significantly better than a baseline of 0.0093 which is the probability of selecting the right nearest neighbor from the evaluation set of 108 pairs. The alignment of Pidgin word vectors with English word vectors which achieves a Nearest Neighbor accuracy of 0.1282. Link to paper - (Accepted at NeurIPS 2019 Workshop on Machine Learning for the Developing World) This repository contains the implementation of an Unsupervised NMT model from West African Pidgin (Creole) to English without using a single parallel sentence during training. Unsupervised Neural Machine Translation from West African Pidgin (Creole) to English
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |