Machine-Translation-Based-On-RNN-Model

Introduction

In this notebook, I will build a deep neural network that functions as part of an end-to-end machine translation pipeline. I completed pipeline will accept English text as input and return the French translation.

Preprocess - I'll convert text to sequence of integers.
Models Create models which accepts a sequence of integers as input and returns a probability distribution over possible translations. After learning about the basic types of neural networks that are often used for machine translation, you will engage in your own investigations, to design your own model!
Prediction Run the model on English text.

Dataset

We begin by investigating the dataset that will be used to train and evaluate the pipeline. The most common datasets used for machine translation are from WMT. However, that will take a long time to train a neural network on. We'll be using a dataset we created for this project that contains a small vocabulary. We'll be able to train our model in a reasonable time with this dataset.
For comparison, Alice's Adventures in Wonderland contains 2,766 unique words of a total of 15,500 words.

Preprocess

For this project, we won't use text data as input to our model. Instead, we'll convert the text into sequences of integers using the following preprocess methods:
1. Tokenize the words into ids
2. Add padding to make all the sequences the same length.

Tokenize

For a neural network to predict on text data, it first has to be turned into data it can understand. Text data like "dog" is a sequence of ASCII character encodings. Since a neural network is a series of multiplication and addition operations, the input data needs to be number(s).
We can turn each character into a number or each word into a number. These are called character and word ids, respectively. Character ids are used for character level models that generate text predictions for each character. A word level model uses word ids that generate text predictions for each word. Word level models tend to learn better, since they are lower in complexity, so we'll use those.

Padding

When batching the sequence of word ids together, each sequence needs to be the same length. Since sentences are dynamic in length, we can add padding to the end of the sequences to make them the same length.

Models

In this section, I will experiment with various neural network architectures. I will begin by training four relatively simple architectures.

Model 1 is a simple RNN
Model 2 is a RNN with Embedding
Model 3 is a Bidirectional RNN
Model 4 is an optional Encoder-Decoder RNN
After experimenting with the four simple architectures, I will construct a deeper architecture that is designed to outperform all four models.

Model 1: RNN

A basic RNN model is a good baseline for sequence data.

Model 2: Embedding

You've turned the words into ids, but there's a better representation of a word. This is called word embeddings. An embedding is a vector representation of the word that is close to similar words in n-dimensional space, where the n represents the size of the embedding vectors.

Model 3: Bidirectional RNNs

One restriction of a RNN is that it can't see the future input, only the past. This is where bidirectional recurrent neural networks come in. They are able to see the future data.

Model 4: Encoder-Decoder

Time to look at encoder-decoder models. This model is made up of an encoder and decoder. The encoder creates a matrix representation of the sentence. The decoder takes this matrix as input and predicts the translation as output.

Results

Finally, our model achieved 93.48% accuracy. In this model, we added embedding layer, bidirectional RNN, encoding layer, and decoding layer. So in this project, I showed how to use Keras to design a RNN model for machine learning. Please let me know, if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
helper.py		helper.py
machine_translation.ipynb		machine_translation.ipynb
project_tests.py		project_tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine-Translation-Based-On-RNN-Model

Introduction

Dataset

Preprocess

Tokenize

Padding

Models

Model 1: RNN

Model 2: Embedding

Model 3: Bidirectional RNNs

Model 4: Encoder-Decoder

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Machine-Translation-Based-On-RNN-Model

Introduction

Dataset

Preprocess

Tokenize

Padding

Models

Model 1: RNN

Model 2: Embedding

Model 3: Bidirectional RNNs

Model 4: Encoder-Decoder

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages