This repository contains a Neural Machine Translation (NMT) model designed to translate English sentences into Hindi. The model uses an encoder-decoder architecture with Long Short-Term Memory (LSTM) layers and is trained on the Hindi-English Truncated Corpus dataset.
Key Features:
- Data preprocessing for text normalization and tokenization.
- Custom batch generator for memory-efficient training.
- Encoder-Decoder architecture with embedding layers.
- Trained using the Keras deep learning framework.
The Hindi-English Truncated Corpus dataset is used for training the model. The dataset contains English and Hindi sentence pairs, sourced from TED talks. Only sentences with a maximum length of 20 words are used for training.
- Convert all text to lowercase.
- Remove special characters, numbers, and extra spaces.
- Add
START_and_ENDtokens to Hindi sentences for better decoding.
The model is built using the Encoder-Decoder architecture with the following components:
- Encoder:
- Embedding layer for English sentences.
- LSTM layer to generate context vectors (hidden and cell states).
- Decoder:
- Embedding layer for Hindi sentences.
- LSTM layer initialized with encoder states.
- Dense layer with a softmax activation for predicting the target words.
Ensure you have the following installed:
- Python 3.7 or later
- TensorFlow/Keras
- Numpy
- Pandas
- Matplotlib
- Seaborn
- Clone the repository:
git clone https://github.com/mbithesss/Language-Translation-with-Deep-Learning.git cd english-to-hindi-translation - Install the required libraries:
pip install -r requirements.txt
The model is trained using the following parameters:
- Optimizer: RMSprop
- Loss Function: Categorical Crossentropy
- Batch Size: 128
- Epochs: 100
Checkpoints are saved after every epoch to ensure progress is not lost.
Checkpoints are stored in checkpoints.h5 directory. To load a checkpoint:
model.load_weights('/checkpoint.h5')Contributions are welcome! Feel free to open issues or submit pull requests.
- Fork the repository.
- Create a new branch:
git checkout -b feature-branch
- Commit your changes and push them to your fork:
git push origin feature-branch
- Open a pull request.