Course: Python for Data Science (Dec 2023)
Project: Machine Translation from English to Vietnamese
Technical Requirements: Python, PyTorch, Gradio,...
Team Members:
- Vo Thi Khanh Linh
- Nguyen Nhat Minh Thu
- Nguyen Dang Anh Thu
Descriptions:
- A natural language processing project, using Seq2Seq Architecture combine with Bahdanau Attention and Teacher forcing technique
- Preprocessing and tokenization are performed for each language, with the creation of custom Vocabulary and Dataset classes for both training and testing phases
- Leveraged pre-trained embeddings GloVe for English and PhoW2V for Vietnamese to enhance translation accuracy and fluency
- Using metrics such as BLEU and F1-Score for evaluation and visualization performance
- Simple user interface using Gradio and hosting through Hugging Face