Giter VIP home page Giter VIP logo

francescobaio / sentence_reordering Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 68 KB

This project was undertaken as part of the Deep Learning course final exam. The primary objective of this project is to develop and implement a deep learning model for sentence reordering. Sentence reordering is a challenging Natural Language Processing (NLP) task that involves rearranging the words in an ordered sentence.

Jupyter Notebook 100.00%
deep-learning sentence-embeddings sentence-reordering

sentence_reordering's Introduction

Sentence Reordering with Transformers

This repository contains the implementation of a sentence reordering model using the Transformer architecture, inspired by the "Attention Is All You Need" paper. The notebook documents the process of developing, training, and evaluating the model.

Table of Contents

Introduction

The goal of this project is to develop a model that can reorder sentences to form coherent paragraphs. This task is crucial for applications such as text summarization and generation. The primary architecture used in this project is the Transformer model, known for its effectiveness in natural language processing tasks.

Requirements

To run this notebook, you'll need the following packages:

  • Python 3.7+
  • TensorFlow
  • Keras
  • NumPy
  • Pandas
  • Matplotlib
  • Scikit-learn

You can install the required packages using the following command:

pip install tensorflow keras numpy pandas matplotlib scikit-learn

Dataset

The dataset used in this project consists of text data where each instance contains sentences in a shuffled order. The goal is to reorder these sentences to their original, coherent order. The dataset is split into training, validation, and test sets to evaluate the model’s performance effectively.

Notebook Structure

The notebook is divided into several sections:

  1. Introduction: An overview of the project and its objectives.
  2. Data Loading and Preprocessing: Loading the dataset and preprocessing steps, including tokenization and padding.
  3. Model Development: Implementation of the Transformer model for sentence reordering.
  4. Training and Evaluation: Training the model and evaluating its performance on the validation set.
  5. Experiments: Various experiments conducted to fine-tune the model, including different architectures and hyperparameters.
  6. Conclusion: Summary of the results and final thoughts.

Custom Callback

A custom validation callback is implemented to monitor the model’s performance in real-time during training. This callback allows for dynamic adjustments and early stopping based on validation performance, helping to prevent overfitting and improve generalization.

Cosine Decay Restart

The learning rate schedule used in this project is the Cosine Decay with Restarts. This schedule helps in improving the convergence of the model by periodically reducing the learning rate and then increasing it again, which can help the model escape local minima and continue learning effectively.

Usage

  1. Clone this repository:
git clone https://github.com/your_username/sentence-reordering-transformers.git
  1. Navigate to the project directory:
cd sentence-reordering-transformers
  1. Open the notebook:
jupyter notebook francesco_baiocchi_Sentence_Reordering.ipynb
  1. Run the cells in the notebook to execute the code.

Results

The model’s performance is evaluated using a custom validation callback to monitor real-time performance on the validation set. The best model achieved an average score of 0.573 on the test set.

Conclusion

After numerous experiments and iterations, the Transformer model demonstrated average performance in the task of sentence reordering. While different architectures and parameter tuning efforts were explored, the improvements were marginal. The final score achieved was 0.573.

References

This project was inspired by the following papers:

sentence_reordering's People

Contributors

francescobaio avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.