Giter VIP home page Giter VIP logo

sentence-splitter's Introduction

Discourse based sentence splitter

Fine tuning ELECTRA to break sentences into two parts when the discourse marker is missing

Problem

  • Being able to predict discourse in a sentence based on discourse markers is quite useful for many NLP task
  • Although, there are some discourse markers which get commonly skipped, then and , are good examples
  • Some real-world scenarios
    • In a chat people can naturally skip it as its not a formal typing environment
    • A speech-to-text software that doesn't translate verbal cues, like pauses, into commas

Solution

  • Despite the absence of those discourse markers, the parts before and after the discourse marker depict recognizable grammar syntax
  • A pre-trained Language Model can be fine tuned to learn recognizing the start of a discourse in a sentence
  • This repository shows fine-tuning of ELECTRA, which is an LM exactly like BERT trained with a novel procedure that resembles GANs but does not really use adversarial training. The advantage that this adds is that the model uses significantly less memory and compute power as compared to the other comparable pre-trained LMs.

Training details

  • Trained on Google Colab with 16GB of GPU available
  • ELECTRA-base model with 110M parameters was used
  • Used batch size of 32 but I haven't tried bigger ones to see if 16GB would still be enough
  • Fine-tuning for 2 epochs took about an hour

Results

  • With little-to-no effort on hyperparameter tuning, fine tuning the model for 2 epochs gets to a test accuracy of 91.8% while training accuracy is 95.4%
  • Check the notebook or Try it on Colab for more details Sample prediction

TODOs

  • Make the pretrained model available probably via Hugging Face's own model upload facility
  • Try a custom metric that uses distance of predicted discourse position from the actual one, that should be able explain the error more meaningfully
  • Modularizing the notebook to pull out the Model class as a package

References & Acknowledgements

sentence-splitter's People

Contributors

pavanchhatpar avatar

Stargazers

hejieprobe avatar

Watchers

 avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.