Read this project full article on my Medium blog --> here
Generate summaries of news articles or blog posts using Google's language model Pegasus via ๐ค HuggingFace's API.
Pegasus is an encoder-decoder style transformer, specifically trained for abstractive summarization tasks. For this app I used the checkpoint: google/pegasus-cnn_dailymail, trained on the CNN-Dailymail corpus.
For more information about the model, see the original paper PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu, published on Dec 18, 2019.
You will need an API key from HuggingFace. In case don't have one already, follow these steps:
- Create a free account or login
- Go to Settings and then Access Tokens
- Create a new Token (select 'read' role)
- Paste your API key in the app's text box
Considerations:
- The model works best with articles in English
- Articles behind paywall restrictions can't be accessed
- Longer articles require more processing time and resources
- It may not be possible to scrape some websites