Advanced submission for open source challenge, containing summarization and question answering. There are two parts to the project, the experimental notebook and notebook with UI components, both of which are explained down below.
https://drive.google.com/file/d/1q_6n5Xvh5gktansLY-MI-cn4lZWISCLA/view?usp=sharing
Text summarization in NLP is the process of summarizing the information in large texts for quicker consumption. The intention is to create a coherent and fluent summary having only the main points outlined in the document. Question-Answering Models are machine or deep learning models that can answer questions given some context, and sometimes without any context. Automatic text summarization and question answering are common problems in machine learning and natural language processing (NLP). My project implements a UI. The user can input a link(preferrably and article or informative website), the question they want to ask about the text, and how many bullet points they want to be generated. Then from those parameters model generates a summary based on the raw text, bullet points, and the answered question.
Here are some resources to learn about text summarization, question answering, and the UI that I used:
- A Quick Introduction to Text Summarization in Machine Learning
- Comprehensive Guide to Text Summarization using Deep Learning in Python
- Text Summarization Approaches for NLP
- Allen NLP
- NLP โ Building a Question Answering model
- Google AI Blog on Question Answering
- Gradio Getting Started
Results:
- Input: Check .ipynb
- Output: Check .ipynb
- Clone git repository
- Make sure correct packages are installed(below)
- Run textsummarizerwithui.py
- or optionally for easier overall usage run TextSummarizerWithUI.ipynb in Google Colaboratory or Jupyter Notebook
- Go to the link that is output
- Input all parameters and wait for summary and answers to be generated
This is an example of how to list things you need to use the software and how to install them.
- transformers:
pip install transformers
- NLTK:
pip install nltk
- BeautifulSoup:
pip install bs4
- Regular Expressions:
pip install regex
- HeapQ:
pip install heapq
- AllenNLP:
pip install allennlp==1.0.0 allennlp-models==1.0.0
- Gradio
pip install gradio
https://drive.google.com/file/d/1Nn7iHfcbu3lcpfqAeb-m1TtwpBeh50Zo/view?usp=sharing
Text summarization in NLP is the process of summarizing the information in large texts for quicker consumption. The intention is to create a coherent and fluent summary having only the main points outlined in the document. Automatic text summarization is a common problem in machine learning and natural language processing (NLP). My project takes a link from a website(adjustable by user) and scrapes the raw text from the website. Then from this text my model generates a number of bullet points(adjustable by user) and a summary based on the raw text. Another feature that this project has is a question and answer feature. Question-Answering Models are machine or deep learning models that can answer questions given some context, and sometimes without any context. The model uses the scraped text from the website and takes a question as input which will then be answered by a transformer based model with relatively accurate results.
NOTE
Many customizable features without UI are shown in demo video.
Here are some resources to learn about text summarization and question answering that I used:
- A Quick Introduction to Text Summarization in Machine Learning
- Comprehensive Guide to Text Summarization using Deep Learning in Python
- Text Summarization Approaches for NLP
- Allen NLP
- NLP โ Building a Question Answering model
- Google AI Blog on Question Answering
Results:
- Input: Check .ipynb
- Output: Check .ipynb
- Clone git repository
- Make sure correct packages are installed(below)
- Run level3textsummarization.py
- or optionally for easier overall usage run Level3TextSummarization.ipynb in Google Colaboratory or Jupyter Notebook
This is an example of how to list things you need to use the software and how to install them.
- transformers:
pip install transformers
- NLTK:
pip install nltk
- BeautifulSoup:
pip install bs4
- Regular Expressions:
pip install regex
- HeapQ:
pip install heapq
- AllenNLP:
pip install allennlp==1.0.0 allennlp-models==1.0.0
See the open issues for a list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Nalin Nagar - [email protected]