Giter VIP home page Giter VIP logo

obqna's Introduction

OBQnA

An OpenBook Question 'n' Answer System

Introduction

OBQnA is a high-level OO Python package which aims to provide an easy and intuitive way of creating a OpenBook Question โ€˜nโ€™ Answer system.

The package parses PDF files using Apache Tika, splits the corpus into passages and calculates their corresponding Dense vector representation exploiting a Transformer NLP model. For each question asked, the system performs a Dense Passage Retrieval, using an efficient similarity search library (Faiss, ScaNN or Annoy) and extracts the answer from the retrieved passages.


Install

To install simply do pip install -r requirements.tx

  • note: If you want to use GPU please install CUDA

Python code example

We are using J. R. R. Tolkien's Lord Of The Rings Trilogy and the Hobbit for the following example.

For more detailed explanation please read the Documentation.


Parsing PDFs and performing some basic text cleaning

from obqna.process import PDFParser, Passages
from obqna.qa import QuestionAnswering

parser = PDFParser("../books/") # Path of PDFs
books = parser.parse()
books = parser.clean(books)

Splitting the corpus into passages

passages = Passages()
corpus = passages.df2passages(books)

Calculate the vector represantation of each passage and store the corresponding indices

searcher_type = "scann" # other choices: "faiss", "annoy"
qna = QuestionAnswering(searcher_type)
qna.prepare(corpus)

Ask questions

questions = [
    "Who is Galadriel?",
    "Who is Isildur?",
    "Who is Boromir's father?",
    "Who is Aragorn?",
    "Was the ring destroyed?",
    "What language is on the One Ring inscription?"
]

for question in questions:
    print(f"Question: {question}: ")
    results = qna.ask(question)
    print(f"Answer: {results['answer']}")
    print(10*'-')
Question: Who is Galadriel?: 
Answer: The Lady of Lorien
----------
Question: Who is Isildur?: 
Answer: Elendils son
----------
Question: Who is Boromir's father?: 
Answer: Lord Denethor
----------
Question: Who is Aragorn?: 
Answer: Heir of Isildur
----------
Question: Was the ring destroyed?: 
Answer: it perished from the world in the ruin of his first realm
----------
Question: What language is on the One Ring inscription?: 
Answer: Black Speech
----------

obqna's People

Contributors

nikosnalmpantis avatar nsantavas avatar ktoulgaridis avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.