Giter VIP home page Giter VIP logo

adam_qas's Introduction

adam_qas's People

Contributors

5hirish avatar am1tg avatar codacy-badger avatar dependabot[bot] avatar idoroiengel avatar lotti avatar louisguitton avatar mpcsb avatar ziggerzz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

adam_qas's Issues

Feature Extraction from Question.

Bro I am completely confused.. Give me some files to start with for feature extraction so that i will be able to take the baby steps

Elasitcsearch Full Text search strategies.

The current search implementation uses match query DSL, the standard query for performing full-text queries, including fuzzy matching and phrase or proximity queries. Link >>

Match Query

This query is of boolean type and the search query goes through analysis phase and it also supports the operator flag for and, or (or being the default). It supports the boolean clauses, should, must, must_not.

The match query supports multi-terms synonym expansion with the synonym_graph token filter. By default the parameter auto_generate_synonyms_phrase_query is set to true.

This issue explores other alternatives to the current implementation.

Running same query twice is causing error.

Your Environment

  • Operating System: Windows
  • Python Version Used: 3.6.3
  • Elasticsearch Version Used 6.2.4:
  • adam_qas Version Used: Latest
  • Environment Information:
  • Question you were trying to ask:

When I run the -- python -m qas.adam -vv "When was linux kernel version 4.0 released ?" first time, I used to get answers perfectly.
But when I run the same query again it give errors.

Traceback (most recent call last):
  File "C:\Users\BLR\AppData\Local\Programs\Python\Python36\Lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\BLR\AppData\Local\Programs\Python\Python36\Lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "D:\home\Aman Dalmia\Experiments\testing\adam_qas\qas\adam.py", line 273, in <module>
    run()
  File "D:\home\Aman Dalmia\Experiments\testing\adam_qas\qas\adam.py", line 269, in run
    main(sys.argv[1:])
  File "D:\home\Aman Dalmia\Experiments\testing\adam_qas\qas\adam.py", line 262, in main
    answer = qas.process_answer()
  File "D:\home\Aman Dalmia\Experiments\testing\adam_qas\qas\adam.py", line 108, in process_answer
    search_wikipedia(self.question_keywords, self.search_depth)
  File "D:\home\Aman Dalmia\Experiments\testing\adam_qas\qas\wiki\wiki_search.py", line 22, in search_wikipedia
    wikif.parse_wiki_page()
  File "D:\home\Aman Dalmia\Experiments\testing\adam_qas\qas\wiki\wiki_fetch.py", line 51, in parse_wiki_page
    res = self.es_ops.upsert_wiki_article_if_updated(page, wiki_revid, wiki_title, wiki_html_text)
  File "D:\home\Aman Dalmia\Experiments\testing\adam_qas\qas\esstore\es_operate.py", line 83, in upsert_wiki_article_if_updated
    res = self.es_conn.update(index=__index_name__, doc_type=__doc_type__, body=wiki_body, id=pageid)
  File "D:\home\AMANDA~1\EXPERI~1\testing\venv\lib\site-packages\elasticsearch\client\utils.py", line 76, in _wrapped
    return func(*args, params=params, **kwargs)
  File "D:\home\AMANDA~1\EXPERI~1\testing\venv\lib\site-packages\elasticsearch\client\__init__.py", line 547, in update
    doc_type, id, '_update'), params=params, body=body)
  File "D:\home\AMANDA~1\EXPERI~1\testing\venv\lib\site-packages\elasticsearch\transport.py", line 314, in perform_request
    status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=time
out)
  File "D:\home\AMANDA~1\EXPERI~1\testing\venv\lib\site-packages\elasticsearch\connection\http_urllib3.py", line 180, in perform_reques
t
    self._raise_error(response.status, raw_data)
  File "D:\home\AMANDA~1\EXPERI~1\testing\venv\lib\site-packages\elasticsearch\connection\base.py", line 125, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, 'illegal_argument_exception', '[jTr5AMq][127.0.0.1:9300][indices:data/write/
update[s]]')

Update Dockerfile

Hi!
Could you also update the Dockerfile, please? Yours doesn't work for me.
I can provide the logs if you wish but I suppose that the Dockerfile is just not up to date (updated last time 1 year ago).
Thanks!

Elasticsearch with custom English analyzer

The in-built English analyzer for Elasticsearch seems to be using a weak stemmer (Porter Stemmer). So for a token like 'friendly' would get stemmed to 'friendli' and not 'friend'. A Lemmatizer would actually be perfect in such use cases.

Lemmatization is a much more complicated and expensive process that needs to understand the context in which words appear in order to make decisions about what they mean.

Stemming usually refers to a crude heuristic process that chops off the ends of words in the hope of achieving this goal correctly most of the time, and often includes the removal of derivational affixes. Lemmatization usually refers to doing things properly with the use of a vocabulary and morphological analysis of words, normally aiming to remove inflectional endings only and to return the base or dictionary form of a word, which is known as the lemma. Source

trying to run adam.py but getting DeprecationWarning

I'm trying to run the adam.py but I keep getting DeprecationWarning

home/idoroiengel/PycharmProjects/adam_qas/adam_qas/venv/lib/python3.6/site-packages/sklearn/externals/joblib/__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
  warnings.warn(msg, category=DeprecationWarning)
/home/idoroiengel/PycharmProjects/adam_qas/adam_qas/venv/lib/python3.6/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator LinearSVC from version 0.19.1 when using version 0.21.2. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)

output of: $ python -m qas.sys_info

System: Linux
Platform: #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019
Python: 3.6.8
Elasticsearch: (7, 0, 2)
Old Index Mapping. Manually reindex the index to persist your data.

 -- Old Index Mapping. Manually reindex the index to persist your data.--

I tried to change scikit-learn requirements version, but it does not cooperate. It says the current one is the largest version number.

Implement Tree Data Structure for representing data.

A tree can be used in order to represent the association and relation between the primary topic and its related topics. Any one of the related topics can later act as a primary topic for some another query targeted to that topic.
eg. When did Linus release the fist prototype of Linux ?
[Linus] -> [Linux], [Git], [Subsurface], [Linux Foundation], [Finland].....
Here [Linus] is the root node and [Linux], [Git], [Subsurface], [Linux Foundation], [Finland] are the nodes of the tree at level 1.
Now if the next question posed is : Which is the license of Linux ?
Now we can easily pursue the node [Linux] furthermore considering it as a root node for the current question.
[Linux] -> [Linus], [Android], [GPL 2]......
Thus we can maintain the context throughout the interaction with the system without; it making us feel as lost.

mapper_parsing_exception error

I am using elastic search version 6.1.1 & Python - 3.6.4

I am getting a mapping_parser_exception error when trying to run the code. Below is the error :

ERROR:qas.esstore.es_connect:Error Occurred in Index creation:
{
    'error': 
    {
        'root_cause': 
        [{
            'type': 'mapper_parsing_exception', 
            'reason': 'Root mapping definition has unsupported parameters:  [article : {_meta={version=2}, properties={content_table={analyzer=adam_analyzer, type=text}, raw={type=object, enabled=false}, content_info={analyzer=adam_analyzer, type=text}, title={analyzer=adam_analyzer, type=text}, updated={type=date}, content={analyzer=adam_analyzer, type=text}, revision={type=long}}}]'
        }],
        'type': 'mapper_parsing_exception', 
        'reason': 'Failed to parse mapping [_doc]: Root mapping definition has unsupported parameters:  [article : {_meta={version=2}, properties={content_table={analyzer=adam_analyzer, type=text}, raw={type=object, enabled=false}, content_info={analyzer=adam_analyzer, type=text}, title={analyzer=adam_analyzer, type=text}, updated={type=date}, content={analyzer=adam_analyzer, type=text}, revision={type=long}}}]', 
        'caused_by': 
        {
            'type': 'mapper_parsing_exception', 
            'reason': 'Root mapping definition has unsupported parameters:  [article : {_meta={version=2}, properties={content_table={analyzer=adam_analyzer, type=text}, raw={type=object, enabled=false}, content_info={analyzer=adam_analyzer, type=text}, title={analyzer=adam_analyzer, type=text}, updated={type=date}, content={analyzer=adam_analyzer, type=text}, revision={type=long}}}]'
        }
    }, 
    'status': 400
}

How to fix this ?

Incomplete Requirements

README installation instructions are incomplete.
For an example setup.py and requirements.txt (created mostly by PyScaffold) see PR-13: #13

No module named 'dill.dill'

When i try to run your project i got this error :
(My friend tried also on a different PC on ubuntu and he got the same error )

adrian:~/Worskpace/adam_qas$ python3 -m qas.adam -vv "When was linux kernel version 4.0 released ?"

[2018-07-02 22:52:44] DEBUG:__main__:Thinking...
I think what you want to know is: When was linux kernel version 4.0 released ?
[2018-07-02 22:52:57] DEBUG:qas.classifier.question_classifier:WH : When | WH-POS : WRB | WH-NBOR-POS : VBD | Root-POS : VBN
[2018-07-02 22:52:57] DEBUG:qas.classifier.question_classifier:Union Columns: 63
[2018-07-02 22:52:57] DEBUG:qas.classifier.question_classifier:Training data: (5365, 63)
[2018-07-02 22:52:57] DEBUG:qas.classifier.question_classifier:Target data: (1, 63)
Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/adrian/Worskpace/adam_qas/qas/adam.py", line 286, in <module>
    run()
  File "/home/adrian/Worskpace/adam_qas/qas/adam.py", line 282, in run
    main(sys.argv[1:])
  File "/home/adrian/Worskpace/adam_qas/qas/adam.py", line 274, in main
    qas.process_question()
  File "/home/adrian/Worskpace/adam_qas/qas/adam.py", line 106, in process_question
    self.question_class = classify_question(self.question_doc)
  File "/home/adrian/Worskpace/adam_qas/qas/classifier/question_classifier.py", line 164, in classify_question
    question_clf = load_classifier_model()
  File "/home/adrian/Worskpace/adam_qas/qas/classifier/question_classifier.py", line 96, in load_classifier_model
    return joblib.load(training_model_path)
  File "/home/adrian/Worskpace/adam_qas/virtualenv/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 578, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "/home/adrian/Worskpace/adam_qas/virtualenv/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 508, in _unpickle
    obj = unpickler.load()
  File "/usr/lib/python3.5/pickle.py", line 1039, in load
    dispatch[key[0]](self)
  File "/usr/lib/python3.5/pickle.py", line 1334, in load_global
    klass = self.find_class(module, name)
  File "/usr/lib/python3.5/pickle.py", line 1384, in find_class
    __import__(module, level=0)
ImportError: No module named 'dill.dill'

Can't figure out why i got this.

Result of pip freeze :

boto==2.48.0
boto3==1.7.48
botocore==1.10.48
bz2file==0.98
certifi==2018.4.16
chardet==3.0.4
cymem==1.31.2
cytoolz==0.8.2
dill==0.2.8.2
docutils==0.14
elasticsearch==6.3.0
en-core-web-md==2.0.0
en-core-web-sm==2.0.0
gensim==3.4.0
idna==2.7
jmespath==0.9.3
lxml==4.2.3
msgpack-numpy==0.4.1
msgpack-python==0.5.6
murmurhash==0.28.0
numpy==1.14.5
pandas==0.23.1
pathlib==1.0.1
pkg-resources==0.0.0
plac==0.9.6
preshed==1.0.0
python-dateutil==2.7.3
pytz==2018.5
regex==2017.4.5
requests==2.19.1
s3transfer==0.1.13
scikit-learn==0.19.1
scipy==1.1.0
six==1.11.0
smart-open==1.6.0
spacy==2.0.11
termcolor==1.1.0
thinc==6.10.2
toolz==0.9.0
tqdm==4.23.4
ujson==1.35
urllib3==1.23
wrapt==1.10.11
python --version
Python 3.5.2

Does anyone have the same issue ?

connection issue

When I try to run the qas.adam, it gives me the following errors.

(venv) admin@BLR:~/aman/experiments/adam_qas$ python -m qas.adam "When was linux kernel version 4.0 released ?"
I think what you want to know is: When was linux kernel version 4.0 released ?
Traceback (most recent call last):
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/util/connection.py", line 83, in create_connection
raise err
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 166, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=request_headers, **kw)
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/util/retry.py", line 333, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/packages/six.py", line 686, in reraise
raise value
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fe75d441e10>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/administrator/aman/experiments/adam_qas/qas/adam.py", line 273, in
run()
File "/home/administrator/aman/experiments/adam_qas/qas/adam.py", line 269, in run
main(sys.argv[1:])
File "/home/administrator/aman/experiments/adam_qas/qas/adam.py", line 262, in main
answer = qas.process_answer()
File "/home/administrator/aman/experiments/adam_qas/qas/adam.py", line 108, in process_answer
search_wikipedia(self.question_keywords, self.search_depth)
File "/home/administrator/aman/experiments/adam_qas/qas/wiki/wiki_search.py", line 21, in search_wikipedia
wikif = WikiFetch(wiki_page_ids)
File "/home/administrator/aman/experiments/adam_qas/qas/wiki/wiki_fetch.py", line 37, in init
self.es_ops = ElasticSearchOperate()
File "/home/administrator/aman/experiments/adam_qas/qas/esstore/es_operate.py", line 24, in init
es = ElasticSearchConn()
File "/home/administrator/aman/experiments/adam_qas/qas/esstore/es_connect.py", line 18, in call
cls._instances[cls] = super(ElasticSearchMeta, cls).call(*args, **kwargs)
File "/home/administrator/aman/experiments/adam_qas/qas/esstore/es_connect.py", line 31, in init
self.set_up_index()
File "/home/administrator/aman/experiments/adam_qas/qas/esstore/es_connect.py", line 121, in set_up_index
index_exists = self.es_conn.indices.exists(index=index_name)
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/elasticsearch/client/utils.py", line 76, in _wrapped
return func(*args, params=params, **kwargs)
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/elasticsearch/client/indices.py", line 213, in exists
params=params)
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/elasticsearch/transport.py", line 314, in perform_request
status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
File "/home/administrator/aman/experiments/adam_qas/venv/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 175, in perform_request
raise ConnectionError('N/A', str(e), e)
elasticsearch.exceptions.ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x7fe75d441e10>: Failed to establish a new connection: [Errno 111] Connection refused) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7fe75d441e10>: Failed to establish a new connection: [Errno 111] Connection refused)

Your Environment

  • Operating System: Linux
  • Python Version Used: 3.6.3

Error With Running

[2019-05-18 23:44:29] DEBUG:qas.esstore.es_operate:{'query': {'bool': {'must': [{'multi_match': {'query': 'linux kernel version 4.0 release', 'type': 'most_fields', 'fields': ['content', 'content_info', 'content_table']}}], 'should': [], 'must_not': []}}}
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/mnt/drive2/programs/adam_qas/qas/adam.py", line 286, in <module>
    run()
  File "/mnt/drive2/programs/adam_qas/qas/adam.py", line 282, in run
    main(sys.argv[1:])
  File "/mnt/drive2/programs/adam_qas/qas/adam.py", line 275, in main
    answer = qas.process_answer()
  File "/mnt/drive2/programs/adam_qas/qas/adam.py", line 124, in process_answer
    wiki_pages = search_rank(self.query)
  File "/mnt/drive2/programs/adam_qas/qas/doc_search_rank.py", line 13, in search_rank
    result_all = es.search_wiki_article(query)
  File "/mnt/drive2/programs/adam_qas/qas/esstore/es_operate.py", line 266, in search_wiki_article
    es_result = self.es_conn.search(index=__index_name__, doc_type=__doc_type__, body=search_body)
  File "/usr/local/lib/python3.6/dist-packages/elasticsearch/client/utils.py", line 84, in _wrapped
    return func(*args, params=params, **kwargs)
TypeError: search() got an unexpected keyword argument 'doc_type'

Guidance on providing additional languages

Hello,
I would invest some time to help adding support to multiple languages. I am mainly interested in italian, but the approach may be generic I guess.

Can you provide some references on you would proceed eventually also sub task?

Thank you

Question Answering

Super epic piece of python! It took me a while to get it work (tried windows) but now on linux mint it's working!

The question classification works epicly well (if I don't do weird things like WHERE is amsterdam?). I'm looking into this library for my final internship and am amazed! The only thing I notice is that the answers are often quite weird. Some are fine some are not even close. Is this known? I know it is work in progress tho.

An example is below (the question is copied from commented code in the feature_extractor.py

using spacy 1.9!

Q:>What's the American dollar equivalent for 8 pounds in the U.K. ?
Question: What's the American dollar equivalent for 8 pounds in the U.K. ?
Class: ['DESC']
Question Features: ['equivalent american dollar', '8', 'pounds', 'U.K.', 'be']
Question Query: [['equivalent american dollar', '8', 'pounds', 'U.K.', 'be'], [], [], []]
Fetching Knowledge source...
Pages Fetched: 11
Documents pre-processed
Ranked Pages: [(0, 0.21964739), (2, 0.20715606), (3, 0.19018666)]
Searching:  ['american dollar', '8', 'dollar', 'american', 'u.k.', 'equivalent american dollar', 'equivalent american', 'be', 'pounds', 'equivalent']
Candidate Answer: (5) ['Kate Plus 8 is an American reality television show.', 'American trade dollars are 90% silver and weigh 420 grains (27.2 g), which is about 0.5 g heavier than the regular circulation Seated Liberty Dollars and Morgan Dollars.', 'The opening ceremony of the Summer Olympics in Beijing started at 8 seconds and 8 minutes past 8 pm (local time) on 8 August 2008.', 'The currency was ultimately replaced by the silver dollar at the rate of 1 silver dollar to 1000 continental dollars.', 'The Sacagawea dollar is one example of the copper alloy dollar.']
Answer: Kate Plus 8 is an American reality television show. American trade dollars are 90% silver and weigh 420 grains (27.2 g), which is about 0.5 g heavier than the regular circulation Seated Liberty Dollars and Morgan Dollars. The opening ceremony of the Summer Olympics in Beijing started at 8 seconds and 8 minutes past 8 pm (local time) on 8 August 2008. The currency was ultimately replaced by the silver dollar at the rate of 1 silver dollar to 1000 continental dollars. The Sacagawea dollar is one example of the copper alloy dollar.
Total time : 20.559940099716187

Process finished with exit code 0

Q:>How many colored squares are there on a Rubik's Cube
Question: How many colored squares are there on a Rubik's Cube
Class: ['NUM']

Question Features: ['colored squares', 'Rubik', 'Cube', 'be']
Question Query: [['colored squares', 'Rubik', 'Cube', 'be'], [], [], []]
Fetching Knowledge source...
Pages Fetched: 7
Documents pre-processed
Ranked Pages: [(3, 0.54112279), (5, 0.32183689), (4, 0.27689046)]
Searching:  ['cube', 'squares', 'be', 'rubik', 'colored', 'colored squares']
Candidate Answer: (5) ["Tesseract Trapezohedron Miscellaneous cubes Cube (film) Diamond cubic Lövheim cube of emotion Cube of Heymans Necker Cube OLAP cube Prince Rupert's cube Rubik's Cube The Cube (game show) Unit cube Yoshimoto Cube Kaaba 'Rubik's Cube is a 3-D combination puzzle invented in 1974 by Hungarian sculptor and professor of architecture Ernő Rubik.", 'ISBN 157912805X. Official website Safecracker Method: Solving Rubik\'s Cube with just 10 Numbers Rubik\'s Cube at DMOZ How to solve a Rubik\'s Cube on YouTube World record 5.55 second solution on YouTube World Cube Association List of related puzzles and solutions Complete disassembly of a 3^3 classic Rubik\'s cube Speedsolving Wiki "Rubik\'s Cube". Google Doodle. Retrieved 2014-05-19.', "Mathematics of the Rubik's Cube Design.", 'Google has released the Chrome Cube Lab in association with Ernő Rubik.', "Ideal rebranded The Magic Cube to the Rubik's Cube before its introduction to an international audience in 1980."]
Answer: Tesseract Trapezohedron Miscellaneous cubes Cube (film) Diamond cubic Lövheim cube of emotion Cube of Heymans Necker Cube OLAP cube Prince Rupert's cube Rubik's Cube The Cube (game show) Unit cube Yoshimoto Cube Kaaba 'Rubik's Cube is a 3-D combination puzzle invented in 1974 by Hungarian sculptor and professor of architecture Ernő Rubik. ISBN 157912805X. Official website Safecracker Method: Solving Rubik's Cube with just 10 Numbers Rubik's Cube at DMOZ How to solve a Rubik's Cube on YouTube World record 5.55 second solution on YouTube World Cube Association List of related puzzles and solutions Complete disassembly of a 3^3 classic Rubik's cube Speedsolving Wiki "Rubik's Cube". Google Doodle. Retrieved 2014-05-19. Mathematics of the Rubik's Cube Design. Google has released the Chrome Cube Lab in association with Ernő Rubik. Ideal rebranded The Magic Cube to the Rubik's Cube before its introduction to an international audience in 1980.
Total time : 16.014000177383423

Invalid Answers

When asking: "Who was the first president of the United States?"

The answer is:

Normally vice presidents hold some power and special responsibilities below that of the president. The amendment also specifies that if any eligible person serves as president or acting president for more than two years of a term for which some other eligible person was elected president, the former can only be elected president once. Mitt Romney for president. Perhaps the best known sub-national presidents are the borough presidents of the Five Boroughs of New York City. The president fulfills various ceremonial duties.

Implement custom Wikipedia scrapper

https://en.wikipedia.org/w/api.php [EndPoint] [User-Agent header]
 > format:json
 > action:query

    list:search

    srsearch: Search for all page titles (or content) that have this value.
    srwhat: Search inside the text or titles.
    srlimit: How many total pages to return. No more than 50 (500 for bots) allowed. (Default: 10)

Anaphora resolution

So, the pronouns are affecting the accuracy of the vector space model and to avoid that I am trying to implement anaphora resolution.

Windows memory error

python -m qas.adam -vv "When was linux kernel version 4.0 released ?"
C:\Program Files (x86)\Python37-32\lib\site-packages\gensim\utils.py:1212: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
[2018-12-03 20:28:22] DEBUG:main:Thinking...
I think what you want to know is: When was linux kernel version 4.0 released ?
....
....
[2018-12-03 20:29:34] DEBUG:qas.esstore.es_operate:Article Updated:updated
[2018-12-03 20:29:34] DEBUG:qas.wiki.wiki_parse:Parsed content length: 992
[2018-12-03 20:29:34] INFO:qas.wiki.wiki_parse:Inserted parsed content for: 24845611
[2018-12-03 20:29:34] DEBUG:qas.esstore.es_operate:{'query': {'bool': {'must': [{'multi_match': {'query': 'linux kernel version 4.0 release', 'type': 'most_fields', 'fields': ['content', 'content_info', 'content_table']}}], 'should': [], 'must_not': []}}}
[2018-12-03 20:29:34] DEBUG:root:Ranked Wiki Pages Title: ['Linux kernel', 'Linux', 'Kernel-based Virtual Machine', 'Linux kernel interfaces', 'Paint.net', 'Release early, release often', 'Creative Commons license', '4.0', 'Software release life cycle', 'Jeff Bridges']
[2018-12-03 20:29:34] INFO:main:Pages retrieved: 10
Traceback (most recent call last):
File "C:\Program Files (x86)\Python37-32\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Program Files (x86)\Python37-32\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Vineet\Python\adam_qas\qas\adam.py", line 286, in
run()
File "C:\Vineet\Python\adam_qas\qas\adam.py", line 282, in run
main(sys.argv[1:])
File "C:\Vineet\Python\adam_qas\qas\adam.py", line 275, in main
answer = qas.process_answer()
File "C:\Vineet\Python\adam_qas\qas\adam.py", line 128, in process_answer
self.candidate_answers, keywords = get_candidate_answers(self.query, wiki_pages, self.nlp)
File "C:\Vineet\Python\adam_qas\qas\candidate_ans.py", line 122, in get_candidate_answers
en_doc = en_nlp(u'' + combined_document_str)
File "C:\Program Files (x86)\Python37-32\lib\site-packages\spacy\language.py", line 346, in call
doc = proc(doc)
File "nn_parser.pyx", line 338, in spacy.syntax.nn_parser.Parser.call
File "nn_parser.pyx", line 401, in spacy.syntax.nn_parser.Parser.parse_batch
File "nn_parser.pyx", line 730, in spacy.syntax.nn_parser.Parser.get_batch_model
File "nn_parser.pyx", line 85, in spacy.syntax.nn_parser.precompute_hiddens.init
File "C:\Program Files (x86)\Python37-32\lib\site-packages\spacy_ml.py", line 149, in begin_update
self.W.reshape((self.nFself.nOself.nP, self.nI)).T)
MemoryError

Environment

python -m qas.sys_info
System: Windows
Platform: 10.0.17134
Python: 3.7.1
Elasticsearch: (6, 3, 1)
Elasticsearch Mapping: {'settings': {'number_of_shards': 1, 'number_of_replicas': 0, 'analysis': {'filter': {'english_stop': {'type': 'stop', 'stopwords': 'english'}, 'english_porter2': {'type': 'stemmer', 'language': 'porter2'}}, 'analyzer': {'adam_analyzer': {'type': 'custom', 'tokenizer': 'standard', 'filter': ['lowercase', 'english_stop', 'english_porter2']}}}}, 'mappings': {'article': {'_meta': {'version': 2}, 'properties': {'title': {'type': 'text', 'analyzer': 'adam_analyzer'}, 'updated': {'type': 'date'}, 'raw': {'type': 'object', 'enabled': 'false'}, 'content': {'type': 'text', 'analyzer': 'adam_analyzer'}, 'content_info': {'type': 'text', 'analyzer': 'adam_analyzer'}, 'content_table': {'type': 'text', 'analyzer': 'adam_analyzer'}, 'revision': {'type': 'long'}}}}}

* Question you were trying to ask: Exception as shown above.

running in Jupyter

Hi,
I'm trying to run the code in Jupyter, I have everything installed. but when running:

import sys
sys.path.insert(0, '/the/path/in/my/computer')

import qas
from qas import adam
import spacy
from qas.constants import EN_MODEL_MD, EN_MODEL_DEFAULT, EN_MODEL_SM

adam.main("when was WW2?")

I get the error:

usage: ipykernel_launcher.py [-h] [--version] [-l XX] [-n Y] [--lite]
[--model XXX_XX] [-v] [-vv]
"QUESTION"
ipykernel_launcher.py: error: unrecognized arguments: h e n w a s W W 2 ?

An exception has occurred, use %tb to see the full traceback.

SystemExit: 2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.