Giter VIP home page Giter VIP logo

wikipedia's Introduction

Wikipedia

https://travis-ci.org/goldsmith/Wikipedia.png?branch=master https://pypip.in/d/wikipedia/badge.png https://pypip.in/v/wikipedia/badge.png License

Wikipedia is a Python library that makes it easy to access and parse data from Wikipedia.

Search Wikipedia, get article summaries, get data like links and images from a page, and more. Wikipedia wraps the MediaWiki API so you can focus on using Wikipedia data, not getting it.

>>> import wikipedia
>>> print wikipedia.summary("Wikipedia")
# Wikipedia (/ˌwɪkɨˈpiːdiə/ or /ˌwɪkiˈpiːdiə/ WIK-i-PEE-dee-ə) is a collaboratively edited, multilingual, free Internet encyclopedia supported by the non-profit Wikimedia Foundation...

>>> wikipedia.search("Barack")
# [u'Barak (given name)', u'Barack Obama', u'Barack (brandy)', u'Presidency of Barack Obama', u'Family of Barack Obama', u'First inauguration of Barack Obama', u'Barack Obama presidential campaign, 2008', u'Barack Obama, Sr.', u'Barack Obama citizenship conspiracy theories', u'Presidential transition of Barack Obama']

>>> ny = wikipedia.page("New York")
>>> ny.title
# u'New York'
>>> ny.url
# u'http://en.wikipedia.org/wiki/New_York'
>>> ny.content
# u'New York is a state in the Northeastern region of the United States. New York is the 27th-most exten'...
>>> ny.links[0]
# u'1790 United States Census'

>>> wikipedia.set_lang("fr")
>>> wikipedia.summary("Facebook", sentences=1)
# Facebook est un service de réseautage social en ligne sur Internet permettant d'y publier des informations (photographies, liens, textes, etc.) en contrôlant leur visibilité par différentes catégories de personnes.

Note: this library was designed for ease of use and simplicity, not for advanced use. If you plan on doing serious scraping or automated requests, please use Pywikipediabot (or one of the other more advanced Python MediaWiki API wrappers), which has a larger API, rate limiting, and other features so we can be considerate of the MediaWiki infrastructure.

Installation

To install Wikipedia, simply run:

$ pip install wikipedia

Wikipedia is compatible with Python 2.6+ (2.7+ to run unittest discover) and Python 3.3+.

Documentation

Read the docs at https://wikipedia.readthedocs.org/en/latest/.

To run tests, clone the repository on GitHub, then run:

$ pip install -r requirements.txt
$ bash runtests  # will run tests for python and python3
$ python -m unittest discover tests/ '*test.py'  # manual style

in the root project directory.

To build the documentation yourself, after installing requirements.txt, run:

$ pip install sphinx
$ cd docs/
$ make html

License

MIT licensed. See the LICENSE file for full details.

Credits

  • wiki-api by @richardasaurus for inspiration
  • @nmoroze and @themichaelyang for feedback and suggestions
  • The Wikimedia Foundation for giving the world free access to data
Bitdeli badge

wikipedia's People

Contributors

arcolife avatar astavonin avatar bitdeli-chef avatar crazybmanp avatar frewsxcv avatar fusiongyro avatar goldsmith avatar imkevinxu avatar infothrill avatar javierprovecho avatar jongoodnow avatar jvanasco avatar kazuar avatar legoktm avatar mjpieters avatar mkasprz avatar razerm avatar sachavakili avatar salty-horse avatar theopolisme avatar wronglink avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wikipedia's Issues

WikipediaException: unknown exception on summary

An unknown error has been thrown during summary()

Here is the code snippet calling it. " ".join(request_words[1:]) resolved here to ":-)". Lang was set on "fr". This seems to be easily reproducible.

try:
    science = wikipedia.summary(" ".join(request_words[1:]),
                                sentences = 1)
    say(science)
except wikipedia.exceptions.DisambiguationError:
    say("aba c ambigü dsl")
except wikipedia.exceptions.PageError:
    say("sa nexist pa B|")

The traceback:

  File "/home/shgck/edmond/brain.py", line 354, in handle_request
    sentences = 1)
  File "/usr/local/lib/python3.4/site-packages/wikipedia/util.py", line 28, in __call__
    ret = self._cache[key] = self.fn(*args, **kwargs)
  File "/usr/local/lib/python3.4/site-packages/wikipedia/wikipedia.py", line 231, in summary
    page_info = page(title, auto_suggest=auto_suggest, redirect=redirect)
  File "/usr/local/lib/python3.4/site-packages/wikipedia/wikipedia.py", line 270, in page
    results, suggestion = search(title, results=1, suggestion=True)
  File "/usr/local/lib/python3.4/site-packages/wikipedia/util.py", line 28, in __call__
    ret = self._cache[key] = self.fn(*args, **kwargs)
  File "/usr/local/lib/python3.4/site-packages/wikipedia/wikipedia.py", line 109, in search
    raise WikipediaException(raw_results['error']['info'])
wikipedia.exceptions.WikipediaException: An unknown error occured: "La recherche d’arrière-plan a renvoyé une erreur : ". Please report it on GitHub!

Pool queue is full

autowikibot-commenter.py is my script.

Traceback (most recent call last):
  File "autowikibot-commenter.py", line 272, in <module>
    url_string, bit_comment_start = process_summary_call(post)
      File "autowikibot-commenter.py", line 177, in process_summary_call
    trialsummary = wikipedia.summary(term,auto_suggest=True)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/util.py", line 23, in __call__
    ret = self._cache[key] = self.fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 169, in summary
    page_info = page(title, auto_suggest=auto_suggest, redirect=redirect)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 204, in page
    results, suggestion = search(title, results=1, suggestion=True)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/util.py", line 23, in __call__
    ret = self._cache[key] = self.fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 91, in search
    raise WikipediaException(raw_results['error']['info'])
WikipediaException: An unknown error occured: "Pool queue is full". Please report it on GitHub!
[2014-01-13 23:05:36] GLOBAL: An unknown error occured: "Pool queue is full". Please report it on GitHub!

summary() will sometimes cut off sentences

Using the sentences parameter with wikipedia.summary() will sometimes lead to sentences being cut off. For example:

>>> wikipedia.summary('Induced radioactivity', sentences = 3)

Will yield the following: 'Induced radioactivity occurs when a previously stable material has been made radioactive by exposure to specific radiation. Most radioactivity does not induce other material to become radioactive. This Induced radioactivity was discovered by Irène Curie and F.'

When it should actually be:

'Induced radioactivity occurs when a previously stable material has been made radioactive by exposure to specific radiation. Most radioactivity does not induce other material to become radioactive. This Induced radioactivity was discovered by Irène Curie and F. Joliot in 1934.'

I've only noticed this in summaries of articles containing abbreviated names (period), like 'F. Joliot'.

Sections returned empty

empty

All these sections do exist in the article, but are returned either empty unicode string u'' or simply None

I think the last 2 have something to do with url encoding.

Exception on passing emoji to search() and page()

Passing emoji character "⌛" (U+231B Hourglass) to Wikipedia MW API returns the redirect info and pageid and title for the article "Hourglass" redirected to:

<?xml version="1.0"?>
<api>
  <query>
    <redirects>
      <r from="⌛" to="Hourglass" />
    </redirects>
    <pages>
      <page pageid="4166493" ns="0" title="Hourglass" />
    </pages>
  </query>
</api>

but both search() and page() methods throw an exception on such query:

>>> wikipedia.page("⌛")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "build/bdist.macosx-10.9-intel/egg/wikipedia/wikipedia.py", line 270, in page
  File "build/bdist.macosx-10.9-intel/egg/wikipedia/util.py", line 28, in __call__
  File "build/bdist.macosx-10.9-intel/egg/wikipedia/wikipedia.py", line 109, in search
  wikipedia.exceptions.WikipediaException: An unknown error occured: "The search backend returned an error: ". Please report it on GitHub!

Update: In case of page() this is apparently happening because of auto_suggest default value. Therefore:

>>> wikipedia.page(u"⌛", auto_suggest=False)
<WikipediaPage 'Hourglass'>

Some puzzles when using Chinese

Hey, it is a really cool project! But today when i try to using it with Chinese, I encountered some problems----
when i try to set the language to "zh", and start my script with a Chinese word, and use the links it got to fetch the webpages again, it will throw some syntax errors.
qq20131204-1 2x
The word I use is 双名法,the script got it from the keyboard input.I hope u can solve my puzzles.Thank you !

wikipedia.page(<title>...) returns wrong page (in Arabic, maybe other languages)

Hi Jonathan,

I caught a bug (or what I think is a bug) by accident. I was grabbing parallel page titles in English and Arabic to enhance a translation system, and I noticed that "October" got translated as "medical diagnosis". This is because when I had grabbed the page "Medical diagnosis" in English and had found the parallel page title in Arabic (using urllib2) and then pulled up the whole parallel page using 'wikipedia.page(<arabic_title>)', I got a completely different page. Even though the page title that I got using urllib2 is correct ("diagnosis" in Arabic), the call to 'wikipedia.page(<diagnosis_in_arabic>)' brings up the Arabic page for "October".

$ python

import wikipedia, urllib2, re
wikipedia.set_lang("en")
urllib2_agent = urllib2.build_opener()
urllib2_agent.addheaders = [('User-agent', 'Mozilla/5.0')]
en_page = wikipedia.page("Medical diagnosis")
print en_page.title
Medical diagnosis
parallel_title_data = urllib2_agent.open(u'http://www.wikidata.org/w/api.php?action=wbgetentities&sites=enwiki&titles='+urllib2.quote(en_page.title.encode("utf-8")) + u'&languages=ar&props=labels&format=xml')
parallel_title = re.findall('value="([^"]*)"', parallel_title_data.read())[0]
print parallel_title
تشخيص
wikipedia.set_lang("ar")
parallel_page = wikipedia.page(parallel_title)
print parallel_page.title
أكتوبر

Note that the first title means "diagnosis", and the second means "October". Google translate will confirm. (You can also pass 'parallel_page.content' to Google translate.) Very strange.

This is on (a very old version of) Redhat Linux with Python 2.7.2

$ cat /etc/redhat-release
Scientific Linux SL release 5.1 (Boron)

but I can reproduce the issue on Windows 7, Cygwin (1.7.30), Python 2.7.5.

Let me know if there's any more information you might need, and thanks for the tool.

Best,
Dennis

Sentence cut short

tire

This is probably because of the period with space after it. Sentence should not be cut off if, say the period is inside a bracket.

Python 3: Charmap codec can't encode character

This is a pretty standard issue that occurs a lot in Python 3 especially whenever I'm building scrappers in Python using requests and beautifulsoup4. I have forked this repo and will work on this issue but I thought the core developers should know. I have been struggling to solve this problem and it is fixed using the .encode() and .decode() methods of string in Python 2. In Python 3, it's a different case.

A Screenshot:
capture

The sections attribute is an empty list.

The example page, where I've noticed the issue: Comparison of MUTCD-Influenced Traffic Signs

Here's what's happening:

import wikipedia
mutcd = wikipedia.page('Comparison of MUTCD-Influenced Traffic Signs')
mutcd.sections

Which outputs [].

I would expect the section headers from the ToC to appear in the list as mentioned in the documentation. Let me know if I'm just doing it wrong!

Limit on page length with wikipedia.page

The following simple code:

page = wikipedia.page("List_of_poets_from_the_United_States")
print page.links

Is only returning about 1/4 of the links on that page. Where is the limitation coming from?

wikipedia.geosearch() not working?

I'm trying geosearch() from the python console, should I be missing some dependency?

>>> import wikipedia as wiki
>>> wiki.geosearch(-34.2, -54.3, radius = 1000)

Traceback (most recent call last):
  File "<pyshell#43>", line 1, in <module>
    wiki.geosearch(-34.2, -54.3, radius = 1000)
AttributeError: 'module' object has no attribute 'geosearch'

KeyError thrown when HTTP request times out (wikipedia.search)

Calling a series of searches like wikipedia.search(item, results=1) occasionally results in the error:

File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 47, in search
    search_results = (d['title'] for d in raw_results['query']['search'])
KeyError: 'query'

This is because the raw_results dict looks like this:
{u'servedby': u'mw1118', u'error': {u'info': u'HTTP request timed out.', u'code': u'srsearch-error'}}

Maybe it would be better if some sort of HTTP exception was thrown instead of the key error? That is, if I'm understanding what's happening correctly.

ImportError on installing from pip in virtualenv

If I develop a new virtualenv and do a pip install wikipedia, it gives:

ImportError: No module named requests

This is due to 'import wikipedia' in the setup.py file. I think this shouldn't be present in setup.py file (since it hasn't finished installing requirements yet).
Also, could the requests version be updated to 2.3.0 ?

Method to change API endpoint?

Currently the API endpoint is hardcoded into wikipedia/wikipedia.py on line 15: API_URL = 'http://en.wikipedia.org/w/api.php'.

This constrains access to other wiki sites accessible with the Wikimedia API, but at different API endpoints.

For example, I would like to use the Wikimedia API to access Wikiquote data––this would involve changing the API endpoint to http://en.wikiquote.org/w/api.php (link).

This is a problem I'm running into with my project; I'd like to access quotes for a specific person, and I'm missing some good way to access the WikiQuote API from Python, save wrapping it myself. Would you be open to a pull request that allows the user to override the API endpoint to access other Wikimedia sites?

Drop down range for `requests` requirement ?

First, thanks for switching the 'requests' requirement to a range.

I wanted to ask if the minimum version could be lowered. On python2 (i don't have 3) , '2.1.0' and 2.0.0 will both pass all the tests.

Looking at the commit log, the version seemed arbitrarily introduced. If that's the case, using an earlier version would be very helpful as many libraries require that package at different minimum levels.

Redirect page error on article with no actual redirect

e.g. http://en.wikipedia.org/wiki/Whisky

>>> wiki = wikipedia.WikipediaPage('Whiskey')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 224, in __init__
    self.load(redirect=redirect, preload=preload)
  File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 276, in load
    self.__init__(title, redirect=redirect, preload=preload)
  File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 224, in __init__
    self.load(redirect=redirect, preload=preload)
  File "/Library/Python/2.7/site-packages/wikipedia/wikipedia.py", line 256, in load
    raise PageError(self.title)
wikipedia.exceptions.PageError: "is a redirect from a title with a different spelling. Pages that use this link may be updated to link directly to the target page. It is not necessary to replace these redirected links with a piped link. For more information, follow the category link." does not match any pages. Try another query!

Unexpected page result involving redirection

While arbitrary tries with wikipedia.page seems working great, I just encountered the following unexpected results:

>>> p = wikipedia.page('WiFi')
>>> p
<WikipediaPage 'Wife'>
>>> p = wikipedia.page('WiFi', auto_suggest=False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "wikipedia/wikipedia.py", line 211, in page
    return WikipediaPage(title, redirect=redirect, preload=preload)
  File "wikipedia/wikipedia.py", line 224, in __init__
    self.load(redirect=redirect, preload=preload)
  File "wikipedia/wikipedia.py", line 277, in load
    self.__init__(title, redirect=redirect, preload=preload)
  File "wikipedia/wikipedia.py", line 224, in __init__
    self.load(redirect=redirect, preload=preload)
  File "wikipedia/wikipedia.py", line 297, in load
    self.url = data['fullurl']
KeyError: 'fullurl'

Any ideas?

Syntax Error When Lib Moved via SSH

I copied the library over to a computer cluster and get the below syntax error.
File "wikipedia.py", line 682
for lang in languages
^
I can't install with pip as I am not the root of the cluster. The cluster is running python 2.6.6.
Any ideas?

invalid syntax in wikipedia.py

i'm working on CentOS 6.6 - Python 2.6.6
when i install wikipedia with pip or from git source, i get this error:

File "/usr/lib/python2.6/site-packages/wikipedia/init.py", line 1, in
from .wikipedia import *
File "/usr/lib/python2.6/site-packages/wikipedia/wikipedia.py", line 699
for lang in languages
^
SyntaxError: invalid syntax

i solved with this syntax:

a = {}
for lang in languages:
b = lang['code']
c = lang['*']
a[b] = c
return a

instead of:

return {
lang['code']: lang['*']
for lang in languages
}

insert this code in wikipedia in git source and install with setup,py

KeyError: u'normalized'

wikipedia version 1.2.1

>>> import wikipedia
>>> wikipedia.page("Communist Party", auto_suggest=False)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 274, in page
    return WikipediaPage(title, redirect=redirect, preload=preload)
  File "/usr/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 297, in __init__
    self.__load(redirect=redirect, preload=preload)
  File "/usr/local/lib/python2.7/site-packages/wikipedia/wikipedia.py", line 352, in __load
    normalized = query['normalized'][0]
KeyError: u'normalized'

API is searching for CHAD instead of CHAID

When I am trying this code, it throws an error

try:
text = wikipedia.summary("CHAID", sentences=2)
except DisambiguationError:
print("Multiple pages with the same name. Disambiguation Error was thrown.")
print text.encode('utf-8')

Somehow the API is searching for CHAD instead of CHAID. Pages exist for both terms.
Can someone please explain to me what is potentially the issue or if I need to modify something in my code ?

Traceback (most recent call last):
File "C:\Users\mowgli\workspace\SubTypeRelationshipExtractor\article_in_category_retriever.py", line 46, in
aicr.find_articles_in_category()
File "C:\Users\mowgli\workspace\SubTypeRelationshipExtractor\article_in_category_retriever.py", line 31, in find_articles_in_category
text = wikipedia.summary("CHAID", sentences=2)
File "build\bdist.win32\egg\wikipedia\util.py", line 23, in call
File "build\bdist.win32\egg\wikipedia\wikipedia.py", line 182, in summary
File "build\bdist.win32\egg\wikipedia\wikipedia.py", line 227, in page
File "build\bdist.win32\egg\wikipedia\wikipedia.py", line 250, in init
File "build\bdist.win32\egg\wikipedia\wikipedia.py", line 295, in load
wikipedia.exceptions.PageError: Page id "CHAD" does not match any pages. Try another id!

Problem with installation on python3

At installation of wikipedia package on python3 fails because some of problems in setup.py.

$ pip install wikipedia
Downloading/unpacking wikipedia
  Downloading wikipedia-1.0.0.tar.gz
  Running setup.py egg_info for package wikipedia
    Traceback (most recent call last):
      File "<string>", line 16, in <module>
      File "/Users/wronglink/.virtualenvs/p33/build/wikipedia/setup.py", line 29, in <module>
        'License :: OSI Approved :: MIT License',
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/core.py", line 148, in setup
        dist.run_commands()
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/dist.py", line 941, in run_commands
        self.run_command(cmd)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/dist.py", line 960, in run_command
        cmd_obj.run()
      File "<string>", line 13, in replacement_run
      File "/Users/wronglink/.virtualenvs/p33/lib/python3.3/site-packages/distribute-0.6.28-py3.3.egg/setuptools/command/egg_info.py", line 384, in write_pkg_info
        metadata.write_pkg_info(cmd.egg_info)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/dist.py", line 1039, in write_pkg_info
        self.write_pkg_file(pkg_info)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/dist.py", line 1060, in write_pkg_file
        long_desc = rfc822_escape(self.get_long_description())
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/util.py", line 490, in rfc822_escape
        lines = header.split('\n')
    TypeError: Type str doesn't support the buffer API
    Complete output from command python setup.py egg_info:
    running egg_info

creating pip-egg-info/wikipedia.egg-info

writing requirements to pip-egg-info/wikipedia.egg-info/requires.txt

writing pip-egg-info/wikipedia.egg-info/PKG-INFO

Traceback (most recent call last):

  File "<string>", line 16, in <module>

  File "/Users/wronglink/.virtualenvs/p33/build/wikipedia/setup.py", line 29, in <module>

    'License :: OSI Approved :: MIT License',

  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/core.py", line 148, in setup

    dist.run_commands()

  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/dist.py", line 941, in run_commands

    self.run_command(cmd)

  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/dist.py", line 960, in run_command

    cmd_obj.run()

  File "<string>", line 13, in replacement_run

  File "/Users/wronglink/.virtualenvs/p33/lib/python3.3/site-packages/distribute-0.6.28-py3.3.egg/setuptools/command/egg_info.py", line 384, in write_pkg_info

    metadata.write_pkg_info(cmd.egg_info)

  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/dist.py", line 1039, in write_pkg_info

    self.write_pkg_file(pkg_info)

  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/dist.py", line 1060, in write_pkg_file

    long_desc = rfc822_escape(self.get_long_description())

  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/distutils/util.py", line 490, in rfc822_escape

    lines = header.split('\n')

TypeError: Type str doesn't support the buffer API

DisambiguationError: perhaps a better response?

For example:

    wikipedia.summary('recommendation')

gives the following error and terminates the program:

    290       may_refer_to = [li.a.get_text() for li in filtered_lis if li.a]
    291 
--> 292       raise DisambiguationError(self.title, may_refer_to)
    293 
    294     else:

DisambiguationError: "Recommendation" may refer to: 
norm (philosophy)
Recommender systems
European Union recommendation
W3C recommendation
letter of recommendation

Perhaps a better response, maybe in a JSON format, would help? without termination, and instead ask for further clarification?

Installation problem in python 2.7

oquidave@davemint ~/workspace/python/django_projects/wikiped $ sudo pip install wikipedia
Downloading/unpacking wikipedia
Downloading wikipedia-1.2.1.tar.gz
Running setup.py egg_info for package wikipedia
Traceback (most recent call last):
File "", line 14, in
File "/home/oquidave/workspace/python/django_projects/wikiped/build/wikipedia/setup.py", line 11, in
dependencies = [str(ir.req) for ir in install_reqs]
File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1200, in parse_requirements
skip_regex = options.skip_requirements_regex
AttributeError: 'NoneType' object has no attribute 'skip_requirements_regex'
Complete output from command python setup.py egg_info:
Traceback (most recent call last):

File "", line 14, in

File "/home/oquidave/workspace/python/django_projects/wikiped/build/wikipedia/setup.py", line 11, in

dependencies = [str(ir.req) for ir in install_reqs]

File "/usr/lib/python2.7/dist-packages/pip/req.py", line 1200, in parse_requirements

skip_regex = options.skip_requirements_regex

AttributeError: 'NoneType' object has no attribute 'skip_requirements_regex'


Command python setup.py egg_info failed with error code 1
Storing complete log in /home/oquidave/.pip/pip.log

Empty 'extract' in Wikipedia response causes 'TypeError: list indices must be integers, not str'.

>>> import wikipedia
>>> wikipedia.page('Fully connected network', auto_suggest=False, redirect=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/wikipedia/wikipedia.py", line 211, in page
    return WikipediaPage(title, redirect=redirect, preload=preload)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/wikipedia/wikipedia.py", line 224, in __init__
    self.load(redirect=redirect, preload=preload)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/wikipedia/wikipedia.py", line 276, in load
    self.__init__(title, redirect=redirect, preload=preload)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/wikipedia/wikipedia.py", line 224, in __init__
    self.load(redirect=redirect, preload=preload)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/wikipedia/wikipedia.py", line 250, in load
    pages = request['query']['pages']
TypeError: list indices must be integers, not str

WikipediaPage.images throws KeyError on 'query'

First, thank you for building and maintaining this project. It's going to help a lot with a recipe generator I'm making. My error is that when I run the following:

testerPage = wikipedia.page('Babić (grape)')
test = testerPage.images

I get the following KeyError:

KeyError                                  Traceback (most recent call last)
<ipython-input-63-6eab4e7ee84f> in <module>()
      1 testerPage = wikipedia.page('Babić (grape)')
----> 2 test = testerPage.images

/opt/anaconda/envs/np18py27-1.9/lib/python2.7/site-packages/wikipedia/wikipedia.pyc in images(self)
    371       request = _wiki_request(**query_params)
    372 
--> 373       image_keys = request['query']['pages'].keys()
    374       images = (request['query']['pages'][key] for key in image_keys)
    375       self._images = [image['imageinfo'][0]['url'] for image in images if image.get('imageinfo')]

KeyError: 'query'

I'm afraid I haven't been able to isolate the problem. Calling testerPage.summary, which also seems to use the 'query' argument, does work. I haven't had problems with the .image param otherwise--it's just this one page. I tried putting in a ternary in case an empty list of images was the problem, but that didn't help. Any ideas?

KeyError: u'extlinks' when using preload=True in page()

>>> wikipedia.page("747", preload=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Python27\lib\site-packages\wikipedia\wikipedia.py", line 276, in page
    return WikipediaPage(title, redirect=redirect, preload=preload)
  File "c:\Python27\lib\site-packages\wikipedia\wikipedia.py", line 303, in __init__
    getattr(self, prop)
  File "c:\Python27\lib\site-packages\wikipedia\wikipedia.py", line 592, in references
    'ellimit': 'max'
  File "c:\Python27\lib\site-packages\wikipedia\wikipedia.py", line 423, in __continued_query
    for datum in pages[self.pageid][prop]:
KeyError: u'extlinks'
>>> wikipedia.__version__
(1, 4, 0)

update requests to 2.2.1

requests=1.2.3 conflicts with the latest version 2.2.1, which may have been included in requirements.txt files from other packages. Please update it to 2.2.1
Also, gemnasium.com is good source to keep a tap on the same. :-)

UnicodeEncodeError during random page visiting

Hey, cool project!
I found a bug during evaluating your library - it seems, that you've got
a problem with Unicode.

>>> a = wikipedia.random()
>>> type(a)
<type 'unicode'>
>>> wikipedia.page(wikipedia.random())
Traceback (most recent call last):
  File "<input>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 19: ordinal not in range(128)
>>> wikipedia.page(wikipedia.random())
Traceback (most recent call last):
  File "<input>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 25: ordinal not in range(128)
>>> wikipedia.page(wikipedia.random())
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 143, in page
    return WikipediaPage(title, redirect=redirect, preload=preload)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 155, in __init__
    self.load(redirect=redirect, preload=preload)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 209, in load
    may_refer_to = [li.a.get_text() for li in BeautifulSoup(html).ul.find_all('li')]
AttributeError: 'NoneType' object has no attribute 'get_text'
>>> wikipedia.page(wikipedia.random())
<WikipediaPage 'Chinatown, Salt Lake City'>

./cereal

Get the same page in another language

Hello,

Is there a way to get a page in one language, and then request the same page in another language, like what we can do by using the language links in the left bar on the website?

Update:
I’m looking for an API like this:

my_page = wikipedia.search("Something")
my_page_in_french = my_page.get_lang("fr")

I’m currently looking at the API and will make a pull request if I can find a simple way to do what I want. Thoughts on this?

Update 2:
Since the language is managed globally in the module it’d hard to add this feature without changing a lot of things or going the hacky way (change API_URL, make a request, and change it back to its previous value).

Error with accented chars in search term: KeyError: u'fullurl'

e.g.:

import sys
import wikipedia as wp

s = wp.summary(str(sys.argv[1:]))

Then running script.py "Après" fails with:

Traceback (most recent call last):
  File "/home/me/.bin/w", line 25, in <module>
    s = wp.summary(str(sys.argv[1:]))
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/util.py", line 28, in __call__
    ret = self._cache[key] = self.fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 231, in summary
    page_info = page(title, auto_suggest=auto_suggest, redirect=redirect)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 276, in page
    return WikipediaPage(title, redirect=redirect, preload=preload)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 299, in __init__
    self.__load(redirect=redirect, preload=preload)
  File "/usr/local/lib/python2.7/dist-packages/wikipedia/wikipedia.py", line 398, in __load
    self.url = page['fullurl']
KeyError: u'fullurl'

wikipedia.summary gives different results than expected

wikipedia.summary('boson') gives results for "Boston" instead of "Boson" :

'Boston (pronounced /\u02c8b\u0252st\u0259n/) is the capital and largest city of 
the state of Massachusetts (officially the Commonwealth of Massachusetts), in the
United States. Boston also serves as corunty seat of Suffolk County.
..

I have version 1.3 installed.

UnicodeEncodeError on using summary for "Paris"

I tried the following example:

import wikipedia
p = "Paris"    
print wikipedia.summary(p)

And recieved the following error:

Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\u02c8' in position 16: ordinal not in range(128)

The command was executed on a for a robot modified openembedded linux. I don't know why that happens. Normally the UnicodeEncodeError occurs, if there is a not valid unicode character (with ordinal below 128).

By the way: It's working correctly if i use

page = wikipedia.page(p)
print page.summary

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.