anthonyhseb / googlesearch Goto Github PK

View Code? Open in Web Editor NEW

113.0 113.0 43.0 39 KB

Python library for scraping google search results

License: MIT License

Makefile 16.05% Python 83.95%

googlesearch's People

Contributors

Stargazers

Watchers

googlesearch's Issues

hi.googlesearch by default returned section all of google engine. yes? how I get results of googlenews section??thanks

too many arguments

google-search version:
Python version:
Operating System:

Description

Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.

What I Did

$ ./go
from: too many arguments
./go: line 2: syntax error near unexpected token `('
./go: line 2: `response = GoogleSearch().search("something")'

prefetch_pages vs prefetch_results

google-search version: 8070f8e (master as of July 2018)
Python version: any
Operating System: any

Description

googlesearch/googlesearch/googlesearch.py

Line 26 in 099550d

 def search(self, query, num_results = 10, prefetch_pages = True, prefetch_threads = 10): 

appears to be using prefetch_pages parameter. Readme ( https://github.com/anthonyhseb/googlesearch/blob/3dc96936ab857c3173ecb4c7a4856e11a2d346f5/README.rst#features ) mentions prefetch_results parameter that appears to not be existing.

What I Did

I attempted to analyse why I get 503 errors so often (worse than with a manual searches) and encountered this mismatch.

ERROR

google-search version:
Python version:
Operating System:

Description

Traceback (most recent call last):
File "c:\Abhi\installer\Anaconda3\Lib\site-packages\googlesearch\tempCodeRunnerFile.py", line 1, in
urllib.request
NameError: name 'urllib' is not defined
File "c:\Abhi\installer\Anaconda3\Lib\site-packages\googlesearch\googleseller\Anaconda3\Lib\site-packages\googlesearch\googlesearch.py" response = search.search(query, count)
File "c:\Abhi\installer\Anaconda3\Lib\site-packages\googlesearch\googlesearch.py", line 39, in search
totalText = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children.next().encode('utf-8')
PS C:\Users\Abhishek Singh\Desktop\Automate Tasks> python -u "c:\Users\Abhishek Singh\Downloads\test2.py"
Traceback (most recent call last):
File "c:\Users\Abhishek Singh\Downloads\test2.py", line 2, in
response = GoogleSearch().search("something")
File "C:\Abhi\installer\Anaconda3\lib\site-packages\googlesearch\googlesearch.py", line 39, in search
ch1 = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children
IndexError: list index out of range
PS C:\Users\Abhishek Singh\Desktop\Automate Tasks> python -u "c:\Users\Abhishek Singh\Downloads\test2.py"
Traceback (most recent call last):
File "c:\Users\Abhishek Singh\Downloads\test2.py", line 2, in
response = GoogleSearch().search("something")
File "C:\Abhi\installer\Anaconda3\lib\site-packages\googlesearch\googlesearch.py", line 39, in search
totalText = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children.next().encode('utf-8')
IndexError: list index out of range
Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.

What I Did

from googlesearch.googlesearch import GoogleSearch
response = GoogleSearch().search("something")
for result in response.results:
print("Title: " + result.title)
print("Content: " + result.getText())

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

Selector not working anymore

google-search version: 1.1.1
Python version: 3.8
Operating System: Ubuntu

Description

For a couple of days now, search has not been working, this is due to a failing selector:

It would be nice if we could be able to override the constants. Currently, they are bound to GoogleSearch class. So if we try to extend the class, it still doesn't change the functionality since the class attributes are accessed using GoogleSearch. instead of self.

Too many requests

Hi all. I'm using googlesearch in a project and recently received the error:

HTTP Error 429: Too Many Requests

Anybody else getting this? How did you resolve it? Do I need to switch to the custom search API?

first and second result are the same

google-search version:
Python version:
Operating System:

Description

Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.

What I Did

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

Urllib2 causing troubles

I just installed google-search from PyPi for Python3.5. Importing the library causes an ImportError: No module named 'urllib2'. I think the package needs some update in this regard, this might help.

merose@Panther:~$ sudo -H pip3 install --upgrade google-search
Collecting google-search
Collecting lxml (from google-search)
  Downloading lxml-4.1.0-cp35-cp35m-manylinux1_x86_64.whl (5.5MB)
    100% |████████████████████████████████| 5.6MB 239kB/s 
Requirement already up-to-date: beautifulsoup4 in /usr/local/lib/python3.5/dist-packages (from google-search)
Installing collected packages: lxml, google-search
  Found existing installation: lxml 3.5.0
    Uninstalling lxml-3.5.0:
      Successfully uninstalled lxml-3.5.0
Successfully installed google-search-1.0.2 lxml-4.1.0
merose@Panther:~$ python3
Python 3.5.2 (default, Sep 14 2017, 22:51:06) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from googlesearch.googlesearch import GoogleSearch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/googlesearch/googlesearch.py", line 6, in <module>
    import urllib2
ImportError: No module named 'urllib2'

a search gives no result, no error, nothing

google-search version: from pypi. There is no googlesearch.__version__, but install says google-search-1.1.1
Python version: 3.8.5
Operating System: Ubuntu 20.04

Description

Simple search.
I get no results, no errors, and nothing.

What I Did


sander@brixit:~$ python3 -m pip install google-search
Collecting google-search
  Downloading google_search-1.1.1-py2.py3-none-any.whl (6.5 kB)
Requirement already satisfied: soupsieve in ./.local/lib/python3.8/site-packages (from google-search) (2.2.1)
Requirement already satisfied: lxml in ./.local/lib/python3.8/site-packages (from google-search) (4.6.2)
Requirement already satisfied: beautifulsoup4 in ./.local/lib/python3.8/site-packages (from google-search) (4.9.3)
Installing collected packages: google-search
Successfully installed google-search-1.1.1


sander@brixit:~$ python3
Python 3.8.5 (default, Jan 27 2021, 15:41:15) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from googlesearch.googlesearch import GoogleSearch
>>> response = GoogleSearch().search("something")
>>> response.results
[]
>>>

Other machine same same:

>>> from googlesearch.googlesearch import GoogleSearch
>>> response = GoogleSearch().search("something")
>>> print(response.results)
[]
>>> response.
response.results  response.total    
>>> response.total
4
>>>

Am I doing something wrong?

ModuleNotFoundError: No module named 'urllib2'

google-search version: 1.0.2
Python version: 3.7.5
Operating System: Windows 10 64 Bits

Description

It's giving this error when i execute: ModuleNotFoundError: No module named 'urllib2'.

What I Did

I'm trying to execute this code that i got on the lib page on pypi.org.

from googlesearch.googlesearch import GoogleSearch
response = GoogleSearch().search("something")
for result in response.results:
  print("Title: " + result.title)
  print("Content: " + result.getText())

Initial Update

Hi 👊

This is my first visit to this fine repo, but it seems you have been working hard to keep all dependencies updated so far.

Once you have closed this issue, I'll create separate pull requests for every update as soon as I find one.

That's it for now!

Happy merging! 🤖

HTTPError: Service Unavailable

Hi, I just ran the sample code, and I got the error message of "Service Unavailable".

Could you please help me understand it? Is it just Google blocked it? Is there other way to load Google search result to Python?

Thank you very much!

File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)

HTTPError: Service Unavailable

Search language

Python version: 2.7
Operating System: win10, cygwin64

Hello,
how can i set default search language?
Is there a planned support for python 3?

Thanks,
Robert

browser_agents.txt not found

google-search version: 1.1.0
Python version: 3.9
Operating System: Ubuntu

Description

When I try to run the snippet in the example, I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: '/Projects/PycharmProjects/webcrawler/venv/lib/python3.8/site-packages/googlesearch/browser_agents.txt'

What I Did

gs = GoogleSearch()
results = gs.search(term, 20)
for result in results:
    print("Title: " + result.title)
    print("Content: " + result.getText())
    print()

The issue is in MANIFEST.in. By default, when building python packages, non .py files are not included. See the image below:

Adding include googlesearch/browser_agents.txt to MANIFEST.in should resolve the issue.

How to access to the desired link number ?

Hi everyone :),
I have an issue, I have a csv file in input whose gives the keyword that I want to search. I set up my search to get the first 2 urls. But I want to clean the url. If one url it's not good I want to get to the next url.

Is it possible to do that via the google-search library ?

Thank you

HttpError

google-search version: 1.1.1
Python version: Python3.8
Operating System: Windows 10 Home 64bit

Description

Always get Http Error (429?)

File "C:\Users\hp\Documents\pythonPractice\googleSearchResult.py", line 9, in
response = GoogleSearch().search("okayu",num_results=5,prefetch_pages=False,num_prefetch_threads=10)

File "C:\Users\hp\anaconda3\lib\site-packages\googlesearch\googlesearch.py", line 57, in search
with closing(opener.open(GoogleSearch.SEARCH_URL +

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 531, in open
response = meth(req, response)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 531, in open
response = meth(req, response)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 531, in open
response = meth(req, response)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)

HTTPError: Too Many Requests

What I Did

from googlesearch.googlesearch import GoogleSearch
response = GoogleSearch().search("okayu",num_results=5,prefetch_pages=False,num_prefetch_threads=10)
for result in response.results:
print("Title: " + result.title)
print("Content: " + result.getText())

HTTPError: HTTP Error 500: Internal Server Error

v1.0.2
2.7.13
OSX 10.13.3

Description

HTTPError: HTTP Error 500: Internal Server Error

What I Did

from googlesearch.googlesearch import GoogleSearch
  
response = GoogleSearch().search("tacita dean", num_results=150)
for i, result in enumerate(response.results):
    if 'test' in result.url.lower():
        print i, resut.url
        break

New Document Please

Hi, this document a little small. Please come new big document.

Ali İlteriş Keskin

Default prefetch_pages argument make program feels like running forever!

google-search version: Latest clone e2d3e74, I am not sure what version of this? __version__ from __init__.py says 1.0.0?
Python version: 3.8.10
Operating System: Windows 10 (Ubuntu WSL)

Description

Like what I am saying in title, when using default prefetch_pages argument and trying to get more than 10 results, it cause program feels like running forever!

Here picture with prefetch_pages=False:

Here picture with prefetch_pages=True, almost 3 minutes and program still running

What I Did

Trying to get more than 10 results

Search fails when results page shows no total

google-search version: 1.0.2
Python version: 2.7.13
Operating System: macOS 10.12.6

Description

I'm using a simple test script, barely modified from the example in the readme. I think the TOTAL_SELECTOR is missing from the results page, which seems to happen when Google displays "card" results (or whatever those are called). You can see an example with the query "danny elfman emmy award":

This could presumably be fixed by checking for the presence of the selector. However, I am interested in the total count as well, so I'm wondering if you have any ideas for another way to get that, for queries that trigger the "card" display.

What I Did

Traceback (most recent call last):
  File "google_search_test.py", line 21, in <module>
    r = GoogleSearch().search(query, prefetch_pages=False)
  File "/Users/abernstein/venvs/0/lib/python2.7/site-packages/googlesearch/googlesearch.py", line 39, in search
    totalText = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children.next().encode('utf-8')
IndexError: list index out of range

input conversion failed due to input error

v1.0.2
2.7.13
OSX 10.13.3

Description

encoding error : input conversion failed due to input error, bytes 0x81 0x84 0x35 0x29

What I Did

from googlesearch.googlesearch import GoogleSearch
  
response = GoogleSearch().search("tacita dean", num_results=50)
for i, result in enumerate(response.results):
    print i, result.url

Handling "HTTP Error 429" error.

I've been trying to create a way to handle the 429 error that comes up when google tries to throttle you, but no matter what I do the app still crashes. Any idea why or what I can do to handle it?

The code I'm using is:

try: 
        k=search(query, num=1, stop=1, pause=20)
except urllib.error.HTTPError:
        print('failed')`

Thanks,

Dan

IndexError: list index out of range

google-search version: 1.1.1
Python version: 3.8.3
Operating System: Windows 10

    response = GoogleSearch().search("something")
  File "C:\ProgramData\Anaconda3\lib\site-packages\googlesearch\googlesearch.py", line 64, in search
    totalText = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children.__next__()
IndexError: list index out of range

anthonyhseb / googlesearch Goto Github PK

googlesearch's People

Contributors

Stargazers

Watchers

Forkers

googlesearch's Issues

Description

What I Did

Description

What I Did

Description

What I Did

Description

Description

What I Did

Description

What I Did

Description

What I Did

Description

What I Did

Description

What I Did

Description

What I Did

Description

What I Did

Description

What I Did

Description

What I Did

Recommend Projects

Recommend Topics

Recommend Org