Giter VIP home page Giter VIP logo

googlesearch's People

Contributors

anthonyhseb avatar michaelbukachi avatar pyup-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

googlesearch's Issues

too many arguments

  • google-search version:
  • Python version:
  • Operating System:

Description

Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.

What I Did

$ ./go
from: too many arguments
./go: line 2: syntax error near unexpected token `('
./go: line 2: `response = GoogleSearch().search("something")'

prefetch_pages vs prefetch_results

  • google-search version: 8070f8e (master as of July 2018)
  • Python version: any
  • Operating System: any

Description

def search(self, query, num_results = 10, prefetch_pages = True, prefetch_threads = 10):
appears to be using prefetch_pages parameter. Readme ( https://github.com/anthonyhseb/googlesearch/blob/3dc96936ab857c3173ecb4c7a4856e11a2d346f5/README.rst#features ) mentions prefetch_results parameter that appears to not be existing.

What I Did

I attempted to analyse why I get 503 errors so often (worse than with a manual searches) and encountered this mismatch.

ERROR

  • google-search version:
  • Python version:
  • Operating System:

Description

Traceback (most recent call last):
File "c:\Abhi\installer\Anaconda3\Lib\site-packages\googlesearch\tempCodeRunnerFile.py", line 1, in
urllib.request
NameError: name 'urllib' is not defined
File "c:\Abhi\installer\Anaconda3\Lib\site-packages\googlesearch\googleseller\Anaconda3\Lib\site-packages\googlesearch\googlesearch.py" response = search.search(query, count)
File "c:\Abhi\installer\Anaconda3\Lib\site-packages\googlesearch\googlesearch.py", line 39, in search
totalText = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children.next().encode('utf-8')
PS C:\Users\Abhishek Singh\Desktop\Automate Tasks> python -u "c:\Users\Abhishek Singh\Downloads\test2.py"
Traceback (most recent call last):
File "c:\Users\Abhishek Singh\Downloads\test2.py", line 2, in
response = GoogleSearch().search("something")
File "C:\Abhi\installer\Anaconda3\lib\site-packages\googlesearch\googlesearch.py", line 39, in search
ch1 = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children
IndexError: list index out of range
PS C:\Users\Abhishek Singh\Desktop\Automate Tasks> python -u "c:\Users\Abhishek Singh\Downloads\test2.py"
Traceback (most recent call last):
File "c:\Users\Abhishek Singh\Downloads\test2.py", line 2, in
response = GoogleSearch().search("something")
File "C:\Abhi\installer\Anaconda3\lib\site-packages\googlesearch\googlesearch.py", line 39, in search
totalText = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children.next().encode('utf-8')
IndexError: list index out of range
Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.

What I Did

from googlesearch.googlesearch import GoogleSearch
response = GoogleSearch().search("something")
for result in response.results:
print("Title: " + result.title)
print("Content: " + result.getText())

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

Selector not working anymore

  • google-search version: 1.1.1
  • Python version: 3.8
  • Operating System: Ubuntu

Description

For a couple of days now, search has not been working, this is due to a failing selector:
screenshot_20210420042639

It would be nice if we could be able to override the constants. Currently, they are bound to GoogleSearch class. So if we try to extend the class, it still doesn't change the functionality since the class attributes are accessed using GoogleSearch. instead of self.

Too many requests

Hi all. I'm using googlesearch in a project and recently received the error:

HTTP Error 429: Too Many Requests

Anybody else getting this? How did you resolve it? Do I need to switch to the custom search API?

first and second result are the same

  • google-search version:
  • Python version:
  • Operating System:

Description

Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.

What I Did

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

Urllib2 causing troubles

I just installed google-search from PyPi for Python3.5. Importing the library causes an ImportError: No module named 'urllib2'. I think the package needs some update in this regard, this might help.

merose@Panther:~$ sudo -H pip3 install --upgrade google-search
Collecting google-search
Collecting lxml (from google-search)
  Downloading lxml-4.1.0-cp35-cp35m-manylinux1_x86_64.whl (5.5MB)
    100% |████████████████████████████████| 5.6MB 239kB/s 
Requirement already up-to-date: beautifulsoup4 in /usr/local/lib/python3.5/dist-packages (from google-search)
Installing collected packages: lxml, google-search
  Found existing installation: lxml 3.5.0
    Uninstalling lxml-3.5.0:
      Successfully uninstalled lxml-3.5.0
Successfully installed google-search-1.0.2 lxml-4.1.0
merose@Panther:~$ python3
Python 3.5.2 (default, Sep 14 2017, 22:51:06) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from googlesearch.googlesearch import GoogleSearch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/googlesearch/googlesearch.py", line 6, in <module>
    import urllib2
ImportError: No module named 'urllib2'

a search gives no result, no error, nothing

  • google-search version: from pypi. There is no googlesearch.__version__, but install says google-search-1.1.1
  • Python version: 3.8.5
  • Operating System: Ubuntu 20.04

Description

Simple search.
I get no results, no errors, and nothing.

What I Did


sander@brixit:~$ python3 -m pip install google-search
Collecting google-search
  Downloading google_search-1.1.1-py2.py3-none-any.whl (6.5 kB)
Requirement already satisfied: soupsieve in ./.local/lib/python3.8/site-packages (from google-search) (2.2.1)
Requirement already satisfied: lxml in ./.local/lib/python3.8/site-packages (from google-search) (4.6.2)
Requirement already satisfied: beautifulsoup4 in ./.local/lib/python3.8/site-packages (from google-search) (4.9.3)
Installing collected packages: google-search
Successfully installed google-search-1.1.1


sander@brixit:~$ python3
Python 3.8.5 (default, Jan 27 2021, 15:41:15) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from googlesearch.googlesearch import GoogleSearch
>>> response = GoogleSearch().search("something")
>>> response.results
[]
>>> 


Other machine same same:

>>> from googlesearch.googlesearch import GoogleSearch
>>> response = GoogleSearch().search("something")
>>> print(response.results)
[]
>>> response.
response.results  response.total    
>>> response.total
4
>>> 

Am I doing something wrong?

ModuleNotFoundError: No module named 'urllib2'

  • google-search version: 1.0.2
  • Python version: 3.7.5
  • Operating System: Windows 10 64 Bits

Description

It's giving this error when i execute: ModuleNotFoundError: No module named 'urllib2'.

What I Did

I'm trying to execute this code that i got on the lib page on pypi.org.

from googlesearch.googlesearch import GoogleSearch
response = GoogleSearch().search("something")
for result in response.results:
  print("Title: " + result.title)
  print("Content: " + result.getText())

Initial Update

Hi 👊

This is my first visit to this fine repo, but it seems you have been working hard to keep all dependencies updated so far.

Once you have closed this issue, I'll create separate pull requests for every update as soon as I find one.

That's it for now!

Happy merging! 🤖

HTTPError: Service Unavailable

Hi, I just ran the sample code, and I got the error message of "Service Unavailable".

Could you please help me understand it? Is it just Google blocked it? Is there other way to load Google search result to Python?

Thank you very much!

File "C:\ProgramData\Anaconda3\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)

HTTPError: Service Unavailable

Search language

  • Python version: 2.7
  • Operating System: win10, cygwin64

Hello,
how can i set default search language?
Is there a planned support for python 3?

Thanks,
Robert

browser_agents.txt not found

  • google-search version: 1.1.0
  • Python version: 3.9
  • Operating System: Ubuntu

Description

When I try to run the snippet in the example, I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: '/Projects/PycharmProjects/webcrawler/venv/lib/python3.8/site-packages/googlesearch/browser_agents.txt'

What I Did

gs = GoogleSearch()
results = gs.search(term, 20)
for result in results:
    print("Title: " + result.title)
    print("Content: " + result.getText())
    print()

The issue is in MANIFEST.in. By default, when building python packages, non .py files are not included. See the image below:
screenshot_20210308153728

Adding include googlesearch/browser_agents.txt to MANIFEST.in should resolve the issue.

How to access to the desired link number ?

Hi everyone :),
I have an issue, I have a csv file in input whose gives the keyword that I want to search. I set up my search to get the first 2 urls. But I want to clean the url. If one url it's not good I want to get to the next url.

Is it possible to do that via the google-search library ?

Thank you

HttpError

  • google-search version: 1.1.1
  • Python version: Python3.8
  • Operating System: Windows 10 Home 64bit

Description

Always get Http Error (429?)

File "C:\Users\hp\Documents\pythonPractice\googleSearchResult.py", line 9, in
response = GoogleSearch().search("okayu",num_results=5,prefetch_pages=False,num_prefetch_threads=10)

File "C:\Users\hp\anaconda3\lib\site-packages\googlesearch\googlesearch.py", line 57, in search
with closing(opener.open(GoogleSearch.SEARCH_URL +

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 531, in open
response = meth(req, response)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 531, in open
response = meth(req, response)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 531, in open
response = meth(req, response)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)

File "C:\Users\hp\anaconda3\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)

HTTPError: Too Many Requests

What I Did

from googlesearch.googlesearch import GoogleSearch
response = GoogleSearch().search("okayu",num_results=5,prefetch_pages=False,num_prefetch_threads=10)
for result in response.results:
print("Title: " + result.title)
print("Content: " + result.getText())

HTTPError: HTTP Error 500: Internal Server Error

  • v1.0.2
  • 2.7.13
  • OSX 10.13.3

Description

HTTPError: HTTP Error 500: Internal Server Error

What I Did

from googlesearch.googlesearch import GoogleSearch
  
response = GoogleSearch().search("tacita dean", num_results=150)
for i, result in enumerate(response.results):
    if 'test' in result.url.lower():
        print i, resut.url
        break

New Document Please

Hi, this document a little small. Please come new big document.

Ali İlteriş Keskin

Default prefetch_pages argument make program feels like running forever!

  • google-search version: Latest clone e2d3e74, I am not sure what version of this? __version__ from __init__.py says 1.0.0?
  • Python version: 3.8.10
  • Operating System: Windows 10 (Ubuntu WSL)

Description

Like what I am saying in title, when using default prefetch_pages argument and trying to get more than 10 results, it cause program feels like running forever!

Here picture with prefetch_pages=False:
false
Here picture with prefetch_pages=True, almost 3 minutes and program still running
true

What I Did

Trying to get more than 10 results

Search fails when results page shows no total

  • google-search version: 1.0.2
  • Python version: 2.7.13
  • Operating System: macOS 10.12.6

Description

I'm using a simple test script, barely modified from the example in the readme. I think the TOTAL_SELECTOR is missing from the results page, which seems to happen when Google displays "card" results (or whatever those are called). You can see an example with the query "danny elfman emmy award":

screen shot 2018-01-23 at 10 44 43 am

This could presumably be fixed by checking for the presence of the selector. However, I am interested in the total count as well, so I'm wondering if you have any ideas for another way to get that, for queries that trigger the "card" display.

What I Did

Traceback (most recent call last):
  File "google_search_test.py", line 21, in <module>
    r = GoogleSearch().search(query, prefetch_pages=False)
  File "/Users/abernstein/venvs/0/lib/python2.7/site-packages/googlesearch/googlesearch.py", line 39, in search
    totalText = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children.next().encode('utf-8')
IndexError: list index out of range

input conversion failed due to input error

  • v1.0.2
  • 2.7.13
  • OSX 10.13.3

Description

encoding error : input conversion failed due to input error, bytes 0x81 0x84 0x35 0x29

What I Did

from googlesearch.googlesearch import GoogleSearch
  
response = GoogleSearch().search("tacita dean", num_results=50)
for i, result in enumerate(response.results):
    print i, result.url

Handling "HTTP Error 429" error.

I've been trying to create a way to handle the 429 error that comes up when google tries to throttle you, but no matter what I do the app still crashes. Any idea why or what I can do to handle it?

The code I'm using is:

try: 
        k=search(query, num=1, stop=1, pause=20)
except urllib.error.HTTPError:
        print('failed')`

Thanks,

Dan

IndexError: list index out of range

  • google-search version: 1.1.1
  • Python version: 3.8.3
  • Operating System: Windows 10
    response = GoogleSearch().search("something")
  File "C:\ProgramData\Anaconda3\lib\site-packages\googlesearch\googlesearch.py", line 64, in search
    totalText = soup.select(GoogleSearch.TOTAL_SELECTOR)[0].children.__next__()
IndexError: list index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.