Giter VIP home page Giter VIP logo

pyppeteer's People

Contributors

alexadusei avatar berekuk avatar cdbridger avatar chrismuir avatar clarksun avatar cnicodeme avatar d33tah avatar ephes avatar esemi avatar hartym avatar hubertroy avatar marksteward avatar miyakogi avatar pdesgarets avatar pythad avatar rs2 avatar rymdhund avatar scp10011 avatar stellarhoof avatar therefromhere avatar wackazong avatar yykani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pyppeteer's Issues

Demo can not run on Windows 7

Get error when run Example: open web page and take a screenshot:

Error in data transfer
Traceback (most recent call last):
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 496, in transfer_data
msg = yield from self.read_message()
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 526, in read_message
frame = yield from self.read_data_frame(max_size=self.max_size)
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 591, in read_data_frame
frame = yield from self.read_frame(max_size)
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 632, in read_frame
extensions=self.extensions,
File "D:\Python36\lib\site-packages\websockets\framing.py", line 100, in read
data = yield from reader(2)
File "D:\Python36\lib\asyncio\streams.py", line 668, in readexactly
yield from self._wait_for_data('readexactly')
File "D:\Python36\lib\asyncio\streams.py", line 458, in _wait_for_data
yield from self._waiter
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 669, in write_frame
yield from self.writer.drain()
File "D:\Python36\lib\asyncio\streams.py", line 323, in drain
raise exc
File "D:\Python36\lib\asyncio\selector_events.py", line 762, in write
n = self._sock.send(data)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host.
Traceback (most recent call last):
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 496, in transfer_data
msg = yield from self.read_message()
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 526, in read_message
frame = yield from self.read_data_frame(max_size=self.max_size)
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 591, in read_data_frame
frame = yield from self.read_frame(max_size)
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 632, in read_frame
extensions=self.extensions,
File "D:\Python36\lib\site-packages\websockets\framing.py", line 100, in read
data = yield from reader(2)
File "D:\Python36\lib\asyncio\streams.py", line 668, in readexactly
yield from self._wait_for_data('readexactly')
File "D:\Python36\lib\asyncio\streams.py", line 458, in _wait_for_data
yield from self._waiter
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 669, in write_frame
yield from self.writer.drain()
File "D:\Python36\lib\asyncio\streams.py", line 323, in drain
raise exc
File "D:\Python36\lib\asyncio\selector_events.py", line 762, in write
n = self._sock.send(data)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\JavaL\WorkSpace\PyTest\Test\pyppeteerDemo.py", line 11, in
asyncio.get_event_loop().run_until_complete(main())
File "D:\Python36\lib\asyncio\base_events.py", line 466, in run_until_complete
return future.result()
File "D:\JavaL\WorkSpace\PyTest\Test\pyppeteerDemo.py", line 9, in main
await browser.close()
File "D:\Python36\lib\site-packages\pyppeteer\browser.py", line 156, in close
await self.disconnect()
File "D:\Python36\lib\site-packages\pyppeteer\browser.py", line 160, in disconnect
await self._connection.dispose()
File "D:\Python36\lib\site-packages\pyppeteer\connection.py", line 139, in dispose
await self._on_close()
File "D:\Python36\lib\site-packages\pyppeteer\connection.py", line 128, in _on_close
await self.connection.close()
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 370, in close
self.timeout, loop=self.loop)
File "D:\Python36\lib\asyncio\tasks.py", line 352, in wait_for
return fut.result()
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 674, in write_frame
raise ConnectionClosed(self.close_code, self.close_reason)
websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason
qq 20180402095915

page.type('2222222') raise error

hi,

await page.click('input[name="password"]') await page.type('2222222')

when I select a input element and type something,it raise error:
Task exception was never retrieved future: <Task finished coro=<Page.press() done, defined at /usr/local/lib/python3.6/site-packages/pyppeteer/page.py:688> exception=KeyError('2',)> Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/pyppeteer/page.py", line 696, in press await self._keyboard.up(key) File "/usr/local/lib/python3.6/site-packages/pyppeteer/input.py", line 61, in up self._pressedKeys.remove(key) KeyError: '2'

while I change the code like this:
await page.type('2') await page.type('2') await page.type('2') await page.type('2') await page.type('2')

it work

Browser does not start in Docker

Hello and thank you for your awesome package!
I tried it locally on my ubuntu, liked it a lot, now trying to add it to my project's intergration testing facilities and experiencing basically the next minified issue:
image

I guess it might require installation of some additional Debian (because python:3.6 docker image is based on Debian 8) packages or env variables but I don't know which ones exactly.

Please help me:)

Compatibility with Python 3.5?

Hi,

I was wondering if there was any reasons to have Pyppeteer only compatible with Python3.6+?
I'm referring to PEP498, using formatted strings literals.

The issue is that the latest Debian installation comes with Python3.5, not Python3.6, which renders Pyppeteer useless.

Except if there is a better reason (using aysnc features not compatible before 3.6 for instance), it should be better to replace f'' with ''.format().

(I can do a PR if needed)

Grab href value from anchor tags?

Hello,

I'm having difficulty figuring out how I can grab href values from anchor tags using pyppeteer.

I'm able to find the ElementHandle of these anchor tags, but how am I supposed to get the href attribute?
Is this done using evaluate function from Page class? I've tried many variations of grabbing the href attribute with no success.

prize_url_element = await giveaway.xpath(
                '//a[@class="a-link-normal giveAwayItemDetails"]/@href'
)

prize_url = await ga_page.evaluate(
                '(prize_url_element) => prize_url_element.href',
                prize_url_element
)

Might be doing something wrong but maybe someone can help me out.

Thanks!

KeyError: 'webSocketDebuggerUrl'

This is code that is/was working on a previous OS (alpine linux), but I'm now deploying in "production" (a fresh alpine 3.7 linux system) and running in to this error... not sure what to expect since this is "working" code...

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/asyncio/tasks.py", line 180, in _step
    result = coro.send(None)
  File "/home/teratorn/code/catwiz/catwiz/pdf.py", line 58, in _main
    browser = await launch(executablePath=self.exe_path)
  File "/home/teratorn/code/catwiz/venv3/lib/python3.6/site-packages/pyppeteer/launcher.py", line 243, in launch
    return await Launcher(options, **kwargs).launch()
  File "/home/teratorn/code/catwiz/venv3/lib/python3.6/site-packages/pyppeteer/launcher.py", line 160, in launch
    self.browserWSEndpoint = self._get_ws_endpoint()
  File "/home/teratorn/code/catwiz/venv3/lib/python3.6/site-packages/pyppeteer/launcher.py", line 179, in _get_ws_endpoint
    return data['webSocketDebuggerUrl']
KeyError: 'webSocketDebuggerUrl'

Errors while using latest version of websocket

Hello.
I was playing with sample code like this:

from pyppeteer import launch
import logging
logging.basicConfig(level=logging.DEBUG)

async def main():
    browser = launch(options={
        'headless': True,
        'timeout': 10000,  # Maximum time in milliseconds to wait for the browser instance to start
    })
    url = 'http://httpbin.org/anything'
    page = await browser.newPage()

    response = await page.goto(url, options={
        'timeout': 3000,
        'waitUntil': 'load'})
    print('response status: {}'.format(response.status))
    await browser.close()

loop = asyncio.get_event_loop()
loop.set_debug(enabled=True)
loop.run_until_complete(main())

Using python 3.6 and websockets==3.3 it works without any troubles.
But when I upgraded websocket to 4.0.1 - it starts falling with such message:

response status: 200
Task exception was never retrieved
future: <Task finished coro=<Connection._recv_loop() done, defined at /usr/local/lib/python3.6/site-packages/pyppeteer/connection.py:47> exception=InvalidState('Cannot write to a WebSocket in the CLOSING state',)>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pyppeteer/connection.py", line 53, in _recv_loop
resp = await self.connection.recv()
File "/usr/local/lib/python3.6/site-packages/websockets/protocol.py", line 309, in recv
loop=self.loop, return_when=asyncio.FIRST_COMPLETED)
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 307, in wait
return (yield from _wait(fs, timeout, return_when, loop))
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 390, in _wait
yield from waiter
concurrent.futures._base.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pyppeteer/connection.py", line 58, in _recv_loop
break
File "/usr/local/lib/python3.6/site-packages/websockets/client.py", line 390, in aexit
yield from self.ws_client.close()
File "/usr/local/lib/python3.6/site-packages/websockets/protocol.py", line 370, in close
self.timeout, loop=self.loop)
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 352, in wait_for
return fut.result()
File "/usr/local/lib/python3.6/site-packages/websockets/protocol.py", line 642, in write_frame
"in the {} state".format(self.state.name))
websockets.exceptions.InvalidState: Cannot write to a WebSocket in the CLOSING state

Any suggestions, how to handle this?

Submit a form should have created a new tab but don't and the program suspended animation

It happened when I try to visit a website using pyppeteer 0.0.17. I clicked a submit button, the browser should had created a new tab but just didn't and the page on the browser just like reload, I had to refill the form and submit again. For the project, the code I used as follow:

result_page_future = asyncio.get_event_loop().create_future()
browser.once('targetcreated', lambda target: result_page_future.set_result(target))
await page.click('#_searchButton', {"delay": 0.5})
result_page = await (await result_page_future).page()

The project just suspended animation and never came back before I got the param result_page. I just don't understand why it didn't timeout.

Websocket connection is lost on some websites (ConnectionClosed)

I have a script that basically open a tab, plays with a website, close the tab. Loop.

On my laptop (OSX), it works just fine. Now when I run that in a container (either with /dev/shm disabled, or with --shm-size=2g), it does work, but at some point the websocket connection is lost and an exception happens in pyppeteer.connection.Connection, in _async_send() (websocket.ConnectionClosed I think).

[EDIT 04/12] this is not container related, I can have the same error happen on my local laptop. For some reason, it happens less, but still happens.

Error:

websockets.exceptions.ConnectionClosed  WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason
ERR.:0054:asyncio: Task exception was never retrieved
future: <Task finished coro=<Connection._async_send() done, defined at /env/lib/python3.6/site-packages/pyppeteer/connection.py:61> exception=ConnectionClosed('WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason',)>
│ Traceback (most recent call last):
│   File "/env/lib/python3.6/site-packages/pyppeteer/connection.py", line 64, in _async_send
│     await self.connection.send(msg)
│   File "/env/lib/python3.6/site-packages/websockets/protocol.py", line 334, in send
│     yield from self.ensure_open()
│   File "/env/lib/python3.6/site-packages/websockets/protocol.py", line 470, in ensure_open
│     raise ConnectionClosed(self.close_code, self.close_reason)
└ websockets.exceptions.ConnectionClosed  WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason

I tried to add the debug info of asyncio and look at the logs, and it seems that at some point, for no reason that I can think of, there is indeed a OP_CLOSE frame that goes through write_frame() coroutine in the ws protocol. I'm a bit lacking knowledge on asyncio, and can't really find a way to trace it back to the reason of this close.

Then, the next async_send() complains, with an uncaught error.

    async def _async_send(self, msg: str) -> None:
        while not self._connected:
            await asyncio.sleep(self._delay)
        await self.connection.send(msg)

I tried to modify the above code to catch the connection error aroud the send() call, and try to reconnect blindly (ok that was a bit naive, but who knows), but that did not do the trick. Probably because I do it at a random time, without knowing any state.

Note that the browser process does not die, and I can reproduce it more systematically if I enclose the browser.goto call with a asyncio.wait_for with small timeout (small enough so that it can't goto for real). I guess the later is a bad idea, and I stopped trying to do it, but still it happens at complete random time even without. Low or high goto timeouts, 50 tabs or only 1, 10 seconds after running the script or 30 minute after, etc. and never on the same URL.

I don't know what kind of information I can provide to help, or what kind of tools I can use to debug and fix that. The asyncio's "hey, here is your empty stack trace, have fun" way of helping me is giving me a hard time.

Support browser contexts to launch different sessions

puppeteer/puppeteer#85

add to Line 110
https://github.com/miyakogi/pyppeteer/blob/dev/pyppeteer/browser.py#L110

async def newIncognitoPage(self) -> Page:
    """Make new incognito page on this browser and return its object."""
    browserContextId = (await self._connection.send(
        'Target.createBrowserContext', {})).get('browserContextId')
    targetId = (await self._connection.send(
        'Target.createTarget',
        {'url': 'about:blank', 'browserContextId': browserContextId})).get('targetId')
    target = self._targets.get(targetId)
    if target is None:
        raise BrowserError('Failed to create target for page.')
    if not await target._initializedPromise:
        raise BrowserError('Failed to create target for page.')
    page = await target.page()
    if page is None:
        raise BrowserError('Failed to create page.')
    return page, browserContextId

No usable sandbox! error. Chrome exits with exit code 1.

pyppeteer downloaded a chrome instance to ~/.pyppeteer. However, it drops an error if I try to launch it:

$ ./chrome  
[9995:9995:0305/230635.827218:FATAL:zygote_host_impl_linux.cc(124)] No usable sandbox! 
Update your kernel or see https://chromium.googlesource.com/chromium/src/+/master/docs/linux_suid_sandbox_development.md 
for more information on developing with the SUID sandbox. 
If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.
#0 0x55802f65c6ac base::debug::StackTrace::StackTrace()
#1 0x55802f673e23 logging::LogMessage::~LogMessage()
#2 0x55802e73da61 content::ZygoteHostImpl::Init()
...
[end of stack trace]
Calling _exit(1). Core file will not be generated.

ConnectionResetError

pyppeteer==0.0.16

code:

# coding:utf-8
import asyncio
from pyppeteer import launch

async def main():
    brower = await launch()
    await brower.close()

asyncio.get_event_loop().run_until_complete(main())

error:

Error in data transfer
Traceback (most recent call last):
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 496, in transfer_data
    msg = yield from self.read_message()
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 526, in read_message
    frame = yield from self.read_data_frame(max_size=self.max_size)
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 591, in read_data_frame
    frame = yield from self.read_frame(max_size)
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 632, in read_frame
    extensions=self.extensions,
  File "D:\code\py3\.venv\lib\site-packages\websockets\framing.py", line 100, in read
    data = yield from reader(2)
  File "F:\python3\Lib\asyncio\streams.py", line 674, in readexactly
    yield from self._wait_for_data('readexactly')
  File "F:\python3\Lib\asyncio\streams.py", line 464, in _wait_for_data
    yield from self._waiter
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 669, in write_frame
    yield from self.writer.drain()
  File "F:\python3\Lib\asyncio\streams.py", line 329, in drain
    raise exc
  File "F:\python3\Lib\asyncio\selector_events.py", line 761, in write
    n = self._sock.send(data)
ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。
Traceback (most recent call last):
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 496, in transfer_data
    msg = yield from self.read_message()
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 526, in read_message
    frame = yield from self.read_data_frame(max_size=self.max_size)
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 591, in read_data_frame
    frame = yield from self.read_frame(max_size)
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 632, in read_frame
    extensions=self.extensions,
  File "D:\code\py3\.venv\lib\site-packages\websockets\framing.py", line 100, in read
    data = yield from reader(2)
  File "F:\python3\Lib\asyncio\streams.py", line 674, in readexactly
    yield from self._wait_for_data('readexactly')
  File "F:\python3\Lib\asyncio\streams.py", line 464, in _wait_for_data
    yield from self._waiter
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 669, in write_frame
    yield from self.writer.drain()
  File "F:\python3\Lib\asyncio\streams.py", line 329, in drain
    raise exc
  File "F:\python3\Lib\asyncio\selector_events.py", line 761, in write
    n = self._sock.send(data)
ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/code/py3/pyppeteer/hello.py", line 15, in <module>
    asyncio.get_event_loop().run_until_complete(main())
  File "F:\python3\Lib\asyncio\base_events.py", line 467, in run_until_complete
    return future.result()
  File "D:/code/py3/pyppeteer/hello.py", line 11, in main
    await brower.close()
  File "D:\code\py3\.venv\lib\site-packages\pyppeteer\browser.py", line 156, in close
    await self.disconnect()
  File "D:\code\py3\.venv\lib\site-packages\pyppeteer\browser.py", line 160, in disconnect
    await self._connection.dispose()
  File "D:\code\py3\.venv\lib\site-packages\pyppeteer\connection.py", line 139, in dispose
    await self._on_close()
  File "D:\code\py3\.venv\lib\site-packages\pyppeteer\connection.py", line 128, in _on_close
    await self.connection.close()
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 370, in close
    self.timeout, loop=self.loop)
  File "F:\python3\Lib\asyncio\tasks.py", line 358, in wait_for
    return fut.result()
  File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 674, in write_frame
    raise ConnectionClosed(self.close_code, self.close_reason)
websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason

Process finished with exit code 1

RuntimeWarning: coroutine 'Browser.close' was never awaited

I just run the demo. And it takes a bug.

import asyncio
from pyppeteer.launcher import launch

async def main(browser):
    page = await browser.newPage()
    await page.goto('http://example.com')
    await page.screenshot({'path': 'example.png'})

    dimensions = await page.evaluate('''() => {
        return {
            width: document.documentElement.clientWidth,
            height: document.documentElement.clientHeight,
            deviceScaleFactor: window.devicePixelRatio,
        }
    }''')

    print(dimensions)
browser = launch(
    executablePath="/Applications/Chromium.app/Contents/MacOS/Chromium")
asyncio.get_event_loop().run_until_complete(main(browser=browser))
browser.close()

the result is

$ python3 test.py
{'width': 800, 'height': 600, 'deviceScaleFactor': 1}
test.py:21: RuntimeWarning: coroutine 'Browser.close' was never awaited
  browser.close()

Task exception was never retrieved

I randomly (but quite frequently) get this error when I try to crawl several pages of a website :
Task exception was never retrieved future: <Task finished coro=<CDPSession.send() done, defined at /Users/user/.virtualenvs/project/lib/python3.6/site-packages/pyppeteer/connection.py:180> exception=NetworkError('Protocol Error: Unknown event id: 25 None',)>

Traceback (most recent call last): File "/Users/user/.virtualenvs/project/lib/python3.6/site-packages/pyppeteer/connection.py", line 200, in send return await callback
An idea where this error could come from ?

Also, this does not seem to affect the scenario but it looks like tasks are not handled the way asyncio expects in the connection class.

Cannot launch chromium

Background:

My OS is Windows 7 64bit.

I downloaded chromium at GitHub because I cannot access https://storage.googleapis.com/chromium-browser-snapshots. I also put it into C:\Users\Administrator\.pyppeteer\local-chromium\533271\chrome-win32.

chromium version is 66.0.3336.0.

When I first run the example code:

import asyncio
from pyppeteer import launch

async def main():
    browser = launch()
    page = await browser.newPage()
    await page.goto('http://example.com')
    await page.screenshot({'path': 'example.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

I got an UnicodeDecodeError(because it contains Chinese) so I also changed 'UTF-8' to 'GBK' at launch.py:

msg = self.proc.stdout.readline().decode()

to

msg = self.proc.stdout.readline().decode(`GBK`)
print(msg)

Issue

I executed the example code again and got nothing.
It was pending until I kill the process chrome.exe.
I printed msg and got these information:

[0309/215247.680:ERROR:network_change_notifier_win.cc(157)] WSALookupServiceBegin failed with: 10108

[0309/215247.696:ERROR:gpu_process_transport_factory.cc(1019)] Lost UI shared context.

[0309/215247.712:ERROR:tcp_socket_win.cc(272)] CreatePlatformSocket() returned an error: 无法加载或初始化请求的服务提供程序。(Unable to load or initialize the requested service provider) (0x277A)

[0309/215247.712:WARNING:net_errors_win.cc(119)] Unknown error 10106 mapped to net::ERR_FAILED

[0309/215247.712:ERROR:tcp_socket_win.cc(272)] CreatePlatformSocket() returned an error: 无法加载或初始化请求的服务提供程序。(Unable to load or initialize the requested service provider) (0x277A)

[0309/215247.712:WARNING:net_errors_win.cc(119)] Unknown error 10106 mapped to net::ERR_FAILED

[0309/215247.712:ERROR:devtools_http_handler.cc(249)] Cannot start http server for devtools. Stop devtools.
  1. It will be pending forever.
  2. What could I do so I can use pyppeteer?

Thank your response.

Usaage example has error

import asyncio
from pyppeteer.launcher import launch

async def main():
browser = launch()
page = await browser.newPage()
await page.goto('http://example.com')
await page.screenshot({'path': 'example.png'})
browser.close()

asyncio.get_event_loop().run_until_complete(main())

run this example and then

Traceback (most recent call last):
File "D:/Project/Pytho_Project/ComputerVision/pyppe2.py", line 12, in
loop.run_until_complete(main())
File "D:\python3.6\lib\asyncio\base_events.py", line 467, in run_until_complete
return future.result()
File "D:/Project/Pytho_Project/ComputerVision/pyppe2.py", line 9, in main
browser.close()
File "D:\python3.6\lib\site-packages\pyppeteer\browser.py", line 34, in close
asyncio.get_event_loop().run_until_complete(self._connection.dispose())
File "D:\python3.6\lib\asyncio\base_events.py", line 454, in run_until_complete
self.run_forever()
File "D:\python3.6\lib\asyncio\base_events.py", line 408, in run_forever
raise RuntimeError('This event loop is already running')
RuntimeError: This event loop is already running

delete browser.close(),can fix it

Webpage to PDF

Hi,

Thanks for this port!

I was looking at the docs and it said Page.pdf feature is not yet implemented, do you have plans to complete that feature anytime soon?

Regexp " Devtools listening..." is different for chromium-browser

I was trying to run pyppeteer in an alpine container, where the downloaded chrome does not work due to libc issues.
Trying to installl chromium worked but pyppeteer would not find it. Looking in the code i found why:
pyppeteer is using this regexp

DevTools listening on (ws://.*)

but chromium-browser returns this string:

DevTools listening on 127.0.0.1:12345

and that does not match. Can we change the code to match both scenarios ? In my fork i hacked it to only match my scenario but a structural solution is of course preferred.

waitForNavigation doesn't work after clicking a link

# coding:utf-8
import asyncio
from pyppeteer import launch

async def main():
    brower = await launch(headless=False)
    page = await brower.newPage()
    await page.setViewport(dict(width=1200, height=1000))
    await page.goto("https://github.com")
    await page.click('.HeaderMenu [href="/features"]')
    await page.waitForNavigation()

asyncio.get_event_loop().run_until_complete(main())

https://github.com/features The page has been loaded but the navigation timed out

Traceback (most recent call last):
  File "D:/code/py3/pyppeteer/hello.py", line 42, in <module>
    asyncio.get_event_loop().run_until_complete(main())
  File "F:\python3\Lib\asyncio\base_events.py", line 467, in run_until_complete
    return future.result()
  File "D:/code/py3/pyppeteer/hello.py", line 14, in main
    await page.waitForNavigation()
  File "D:\code\py3\.venv\lib\site-packages\pyppeteer\page.py", line 698, in waitForNavigation
    raise error
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.

puppeteer solution

https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pageclickselector-options

const [response] = await Promise.all([
  page.waitForNavigation(waitOptions),
  page.click(selector, clickOptions),
]);

But how does python solve?

Set Download Location in Pyppeteer in headless mode

In there is a method to store the downloaded files in headless mode of puppeteer as described in this so answer
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './myAwesomeDownloadFolder'});

Is there any similar method that exists in pyppeteer?

pyppeteer.errors.BrowserError: Unexpectedly chrome process closed with return code: 127

Dev branch

When I run first Readme example script (Example: open web page and take a screenshot), I've got this error:

python3 example.py Traceback (most recent call last): File "example.py", line 11, in <module> asyncio.get_event_loop().run_until_complete(main()) File "/usr/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete return future.result() File "example.py", line 5, in main browser = launch() File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 161, in launch return Launcher(options, **kwargs).launch() File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 127, in launch raise BrowserError('Unexpectedly chrome process closed with ' pyppeteer.errors.BrowserError: Unexpectedly chrome process closed with return code: 127

cannot import name 'launch'

Hi,

I'm trying to run the example but got this error:

Traceback (most recent call last):
File "pyppeteer.py", line 2, in
from pyppeteer import launch
File "test\pyppeteer.py", line 2, in
from pyppeteer import launch
ImportError: cannot import name 'launch'

Tried and failed on both Mac and Windows.

Installation failed: UnicodeDecodeError

Hello, I had a problem when I try to install the package. The following is my environment.

  • Python 3.5.2 :: Anaconda custom (64-bit)
  • Windows 10
  • pip 9.0.1

Using pip install pyppeteer to install the package. The error is

Collecting pyppeteer
  Using cached pyppeteer-0.0.12.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\secsi\AppData\Local\Temp\pip-build-mbsh4qol\pyppeteer\setup.py", line 27, in <module>
        compile_files(in_dir, out_dir, target)
      File "c:\users\secsi\anaconda3\lib\site-packages\py_backwards\compiler.py", line 85, in compile_files
        dependencies.update(_compile_file(paths, target))
      File "c:\users\secsi\anaconda3\lib\site-packages\py_backwards\compiler.py", line 57, in _compile_file
        code = f.read()
    UnicodeDecodeError: 'gbk' codec can't decode byte 0xb6 in position 2147: illegal multibyte sequence

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:\Users\secsi\AppData\Local\Temp\pip-build-mbsh4qol\pyppeteer\

Puppeteer hangs for some websites (Navigation Timeout Exceeded: 30000 ms exceeded).

Puppeteer hangs at some websites, throwing following error:
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.

This is the example:

async def main():
    browser = await launch({'headless': False})
    page = await browser.newPage()
    await page.goto('https://www.ferragamo.com/shop/us/en/women/silk-bijoux/shawls-and-stoles')
    await browser.close()
asyncio.get_event_loop().run_until_complete(main())

I have tried 'waitUntil': 'networkidle0', 'waitUntil': 'networkidle2'. It does not help for this particular website.

local variable 'watchdog' referenced before assignment

branch dev, When timeout set to 0, commit 158bfaf causes line 98 if not watchdog.done():, UnboundLocalError: local variable 'watchdog' referenced before assignment

Traceback (most recent call last):
  File "./test_pype.py", line 48, in <module>
    asyncio.get_event_loop().run_until_complete(main())
  File "/usr/lib64/python3.6/asyncio/base_events.py", line 467, in run_until_complete
    return future.result()
  File "./test_pype.py", line 44, in main
    await page.goto(url, timeout=0)
  File "/home/centos/p3venv/lib64/python3.6/site-packages/pyppeteer/page.py", line 487, in goto
    error = await navigationPromise
  File "/home/centos/p3venv/lib64/python3.6/site-packages/pyppeteer/navigator_watcher.py", line 98, in waitForNavigation
    if not watchdog.done():
UnboundLocalError: local variable 'watchdog' referenced before assignment

[feature] Enhance launcher to let the end user manage signals / exit handlers if he wants so

Currently the Launcher contains the following code:

        # dont forget to close browser process
        atexit.register(_close_process)
        if self.options.get('handleSIGINT', True):
            signal.signal(signal.SIGINT, _close_process)
        if self.options.get('handleSIGTERM', True):
            signal.signal(signal.SIGTERM, _close_process)
        if not sys.platform.startswith('win'):
            # SIGHUP is not defined on windows
            if self.options.get('handleSIGHUP', True):
                signal.signal(signal.SIGHUP, _close_process)

Although it can be handy to begin with pyppeteer, it causes different troubles when integrated in other code. What if there are already signal handlers? What if an error happen in the _close_process callable? etc.

To overcome this, I have my own Launcher that is basically a copypaste of the original launcher without this part.

I propose to make this configurable, we can keep the current behaviour as default (even if I think it's not a good idea).

I can work on a PR if you're not against it, let me know.

pyppeteer fails in a simple test case where pupeteer does not

I tried a small comparative test; this puppeteer code works and successfully takes a screenshot:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://davepeck.org/');
  await page.screenshot({path: '/tmp/dave-js.png'});

  await browser.close();
})();

The equivalent pyppeteer code does not work:

import asyncio
from pyppeteer.launcher import launch

async def main(browser):
    page = await browser.newPage()
    await page.goto('https://davepeck.org/')
    await page.screenshot({'path': '/tmp/dave-py.png'})

browser = launch()
asyncio.get_event_loop().run_until_complete(main(browser))
browser.close()

It hits its 3 second navigation timeout; explicitly setting a different timeout (something that will definitely be long enough to load the page in question, like await page.goto('https://davepeck.org/', timeout=150000)) similarly fails.

Details:
- Pyppeteer: Python 3.6.4, Pyppeteer 0.0.9, chromium revision 497674 (62.0.3198.0), macOS 10.13.3
- Puppeteer: Node 9.4.0, Puppeteer 1.0.0, chromium revision 526987 (65.0.3312.0), macOS 10.13.3

Thoughts:
- It wouldn't surprise me if the very different chromium versions are to blame, although I'm not certain.
- I noticed this with my personal blog, but I suspect we'll find similar behavior for other pages?

Happy to test whatever you'd like.

iFrames are not loaded

I have tested with some different sites that have iframes within and when I try to access page.frames an AttributeError is raised always. I have manually checked that the iframes are in place.

Traceback (most recent call last):
  File "async_test.py", line 44, in <module>
    loop.run_until_complete(c.main())
  File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete
    return future.result()
  File "async_test.py", line 23, in main
    await self.start()
  File "async_test.py", line 29, in start_challenge
    print(self.page.frames)
  File "/Users/ordanis/.local/share/virtualenvs/test-mL5IBpML/lib/python3.6/site-packages/pyppeteer/page.py", line 171, in frames
    return list(self._frames.values())
AttributeError: 'Page' object has no attribute '_frames'

Default example on main readme doesn't work.

The example indicated here:
https://github.com/miyakogi/pyppeteer#usage

import asyncio
from pyppeteer import launch

async def main():
    browser = launch()
    page = await browser.newPage()
    await page.goto('http://example.com')
    await page.screenshot({'path': 'example.png'})
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

Is not working for me. I get a RuntimeError: This event loop is already running because browser.close() also use a run_until_complete call on the Connection.py file.

I kind of "fixed" it by doing the following, which is not perfect, if anyone has any better solution?

asyncio.get_event_loop().run_until_complete(main())

browser.close()

try:
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*asyncio.Task.all_tasks()))
except Exception as e:
    # Will almost everytime show an exception of either:
    # <class 'websockets.exceptions.InvalidState'> Cannot write to a WebSocket in the CLOSING state
    # Or a CancelledError
    print("Exception when closing:")
    print(e.__class__, e)
    pass

"browser.close()" dosen't terminate browser process

I saw the source code of the method "browser.close()" as follow:
async def close(self) -> None:
"""Close connections and terminate browser process."""
await self._closeCallback()
await self.disconnect()
and it seems doesn't terminate browser process, the Chromium process still running.

Building docs failed with sphinx

Environment:
Windows 10
Python 3.6.2
Sphinx 1.7.1

When running the command make.bat html | text | etc , I was getting an error that sphinxcontrib.asyncio extension was not installed. Removing the extension from the config.py fixed the issue.

The documentation should explicitly mention a step to get around this.

ElementHandle .querySelector() and J() missing

Docs mention ElementHandle.querySelector() and ElementHandle.J() but neither seem to be implemented.

(Pdb) elem.querySelector("h1")
*** AttributeError: 'ElementHandle' object has no attribute 'querySelector'

Can we put chromium download into the install process?

Hi thanks for this puppeteer port, works pretty well expect one thing that it downloads the essential chromium during the first launch of the browser.

This can be a potential breakpoint for every production service since if the resource is (somehow) not available, the service will break and you will only be able to find out once the service is online.

I'm currently using pyppeteer in a docker container, and I think if anything is going to break, it better breaks during the build process instead of after deployment.

Support QtWebengine

I would like to use pyppeteer to control a QtWebengine. But it throws a error.

/usr/bin/python3.6 /home/matty/PycharmProjects/GhostAuto/AutoWeb.py
Traceback (most recent call last):
  File "/home/matty/PycharmProjects/GhostAuto/AutoWeb.py", line 17, in <module>
    asyncio.get_event_loop().run_until_complete(main())
  File "/usr/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete
    return future.result()
  File "/home/matty/PycharmProjects/GhostAuto/AutoWeb.py", line 7, in main
    browserWSEndpoint="ws://127.0.0.1:55551/devtools/browser/<id>")
  File "/usr/lib/python3.6/site-packages/pyppeteer/launcher.py", line 263, in connect
    connection, options, None, lambda: connection.send('Browser.close'))
  File "/usr/lib/python3.6/site-packages/pyppeteer/browser.py", line 80, in create
    await connection.send('Target.setDiscoverTargets', {'discover': True})
pyppeteer.errors.NetworkError: Protocol Error: {'code': -32601, 'message': "'Target.setDiscoverTargets' wasn't found"}

I think this is because pyppeteer cant get the browser id because "http://127.0.0.1:55551/json/version " doesn't supply it. Maybe it is possible to connect strait to a page

Complete noob question

I've been using phantomjs (speaking of which.. is horrible to use with python), then i find this. Thank you @miyakogi

Have some noob questions

  1. How do you make sure that pyppeteer follow redirects?

  2. how do you fetch list of links (along with innerHTML text of the links)

  3. how do you fetch the PDF that's requested in header?

waitFor networkidle does not work

Exceptions are thrown if waitFor networkidle is used. One of them is caused by a typo on line 92 in navigator_watcher,py (lambda f: instead of lambda:). Even then, it does not work. I'll see if I can fix this and submit a proper pull request.

Chrome Hangs

When I run the screenshot sample code it attempts to download Chrome. But after downloading it hangs.

[W:pyppeteer.chromium_downloader] start chromium download. Download may take a few minutes. [W:pyppeteer.chromium_downloader] chromium download done. [W:pyppeteer.chromium_downloader] chromium extracted to: /Users/<username>/.pyppeteer/local-chromium/497674

It does download the Chrome app. If I kill the script and retry, it won't download the app again but it doesn't do anything either.

This is the trace when I ^C

^CTraceback (most recent call last): File "test.py", line 13, in <module> browser = launch() File "/usr/local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 117, in launch return Launcher(options, **kwargs).launch() File "/usr/local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 84, in launch time.sleep(0.1) KeyboardInterrupt

Any ideas?

Navigation Timeout Exceeded: 3000 ms exceeded

Exception in callback NavigatorWatcher.waitForNavigation.<locals>.watchdog_cb(<Task finishe...> result=None>) at /usr/local/lib/python3.6/dist-packages/pyppeteer/navigator_watcher.py:49
handle: <Handle NavigatorWatcher.waitForNavigation.<locals>.watchdog_cb(<Task finishe...> result=None>) at /usr/local/lib/python3.6/dist-packages/pyppeteer/navigator_watcher.py:49>
Traceback (most recent call last):
  File "/usr/lib/python3.6/asyncio/events.py", line 127, in _run
    self._callback(*self._args)
  File "/usr/local/lib/python3.6/dist-packages/pyppeteer/navigator_watcher.py", line 52, in watchdog_cb
    self._timeout)
  File "/usr/local/lib/python3.6/dist-packages/pyppeteer/navigator_watcher.py", line 40, in _raise_error
    raise error
concurrent.futures._base.TimeoutError: Navigation Timeout Exceeded: 3000 ms exceeded

Hi, What this mean? is this page can't render?

Running in Travis

Good day, thanks for the awesome project!
I'm having some problem running it in Travis, and it seems that you succeeded, could you help documenting how to do it?

I've tried several options both from the travis doc and checking other puppeteer configurations but after
[W:pyppeteer.chromium_downloader] chromium extracted to: /home/travis/.pyppeteer/local-chromium/543305
I get a pyppeteer.errors.BrowserError: Failed to connect to browser port: http://127.0.0.1:39289/json/version and I'm not sure how to debug it.
Do you have any suggestion?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.