miyakogi / pyppeteer Goto Github PK
View Code? Open in Web Editor NEWHeadless chrome/chromium automation library (unofficial port of puppeteer)
License: Other
Headless chrome/chromium automation library (unofficial port of puppeteer)
License: Other
Get error when run Example: open web page and take a screenshot:
Error in data transfer
Traceback (most recent call last):
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 496, in transfer_data
msg = yield from self.read_message()
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 526, in read_message
frame = yield from self.read_data_frame(max_size=self.max_size)
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 591, in read_data_frame
frame = yield from self.read_frame(max_size)
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 632, in read_frame
extensions=self.extensions,
File "D:\Python36\lib\site-packages\websockets\framing.py", line 100, in read
data = yield from reader(2)
File "D:\Python36\lib\asyncio\streams.py", line 668, in readexactly
yield from self._wait_for_data('readexactly')
File "D:\Python36\lib\asyncio\streams.py", line 458, in _wait_for_data
yield from self._waiter
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 669, in write_frame
yield from self.writer.drain()
File "D:\Python36\lib\asyncio\streams.py", line 323, in drain
raise exc
File "D:\Python36\lib\asyncio\selector_events.py", line 762, in write
n = self._sock.send(data)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host.
Traceback (most recent call last):
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 496, in transfer_data
msg = yield from self.read_message()
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 526, in read_message
frame = yield from self.read_data_frame(max_size=self.max_size)
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 591, in read_data_frame
frame = yield from self.read_frame(max_size)
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 632, in read_frame
extensions=self.extensions,
File "D:\Python36\lib\site-packages\websockets\framing.py", line 100, in read
data = yield from reader(2)
File "D:\Python36\lib\asyncio\streams.py", line 668, in readexactly
yield from self._wait_for_data('readexactly')
File "D:\Python36\lib\asyncio\streams.py", line 458, in _wait_for_data
yield from self._waiter
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 669, in write_frame
yield from self.writer.drain()
File "D:\Python36\lib\asyncio\streams.py", line 323, in drain
raise exc
File "D:\Python36\lib\asyncio\selector_events.py", line 762, in write
n = self._sock.send(data)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\JavaL\WorkSpace\PyTest\Test\pyppeteerDemo.py", line 11, in
asyncio.get_event_loop().run_until_complete(main())
File "D:\Python36\lib\asyncio\base_events.py", line 466, in run_until_complete
return future.result()
File "D:\JavaL\WorkSpace\PyTest\Test\pyppeteerDemo.py", line 9, in main
await browser.close()
File "D:\Python36\lib\site-packages\pyppeteer\browser.py", line 156, in close
await self.disconnect()
File "D:\Python36\lib\site-packages\pyppeteer\browser.py", line 160, in disconnect
await self._connection.dispose()
File "D:\Python36\lib\site-packages\pyppeteer\connection.py", line 139, in dispose
await self._on_close()
File "D:\Python36\lib\site-packages\pyppeteer\connection.py", line 128, in _on_close
await self.connection.close()
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 370, in close
self.timeout, loop=self.loop)
File "D:\Python36\lib\asyncio\tasks.py", line 352, in wait_for
return fut.result()
File "D:\Python36\lib\site-packages\websockets\protocol.py", line 674, in write_frame
raise ConnectionClosed(self.close_code, self.close_reason)
websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason
hi,
await page.click('input[name="password"]') await page.type('2222222')
when I select a input element and type something,it raise error:
Task exception was never retrieved future: <Task finished coro=<Page.press() done, defined at /usr/local/lib/python3.6/site-packages/pyppeteer/page.py:688> exception=KeyError('2',)> Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/pyppeteer/page.py", line 696, in press await self._keyboard.up(key) File "/usr/local/lib/python3.6/site-packages/pyppeteer/input.py", line 61, in up self._pressedKeys.remove(key) KeyError: '2'
while I change the code like this:
await page.type('2') await page.type('2') await page.type('2') await page.type('2') await page.type('2')
it work
Hello and thank you for your awesome package!
I tried it locally on my ubuntu, liked it a lot, now trying to add it to my project's intergration testing facilities and experiencing basically the next minified issue:
I guess it might require installation of some additional Debian (because python:3.6
docker image is based on Debian 8) packages or env variables but I don't know which ones exactly.
Please help me:)
Something went wrong..
Hi,
I was wondering if there was any reasons to have Pyppeteer only compatible with Python3.6+?
I'm referring to PEP498, using formatted strings literals.
The issue is that the latest Debian installation comes with Python3.5, not Python3.6, which renders Pyppeteer useless.
Except if there is a better reason (using aysnc features not compatible before 3.6 for instance), it should be better to replace f'' with ''.format().
(I can do a PR if needed)
Puppeteer adds elementHandle.screenshot support in v0.13.0. It would be great to have that available in pyppeteer.
Hi,
Here's an issue:
instance = launch(headless=True)
instance.close()
.close()
should check if the connection is active before trying to close the connection :)
The error happens frequently when I visit a websit using pyppeteer, is it a bug from puppeteer. I saw some people are discussing this issue.
puppeteer/puppeteer#1325
Hello,
I'm having difficulty figuring out how I can grab href values from anchor tags using pyppeteer.
I'm able to find the ElementHandle of these anchor tags, but how am I supposed to get the href attribute?
Is this done using evaluate function from Page class? I've tried many variations of grabbing the href attribute with no success.
prize_url_element = await giveaway.xpath(
'//a[@class="a-link-normal giveAwayItemDetails"]/@href'
)
prize_url = await ga_page.evaluate(
'(prize_url_element) => prize_url_element.href',
prize_url_element
)
Might be doing something wrong but maybe someone can help me out.
Thanks!
This is code that is/was working on a previous OS (alpine linux), but I'm now deploying in "production" (a fresh alpine 3.7 linux system) and running in to this error... not sure what to expect since this is "working" code...
Traceback (most recent call last):
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 180, in _step
result = coro.send(None)
File "/home/teratorn/code/catwiz/catwiz/pdf.py", line 58, in _main
browser = await launch(executablePath=self.exe_path)
File "/home/teratorn/code/catwiz/venv3/lib/python3.6/site-packages/pyppeteer/launcher.py", line 243, in launch
return await Launcher(options, **kwargs).launch()
File "/home/teratorn/code/catwiz/venv3/lib/python3.6/site-packages/pyppeteer/launcher.py", line 160, in launch
self.browserWSEndpoint = self._get_ws_endpoint()
File "/home/teratorn/code/catwiz/venv3/lib/python3.6/site-packages/pyppeteer/launcher.py", line 179, in _get_ws_endpoint
return data['webSocketDebuggerUrl']
KeyError: 'webSocketDebuggerUrl'
From the documentation located here:
https://github.com/miyakogi/pyppeteer#usage
The line
from pyppeteer import launch
doesn't work, it should be:
from pyppeteer.launcher import launch
Hello.
I was playing with sample code like this:
from pyppeteer import launch
import logging
logging.basicConfig(level=logging.DEBUG)
async def main():
browser = launch(options={
'headless': True,
'timeout': 10000, # Maximum time in milliseconds to wait for the browser instance to start
})
url = 'http://httpbin.org/anything'
page = await browser.newPage()
response = await page.goto(url, options={
'timeout': 3000,
'waitUntil': 'load'})
print('response status: {}'.format(response.status))
await browser.close()
loop = asyncio.get_event_loop()
loop.set_debug(enabled=True)
loop.run_until_complete(main())
Using python 3.6 and websockets==3.3 it works without any troubles.
But when I upgraded websocket to 4.0.1 - it starts falling with such message:
response status: 200
Task exception was never retrieved
future: <Task finished coro=<Connection._recv_loop() done, defined at /usr/local/lib/python3.6/site-packages/pyppeteer/connection.py:47> exception=InvalidState('Cannot write to a WebSocket in the CLOSING state',)>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pyppeteer/connection.py", line 53, in _recv_loop
resp = await self.connection.recv()
File "/usr/local/lib/python3.6/site-packages/websockets/protocol.py", line 309, in recv
loop=self.loop, return_when=asyncio.FIRST_COMPLETED)
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 307, in wait
return (yield from _wait(fs, timeout, return_when, loop))
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 390, in _wait
yield from waiter
concurrent.futures._base.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pyppeteer/connection.py", line 58, in _recv_loop
break
File "/usr/local/lib/python3.6/site-packages/websockets/client.py", line 390, in aexit
yield from self.ws_client.close()
File "/usr/local/lib/python3.6/site-packages/websockets/protocol.py", line 370, in close
self.timeout, loop=self.loop)
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 352, in wait_for
return fut.result()
File "/usr/local/lib/python3.6/site-packages/websockets/protocol.py", line 642, in write_frame
"in the {} state".format(self.state.name))
websockets.exceptions.InvalidState: Cannot write to a WebSocket in the CLOSING state
Any suggestions, how to handle this?
It happened when I try to visit a website using pyppeteer 0.0.17. I clicked a submit button, the browser should had created a new tab but just didn't and the page on the browser just like reload, I had to refill the form and submit again. For the project, the code I used as follow:
result_page_future = asyncio.get_event_loop().create_future()
browser.once('targetcreated', lambda target: result_page_future.set_result(target))
await page.click('#_searchButton', {"delay": 0.5})
result_page = await (await result_page_future).page()
The project just suspended animation and never came back before I got the param result_page. I just don't understand why it didn't timeout.
because I got an error as
urllib.error.URLError: <urlopen error [Errno 101] Network is unreachable>
I have a script that basically open a tab, plays with a website, close the tab. Loop.
On my laptop (OSX), it works just fine. Now when I run that in a container (either with /dev/shm disabled, or with --shm-size=2g), it does work, but at some point the websocket connection is lost and an exception happens in pyppeteer.connection.Connection
, in _async_send()
(websocket.ConnectionClosed
I think).
[EDIT 04/12] this is not container related, I can have the same error happen on my local laptop. For some reason, it happens less, but still happens.
Error:
websockets.exceptions.ConnectionClosed WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason
ERR.:0054:asyncio: Task exception was never retrieved
future: <Task finished coro=<Connection._async_send() done, defined at /env/lib/python3.6/site-packages/pyppeteer/connection.py:61> exception=ConnectionClosed('WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason',)>
│ Traceback (most recent call last):
│ File "/env/lib/python3.6/site-packages/pyppeteer/connection.py", line 64, in _async_send
│ await self.connection.send(msg)
│ File "/env/lib/python3.6/site-packages/websockets/protocol.py", line 334, in send
│ yield from self.ensure_open()
│ File "/env/lib/python3.6/site-packages/websockets/protocol.py", line 470, in ensure_open
│ raise ConnectionClosed(self.close_code, self.close_reason)
└ websockets.exceptions.ConnectionClosed WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason
I tried to add the debug info of asyncio and look at the logs, and it seems that at some point, for no reason that I can think of, there is indeed a OP_CLOSE
frame that goes through write_frame()
coroutine in the ws protocol. I'm a bit lacking knowledge on asyncio, and can't really find a way to trace it back to the reason of this close.
Then, the next async_send()
complains, with an uncaught error.
async def _async_send(self, msg: str) -> None:
while not self._connected:
await asyncio.sleep(self._delay)
await self.connection.send(msg)
I tried to modify the above code to catch the connection error aroud the send() call, and try to reconnect blindly (ok that was a bit naive, but who knows), but that did not do the trick. Probably because I do it at a random time, without knowing any state.
Note that the browser process does not die, and I can reproduce it more systematically if I enclose the browser.goto
call with a asyncio.wait_for
with small timeout (small enough so that it can't goto for real). I guess the later is a bad idea, and I stopped trying to do it, but still it happens at complete random time even without. Low or high goto timeouts, 50 tabs or only 1, 10 seconds after running the script or 30 minute after, etc. and never on the same URL.
I don't know what kind of information I can provide to help, or what kind of tools I can use to debug and fix that. The asyncio's "hey, here is your empty stack trace, have fun" way of helping me is giving me a hard time.
add to Line 110
https://github.com/miyakogi/pyppeteer/blob/dev/pyppeteer/browser.py#L110
async def newIncognitoPage(self) -> Page:
"""Make new incognito page on this browser and return its object."""
browserContextId = (await self._connection.send(
'Target.createBrowserContext', {})).get('browserContextId')
targetId = (await self._connection.send(
'Target.createTarget',
{'url': 'about:blank', 'browserContextId': browserContextId})).get('targetId')
target = self._targets.get(targetId)
if target is None:
raise BrowserError('Failed to create target for page.')
if not await target._initializedPromise:
raise BrowserError('Failed to create target for page.')
page = await target.page()
if page is None:
raise BrowserError('Failed to create page.')
return page, browserContextId
pyppeteer downloaded a chrome instance to ~/.pyppeteer
. However, it drops an error if I try to launch it:
$ ./chrome
[9995:9995:0305/230635.827218:FATAL:zygote_host_impl_linux.cc(124)] No usable sandbox!
Update your kernel or see https://chromium.googlesource.com/chromium/src/+/master/docs/linux_suid_sandbox_development.md
for more information on developing with the SUID sandbox.
If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.
#0 0x55802f65c6ac base::debug::StackTrace::StackTrace()
#1 0x55802f673e23 logging::LogMessage::~LogMessage()
#2 0x55802e73da61 content::ZygoteHostImpl::Init()
...
[end of stack trace]
Calling _exit(1). Core file will not be generated.
pyppeteer==0.0.16
code:
# coding:utf-8
import asyncio
from pyppeteer import launch
async def main():
brower = await launch()
await brower.close()
asyncio.get_event_loop().run_until_complete(main())
error:
Error in data transfer
Traceback (most recent call last):
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 496, in transfer_data
msg = yield from self.read_message()
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 526, in read_message
frame = yield from self.read_data_frame(max_size=self.max_size)
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 591, in read_data_frame
frame = yield from self.read_frame(max_size)
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 632, in read_frame
extensions=self.extensions,
File "D:\code\py3\.venv\lib\site-packages\websockets\framing.py", line 100, in read
data = yield from reader(2)
File "F:\python3\Lib\asyncio\streams.py", line 674, in readexactly
yield from self._wait_for_data('readexactly')
File "F:\python3\Lib\asyncio\streams.py", line 464, in _wait_for_data
yield from self._waiter
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 669, in write_frame
yield from self.writer.drain()
File "F:\python3\Lib\asyncio\streams.py", line 329, in drain
raise exc
File "F:\python3\Lib\asyncio\selector_events.py", line 761, in write
n = self._sock.send(data)
ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。
Traceback (most recent call last):
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 496, in transfer_data
msg = yield from self.read_message()
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 526, in read_message
frame = yield from self.read_data_frame(max_size=self.max_size)
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 591, in read_data_frame
frame = yield from self.read_frame(max_size)
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 632, in read_frame
extensions=self.extensions,
File "D:\code\py3\.venv\lib\site-packages\websockets\framing.py", line 100, in read
data = yield from reader(2)
File "F:\python3\Lib\asyncio\streams.py", line 674, in readexactly
yield from self._wait_for_data('readexactly')
File "F:\python3\Lib\asyncio\streams.py", line 464, in _wait_for_data
yield from self._waiter
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 669, in write_frame
yield from self.writer.drain()
File "F:\python3\Lib\asyncio\streams.py", line 329, in drain
raise exc
File "F:\python3\Lib\asyncio\selector_events.py", line 761, in write
n = self._sock.send(data)
ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/code/py3/pyppeteer/hello.py", line 15, in <module>
asyncio.get_event_loop().run_until_complete(main())
File "F:\python3\Lib\asyncio\base_events.py", line 467, in run_until_complete
return future.result()
File "D:/code/py3/pyppeteer/hello.py", line 11, in main
await brower.close()
File "D:\code\py3\.venv\lib\site-packages\pyppeteer\browser.py", line 156, in close
await self.disconnect()
File "D:\code\py3\.venv\lib\site-packages\pyppeteer\browser.py", line 160, in disconnect
await self._connection.dispose()
File "D:\code\py3\.venv\lib\site-packages\pyppeteer\connection.py", line 139, in dispose
await self._on_close()
File "D:\code\py3\.venv\lib\site-packages\pyppeteer\connection.py", line 128, in _on_close
await self.connection.close()
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 370, in close
self.timeout, loop=self.loop)
File "F:\python3\Lib\asyncio\tasks.py", line 358, in wait_for
return fut.result()
File "D:\code\py3\.venv\lib\site-packages\websockets\protocol.py", line 674, in write_frame
raise ConnectionClosed(self.close_code, self.close_reason)
websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason
Process finished with exit code 1
I just run the demo. And it takes a bug.
import asyncio
from pyppeteer.launcher import launch
async def main(browser):
page = await browser.newPage()
await page.goto('http://example.com')
await page.screenshot({'path': 'example.png'})
dimensions = await page.evaluate('''() => {
return {
width: document.documentElement.clientWidth,
height: document.documentElement.clientHeight,
deviceScaleFactor: window.devicePixelRatio,
}
}''')
print(dimensions)
browser = launch(
executablePath="/Applications/Chromium.app/Contents/MacOS/Chromium")
asyncio.get_event_loop().run_until_complete(main(browser=browser))
browser.close()
the result is
$ python3 test.py
{'width': 800, 'height': 600, 'deviceScaleFactor': 1}
test.py:21: RuntimeWarning: coroutine 'Browser.close' was never awaited
browser.close()
I randomly (but quite frequently) get this error when I try to crawl several pages of a website :
Task exception was never retrieved future: <Task finished coro=<CDPSession.send() done, defined at /Users/user/.virtualenvs/project/lib/python3.6/site-packages/pyppeteer/connection.py:180> exception=NetworkError('Protocol Error: Unknown event id: 25 None',)>
Traceback (most recent call last): File "/Users/user/.virtualenvs/project/lib/python3.6/site-packages/pyppeteer/connection.py", line 200, in send return await callback
An idea where this error could come from ?
Also, this does not seem to affect the scenario but it looks like tasks are not handled the way asyncio expects in the connection class.
My OS is Windows 7 64bit
.
I downloaded chromium
at GitHub because I cannot access https://storage.googleapis.com/chromium-browser-snapshots
. I also put it into C:\Users\Administrator\.pyppeteer\local-chromium\533271\chrome-win32
.
chromium
version is 66.0.3336.0
.
When I first run the example code:
import asyncio
from pyppeteer import launch
async def main():
browser = launch()
page = await browser.newPage()
await page.goto('http://example.com')
await page.screenshot({'path': 'example.png'})
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
I got an UnicodeDecodeError
(because it contains Chinese) so I also changed 'UTF-8' to 'GBK' at launch.py
:
msg = self.proc.stdout.readline().decode()
to
msg = self.proc.stdout.readline().decode(`GBK`)
print(msg)
I executed the example code again and got nothing.
It was pending until I kill the process chrome.exe
.
I printed msg
and got these information:
[0309/215247.680:ERROR:network_change_notifier_win.cc(157)] WSALookupServiceBegin failed with: 10108
[0309/215247.696:ERROR:gpu_process_transport_factory.cc(1019)] Lost UI shared context.
[0309/215247.712:ERROR:tcp_socket_win.cc(272)] CreatePlatformSocket() returned an error: 无法加载或初始化请求的服务提供程序。(Unable to load or initialize the requested service provider) (0x277A)
[0309/215247.712:WARNING:net_errors_win.cc(119)] Unknown error 10106 mapped to net::ERR_FAILED
[0309/215247.712:ERROR:tcp_socket_win.cc(272)] CreatePlatformSocket() returned an error: 无法加载或初始化请求的服务提供程序。(Unable to load or initialize the requested service provider) (0x277A)
[0309/215247.712:WARNING:net_errors_win.cc(119)] Unknown error 10106 mapped to net::ERR_FAILED
[0309/215247.712:ERROR:devtools_http_handler.cc(249)] Cannot start http server for devtools. Stop devtools.
pyppeteer
?Thank your response.
import asyncio
from pyppeteer.launcher import launch
async def main():
browser = launch()
page = await browser.newPage()
await page.goto('http://example.com')
await page.screenshot({'path': 'example.png'})
browser.close()
asyncio.get_event_loop().run_until_complete(main())
run this example and then
Traceback (most recent call last):
File "D:/Project/Pytho_Project/ComputerVision/pyppe2.py", line 12, in
loop.run_until_complete(main())
File "D:\python3.6\lib\asyncio\base_events.py", line 467, in run_until_complete
return future.result()
File "D:/Project/Pytho_Project/ComputerVision/pyppe2.py", line 9, in main
browser.close()
File "D:\python3.6\lib\site-packages\pyppeteer\browser.py", line 34, in close
asyncio.get_event_loop().run_until_complete(self._connection.dispose())
File "D:\python3.6\lib\asyncio\base_events.py", line 454, in run_until_complete
self.run_forever()
File "D:\python3.6\lib\asyncio\base_events.py", line 408, in run_forever
raise RuntimeError('This event loop is already running')
RuntimeError: This event loop is already running
delete browser.close(),can fix it
Hi,
Thanks for this port!
I was looking at the docs and it said Page.pdf
feature is not yet implemented, do you have plans to complete that feature anytime soon?
I was trying to run pyppeteer in an alpine container, where the downloaded chrome does not work due to libc issues.
Trying to installl chromium worked but pyppeteer would not find it. Looking in the code i found why:
pyppeteer is using this regexp
DevTools listening on (ws://.*)
but chromium-browser returns this string:
DevTools listening on 127.0.0.1:12345
and that does not match. Can we change the code to match both scenarios ? In my fork i hacked it to only match my scenario but a structural solution is of course preferred.
# coding:utf-8
import asyncio
from pyppeteer import launch
async def main():
brower = await launch(headless=False)
page = await brower.newPage()
await page.setViewport(dict(width=1200, height=1000))
await page.goto("https://github.com")
await page.click('.HeaderMenu [href="/features"]')
await page.waitForNavigation()
asyncio.get_event_loop().run_until_complete(main())
https://github.com/features
The page has been loaded but the navigation timed out
Traceback (most recent call last):
File "D:/code/py3/pyppeteer/hello.py", line 42, in <module>
asyncio.get_event_loop().run_until_complete(main())
File "F:\python3\Lib\asyncio\base_events.py", line 467, in run_until_complete
return future.result()
File "D:/code/py3/pyppeteer/hello.py", line 14, in main
await page.waitForNavigation()
File "D:\code\py3\.venv\lib\site-packages\pyppeteer\page.py", line 698, in waitForNavigation
raise error
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.
https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pageclickselector-options
const [response] = await Promise.all([
page.waitForNavigation(waitOptions),
page.click(selector, clickOptions),
]);
But how does python solve?
In there is a method to store the downloaded files in headless mode of puppeteer as described in this so answer
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './myAwesomeDownloadFolder'});
Is there any similar method that exists in pyppeteer?
Dev branch
When I run first Readme example script (Example: open web page and take a screenshot), I've got this error:
python3 example.py Traceback (most recent call last): File "example.py", line 11, in <module> asyncio.get_event_loop().run_until_complete(main()) File "/usr/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete return future.result() File "example.py", line 5, in main browser = launch() File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 161, in launch return Launcher(options, **kwargs).launch() File "/usr/local/lib/python3.6/dist-packages/pyppeteer/launcher.py", line 127, in launch raise BrowserError('Unexpectedly chrome process closed with ' pyppeteer.errors.BrowserError: Unexpectedly chrome process closed with return code: 127
Hi,
I'm trying to run the example but got this error:
Traceback (most recent call last):
File "pyppeteer.py", line 2, in
from pyppeteer import launch
File "test\pyppeteer.py", line 2, in
from pyppeteer import launch
ImportError: cannot import name 'launch'
Tried and failed on both Mac and Windows.
Hello, I had a problem when I try to install the package. The following is my environment.
Using pip install pyppeteer
to install the package. The error is
Collecting pyppeteer
Using cached pyppeteer-0.0.12.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\secsi\AppData\Local\Temp\pip-build-mbsh4qol\pyppeteer\setup.py", line 27, in <module>
compile_files(in_dir, out_dir, target)
File "c:\users\secsi\anaconda3\lib\site-packages\py_backwards\compiler.py", line 85, in compile_files
dependencies.update(_compile_file(paths, target))
File "c:\users\secsi\anaconda3\lib\site-packages\py_backwards\compiler.py", line 57, in _compile_file
code = f.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xb6 in position 2147: illegal multibyte sequence
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:\Users\secsi\AppData\Local\Temp\pip-build-mbsh4qol\pyppeteer\
Puppeteer hangs at some websites, throwing following error:
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.
This is the example:
async def main():
browser = await launch({'headless': False})
page = await browser.newPage()
await page.goto('https://www.ferragamo.com/shop/us/en/women/silk-bijoux/shawls-and-stoles')
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
I have tried 'waitUntil': 'networkidle0', 'waitUntil': 'networkidle2'. It does not help for this particular website.
branch dev, When timeout set to 0, commit 158bfaf causes line 98 if not watchdog.done():
, UnboundLocalError: local variable 'watchdog' referenced before assignment
Traceback (most recent call last):
File "./test_pype.py", line 48, in <module>
asyncio.get_event_loop().run_until_complete(main())
File "/usr/lib64/python3.6/asyncio/base_events.py", line 467, in run_until_complete
return future.result()
File "./test_pype.py", line 44, in main
await page.goto(url, timeout=0)
File "/home/centos/p3venv/lib64/python3.6/site-packages/pyppeteer/page.py", line 487, in goto
error = await navigationPromise
File "/home/centos/p3venv/lib64/python3.6/site-packages/pyppeteer/navigator_watcher.py", line 98, in waitForNavigation
if not watchdog.done():
UnboundLocalError: local variable 'watchdog' referenced before assignment
I want to send a pull request, but, can I do some renaming of attributes as advised here: http://legacy.python.org/dev/peps/pep-0008/#prescriptive-naming-conventions e.g.
self._ignoreHTTPSErrors
becomes self._ignore_https_errors
Currently the Launcher contains the following code:
# dont forget to close browser process
atexit.register(_close_process)
if self.options.get('handleSIGINT', True):
signal.signal(signal.SIGINT, _close_process)
if self.options.get('handleSIGTERM', True):
signal.signal(signal.SIGTERM, _close_process)
if not sys.platform.startswith('win'):
# SIGHUP is not defined on windows
if self.options.get('handleSIGHUP', True):
signal.signal(signal.SIGHUP, _close_process)
Although it can be handy to begin with pyppeteer, it causes different troubles when integrated in other code. What if there are already signal handlers? What if an error happen in the _close_process callable? etc.
To overcome this, I have my own Launcher that is basically a copypaste of the original launcher without this part.
I propose to make this configurable, we can keep the current behaviour as default (even if I think it's not a good idea).
I can work on a PR if you're not against it, let me know.
Currently request interception doesn't work. I'll pick this up if I can, but if it's not tomorrow, it'll be the week after.
The resourceType
parameter from the network_manager.Request is never populated.
@see https://github.com/miyakogi/pyppeteer/blob/dev/pyppeteer/network_manager.py#L285
I tried a small comparative test; this puppeteer code works and successfully takes a screenshot:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://davepeck.org/');
await page.screenshot({path: '/tmp/dave-js.png'});
await browser.close();
})();
The equivalent pyppeteer code does not work:
import asyncio
from pyppeteer.launcher import launch
async def main(browser):
page = await browser.newPage()
await page.goto('https://davepeck.org/')
await page.screenshot({'path': '/tmp/dave-py.png'})
browser = launch()
asyncio.get_event_loop().run_until_complete(main(browser))
browser.close()
It hits its 3 second navigation timeout; explicitly setting a different timeout (something that will definitely be long enough to load the page in question, like await page.goto('https://davepeck.org/', timeout=150000)
) similarly fails.
Details:
- Pyppeteer: Python 3.6.4, Pyppeteer 0.0.9, chromium revision 497674 (62.0.3198.0), macOS 10.13.3
- Puppeteer: Node 9.4.0, Puppeteer 1.0.0, chromium revision 526987 (65.0.3312.0), macOS 10.13.3
Thoughts:
- It wouldn't surprise me if the very different chromium versions are to blame, although I'm not certain.
- I noticed this with my personal blog, but I suspect we'll find similar behavior for other pages?
Happy to test whatever you'd like.
I have tested with some different sites that have iframes within and when I try to access page.frames
an AttributeError
is raised always. I have manually checked that the iframes are in place.
Traceback (most recent call last):
File "async_test.py", line 44, in <module>
loop.run_until_complete(c.main())
File "/usr/local/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete
return future.result()
File "async_test.py", line 23, in main
await self.start()
File "async_test.py", line 29, in start_challenge
print(self.page.frames)
File "/Users/ordanis/.local/share/virtualenvs/test-mL5IBpML/lib/python3.6/site-packages/pyppeteer/page.py", line 171, in frames
return list(self._frames.values())
AttributeError: 'Page' object has no attribute '_frames'
I want to run some below code
page.on("dialog", async (dialog) => {
this.log("got dialog (type: %s, message: %s)", dialog.type, dialog.message());
await dialog.dismiss();
});
I want to close dialogs automatically. How can I do it?
The example indicated here:
https://github.com/miyakogi/pyppeteer#usage
import asyncio
from pyppeteer import launch
async def main():
browser = launch()
page = await browser.newPage()
await page.goto('http://example.com')
await page.screenshot({'path': 'example.png'})
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
Is not working for me. I get a RuntimeError: This event loop is already running
because browser.close()
also use a run_until_complete
call on the Connection.py file.
I kind of "fixed" it by doing the following, which is not perfect, if anyone has any better solution?
asyncio.get_event_loop().run_until_complete(main())
browser.close()
try:
asyncio.get_event_loop().run_until_complete(asyncio.gather(*asyncio.Task.all_tasks()))
except Exception as e:
# Will almost everytime show an exception of either:
# <class 'websockets.exceptions.InvalidState'> Cannot write to a WebSocket in the CLOSING state
# Or a CancelledError
print("Exception when closing:")
print(e.__class__, e)
pass
I saw the source code of the method "browser.close()" as follow:
async def close(self) -> None:
"""Close connections and terminate browser process."""
await self._closeCallback()
await self.disconnect()
and it seems doesn't terminate browser process, the Chromium process still running.
Environment:
Windows 10
Python 3.6.2
Sphinx 1.7.1
When running the command make.bat html | text | etc
, I was getting an error that sphinxcontrib.asyncio extension was not installed. Removing the extension from the config.py fixed the issue.
The documentation should explicitly mention a step to get around this.
Docs mention ElementHandle.querySelector() and ElementHandle.J() but neither seem to be implemented.
(Pdb) elem.querySelector("h1")
*** AttributeError: 'ElementHandle' object has no attribute 'querySelector'
Hi thanks for this puppeteer port, works pretty well expect one thing that it downloads the essential chromium during the first launch of the browser.
This can be a potential breakpoint for every production service since if the resource is (somehow) not available, the service will break and you will only be able to find out once the service is online.
I'm currently using pyppeteer in a docker container, and I think if anything is going to break, it better breaks during the build process instead of after deployment.
I would like to use pyppeteer to control a QtWebengine. But it throws a error.
/usr/bin/python3.6 /home/matty/PycharmProjects/GhostAuto/AutoWeb.py
Traceback (most recent call last):
File "/home/matty/PycharmProjects/GhostAuto/AutoWeb.py", line 17, in <module>
asyncio.get_event_loop().run_until_complete(main())
File "/usr/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete
return future.result()
File "/home/matty/PycharmProjects/GhostAuto/AutoWeb.py", line 7, in main
browserWSEndpoint="ws://127.0.0.1:55551/devtools/browser/<id>")
File "/usr/lib/python3.6/site-packages/pyppeteer/launcher.py", line 263, in connect
connection, options, None, lambda: connection.send('Browser.close'))
File "/usr/lib/python3.6/site-packages/pyppeteer/browser.py", line 80, in create
await connection.send('Target.setDiscoverTargets', {'discover': True})
pyppeteer.errors.NetworkError: Protocol Error: {'code': -32601, 'message': "'Target.setDiscoverTargets' wasn't found"}
I think this is because pyppeteer cant get the browser id because "http://127.0.0.1:55551/json/version " doesn't supply it. Maybe it is possible to connect strait to a page
I've been using phantomjs (speaking of which.. is horrible to use with python), then i find this. Thank you @miyakogi
Have some noob questions
How do you make sure that pyppeteer follow redirects?
how do you fetch list of links (along with innerHTML text of the links)
how do you fetch the PDF that's requested in header?
Exceptions are thrown if waitFor networkidle is used. One of them is caused by a typo on line 92 in navigator_watcher,py (lambda f: instead of lambda:). Even then, it does not work. I'll see if I can fix this and submit a proper pull request.
When I run the screenshot sample code it attempts to download Chrome. But after downloading it hangs.
[W:pyppeteer.chromium_downloader] start chromium download. Download may take a few minutes. [W:pyppeteer.chromium_downloader] chromium download done. [W:pyppeteer.chromium_downloader] chromium extracted to: /Users/<username>/.pyppeteer/local-chromium/497674
It does download the Chrome app. If I kill the script and retry, it won't download the app again but it doesn't do anything either.
This is the trace when I ^C
^CTraceback (most recent call last): File "test.py", line 13, in <module> browser = launch() File "/usr/local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 117, in launch return Launcher(options, **kwargs).launch() File "/usr/local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 84, in launch time.sleep(0.1) KeyboardInterrupt
Any ideas?
Exception in callback NavigatorWatcher.waitForNavigation.<locals>.watchdog_cb(<Task finishe...> result=None>) at /usr/local/lib/python3.6/dist-packages/pyppeteer/navigator_watcher.py:49
handle: <Handle NavigatorWatcher.waitForNavigation.<locals>.watchdog_cb(<Task finishe...> result=None>) at /usr/local/lib/python3.6/dist-packages/pyppeteer/navigator_watcher.py:49>
Traceback (most recent call last):
File "/usr/lib/python3.6/asyncio/events.py", line 127, in _run
self._callback(*self._args)
File "/usr/local/lib/python3.6/dist-packages/pyppeteer/navigator_watcher.py", line 52, in watchdog_cb
self._timeout)
File "/usr/local/lib/python3.6/dist-packages/pyppeteer/navigator_watcher.py", line 40, in _raise_error
raise error
concurrent.futures._base.TimeoutError: Navigation Timeout Exceeded: 3000 ms exceeded
Hi, What this mean? is this page can't render?
Good day, thanks for the awesome project!
I'm having some problem running it in Travis, and it seems that you succeeded, could you help documenting how to do it?
I've tried several options both from the travis doc and checking other puppeteer configurations but after
[W:pyppeteer.chromium_downloader] chromium extracted to: /home/travis/.pyppeteer/local-chromium/543305
I get a pyppeteer.errors.BrowserError: Failed to connect to browser port: http://127.0.0.1:39289/json/version
and I'm not sure how to debug it.
Do you have any suggestion?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.