Giter VIP home page Giter VIP logo

cdproto's Introduction

About chromedp

Package chromedp is a faster, simpler way to drive browsers supporting the Chrome DevTools Protocol in Go without external dependencies.

Unit Tests Go Reference Releases

Installing

Install in the usual Go way:

$ go get -u github.com/chromedp/chromedp

Examples

Refer to the Go reference for the documentation and examples. Additionally, the examples repository contains more examples on complex actions, and other common high-level tasks such as taking full page screenshots.

Frequently Asked Questions

I can't see any Chrome browser window

By default, Chrome is run in headless mode. See DefaultExecAllocatorOptions, and an example to override the default options.

I'm seeing "context canceled" errors

When the connection to the browser is lost, chromedp cancels the context, and it may result in this error. This occurs, for example, if the browser is closed manually, or if the browser process has been killed or otherwise terminated.

Chrome exits as soon as my Go program finishes

On Linux, chromedp is configured to avoid leaking resources by force-killing any started Chrome child processes. If you need to launch a long-running Chrome instance, manually start Chrome and connect using RemoteAllocator.

Executing an action without Run results in "invalid context"

By default, a chromedp context does not have an executor, however one can be specified manually if necessary; see issue #326 for an example.

I can't use an Action with Run because it returns many values

Wrap it with an ActionFunc:

ctx, cancel := chromedp.NewContext(context.Background())
defer cancel()
chromedp.Run(ctx, chromedp.ActionFunc(func(ctx context.Context) error {
	_, err := domain.SomeAction().Do(ctx)
	return err
}))

I want to use chromedp on a headless environment

The simplest way is to run the Go program that uses chromedp inside the chromedp/headless-shell image. That image contains headless-shell, a smaller headless build of Chrome, which chromedp is able to find out of the box.

Resources

cdproto's People

Contributors

kenshaw avatar mvdan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cdproto's Issues

Cookie structure interoperability

Hello and first off, thanks for this wonderful project :)

I am trying to map a cdproto/network.Cookie into a net/http.Cookie but it seems that the two structures are slightly different and not interoperable. Additionally there seems to be no method to percent-encode a network.Cookie into a string. My objective is to pass the network.Cookie to another program that uses a net/http.Client to communicate with its endpoint. What would be the right way to do that?

prop:bug ? browser.SetDownloadBehavior vs. old page.SetDownloadBehavior

github.com/chromedp/cdproto v0.0.0-20200424080200-0de008e41fa0
github.com/chromedp/chromedp v0.5.3
go version go1.14.3 linux/amd64

I change old:
page.SetDownloadBehavior(page.SetDownloadBehaviorBehaviorAllow).WithDownloadPath(FileDir + "/download")

on:
browser.SetDownloadBehavior(browser.SetDownloadBehaviorBehaviorAllow).WithDownloadPath(FileDir + "/download")

error:

'Browser.setDownloadBehavior' wasn't found (-32601)

but code compiled successfully.
error during program execution

What is wrong ?

UnmarshalMessage panics when encountering deprecated messages

For example, 8d5e1d0 got rid of Page.frameScheduledNavigation and all of its types. In UnmarshalMessage, that event method now doesn't fall under any of the cases, and ends up calling json.Unmarshal(buf, v) where v == nil.

The fix here is probably to modify cdproto-gen so that UnmarshalMessage handles this case gracefully:

diff --git a/cdproto.go b/cdproto.go
index 21f111e..fb513e9 100644
--- a/cdproto.go
+++ b/cdproto.go
@@ -2243,6 +2243,9 @@ func UnmarshalMessage(msg *Message) (interface{}, error) {

        case EventTracingTracingComplete:
                v = new(tracing.EventTracingComplete)
+
+       default:
+               return nil, errors.New("unknown or deprecated event method")
        }

        var buf easyjson.RawMessage

I've worked around this in chromedp v0 by capturing these events before the function is called, which is not ideal: chromedp/chromedp@e9aa66f

/cc @kenshaw

jsonUnmarshal fail from a jsonMarshal result of *network.Cookie

follow code will panic:

package main

import (
   	"github.com/chromedp/cdproto/network"
	"encoding/json"
)
func main() {
	b,err:=json.Marshal(&network.Cookie{})
	if err!=nil{
		panic(err)
	}
	var c *network.Cookie
	err=json.Unmarshal(b,&c)
	if err!=nil{
		panic(err)
	}
	return
}

error

github.com/chromedp/cdproto/cdp

../../github.com/chromedp/cdproto/cdp/easyjson.go:31:12: in.UnsafeFieldName undefined (type *jlexer.Lexer has no field or method UnsafeFieldName)
../../github.com/chromedp/cdproto/cdp/easyjson.go:118:12: in.UnsafeFieldName undefined (type *jlexer.Lexer has no field or method UnsafeFieldName)
../../github.com/chromedp/cdproto/cdp/easyjson.go:206:12: in.UnsafeFieldName undefined (type *jlexer.Lexer has no field or method UnsafeFieldName)
../../github.com/chromedp/cdproto/cdp/easyjson.go:319:12: in.UnsafeFieldName undefined (type *jlexer.Lexer has no field or method UnsafeFieldName)
../../github.com/chromedp/cdproto/cdp/easyjson.go:443:12: in.UnsafeFieldName undefined (type *jlexer.Lexer has no field or method UnsafeFieldName)
../../github.com/chromedp/cdproto/cdp/easyjson.go:912:12: in.UnsafeFieldName undefined (type *jlexer.Lexer has no field or method UnsafeFieldName)
../../github.com/chromedp/cdproto/cdp/easyjson.go:1158:12: in.UnsafeFieldName undefined (type *jlexer.Lexer has no field or method UnsafeFieldName)
../../github.com/chromedp/cdproto/cdp/easyjson.go:1238:12: in.UnsafeFieldName undefined (type *jlexer.Lexer has no field or method UnsafeFieldName)

Modify withHeaderTemplate hint

func (p PrintToPDFParams) WithHeaderTemplate(headerTemplate string) *PrintToPDFParams {

Hi, hint for this function says, that it is enough to pass <span class=title></span> to it and it will work. But actually it won't unless you add font size. It wasn't clear for me and I couldn't find any info for a long time. So, please add some mention of that fact in description if possible.

Why do I only get three screenshots using this method?

Why do I only get three screenshots using this method?

var options = []chromedp.ExecAllocatorOption{ chromedp.Flag("headless", true), chromedp.Flag("hide-scrollbars", false), chromedp.Flag("mute-audio", false), chromedp.UserAgent(Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36`),
}

func ChromedpTest(url string) {

err := os.Mkdir("./img", os.ModePerm)
if err != nil {
	fmt.Println(err)
}

options = append(chromedp.DefaultExecAllocatorOptions[:], options...)

ctx, _ := chromedp.NewExecAllocator(context.Background(), options...)
ctx, _ = context.WithTimeout(ctx, 20*time.Second)
ctx, cancel := chromedp.NewContext(ctx)
defer cancel()

ListenTarget(ctx)

err = chromedp.Run(ctx, chromedp.Tasks{
	chromedp.Navigate(url),
	chromedp.ActionFunc(func(ctx context.Context) error {
		scr := page.StartScreencast()
		scr.WithFormat("png")
		scr.WithQuality(100)
		scr.WithMaxWidth(1000)
		scr.WithMaxHeight(1000)
		scr.WithEveryNthFrame(18)
		err := scr.Do(ctx)
		if err != nil {
			fmt.Println("do", err)
		}
		return nil
	}),
	chromedp.Sleep(2 * time.Second),
})
if err != nil {
	fmt.Println(err)
}

}

func ListenTarget(ctx context.Context) {
n := "1"
chromedp.ListenTarget(ctx, func(v interface{}) {
switch ev := v.(type) {
case *page.EventScreencastFrame:
b := Base64Decode(ev.Data)
f, _ := os.OpenFile("./img/"+n+"xx.png", os.O_RDWR|os.O_CREATE, os.ModePerm)
defer f.Close()
f.Write(b)
n = n + "1"
default:
// fmt.Println("hello world")
}
})
}

func Base64Decode(str string) []byte {
decoded, _ := base64.StdEncoding.DecodeString(str)
return decoded
}`

Invalid withlandscape

When generating PDF using chromedp.

It takes effect when using withlandscape to generate horizontal PDF. The generated PDF is still vertical.

code:

err := chromedp.Run(ctx, chromedp.Tasks{

chromedp.Navigate(url),

chromedp.WaitReady("body"),

chromedp.ActionFunc(func(ctx context.Context) error {

var err error

p := page.PrintToPDF().WithLandscape(true)

buf, _, err = p.Do(ctx)

return err

}

}

easyjson generated marshalling methods use value receiver on type containing mutex

Hi! This might ultimately be an easyjson bug, but running go vet on showed issues of mutex copies in the cdp package.

Seen on latest master 75a047418dbe8787d588a1cd0abc4b1ba9d3e477

$ go vet ./...
# github.com/chromedp/cdproto/cdp
cdp/easyjson.go:353:81: easyjsonC5a4559bEncodeGithubComChromedpCdprotoCdp1 passes lock by value: github.com/chromedp/cdproto/cdp.Node
cdp/easyjson.go:562:9: MarshalJSON passes lock by value: github.com/chromedp/cdproto/cdp.Node
cdp/easyjson.go:564:57: call of easyjsonC5a4559bEncodeGithubComChromedpCdprotoCdp1 copies lock value: github.com/chromedp/cdproto/cdp.Node
cdp/easyjson.go:569:9: MarshalEasyJSON passes lock by value: github.com/chromedp/cdproto/cdp.Node
cdp/easyjson.go:570:56: call of easyjsonC5a4559bEncodeGithubComChromedpCdprotoCdp1 copies lock value: github.com/chromedp/cdproto/cdp.Node
cdp/easyjson.go:631:81: easyjsonC5a4559bEncodeGithubComChromedpCdprotoCdp2 passes lock by value: github.com/chromedp/cdproto/cdp.Frame
cdp/easyjson.go:684:9: MarshalJSON passes lock by value: github.com/chromedp/cdproto/cdp.Frame
cdp/easyjson.go:686:57: call of easyjsonC5a4559bEncodeGithubComChromedpCdprotoCdp2 copies lock value: github.com/chromedp/cdproto/cdp.Frame
cdp/easyjson.go:691:9: MarshalEasyJSON passes lock by value: github.com/chromedp/cdproto/cdp.Frame
cdp/easyjson.go:692:56: call of easyjsonC5a4559bEncodeGithubComChromedpCdprotoCdp2 copies lock value: github.com/chromedp/cdproto/cdp.Frame

I'm new to this project and don't quite understand how all the code is generated, but I'm happy keep digging if it's not a straight forward fix.

Error while unmarshaling network messages

Hello,

I am trying to analyze the resource usage of a website with the help of the chromedp library. When enabling the network domain I get corrupt json messages.

From my go.mod file the versions I am using are:
github.com/chromedp/cdproto v0.0.0-20190429085128-1aa4f57ff2a9
github.com/chromedp/chromedp v0.3.1

A short example of my code:

func main() {
  cpctx, cancel := chromedp.NewContext(ctx,
  		     chromedp.WithDebugf(devToolHandler))
  defer cancel()

  err := chromedp.Run(cpctx,
           network.Enable(),
           chromedp.Navigate(baseUrl.String()),
             ...
         )
}

// The handler being used in the context
func devToolHandler(s string, is ...interface{}) {
	/*
	   Uncomment the following line to have a log of the events
	   log.Printf(s, is...)
	*/
	/*
	   We need this to be on a separate gorutine
	   otherwise we block the browser and we don't receive messages
	*/
	go func() {
		for _, elem := range is {
			var msg cdproto.Message
			// The CDP messages are sent as strings so we need to convert them back
			err := json.Unmarshal([]byte(fmt.Sprintf("%s", elem)), &msg)
			// possible source of empty msg!!!!!!!!!!!!!
			if err != nil {
				log.Println(err)
				log.Printf("Faulty element:\n%v\n", fmt.Sprintf("%s", elem))
			}

			msgChan <- msg
		}
	}()
}

Some examples of those corrupt messages
The message has been cut off inside the targetid field:

2019/08/16 11:27:14 unexpected end of JSON input
2019/08/16 11:27:14 Faulty element:
{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"12C28C69BFC88E78A8BD2AA83AB2D475","message":"{\"method\":\"Network.loadingFinished\",\"params\":{\"requestId\":\"1000030524.136\",\"timestamp\":180150.31724,\"encodedDataLength\":31389,\"shouldReportCorbBlocking\":false}}","targetId":"6296FD1F8C4D089

An example message from the network domain, this time the error occures after the targetId:

2019/08/16 11:27:14 invalid character 'a' after top-level value
2019/08/16 11:27:14 Fautly element:
{"method":"Target.receivedMessageFromTarget","params":{"sessionId":"12C28C69BFC88E78A8BD2AA83AB2D475","message":"{\"method\":\"Network.loadingFinished\",\"params\":{\"requestId\":\"1000030524.95\",\"timestamp\":180150.112695,\"encodedDataLength\":23093,\"shouldReportCorbBlocking\":false}}","targetId":"6296FD1F8C4D08988ADD6EC222A30C3F"}}ary.com/flyby/t_thumbOWP/15584289868777.png\",\"status\":200,\"statusText\":\"\",\"headers\":{\"date\":\"Fri, 16 Aug 2019 09:27:14 GMT\",\"via\":\"1.1 varnish\",\"age\":\"1301358\",<CUTOFF>}}

It is like the beginning of the URL is missing in all these messages. Note, that the end of the message is complete, I have cut off the rest of the line to shorten the message a bit.

First I did some experiments with the EnableParams (WithMaxTotalBufferSizes) until I realized that there are filters by message size so that overly large messages won't hit my code.

I guess either I have made some mistake with the setup of my chromedp.NewContext() or there is a bug when the slice of interfaces for is is being created. I tracked the call chain into the chromedp conn.go file Read() func.

To me the observed behaviour could be explained by two buffers are copied into a third and the first overwrites parts of the second?

Kind regards,
Marcus

Warning when running page.setDocumentContent

github.com/chromedp/chromedp v0.5.3
github.com/chromedp/cdproto v0.0.0-20200709115526-d1f6fc58448b
Google Chrome 85.0.4183.102
go version go1.14.4 darwin/amd64
MacOs Catalina version 10.15.4 

I've been implementing a simple HTML to PDF service.
Everything works perfectly and the program delivers the right output.
However, I noticed that every time I call the page.setDocumentContent I get the following warning:

ERROR: received DOM.documentUpdated when there's no top-level frame

Following is an example of how to trigger the error:

package main

import (
	"context"

	"github.com/chromedp/cdproto/cdp"
	"github.com/chromedp/cdproto/page"
	"github.com/chromedp/chromedp"
)

type pageHandler func(ctx context.Context, frameID cdp.FrameID) error

func main() {
	exCtx, cancel := newExecContext()
	defer cancel()

	newPageExec(exCtx, func(tabCtx context.Context, frameID cdp.FrameID) error {
		page.
			SetDocumentContent(frameID, `
				<html>
					<head></head>
					<body>
						<h1>Test</h1>
					</body>
				</html>
			`).
			Do(tabCtx)

		return nil
	})
}

func newExecContext() (context.Context, context.CancelFunc) {
	allocatorOptions := append(
		chromedp.DefaultExecAllocatorOptions[:],
		chromedp.Flag("headless", true),
		chromedp.Flag("disable-gpu", true),
	)

	ctx, cancel := chromedp.NewExecAllocator(
		context.Background(),
		allocatorOptions...,
	)

	return ctx, cancel
}

func newPageExec(execCtx context.Context, handler pageHandler) {
	newTabCtx, cancel := chromedp.NewContext(execCtx)
	chromedp.Run(newTabCtx, taskHandler(handler))
	cancel()
}

func taskHandler(handler pageHandler) chromedp.Tasks {
	return chromedp.Tasks{
		chromedp.ActionFunc(func(ctx context.Context) error {
			frameTree, err := page.GetFrameTree().Do(ctx)

			if err != nil {
				return err
			}

			err = handler(ctx, frameTree.Frame.ID)

			if err != nil {
				return err
			}

			return nil
		}),
	}
}

Json to HAR

How do I take a HAR export (Json content) and load it in the HAR struct? There is no UnMarshal on HAR or log objects.

Could we support go1.12?

Because of some reasons, we could not update go1.12 to go1.13. Cloud we support go1.12 in some branch?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.