install poppler-glib and cairo (for Debian/Ubuntu):
apt-get install libpoppler-glib-dev libcairo2-dev
install go package:
go get github.com/cheggaaa/go-poppler
Go wrapper for a Poppler PDF rendering library
License: GNU General Public License v2.0
I would like to use go-poppler in one of my projects, but it doesn't have a license. Might I suggest the MIT license?
Lack of a license, also prevents pkg.go.dev from displaying the package documentation as seen here.
golang/src/github.com/cheggaaa/go-poppler/document.go:22: cannot use d.doc (type poppDoc) as type *C.struct__PopplerDocument in argument to _Cfunc_poppler_document_get_pdf_version_string
golang/src/github.com/cheggaaa/go-poppler/document.go:23: cannot use d.doc (type poppDoc) as type *C.struct__PopplerDocument in argument to _Cfunc_poppler_document_get_title
golang/src/github.com/cheggaaa/go-poppler/document.go:24: cannot use d.doc (type poppDoc) as type *C.struct__PopplerDocument in argument to _Cfunc_poppler_document_get_author
...
I changed code like this:
diff -ur go-poppler.orig/image.go go-poppler/image.go
--- go-poppler.orig/image.go 2015-10-15 17:26:17.000000000 +0200
+++ go-poppler/image.go 2015-10-15 17:27:27.565100316 +0200
@@ -11,7 +11,7 @@
type Image struct {
Id int
Area Rectangle
- p poppDoc
+ p poppPage
}
type Rectangle struct {
diff -ur go-poppler.orig/page.go go-poppler/page.go
--- go-poppler.orig/page.go 2015-10-15 17:26:17.000000000 +0200
+++ go-poppler/page.go 2015-10-15 17:27:06.805461872 +0200
@@ -8,7 +8,7 @@
//import "fmt"
type Page struct {
- p poppDoc
+ p poppPage
}
func (p *Page) Text() string {
diff -ur go-poppler.orig/poppler.go go-poppler/poppler.go
--- go-poppler.orig/poppler.go 2015-10-15 17:26:17.000000000 +0200
+++ go-poppler/poppler.go 2015-10-15 17:26:49.165219888 +0200
@@ -12,7 +12,8 @@
"path/filepath"
)
-type poppDoc *[0]byte
+type poppDoc *C.struct__PopplerDocument
+type poppPage *C.struct__PopplerPage
func Open(filename string) (doc *Document, err error) {
filename, err = filepath.Abs(filename)
But now I am getting this:
golang/src/github.com/cheggaaa/go-poppler/image.go:24: cannot use ci (type *C.struct__cairo_surface) as type *cairo.C.struct__cairo_surface in argument to cairo.NewSurfaceFromC
golang/src/github.com/cheggaaa/go-poppler/image.go:24: cannot use ctx (type *C.struct__cairo) as type *cairo.C.struct__cairo in argument to cairo.NewSurfaceFromC
It is the same type but with cairo. in front. Do you have an idea why and can it be cast? I use poppler-0.35.0 and cairo-1.14.2.
提供一个可以转出图片的函数,但是转出来的图片会模糊,暂时不知道怎么解决
func (p *Page) Convert2Image() image.Image {
// Render the PDF file
width, height := p.Size()
surface := cairo.NewSurface(cairo.FORMAT_ARGB32, int(width), int(height))
pattern := cairo.NewPatternForSurface(surface)
pattern.SetFilter(cairo.CAIRO_FILTER_NEAREST)
defer pattern.Destroy()
_, drawcontext := surface.Native()
surface.SetSource(pattern)
C.poppler_page_render_for_printing(p.p, (*C.cairo_t)(unsafe.Pointer(drawcontext)))
defer surface.Finish()
return surface.GetImage()
}
Hi there, first I want to say thank you for writing this wrapper 👍
My file descriptor limit is 256 and this code will generate error on 252nd loop (i == 251)
for i := 0; i < 300; i++ {
fmt.Println(i)
doc, _ := poppler.Open("file.pdf")
doc.GetPage(0)
}
Error log:
(process:30447): Poppler-CRITICAL **: 19:37:52.120: int poppler_document_get_n_pages(PopplerDocument *): assertion 'POPPLER_IS_DOCUMENT (document)' failed
OS: macOS Catalina 10.15.4
Per golang/go#23749 (comment), cflags-only-I is no longer a whitelisted cgo option for the Go compiler, hence I was not able to build using go get. I set the environment variable as indicated as a temporary fix, but I'm not familiar enough with the flags and how you're using them to propose a permanent fix. If you could look into it for future users of your library, I'm sure folks would appreciate that. Thanks!
привет. я не нашел в твоей документации примеров как конвертировать целый пдф в несколько изображений. хотя сам poppler это умеет. это можно сделать с использованием твоей обертки?
In poppler it has "pdfseparate"
you can run something like:
pdfseparate main_pdf.pdf pdf_page_%d.pdf
This gives you pdf_page_1.pdf ..2..3.. etc
Or alternatively writing a page, right now you can grab a pdffile like:
pdf, err := poppler.Open("file.pdf");
pageCount := pdf.GetNPages();
for i := 0; i < pageCount; i++ {
pdfPage := pdf.GetPage(i);
//write pdfPage to a file?
}
above could allow you to write specific pages if you had an if statement searching for some text() or something as well..
Is there a way to do this? Thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.