phpdave11 / gofpdi Goto Github PK

View Code? Open in Web Editor NEW

111.0 111.0 53.0 56 KB

Go Free PDF Document Importer

License: MIT License

Go 100.00%

gofpdi's People

Contributors

Stargazers

Watchers

gofpdi's Issues

Counting pages in pdf?

I'd like to use this library to count the number of pages in a PDF. The simplest method I can see using currently exported functionality is this:

func countPages(data []byte) int {
	reader := io.ReadSeeker(bytes.NewReader(data))
	importer := gofpdi.NewImporter()
	importer.SetSourceStream(&reader)
	return len(importer.GetPageSizes())
}

However, if I could access unexported fields, I believe this would be a much better option:

func countPages(data []byte) (int, error) {
	pdfReader, err := gofpdi.NewPdfReaderFromStream(bytes.NewReader(data))
	if err != nil {
		return 0, err
	}
	return len(pdfReader.pages), nil
}

A .PageCount() could be easily added to PdfReader to allow this (I might submit a pull request), but I'm wondering if you see any reason not to do that? I assume the reason it doesn't exist already is that you haven't needed it, but I wanted to make sure.

Why do Importer methods panic instead of returning errors?

This seemed like a strange choice to me, as Go libraries generally return errors instead of panicking. I would like to use the Importer to import PDFs submitted to my API, but as it stands I'm going to have to recover any panics caused by bad PDF formatting so that I can inform the caller of the issue. Any thoughts on this? Would you be opposed to a v2 that returns errors instead?

...................................................................................................................................

Add protection make imported PDF page content become blank

I do follow import the pdf page in the example and it works like a charm

package main

import (
    "bytes"
    "github.com/phpdave11/gofpdf"
    "github.com/phpdave11/gofpdf/contrib/gofpdi"
    "io"
    "io/ioutil"
    "net/http"
)

func main() {
    var err error

    pdf := gofpdf.New("P", "mm", "A4", "")

    // Download a PDF into memory                                                                                                                     
    res, err := http.Get("https://tcpdf.org/files/examples/example_038.pdf")
    if err != nil {
        panic(err)
    }
    pdfBytes, err := ioutil.ReadAll(res.Body)
    res.Body.Close()
    if err != nil {
        panic(err)
    }

    // convert []byte to io.ReadSeeker                                                                                                                
    rs := io.ReadSeeker(bytes.NewReader(pdfBytes))

    // Import in-memory PDF stream with gofpdi free pdf document importer                                                                             
    tpl1 := gofpdi.ImportPageFromStream(pdf, &rs, 1, "/TrimBox")

    pdf.AddPage()

    pdf.SetFillColor(200, 700, 220)
    pdf.Rect(20, 50, 150, 215, "F")

    // Draw imported template onto page                                                                                                               
    gofpdi.UseImportedTemplate(pdf, tpl1, 20, 50, 150, 0)

    pdf.SetFont("Helvetica", "", 20)
    pdf.Cell(0, 0, "Import PDF stream into gofpdf document with gofpdi")

    err = pdf.OutputFileAndClose("example.pdf")
    if err != nil {
        panic(err)
    }
}

But when I use SetProtection function, then all content of the imported page was gone
pdf.SetProtection(gofpdf.CnProtectAnnotForms, "1", "a")

As my demand is import a pdf page and set password for the pdf file, please help me walkthrough this

runtime error: invalid memory address or nil pointer dereference

The file at http://www.campbell-lange.net/media/files/example6.pdf causes the following error:

file: /tmp/f4a42574-28b8-4fc8-8322-fdbd885f10f3.pdf
   pages:   2
   page :   1 processed
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x98 pc=0x5407b6]

goroutine 1 [running]:
github.com/phpdave11/gofpdi.(*PdfReader).rebuildContentStream(0xc00007a1e0, 0xc0002623c0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:1376 +0xb6
github.com/phpdave11/gofpdi.(*PdfReader).getContent(0xc00007a1e0, 0x2, 0xc00022d860, 0x0, 0x0, 0x1)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:1329 +0x1f3
github.com/phpdave11/gofpdi.(*PdfWriter).ImportPage(0xc00016cc80, 0xc00007a1e0, 0x2, 0x5c1e52, 0x9, 0x1, 0x0, 0x0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/writer.go:148 +0x163
github.com/phpdave11/gofpdi.(*Importer).ImportPage(0xc0000a5480, 0x2, 0x5c1e52, 0x9, 0x0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:122 +0x103
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).getTemplateID(0xc00009e570, 0x60a0c0, 0xc000124000, 0x2, 0x5c1e52, 0x9, 0x7)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:61 +0x50
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).ImportPage(0xc00009e570, 0x60a0c0, 0xc000124000, 0x7fffdeeeb5f7, 0x2d, 0x2, 0x5c1e52, 0x9, 0x0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:45 +0x89
github.com/phpdave11/gofpdf/contrib/gofpdi.ImportPage(...)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:115
main.main()
	/home/rory/src/go-gofpdi-test/gofpdi-t3.go:47 +0x214

Please note that this file will be removed shortly.

Failed to read xref table: Failed to read prev xref: Unsupported /DecodeParms

Hi @phpdave11

I'm getting the following error on 1.0.13:

panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Failed to read prev xref: Unsupported /DecodeParms - only tested with /Columns <= 4 and /Predictor <= 12

I get this error with a number of large PDFs (such as books) and PDFs made with Google Docs.

How to reproduce:

package main
  
import (
    "fmt"
    "github.com/phpdave11/gofpdf"
    "github.com/phpdave11/gofpdf/contrib/gofpdi"
    "io"
    "io/ioutil"
    "net/http"
    "os"
    rpdf "rsc.io/pdf"
)

const PDF_WIDTH_IN_MM = 222.6264
const PDF_HEIGHT_IN_MM = 297.0000
const MM_TO_RMPOINTS = 2.83465

var Verbose bool = true

var url string = "http://www.campbell-lange.net/media/files/example_2020a.pd

func main() {

    resp, err := http.Get(url)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    tmpFile, err := ioutil.TempFile(os.TempDir(), "pdfexample-")
    if err != nil {
        panic(err)
    }
    filer := tmpFile.Name()
    defer os.Remove(filer)

    _, err = io.Copy(tmpFile, resp.Body)
    if err != nil {
        panic(err)
    }


    // get pdf file information
    fi, err := rpdf.Open(filer)
    if err != nil {
        panic(err)
    }
    pageNum := fi.NumPage()
    if Verbose {
        fmt.Printf("Example PDF from %s\n", url)
        fmt.Printf("file: %s\n   pages: %3d\n", filer, pageNum)
    }

    // construct new gofpdf document
    pdf := gofpdf.NewCustom(&gofpdf.InitType{
        UnitStr: "pt",
        Size: gofpdf.SizeType{
            Wd: PDF_WIDTH_IN_MM * MM_TO_RMPOINTS,
            Ht: PDF_HEIGHT_IN_MM * MM_TO_RMPOINTS},
    })

    for p := 1; p <= pageNum; p++ {
        pdf.AddPage()

        bgpdf := gofpdi.ImportPage(pdf, filer, p, "/MediaBox")
        gofpdi.UseImportedTemplate(pdf, bgpdf, 0, 0, 210*MM_TO_RMPOINTS, 297

        if Verbose {
            fmt.Printf("   page : %3d processed\n", p)
        }
    }
}

Thanks for your efforts!
Rory

There seems to be a memory leak somewhere?

Our app dies after getting this error

"fatal error: concurrent map iteration and map write"

stacktrace leads here

{"line":"\t/go/src/github.com/phpdave11/gofpdi/importer.go:186 +0xdb fp=0xc14d43ee08 sp=0xc14d43ed38 pc=0xbde04b","source":"stderr","tag":"ss-dms"}

Also attaching pprof profile

Failed to initialize parser: Failed to read pdf: Failed to read xref table: Expected xref to start with 'xref'

Certain formats of PDF documents do not contain the xref table which is causing the parsing error.
Attaching the pdf so you can duplicate the error.
example_012a.pdf

Getting errors when trying to use importers on certain pdfs

When I am trying to load a pdf using importer.SetSourceStream,
I get the panic

panic: Failed to initialize parser: Failed to read pdf: Failed to to read pages: Failed to read kids: Failed to resolve page/pages object: Failed to read value for token: <<: Token is empty
goroutine 50 [running]:
runtime/debug.Stack()
        /usr/lib/go/src/runtime/debug/stack.go:24 +0x5e
github.com/gofiber/fiber/v2/middleware/recover.defaultStackTraceHandler(0x18?, {0x1007980?, 0xc000c86468})
        /home/kittycat/go/pkg/mod/github.com/gofiber/fiber/[email protected]/middleware/recover/recover.go:12 +0x26
github.com/LegalForceLawRAPC/notarysign/src/core/fiber.RunServer.New.func2.1()
        /home/kittycat/go/pkg/mod/github.com/gofiber/fiber/[email protected]/middleware/recover/recover.go:31 +0x72
panic({0x1007980?, 0xc000c86468?})
        /usr/lib/go/src/runtime/panic.go:914 +0x21f
github.com/phpdave11/gofpdi.(*Importer).SetSourceStream(0xc00094cf60, 0xc00041b790)
        /home/kittycat/go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:95 +0x1a8
github.com/LegalForceLawRAPC/notarysign/common/utils.AddPageToPDF({0xc0008be000, 0x247b6, 0x2a000}, 0x3)
        /run/media/kittycat/Linux_files/work/Signmarkia-Backend/common/utils/pdf_utils.go:91 +0x137
...

This is the code I wrote
It just deletes a page from the pdf

func DeletePageFromPDF(pdfBuf []byte, pageNumber int) ([]byte, error) {

	pdf = InitGoPdf()
	importer := gofpdi.NewImporter()
	pdfReader := bytes.NewReader(pdfBuf)
	r := io.ReadSeeker(pdfReader)
	if r == nil {
		return nil, errors.New("failed to read pdf")
	}
	importer.SetSourceStream(&r)
	totalPagesOfPdf := importer.GetNumPages()

	for i := 0; i < totalPagesOfPdf; i++ {
		if pageNumber != i+1 {
			importedPage := pdf.ImportPageStream(&r, i+1, "/MediaBox")
			data := importer.GetPageSizes()
			pageWidth := data[i+1]["/MediaBox"]["w"]
			pageHeight := data[i+1]["/MediaBox"]["h"]
			pdf.AddPageWithOption(gopdf.PageOption{PageSize: &gopdf.Rect{W: pageWidth, H: pageHeight}})
			pdf.UseImportedTemplate(importedPage, 0, 0, pageWidth, pageHeight)
		}

	}

	var newPDFBuf bytes.Buffer
	_, err := pdf.WriteTo(&newPDFBuf)
	if err != nil {
		return nil, err
	}

	return newPDFBuf.Bytes(), nil
}

The issue seems to be happening randomly.
It works once on the pdf
and next time it fails on the same pdf

Here's a link to the entire code if needed
https://gist.github.com/Lioncat2002/f40306aba45ccafe39dcceca294f50dc

Reading pdf plain text

How can I read pdf plain text with this package?

Unable to import PDF exported from browsers (Chromium/Firefox/Blink/Gecko)

Hello, this package is great, but I'm unable to import PDFs that have been exported from browsers.

It works completely fine when importing PDFs exported from jung-kurt/gofpdf, but if I try to import a PDF exported from Chromium or Firefox I get the following error:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x51831b]

goroutine 1 [running]:
github.com/phpdave11/gofpdi.(*PdfReader).resolveObject(0xc00010c4d0, 0x0)
        /mnt/Portable/Programming/Go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:599 +0x19b
github.com/phpdave11/gofpdi.(*PdfReader).readPages(0xc00010c4d0)
        /mnt/Portable/Programming/Go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:1228 +0x52
github.com/phpdave11/gofpdi.(*PdfReader).read(0xc00010c4d0)
        /mnt/Portable/Programming/Go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:1622 +0x5e
github.com/phpdave11/gofpdi.(*PdfReader).init(0xc00010c4d0)
        /mnt/Portable/Programming/Go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:75 +0x105
github.com/phpdave11/gofpdi.NewPdfReader({0x55f1ff, 0xc000016390})
        /mnt/Portable/Programming/Go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:61 +0xdc
github.com/phpdave11/gofpdi.(*Importer).SetSourceFile(0xc00002b580, {0x55f1ff, 0x0})
        /mnt/Portable/Programming/Go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:67 +0x6a
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).ImportPage(0xc00000e5b0, {0x597b58, 0xc000142000}, {0x55f1ff, 0x55d01b}, 0x2, {0x55e516, 0x9})
        /mnt/Portable/Programming/Go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:43 +0x4b
github.com/phpdave11/gofpdf/contrib/gofpdi.ImportPage(...)
        /mnt/Portable/Programming/Go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:115
main.main()
        /mnt/Portable/Programming/Go/src/github.com/barjoio/pdf-header-footer/main.go:14 +0x116
exit status 2

gofpdi version: v1.4.2

PROBLEM TO IMPORT PAGE

Hello @phpdave11 , i'm having problems with ImportPage function.

I'm refactoring my PHP code to GOLANG. I can import my PDF file on PHP project, but the same file don't import on GO project.

The error is it:
Failed to put imported objects: Unable to resolve object: Expected next token to be: endstream, got:

If you could help me, contact me on my linkedin

panic: Failed to get page rotation

This is a good question. I wrote a small program to test whether this can be done but I ran into the following problem with gofpdi:

panic: Failed to get page rotation: Failed to get page rotation for parent: No parent for page rotation

@phpdave11, do you know why this happens? Here is the PDF I read as a template:

func samplePDF() (fileStr string, err error) {
  pdf := gofpdf.New(gofpdf.OrientationPortrait, "mm", "A4", "")
  pdf.AddPage()
  pdf.SetFont("Arial", "B", 16)
  pdf.Cell(40, 10, "Hello World!")
  fileStr = "hello.pdf"
  err = pdf.OutputFileAndClose(fileStr)
  return
}

The generated PDF is smaller than the size of the buffer (1500) that is read, so the following change was made to gofpdi:

diff --git a/reader.go b/reader.go
index 3376b45..afdaff9 100644
--- a/reader.go
+++ b/reader.go
@@ -536,11 +536,20 @@ func (this *PdfReader) resolveObject(objSpec *PdfValue) (*PdfValue, error) {
 
 // Find the xref offset (should be at the end of the PDF)
 func (this *PdfReader) findXref() error {
+       const bufSize = 1500
        var result int
        var err error
        var toRead int64
 
        toRead = 1500
+       info, err := this.f.Stat()
+       if err != nil {
+               return errors.Wrap(err, "Failed to obtain file information")
+       }
+       toRead = info.Size()
+       if toRead > bufSize {
+               toRead = bufSize
+       }
 
        // 0 means relative to the origin of the file,
        // 1 means relative to the current offset,

Originally posted by @jung-kurt in jung-kurt/gofpdf#261 (comment)

gofpdi import from stream changes the first page of the pdf

Below is my code for pdf importer. I am trying to merge 2 pdf as 1. In this case, rsbp is one pdf and rsinv is another one. the system should be that first page of the pdf will have the content of rsbp and then rest of the pdf is rsinv. When i run this code i see the second page rsinv is ok but first page (where it should have the content of rsbp),has the content of the first page of rsinv.

now if i change it to inport page and manually input 2 files then everything works great(problem happens when converted to stream).

IF i remove the loop part then the outputpdf has the content of rsbp which is correct.

Not what is the issue here.

pdf := gofpdf.New("P", "mm", "A4", "")

w, h := m.GetPageSize()
imp1 := gofpdi.NewImporter()
rsbp := io.ReadSeeker(bytes.NewReader(pdfm.Bytes()))
tpl1 := imp1.ImportPageFromStream(pdf, &rsbp, 1, "/MediaBox")
pdf.AddPage()
imp1.UseImportedTemplate(pdf, tpl1, 0, 0, w, h)
rsinv := io.ReadSeeker(bytes.NewReader(bikepass.Invoicebyte))
tpl1 = imp1.ImportPageFromStream(pdf, &rsinv, 1, "/MediaBox")
nrPages := len(imp1.GetPageSizes())

for i := 1; i <= nrPages; i++ {
	pdf.AddPage()
	if i > 1 {
		tpl1 = imp1.ImportPageFromStream(pdf, &rsinv, i, "/MediaBox")
	}		
	imp1.UseImportedTemplate(pdf, tpl1, 0, 0, w, h)		
}
err := pdf.OutputFileAndClose("pdfs/example.pdf")
if err != nil {
	fmt.Errorf("error in generating the output pdf file: %w", err)
}

Failed to read xref table: Unsupported field size

Hi. Here is an issue with v1.0.10:

panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table: Unsupported field size in cross-reference stream dictionary - only tested with /W [1 2 1]

goroutine 1 [running]:
github.com/phpdave11/gofpdi.(*Importer).SetSourceFile(0xc0000634c0, 0x5aad2a, 0xc)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:69 +0x20d
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).ImportPage(0xc00000e580, 0x5e44a0, 0xc0000c6000, 0x5aad2a, 0xc, 0x1, 0x5aa1ee, 0x9, 0x4083b88b1c22f64f)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:43 +0x46
github.com/phpdave11/gofpdf/contrib/gofpdi.ImportPage(...)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:115
main.main()
	/home/rory/src/go-gofpdi-test/gofpdi-t2.go:32 +0x16c
exit status 2

Example file: http://www.campbell-lange.net/media/files/example3.pdf

Unable to import more than one file

If I try to import more than one file the last file imported will be used instead of all other files. This is because the same hash is generated for both files, similarly to what is reported here: #33, as mentioned v1.0.12...v1.0.13 seemed to solve it, but not when importing from a stream.

Other issues that seem related:

sample code:

importer := gofpdi.NewImporter()
for _, attachment := range msg.Attachments {
   rs := io.ReadSeeker(bytes.NewReader(attachment.Bytes))

   tplID := importer.ImportPageFromStream(pdf, &rs, 1, "/MediaBox")
   pdf.AddPage()

   pdf.SetFillColor(238, 238, 238)
   pdf.Rect(0, MarginTop, 210, ContentHeight-MarginTop, "F")

   importer.UseImportedTemplate(pdf, tplID, 0, MarginTop, 0, ContentHeight-MarginTop)
}

similar hashes:

In this case i = 1 in both instances of the loop, and this.r.sourcefile = ''.

Conclusion:

It seems the code is written to differentiate between filenames, and although each stream is assigned a unique this.sourceFile it is not used in any of the readers of writers. I found that by just setting that name it started working for me.

I'll open up a pull request and hopefully this can get merged.

SetSourceStream should not require a pointer to an interface

I don't understand the reason for SetSourceStream to require a pointer to an interface.
the method signature should be
rs io.ReadSeeker
instead of
rs *io.ReadSeeker

a conversion to the interface would then not be necessary anymore as NewReader already implements the ReaderSeeker interface.

rs := bytes.NewReader(pdfBytes) tpl1 := gofpdi.ImportPageFromStream(pdf, rs, 1, "/TrimBox")
Let me know if you want me to create a pull request for this. I understand that this is a rather tedious issue since it actually breaks the interface and would mean this library and the gofpdf and gopdf library would have to release a new major version I suppose.

Some minor issues with gofpdi.

Dear gofpdi author, hello. It appears that the gofpdi.NewImporter().SetSourceFile(pdfFileName) in your work tends to continuously hold onto the PDF, and you haven't provided a method in the API to release this occupancy.

Import Removes Links?

Hello! Testing this with the "go-pdf/fpdf" package -- importing a pdf that has link boxes inside it causes those links to disappear. Is there a way to get them back? Is there an option that's not documented for this?

Failed to get page resources: Page <n> does not exist!!

The following code produces an error on 1.0.11:

package main
  
import (
    "fmt"
    "github.com/phpdave11/gofpdf"
    "github.com/phpdave11/gofpdf/contrib/gofpdi"
    "io"
    "io/ioutil"
    "net/http"
    "os"
    rpdf "rsc.io/pdf"
)

const PDF_WIDTH_IN_MM = 222.6264
const PDF_HEIGHT_IN_MM = 297.0000
const MM_TO_RMPOINTS = 2.83465

var Verbose bool = true

var url string = "http://www.campbell-lange.net/media/files/example4.pdf"

func main() {

    resp, err := http.Get(url)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    tmpFile, err := ioutil.TempFile(os.TempDir(), "pdfexample-")
    if err != nil {
        panic(err)
    }
    filer := tmpFile.Name()
    defer os.Remove(filer)

    _, err = io.Copy(tmpFile, resp.Body)
    if err != nil {
        panic(err)
    }

    // get pdf file information
    fi, err := rpdf.Open(filer)
    if err != nil {
        panic(err)
    }
    pageNum := fi.NumPage()
    if Verbose {
        fmt.Printf("Example PDF from %s\n", url)
        fmt.Printf("file: %s\n   pages: %3d\n", filer, pageNum)
    }

    // construct new gofpdf document
    pdf := gofpdf.NewCustom(&gofpdf.InitType{
        UnitStr: "pt",
        Size: gofpdf.SizeType{
            Wd: PDF_WIDTH_IN_MM * MM_TO_RMPOINTS,
            Ht: PDF_HEIGHT_IN_MM * MM_TO_RMPOINTS},
    })

    for p := 1; p <= pageNum; p++ {
        pdf.AddPage()
        
        bgpdf := gofpdi.ImportPage(pdf, filer, p, "/MediaBox")
        gofpdi.UseImportedTemplate(pdf, bgpdf, 0, 0, 210*MM_TO_RMPOINTS, 297*MM_TO_RMPOINTS)
        
        if Verbose {
            fmt.Printf("   page : %3d processed\n", p)
        }
    }
}

The error is as follows:

Example PDF from http://www.campbell-lange.net/media/files/example4.pdf
file: /tmp/pdfexample-051294996
   pages:  50
   page :   1 processed
   page :   2 processed
   page :   3 processed
   page :   4 processed
   page :   5 processed
   page :   6 processed
   page :   7 processed
panic: Failed to get page resources: Page 8 does not exist!!

goroutine 1 [running]:
github.com/phpdave11/gofpdi.(*Importer).ImportPage(0xc000079440, 0x8, 0x793158, 0x9, 0x0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:124 +0x366
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).getTemplateID(0xc0000a4560, 0x81b0a0, 0xc000224000, 0x8, 0x793158, 0x9, 0x7)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:61 +0x50
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).ImportPage(0xc0000a4560, 0x81b0a0, 0xc000224000, 0xc000196100, 0x19, 0x8, 0x793158, 0x9, 0x0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:45 +0x89
github.com/phpdave11/gofpdf/contrib/gofpdi.ImportPage(...)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:115
main.main()
	/home/rory/src/go-gofpdi-test/gofpdi-t3-web.go:64 +0x402
exit status 2

Failed to put imported objects: Unable to resolve object

The following code on v1.0.11:

package main
  
import (
    "fmt"
    "github.com/phpdave11/gofpdf"
    "github.com/phpdave11/gofpdf/contrib/gofpdi"
    "io"
    "io/ioutil"
    "net/http"
    "os"
    rpdf "rsc.io/pdf"
)

const PDF_WIDTH_IN_MM = 222.6264
const PDF_HEIGHT_IN_MM = 297.0000
const MM_TO_RMPOINTS = 2.83465

var Verbose bool = true

var url string = "http://www.campbell-lange.net/media/files/example5.pdf"

func main() {

    resp, err := http.Get(url)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    tmpFile, err := ioutil.TempFile(os.TempDir(), "pdfexample-")
    if err != nil {
        panic(err)
    }
    filer := tmpFile.Name()
    defer os.Remove(filer)

    _, err = io.Copy(tmpFile, resp.Body)
    if err != nil {
        panic(err)
    }

    // get pdf file information
    fi, err := rpdf.Open(filer)
    if err != nil {
        panic(err)
    }
    pageNum := fi.NumPage()
    if Verbose {
        fmt.Printf("Example PDF from %s\n", url)
        fmt.Printf("file: %s\n   pages: %3d\n", filer, pageNum)
    }

    // construct new gofpdf document
    pdf := gofpdf.NewCustom(&gofpdf.InitType{
        UnitStr: "pt",
        Size: gofpdf.SizeType{
            Wd: PDF_WIDTH_IN_MM * MM_TO_RMPOINTS,
            Ht: PDF_HEIGHT_IN_MM * MM_TO_RMPOINTS},
    })

    for p := 1; p <= pageNum; p++ {
        pdf.AddPage()

        bgpdf := gofpdi.ImportPage(pdf, filer, p, "/MediaBox")
        gofpdi.UseImportedTemplate(pdf, bgpdf, 0, 0, 210*MM_TO_RMPOINTS, 297*MM_TO_RMPOINTS)

        if Verbose {
            fmt.Printf("   page : %3d processed\n", p)
        }
    }
}

Results in the following error:

Example PDF from http://www.campbell-lange.net/media/files/example5.pdf
file: /tmp/pdfexample-315244371
   pages:   2
   page :   1 processed
panic: Failed to put imported objects: Unable to resolve object: Expected next token to be: endstream, got: <<

goroutine 1 [running]:
github.com/phpdave11/gofpdi.(*Importer).PutFormXobjectsUnordered(0xc000079440, 0x2)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:162 +0x32e
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).getTemplateID(0xc0000a4560, 0x81b0a0, 0xc0001fe000, 0x2, 0x793158, 0x9, 0x7)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:66 +0x6b
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).ImportPage(0xc0000a4560, 0x81b0a0, 0xc0001fe000, 0xc000018320, 0x19, 0x2, 0x793158, 0x9, 0x0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:45 +0x89
github.com/phpdave11/gofpdf/contrib/gofpdi.ImportPage(...)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:115
main.main()
	/home/rory/src/go-gofpdi-test/gofpdi-t3-web.go:64 +0x402
exit status 2

Support for Linearized PDFs

The following PDF cannot currently be imported, because there is an error parsing the xref table.

white-background.pdf

Originally posted by @hugoglt in jung-kurt/gofpdf#261 (comment)

panic: runtime error: invalid memory address or nil pointer dereference

Hi @phpdave11

I've found a PDF which causes a nil pointer dereference:

panic: runtime error: invalid memory address or nil pointer dereference

Perhaps it is because the PDF has form fields?

To reproduce:

package main
  
import (
    "fmt"
    "github.com/phpdave11/gofpdf"
    "github.com/phpdave11/gofpdf/contrib/gofpdi"
    "io"
    "io/ioutil"
    "net/http"
    "os"
    rpdf "rsc.io/pdf"
)

const PDF_WIDTH_IN_MM = 222.6264
const PDF_HEIGHT_IN_MM = 297.0000
const MM_TO_RMPOINTS = 2.83465

var Verbose bool = true

var url string = "http://www.campbell-lange.net/media/files/example_2020b.pdf"

func main() {

    resp, err := http.Get(url)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    tmpFile, err := ioutil.TempFile(os.TempDir(), "pdfexample-")
    if err != nil {
        panic(err)
    }
    filer := tmpFile.Name()
    defer os.Remove(filer)

    _, err = io.Copy(tmpFile, resp.Body)
    if err != nil {
        panic(err)
    }
    // get pdf file information
    fi, err := rpdf.Open(filer)
    if err != nil {
        panic(err)
    }
    pageNum := fi.NumPage()
    if Verbose {
        fmt.Printf("Example PDF from %s\n", url)
        fmt.Printf("file: %s\n   pages: %3d\n", filer, pageNum)
    }

    // construct new gofpdf document
    pdf := gofpdf.NewCustom(&gofpdf.InitType{
        UnitStr: "pt",
        Size: gofpdf.SizeType{
            Wd: PDF_WIDTH_IN_MM * MM_TO_RMPOINTS,
            Ht: PDF_HEIGHT_IN_MM * MM_TO_RMPOINTS},
    })

    for p := 1; p <= pageNum; p++ {
        pdf.AddPage()

        bgpdf := gofpdi.ImportPage(pdf, filer, p, "/MediaBox")
        gofpdi.UseImportedTemplate(pdf, bgpdf, 0, 0, 210*MM_TO_RMPOINTS, 297

        if Verbose {
            fmt.Printf("   page : %3d processed\n", p)
        }
    }
}

By the way, pdftotext parses this ok so I assume the PDF is valid.

Cheers,
Rory

Error while inserting a random PDF to another PDF

I am getting a nil pointer error on resolveObjects method while I am inserting an existing pdf to a new pdf

package main

import "github.com/signintech/gopdf"

func main() {
	gPdf := gopdf.GoPdf{}
	gPdf.Start(gopdf.Config{PageSize: *gopdf.PageSizeA4}) //595.28, 841.89 = A4

	pdf := gPdf.ImportPage("/tmp/documents/agreement.pdf", 1, "/MediaBox")
	gPdf.UseImportedTemplate(pdf, 0, 0, 150, 0)

	_ = gPdf.WritePdf("/tmp/documents/see-all.pdf")
}

StackTrace:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x1003d8a5c]

goroutine 1 [running]:
github.com/phpdave11/gofpdi.(*PdfReader).resolveObject(0x140000b4210, 0x0)
        /Users/vishaljain/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:599 +0x13c
github.com/phpdave11/gofpdi.(*PdfReader).readPages(0x140000b4210)
        /Users/vishaljain/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:1228 +0x4c
github.com/phpdave11/gofpdi.(*PdfReader).read(0x140000b4210)
        /Users/vishaljain/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:1622 +0x4c
github.com/phpdave11/gofpdi.(*PdfReader).init(0x140000b4210)
        /Users/vishaljain/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:75 +0x11c
github.com/phpdave11/gofpdi.NewPdfReader({0x1003f9d75, 0x3d})
        /Users/vishaljain/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:61 +0xc4
github.com/phpdave11/gofpdi.(*Importer).SetSourceFile(0x1400009e040, {0x1003f9d75?, 0x5000?})
        /Users/vishaljain/go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:69 +0x80
github.com/signintech/gopdf.(*GoPdf).ImportPage(0x140000cc000, {0x1003f9d75?, 0x0?}, 0x0?, {0x1003f0cbc, 0x9})
        /Users/vishaljain/go/pkg/mod/github.com/signintech/[email protected]/gopdf.go:1387 +0x40
main.main()
        /Users/vishaljain/go/src/pdftester/main.go:9 +0xcc
exit status 2

If we can add a nil check on the method it would be really helpful, on this specific file

problem to import two pdf files' page

per check it is imported only one pdf files' page. is it a bug?

`
var tp_i_page int
var flag_i_page int
var tp_page int
var flag_page int
for ....... {
// FLAGxxx, like FLAG018 FLAG2
if strings.HasPrefix(_fi, "FLAG") {
fg, _ := strconv.Atoi(strings.Replace(_fi, "FLAG", "", -1))

				if flag_fil != "" && fg != flag_i_page {
					flag_i_page = fg
					flag_page = pdf.ImportPage(flag_fil, flag_i_page, "/MediaBox")
					pdf.UseImportedTemplate(flag_page, PT_(_sz[0]), PT_(_sz[1]), 0, 0)
				}
				continue
			}

			// PAGExxx, like  PAGE18   PAGE2
			if strings.HasPrefix(_fi, "PAGE") {
				pg, _ := strconv.Atoi(strings.Replace(_fi, "PAGE", "", -1))
				if pg != tp_i_page {
					tp_i_page = pg
					tp_page = pdf.ImportPage(fil, tp_i_page, "/MediaBox")
					log.Println(tp_page, "PAGE")
				}
			}

			pdf.UseImportedTemplate(tp_page, PT_(_sz[0]), PT_(_sz[1]), 0, 0)
		}

PDF file without xref table.

I have a PDF 1.7 file. There is no xref table in the file. Can gofpdi read such a file?

Understanding document rotation

Hi, I am reading a number of documents using the library. Some of these are rotated 90 degrees, some are not. When I get the height and width of the page, I always get the real measures, regarding if the document is rotated or not. To explain this better, let's assume an A4 page. If the A4 is rotated I would assume to get the width longer than the height, but somehow, the library is able to understand the rotation and returns me the actual height and width, with width < height, always. This causes me problems later in my processing. Is it possible to implemented or is there a way to understand if the document is rotated or not? thank you.

How can I import multiple pages once?

Hi, I use gofpdi in gofpdf to import pdf, but my template pdf have 12 pages, so I have to use gofpdf.ImportPage in every page. It too slow for me. So, how can I import multiple pages once? Thx.

panic: unexpected EOF with github.com/phpdave11/[email protected]

The following code:

package main

import (
	"github.com/phpdave11/gofpdf"
	"github.com/phpdave11/gofpdf/contrib/gofpdi"

	"io"
	"io/ioutil"
	"net/http"
	"os"
)

const PDF_WIDTH_IN_MM = 222.6264
const PDF_HEIGHT_IN_MM = 297.0000
const MM_TO_RMPOINTS = 2.83465

var url string = "http://www.campbell-lange.net/media/files/example.pdf"

func main() {

	resp, err := http.Get(url)
	if err != nil {
		panic(err)
	}
	defer resp.Body.Close()

	tmpFile, err := ioutil.TempFile(os.TempDir(), "pdfexample-")
	if err != nil {
		panic(err)
	}
	defer os.Remove(tmpFile.Name())

	_, err = io.Copy(tmpFile, resp.Body)
	if err != nil {
		panic(err)
	}

	pdf := gofpdf.NewCustom(&gofpdf.InitType{
		UnitStr: "pt",
		Size: gofpdf.SizeType{
			Wd: PDF_WIDTH_IN_MM * MM_TO_RMPOINTS,
			Ht: PDF_HEIGHT_IN_MM * MM_TO_RMPOINTS},
	})

	pdf.AddPage()
	var importPage = 1

	bgpdf := gofpdi.ImportPage(pdf, tmpFile.Name(), importPage, "/MediaBox")
	gofpdi.UseImportedTemplate(pdf, bgpdf, 0, 0, 210*MM_TO_RMPOINTS, 297*MM_TO_RMPOINTS)

}

Fails with

panic: unexpected EOF

goroutine 1 [running]:
github.com/phpdave11/gofpdi.(*PdfReader).readXref(0xc0001a40a0, 0x0, 0x0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:976 +0x2670
github.com/phpdave11/gofpdi.(*PdfReader).read(0xc0001a40a0, 0xc00018e050, 0x40dd48)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:1518 +0x50
github.com/phpdave11/gofpdi.(*PdfReader).init(0xc0001a40a0, 0xc0001a40a0, 0xc0001a20d0)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:72 +0xc2
github.com/phpdave11/gofpdi.NewPdfReader(0xc0001760a0, 0x19, 0xc0001760a0, 0x19, 0xa23d00)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/reader.go:58 +0x116
github.com/phpdave11/gofpdi.(*Importer).SetSourceFile(0xc0000af400, 0xc0001760a0, 0x19)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:67 +0x166
github.com/phpdave11/gofpdf/contrib/gofpdi.(*Importer).ImportPage(0xc0000a0560, 0x7fa600, 0xc00019a000, 0xc0001760a0, 0x19, 0x1, 0x77fad8, 0x9, 0x4083b88b1c22f64f)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:43 +0x46
github.com/phpdave11/gofpdf/contrib/gofpdi.ImportPage(...)
	/home/rory/go/pkg/mod/github.com/phpdave11/[email protected]/contrib/gofpdi/gofpdi.go:115
main.main()
	/home/rory/src/go-gofpdi-test/gofpdi-t.go:48 +0x37b

Apologies if I've done something stupid with my code.

The example PDF is in the url referenced in the snippet. It is a PDF compiled from a .tex file using pdflatex.

Corrupted imported image

@phpdave11 this update is fantastic! It's working well with a lot of my company's pdfs we are importing. There is one that is still having problems. It's successfully importing a pdf, but one of the pdf's pages is almost completely blank. At the top of the second page, all of the pdf's image appears to be squashed at the top. It's attached. We had to blot out some person information, so I hope you can experience what we've seen from the original.
7007327.pdf

Originally posted by @tylerzika in #16 (comment)

panic: Failed to initialize parser: Failed to read pdf: Failed to read xref table

I am getting the panic form gofpdi.importPage() for blank pdf if download from anywhere once I create the blank pdf using gofpdf then It's working fine

code is here:

pdf := gofpdf.New("P", "pt", "A4", "")
// create a new Importer instance
imp := gofpdi.NewImporter()

w,h := pdf.GetPageSize()

fmt.Println("pageSize >>>", w, ">>>>>", h)

// import first page and determine page sizes
tpl := imp.ImportPage(pdf, "barcode.pdf", 1, "/MediaBox")
pageSizes := imp.GetPageSizes()
nrPages := len(imp.GetPageSizes())

// add all pages from template pdf
for i := 1; i <= nrPages; i++ {
	pdf.AddPageFormat("P", gofpdf.SizeType{Wd: pageSizes[i]["/MediaBox"]["w"] , Ht:pageSizes[i]["/MediaBox"]["h"]})
	w,h := pdf.GetPageSize()
	fmt.Println("pageSize >>>", w, ">>>>>", h)
	if i > 1 {
		tpl = imp.ImportPage(pdf, "barcode.pdf", i, "/MediaBox")
	}
	imp.UseImportedTemplate(pdf, tpl, 0, 0, pageSizes[i]["/MediaBox"]["w"], pageSizes[i]["/MediaBox"]["h"])
}

// output
err := pdf.OutputFileAndClose("generated-barcode.pdf")
if err != nil {
	fmt.Println(err)
}

Here is an error I am getting while import page into pdf:

And here it's blank pdf:
barcode.pdf

This pdf I want to use for importPage()

@phpdave11 Please help me to resolve this thing.

Thanks!

Karmdip Joshi

Closing files

So I've encountered a problem when using gofpdf library. When importing pages when it reads the file it does not close it. Then I found out that the library can't close it cause it does not have the access to the file opened by this lib. Far as I see that files opened in NewPdfReader and NewPdfWriter cannot be closed outside the lib. So when the libs use the reader and writer the files remain open and cannot be closed. So there should be a public function on pdfReader and pdfWriter that just closes the file. Something like this maybe?

Did I miss understood how something in the library works? Or am I right and if so should I make a PR with the methods that you see in the screenshot or is it simple enough for you to do a quick implementation. Btw the method in the screenshot was not tested just wanted to see how it would be implemented.

Failed to initialize parser: Failed to read pdf: Failed to read xref table: Expected xref to start with 'xref'. Got: 40

How can i import pdf template.

My code
`pdf := gofpdf.New("P", "mm", "A4", "")
tpl1 := gofpdi.ImportPage(pdf, "contract-1.pdf", 1, "")
gofpdi.UseImportedTemplate(pdf, tpl1, 20, 50, 150, 0)

var buf bytes.Buffer
err := pdf.Output(&buf)
pdf.Close()

return err, buf

A weird problem: default pdf text language changed after importing pdf page

import (
          "github.com/phpdave11/gofpdf"
          "github.com/phpdave11/gofpdi"
      )

var FontType string = "NotoSansSC-Regular.ttf"

func NewPdf() *gofpdf.Fpdf {
    pdf := gofpdf.New("P", "pt", "A4", "")
 //chinese font add
    pdf.AddUTF8Font(FontType, "", "static/font/"+FontType)
    return pdf
}


var imp = gofpdi.NewImporter()
func ImportPdfPages(pdf *gofpdf.Fpdf, pdffile string) *gofpdf.Fpdf {
    imp.SetSourceFile(pdffile)
    pageSizes := imp.GetPageSizes()
    total := len(pageSizes)

  pdfReader := imp.GetReader()

  for i := 1; i <= total; i++ {
	  rotation, _ := pdfReader.GetPageRotation(i)
	  curWidth := pageSizes[i]["/MediaBox"]["w"]
	  curHeight := pageSizes[i]["/MediaBox"]["h"]
	  angle := rotation.Int % 360
	  // Normalize angle
	  tpl := getTemplateID(imp, pdf, i, "/MediaBox")
	  if angle != 0 && (angle/90)%2 != 0 {
		  pdf.AddPageFormat(gofpdf.OrientationLandscape, gofpdf.SizeType{Wd: curWidth, Ht: curHeight})
		  pdf.UseImportedTemplate(imp.UseTemplate(tpl, 0, 0, curHeight, curWidth))
	  } else {
		  pdf.AddPage()
		  pdf.UseImportedTemplate(imp.UseTemplate(tpl, 0, 0, curWidth, curHeight))
	  }
  }
  return pdf
}

//github.com/phpdave11/gofpdi 
func getTemplateID(i *gofpdi.Importer, f *gofpdf.Fpdf, pageno int, box string) int {
    tp := i.ImportPage(pageno, box)
    tplObjIDs := i.PutFormXobjectsUnordered()
    f.ImportTemplates(tplObjIDs)
    imported := i.GetImportedObjectsUnordered()
    f.ImportObjects(imported)
    importedObjPos := i.GetImportedObjHashPos()
    f.ImportObjPos(importedObjPos)
    return tp
}

// test on window7 \centos7.5 go 1.16.
pdf text is chinese language before importing(sample pdf page 2)

changed to english after importing

I dont know the reason ....

Simple Pdf is here ↓↓↓↓↓↓↓↓↓

BODYL-P1.pdf

First page gets duplicated when using ImportPageFromStream

Hi! Im trying to import multiple pages from a stream into a new PDF object. When using the following code:

tp := gofpdi.ImportPageFromStream(pdf, &rs, 1, "/MediaBox")
pdf.AddPage()
gofpdi.UseImportedTemplate(pdf, tp, 0, 0, paperWidth, 0)

I get the first page of the PDF perfectly fine. But when using a loop, or when adding a second page the same way as above (duplicating the code and changing the second import instance to page 2) the first page gets replaced by the second page, so the first two pages of the document are the second page of the document I am importing from. This same code works perfect when using ImportPage, so not sure why it is showing this strange behaviour when importing from a stream instead.

This code duplicates the first page, all other pages are fine:

for i := 1; i <= 13; i++ {
	tp := gofpdi.ImportPageFromStream(pdf, &rs, i, "/MediaBox")
	pdf.AddPage()
	gofpdi.UseImportedTemplate(pdf, tp, 0, 0, paperWidth, 0)
}

Can't open template PDF file on Google Cloud Functions

Hi, Thank you for create great module. this is very helpful for me.
I'm a beginner of golang. And I'm trying to implement creating pdf function on Google Cloud Functions.
My code could run on local ( v1.12 ), but it doesn't run on Cloud Functions( v1.11 ).
I know gofpdi need golang 1.12 ( in go.mod ). but, Do gofpdi realy need v1.12?

I paste error message below:

textPayload:  "Function panic: Failed to read pdf: Failed to read xref table: Expected xref to start with 'xref'

github.com/phpdave11/gofpdi.(*Importer).SetSourceFile(0xc00002a540, 0x7e3706, 0xc)
	/builder/home/go/pkg/mod/github.com/phpdave11/[email protected]/importer.go:63 +0x212

Thank you.

zlib.NewReader(bytes.NewBuffer(stream)) can return an error and a nil value

I have a pdf where the encoding does not seem to be supported, but
I think you should check the error here, otherwise I get a runtime error.

runtime error: invalid memory address or nil pointer dereference

gofpdi/reader.go

Line 509 in cf771f6

zlibReader, _ := zlib.NewReader(bytes.NewBuffer(compressedObj.Stream.Bytes))

Problem import page

Hello i'm having problems with ImportPage

The code

 documents := [...]string{"146654.pdf", "141368.pdf", "146654.pdf"}
 pdf := gofpdf.New("P", "mm", "A4", "")

	

 for _, document := range documents {
  pdf.AddPage()
  gofpdi.UseImportedTemplate(pdf, gofpdi.ImportPage(pdf, document, 1, "/MediaBox"), 0, 0, 210, 297)
  
}

 err = pdf.OutputFileAndClose("example.pdf")
  if err != nil {
     panic(err)
  }

the result is a 3-page PDF but each page is the first page of document "146654.pdf"

Sorry for my English

gofpdi fails to correctly parse streams on some pdfs

When reading some PDFs (seen this typically when importing scanned-in PDFs), gofpdi will fail to detect 'endstream', panicking with panic: Failed to get content: Failed to get page content: Failed to resolve object: Expected next token to be: endstream, got: dstream.

When reading a PDF stream the reader should start reading stream after the first CRLF sequence but instead skips all leading whitespace which can result in reading past the 'endstream' token.

Here's a test PDF with described behaviour.
BRW2C6FC94B5488_000827.pdf

Adobe Acrobat error 110 when opening file using UseImportedTemplate

When generating a PDF file that uses imported pages I get the following error in Adobe Acrobat:

There was an error processing a page. There was a problem reading this document (110).

The file opens and displays correctly in Chrome, Brave, Safari and iOS Viewer.

If I comment out the following lines it does open in Adobe:

	page := gen.doc.ImportPage(gen.cacheFile(importData.Name), importData.Page, "/MediaBox")
	gen.doc.UseImportedTemplate(page, 0, 0, WIDTH, HEIGHT)

Also I checked, the source file does open in Acrobat as well.

Problem importing one template PDF and saving it thousands of times

Hey - first off, just want to say I love this library, but I've been having an issue with importing the same PDF over and over again.

I have a static PDF file - "my_template_pdf.pdf". I want to import one (or both) pages from it multiple times, create a new PDF, add some text to it, and save the new PDF with a new name (basically, filling out forms).

I was just wondering if there's a way to do it in this library without creating a new importer for every output file? I'm interested in increasing performance.

Ideally I could just import the page(s) I need once, and then refer to those page(s) as I need them, but the only way I can get it to work is to create a new importer for every output file that I want to create.

func do_the_same_pdf_over_and_over(this_wg *sync.WaitGroup, i int) {
	defer this_wg.Done()
        this_output_form := gofpdf.New("L", "pt", "Letter", "")
        this_output_form.AddPage()
        
        this_importer := gofpdi.NewImporter()
        
        THIS_TEMPLATE_PDF := this_importer.ImportPage(this_output_form, "my_template_pdf.pdf", 1, "/MediaBox")
        this_importer.UseImportedTemplate(this_output_form, THIS_TEMPLATE_PDF , 0, 0, 792, 612)
        
        this_output_form.SetFont("Helvetica", "", 20)
        to_print := fmt.Sprintf("%v %v %v %v", "test print", "test", "person", i)
        this_output_form.Text(50, 50, to_print)
        this_output_path := fmt.Sprintf("test files/pdf output/output %v.pdf", to_print)
        err := this_output_form.OutputFileAndClose(this_output_path)

Basically I have to make a NewImporter() for every output file that I want, even if I'm using the same input template for 1000 output files.

If I don't make a new importer every time, then the file size is optimal for the first form, but every form after that the forms get larger and larger which is obviously bad for performance / memory. I can go into Adobe and do "Optimize PDF" and it removes a huge portion of "Document Overhead" and fixes the file size to be more or less optimal. So I think the library is saving some document overhead multiple times when I don't reinitialize the importer?

Is it by design that I should use a new importer for every output file that I want, even if I'm reusing the same template 1000 times? Am I just using the package incorrectly?

Add ability to import PDFs into new encrypted PDF document

This is an interesting problem. Protection involves encrypting the data stream as it is written to the internal buffer. @phpdave11, do you have any insights into what it would take to apply the encryption to embedded PDF templates? This problem may involve templates in general, not just those that contain embedded PDF pages. If a solution is too complex to handle at this point, we may want to document the limitation and set an error internally if an incompatible embedding method is called.

Originally posted by @jung-kurt in jung-kurt/gofpdf#265 (comment)

Add support for importing PDF streams

As far as I can tell (correct me if I am wrong, @phpdave11), you will need to write the downloaded PDF to a file first, and then convert it to a template with a call like

tp := gofpdi.ImportPage(pdf, "rxImage.pdf", 1, "/MediaBox")

@phpdave11, a new import function that uses an io.Reader would be a welcome enhancement to the gofpdi package.

Originally posted by @jung-kurt in jung-kurt/gofpdf#258 (comment)

Why "*io.ReadSeeker" are used instead of "io.ReadSeeker" in ImportPageFromStream?

Hello!

Why, in principle, use a pointer to an interface instead of just an interface? In language it is considered antipattern, no?
For this reason this simple code does not compile (as well as using bytes.NewReader from the example):

package main

import (
	"os"

	"github.com/jung-kurt/gofpdf"
	"github.com/jung-kurt/gofpdf/contrib/gofpdi"
)

func main() {
	pdf := gofpdf.New("P", "mm", "A4", "")

	tmpl, _ := os.Open("/tmp/example.pdf")
	defer tmpl.Close()

	gofpdi.ImportPageFromStream(pdf, tmpl, 1, "/TrimBox")
}

cannot use tmpl (type *os.File) as type *io.ReadSeeker in argument

Please point me to my mistake or misunderstanding. Thanks!

Not able to compare pdf byte by byte

Steps to check this error:

I tried generating pdf1 which imports a template pdf. Stored it as reference
Generated pdf2 same way by importing template
used referenceCompare() from gofpdf example package, which compares 2 pdf byte by byte
There are lot of differences in both files.

How should I compare the files for my testing?
Note:
I generated some simple pdfs from gofpdf that did not involve importing templates. And I was able to compare them successfully.

panic: did not set root object

I get a different, panic: did not set root object error with http://www.campbell-lange.net/media/files/example2.pdf, which is made from a Google Docs PDF conversion.

It's possible my example code isn't a useful. Let me know if so.

Originally posted by @rorycl in #23 (comment)

phpdave11 / gofpdi Goto Github PK

gofpdi's People

Contributors

Stargazers

Watchers

Forkers

gofpdi's Issues

Conclusion:

Recommend Projects

Recommend Topics

Recommend Org