wootski / peepdf Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 788 KB

Automatically exported from code.google.com/p/peepdf

License: GNU General Public License v3.0

Python 100.00%

peepdf's People

Contributors

peepdf's Issues

Additional cross-reference entry when create new PDF file

What steps will reproduce the problem?

1. Run "./peepdf.py -i"
2. Run "create pdf" in peepdf console
3. Run "save 'test.pdf'" in peepdf console


What is the expected output? What do you see instead?

The following content is the cross-reference table and trailer of test.pdf. The 
size of cross-reference table is 4, however, there are 5 entries in table. 
There is a useless entry in cross-reference table which does not point to an 
object.

xref
0 4
0000000000 65535 f 
0000000009 00000 n 
0000000059 00000 n 
0000000118 00000 n 
0000000119 00000 n 
trailer
<< /Size 4
/Root 1 0 R >>
startxref
210
%%EOF


What version of the product are you using? On what operating system?

The version of peepdf is r42. The operating system is ubuntu-11.10 x86_64.


Please provide any additional information below.

Original issue reported on code.google.com by czchen on 24 Oct 2011 at 11:50

Enhancement: 'dumpstream' command

I needed to dump streams directly to file, e.g. extracting fonts from a PDF.

Attached is a patch which duplicates the 'stream' command, but accepts a 
filename to output to rather than the console.

Original issue reported on code.google.com by [email protected] on 11 Nov 2012 at 5:07

Attachments:

dumpstream.patch

UnboundLocalError: local variable 'ret' referenced before assignment in PDFFilters.py

When processing certain files, peepdf crashes with the following error:

UnboundLocalError: local variable 'ret' referenced before assignment

The bug lies in the PDFFilters.py file in the decodeStream() function, line 92:

{{{
    Traceback (most recent call last):
      File "my_script.py", line 45, in <module>
        ret, pdf = PDFCore.PDFParser().parse(filepath, True, True)
      File "/home/travesti/peepdf_0.2/PDFCore.py", line 6727, in parse
        ret = body.updateObjects()
      File "/home/travesti/peepdf_0.2/PDFCore.py", line 4126, in updateObjects
        object.resolveReferences()
      File "/home/travesti/peepdf_0.2/PDFCore.py", line 2470, in resolveReferences
        ret = self.decode()
      File "/home/travesti/peepdf_0.2/PDFCore.py", line 2001, in decode
        ret = decodeStream(self.encodedStream, self.filter.getValue(), self.filterParams)
      File "/home/travesti/peepdf_0.2/PDFFilters.py", line 92, in decodeStream
        return ret
    UnboundLocalError: local variable 'ret' referenced before assignment
}}}

The exception is raised because there isn't a previous declaration of the "ret" 
variable in the decodeStream() function. If none of the conditions are true 
then the "ret" variable never gets a value, the function ret is reached and 
Python raises the UnboundLocalError exception.

I patched the function just adding the following line at the begenning of the 
decodeStream() function:

{{{
    ret = (-1, "")
}}}

But it keeps raising errors in other modules :(

Original issue reported on code.google.com by [email protected] on 8 Mar 2014 at 3:11

invalid dictOwnerPass prevents further processing

What steps will reproduce the problem?
1.https://www.virustotal.com/en/file/784d1ebd1faccec27f98970cc266859eaf5676da1c4
51e3304fb55435d8c8473/analysis/
2. run peepdf.py -f vtfile


What is the expected output? What do you see instead?

#Expected:

Warning: PyV8 is not installed!!
Warning: pylibemu is not installed!!
Decryption error: Bad format for /O!!
Decryption error: Bad format for /U!!
Decryption error: Default user password not working here!!

File: tp_22340_utf8_88292d7181514fda5390292d73da28d4
MD5: 88292d7181514fda5390292d73da28d4
SHA1: fbc3856fd689e1ac0f8fb56bbd7d0a2b8332a928
Size: 807079 bytes
Version: 1.4
Binary: True
Linearized: False
Encrypted: True (RC4 40 bits)
Updates: 0
Objects: 7
Streams: 1
Comments: 0
Errors: 5

Version 0:
    Catalog: 1
    Info: No
    Objects (7): [1, 2, 3, 4, 5, 8, 9]
        Errors (1): [5]
    Streams (1): [5]
        Encoded (1): [5]
        Decoding errors (1): [5]
    Suspicious elements:
        /AcroForm: [1]
        /OpenAction: [1]
        /JS: [1]
        /JavaScript: [1]

#Instead see:

Traceback (most recent call last):
  File "peepdf.py", line 352, in <module>
    ret,pdf = pdfParser.parse(fileName, options.isForceMode, options.isLooseMode, options.isManualAnalysis)
  File "/Users/tross/Code/satori/peepdf_service/peepdf-svn/PDFCore.py", line 6822, in parse
    ret = pdfFile.decrypt()
  File "/Users/tross/Code/satori/peepdf_service/peepdf-svn/PDFCore.py", line 5179, in decrypt
    ret = computeUserPass(password, dictO, fileId, perm, keyLength, revision, encryptMetadata)
  File "/Users/tross/Code/satori/peepdf_service/peepdf-svn/PDFCrypto.py", line 164, in computeUserPass
    ret = computeEncryptionKey(userPassString, dictO, dictU, dictOE, dictUE, fileID, pElement, keyLength, revision, encryptMetadata)
  File "/Users/tross/Code/satori/peepdf_service/peepdf-svn/PDFCrypto.py", line 58, in computeEncryptionKey
    md5input = password + dictOwnerPass + struct.pack('<I',abs(int(pElement))) + fileID
TypeError: cannot concatenate 'str' and 'instance' objects

What version of the product are you using? On what operating system?
latest version from svn, any os

Please provide any additional information below.
when forcing and encountering errors and the dict0/dictOwnerPass object doesn't 
resolve to a simple string and therefore hinders further execution.

Original issue reported on code.google.com by [email protected] on 5 Sep 2013 at 3:35

Attachments:

PDFCore.py_patch

Exception when opening RC4 encrypted PDF

peepdf will raise exception when opening the sample.pdf in attachment because 
it does not handle key P in standard encryption dictionary properly. The 
rc4.patch in attachment can fix this problem.

Original issue reported on code.google.com by czchen on 21 Oct 2011 at 1:10

Attachments:

Patch: Permit in-memory scanning (from a variable)

We have an automated malware analysis system that runs a variety of scans in 
memory on input files.  We patched PDFCore.py to enable string input of file 
contents, rather than a filename.  It is attached, in case anyone finds it 
useful.

Original issue reported on code.google.com by [email protected] on 22 Mar 2012 at 2:48

Attachments:

peepdf-inmemory.patch

error during analysis pdf

this is the error.log

Traceback (most recent call last):
  File "./peepdf.py", line 541, in <module>
    console.cmdloop()
  File "/usr/lib/python2.7/cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "/usr/lib/python2.7/cmd.py", line 219, in onecmd
    return func(arg)
  File "/usr/local/peepdf/PDFConsole.py", line 2721, in do_open
    ret = pdfParser.parse(fileName, forceMode, looseMode)
  File "/usr/local/peepdf/PDFCore.py", line 6838, in parse
    sys.exit('Error: An error has occurred while parsing an indirect object!!')
SystemExit: Error: An error has occurred while parsing an indirect object!!
Traceback (most recent call last):
  File "./peepdf.py", line 541, in <module>
    console.cmdloop()
  File "/usr/lib/python2.7/cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "/usr/lib/python2.7/cmd.py", line 219, in onecmd
    return func(arg)
  File "/usr/local/peepdf/PDFConsole.py", line 2721, in do_open
    ret = pdfParser.parse(fileName, forceMode, looseMode)
  File "/usr/local/peepdf/PDFCore.py", line 6838, in parse
    sys.exit('Error: An error has occurred while parsing an indirect object!!')
SystemExit: Error: An error has occurred while parsing an indirect object!!


do you need other info?

thanks a lot

Original issue reported on code.google.com by [email protected] on 23 Jun 2014 at 3:26

JSAnalysis.py always requires PyV8

What steps will reproduce the problem?
1. Don't install PyV8
2. try to run peepdf.py on any pdf w/ js

What is the expected output? What do you see instead?
For the python to load. 

Instead presented with this:

Traceback (most recent call last):
  File "peepdf.py", line 32, in <module>
    from PDFCore import PDFParser, vulnsDict
  File "/Users/tross/Code/satori/peepdf_service/peepdf-svn/PDFCore.py", line 31, in <module>
    from JSAnalysis import *
  File "/Users/tross/Code/satori/peepdf_service/peepdf-svn/JSAnalysis.py", line 36, in <module>
    class Global(PyV8.JSClass):

NameError: name 'PyV8' is not defined

What version of the product are you using? On what operating system?
any


Please provide any additional information below.
placing the global class in the try block will fix it... probably a better fix.
try:
    import PyV8
    JS_MODULE = True 
    class Global(PyV8.JSClass):
        evalCode = ''

        def evalOverride(self, expression):
            self.evalCode += '\n\n// New evaluated code\n' + expression
            return
except:
    JS_MODULE = False

Original issue reported on code.google.com by [email protected] on 5 Sep 2013 at 3:18

metadata command crashed peepdf

What steps will reproduce the problem?
1. running metadata in the console on a malformed PDF


What is the expected output? What do you see instead?
The program crashed with:

Traceback (most recent call last):
  File "/home/.../bin/peepdf.py", line 465, in <module>
    console.cmdloop(stats + newLine)
  File "/usr/lib64/python2.6/cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "/usr/lib64/python2.6/cmd.py", line 219, in onecmd
    return func(arg)
  File "/home/.../src/svn/sec/peepdf-read-only/PDFConsole.py", line 2290, in do_metadata
    type = object.getElementByName('/Type').getValue()
AttributeError: 'list' object has no attribute 'getValue'


What version of the product are you using? On what operating system?
r158 from svn

Please provide any additional information below.

I don't know if the patch is the right long-term solution, but it solved my 
crash.

Maybe every interactive command should be in a try/except block, so the program 
does not crash on the user?

Original issue reported on code.google.com by [email protected] on 30 Nov 2012 at 3:27

Attachments:

peepdf-metadata-crash.patch

Enhancement: add ASCII85Decode filter

Add the ASCII85Decode filter to peepdf, using the decoder
from pdfminer.

Original issue reported on code.google.com by [email protected] on 30 Nov 2012 at 2:49

Attachments:

peepdf-ascii85decode.patch

PDFCore.py's search for elements/actions/events needs a space

What steps will reproduce the problem?
1. Have a PDF with /AAPL:Keywords and it will get flagged as /AA based on line 
43 of PDFCore.py .  By adding a space after each of the the items from line 
43-45, i.e. - '/AA ', you will still receive hits for legitimate Additional 
Actions still but you now won't receive false positive hits because something 
else contains _part_ of the data that was looked to match.

What is the expected output? What do you see instead?
Expected to flag only on the correct Event/Action/Element names but instead you 
may receive false hits.

What version of the product are you using? On what operating system?
Version included in REMnux - checked the latest trunk version and it should 
still be the same.

Please provide any additional information below.
pdfxray_lite also has this issue since it uses peepdf on the back end, however, 
since it uses it's own copy of PDFCore.py that owner will be contacted 
separately if this issue is accepted as it'll also need the slight change.

Original issue reported on code.google.com by [email protected] on 11 Jun 2012 at 10:56

PNG prediction decode only decodes part of the image

When using PDFs containing PNG images with prediction > 10, the current 
implementation only decodes part of the image (1/3 of each row of the image).

Luckily, I already found the problem and I will attach a patch with a possible 
solution :)

Original issue reported on code.google.com by [email protected] on 17 Sep 2013 at 9:55

Attachments:

png_prediction.patch

Add a jjdecoder function

CVE-2013-3346 pdf samples have obfuscated Javascript code using jjencode 
(http://utf-8.jp/public/jjencode.html). It would be nice to have a jjdecoder in 
peepdf to quickly deobfuscate the code.

Sample jjdecoder written in Javascript can be found here: 
http://csc.cs.utm.my/syed/images/files/jjdecode/jjdecode.html

Some explanation about how a jjdecoder works can be found here: 
http://corkami.googlecode.com/svn-history/r399/trunk/misc/jjencode.txt

Original issue reported on code.google.com by [email protected] on 12 Dec 2013 at 12:28

XFA JS decapsulation invalid for invalid xref table

What steps will reproduce the problem?
1. Get this specially forged PDF:
https://www.virustotal.com/en-gb/file/be9c0025b99f0f8c55f448ba619ba303fc65eba862
cac65a00ea83d480e5efec/analysis/
2. run peepdf -fi filename
3. run js_analysis object 6

What is the expected output? What do you see instead?

Run the JS code the PyV8 .

Because there are XFA tags opening and closing, js emulation fails:

*** Error analysing Javascript: SyntaxError: Unexpected token < (  @ 1 : 0 )  
-> <? xml version = "1.0"


What version of the product are you using? On what operating system?


Version: peepdf 0.2 r203
Ubuntu 12.10

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 18 Oct 2013 at 2:53

Problem with Filter LZW

What steps will reproduce the problem?

1. ./peepdf -i
2. create pdf
3. embed file
4. filters 4 lzw
5. save test.pdf
6. exit
7. ./peepdf -i test.pdf
8. peepdf shows decode error in object 4


What is the expected output? What do you see instead?

Peepdf shall encode/decode LZW filter successfully.


What version of the product are you using? On what operating system?

The peepdf version is r45
The python version is 2.7.2+
The operating system is ubuntu 11.10 x86_64


Please provide any additional information below.

The test.pdf can not decode by other PDF tools like origami-pdf.

Original issue reported on code.google.com by czchen on 27 Oct 2011 at 12:38

wootski / peepdf Goto Github PK

peepdf's People

Contributors

peepdf's Issues

Recommend Projects

Recommend Topics

Recommend Org