Giter VIP home page Giter VIP logo

python-uncompyle6's Introduction

buildstatus Pypi Installs Latest Version Supported Python Versions

packagestatus

A native Python cross-version decompiler and fragment decompiler. The successor to decompyle, uncompyle, and uncompyle2.

uncompyle6 translates Python bytecode back into equivalent Python source code. It accepts bytecodes from Python version 1.0 to version 3.8, spanning over 24 years of Python releases. We include Dropbox's Python 2.5 bytecode and some PyPy bytecodes.

Ok, I'll say it: this software is amazing. It is more than your normal hacky decompiler. Using compiler technology, the program creates a parse tree of the program from the instructions; nodes at the upper levels that look a little like what might come from a Python AST. So we can really classify and understand what's going on in sections of Python bytecode.

Building on this, another thing that makes this different from other CPython bytecode decompilers is the ability to deparse just fragments of source code and give source-code information around a given bytecode offset.

I use the tree fragments to deparse fragments of code at run time inside my trepan debuggers. For that, bytecode offsets are recorded and associated with fragments of the source code. This purpose, although compatible with the original intention, is yet a little bit different. See this for more information.

Python fragment deparsing given an instruction offset is useful in showing stack traces and can be incorporated into any program that wants to show a location in more detail than just a line number at runtime. This code can be also used when source-code information does not exist and there is just bytecode. Again, my debuggers make use of this.

There were (and still are) a number of decompyle, uncompyle, uncompyle2, uncompyle3 forks around. Many of them come basically from the same code base, and (almost?) all of them are no longer actively maintained. One was really good at decompiling Python 1.5-2.3, another really good at Python 2.7, but that only. Another handles Python 3.2 only; another patched that and handled only 3.3. You get the idea. This code pulls all of these forks together and moves forward. There is some serious refactoring and cleanup in this code base over those old forks. Even more experimental refactoring is going on in decompyle3.

This demonstrably does the best in decompiling Python across all Python versions. And even when there is another project that only provides decompilation for subset of Python versions, we generally do demonstrably better for those as well.

How can we tell? By taking Python bytecode that comes distributed with that version of Python and decompiling these. Among those that successfully decompile, we can then make sure the resulting programs are syntactically correct by running the Python interpreter for that bytecode version. Finally, in cases where the program has a test for itself, we can run the check on the decompiled code.

We use an automated processes to find bugs. In the issue trackers for other decompilers, you will find a number of bugs we've found along the way. Very few to none of them are fixed in the other decompilers.

The code in the git repository can be run from Python 2.4 to the latest Python version, with the exception of Python 3.0 through 3.2. Volunteers are welcome to address these deficiencies if there a desire to do so.

The way it does this though is by segregating consecutive Python versions into git branches:

master
Python 3.6 and up (uses type annotations)
python-3.3-to-3.5
Python 3.3 through 3.5 (Generic Python 3)
python-2.4
Python 2.4 through 2.7 (Generic Python 2)

PyPy 3-2.4 and later works as well.

The bytecode files it can read have been tested on Python bytecodes from versions 1.4, 2.1-2.7, and 3.0-3.8 and later PyPy versions.

You can install from PyPI using the name uncompyle6:

pip install uncompyle6

To install from source code, this project uses setup.py, so it follows the standard Python routine:

$ pip install -e .  # set up to run from source tree

or:

$ python setup.py install # may need sudo

A GNU Makefile is also provided so make install (possibly as root or sudo) will do the steps above.

make check

A GNU makefile has been added to smooth over setting running the right command, and running tests from fastest to slowest.

If you have remake installed, you can see the list of all tasks including tests via remake --tasks

Run

$ uncompyle6 *compiled-python-file-pyc-or-pyo*

For usage help:

$ uncompyle6 -h

In older versions of Python it was possible to verify bytecode by decompiling bytecode, and then compiling using the Python interpreter for that bytecode version. Having done this, the bytecode produced could be compared with the original bytecode. However as Python's code generation got better, this no longer was feasible.

If you want Python syntax verification of the correctness of the decompilation process, add the --syntax-verify option. However since Python syntax changes, you should use this option if the bytecode is the right bytecode for the Python interpreter that will be checking the syntax.

You can also cross compare the results with another version of uncompyle6 since there are sometimes regressions in decompiling specific bytecode as the overall quality improves.

For Python 3.7 and 3.8, the code in decompyle3 is generally better.

Or try specific another python decompiler like uncompyle2, unpyc37, or pycdc. Since the later two work differently, bugs here often aren't in that, and vice versa.

There is an interesting class of these programs that is readily available give stronger verification: those programs that when run test themselves. Our test suite includes these.

And Python comes with another a set of programs like this: its test suite for the standard library. We have some code in test/stdlib to facilitate this kind of checking too.

The biggest known and possibly fixable (but hard) problem has to do with handling control flow. (Python has probably the most diverse and screwy set of compound statements I've ever seen; there are "else" clauses on loops and try blocks that I suspect many programmers don't know about.)

All of the Python decompilers that I have looked at have problems decompiling Python's control flow. In some cases we can detect an erroneous decompilation and report that.

Python support is pretty good for Python 2

On the lower end of Python versions, decompilation seems pretty good although we don't have any automated testing in place for Python's distributed tests. Also, we don't have a Python interpreter for versions 1.6, and 2.0.

In the Python 3 series, Python support is strongest around 3.4 or 3.3 and drops off as you move further away from those versions. Python 3.0 is weird in that it in some ways resembles 2.6 more than it does 3.1 or 2.7. Python 3.6 changes things drastically by using word codes rather than byte codes. As a result, the jump offset field in a jump instruction argument has been reduced. This makes the EXTENDED_ARG instructions are now more prevalent in jump instruction; previously they had been rare. Perhaps to compensate for the additional EXTENDED_ARG instructions, additional jump optimization has been added. So in sum handling control flow by ad hoc means as is currently done is worse.

Between Python 3.5, 3.6, 3.7 there have been major changes to the MAKE_FUNCTION and CALL_FUNCTION instructions.

Python 3.8 removes SETUP_LOOP, SETUP_EXCEPT, BREAK_LOOP, and CONTINUE_LOOP, instructions which may make control-flow detection harder, lacking the more sophisticated control-flow analysis that is planned. We'll see.

Currently not all Python magic numbers are supported. Specifically in some versions of Python, notably Python 3.6, the magic number has changes several times within a version.

We support only released versions, not candidate versions. Note however that the magic of a released version is usually the same as the last candidate version prior to release.

There are also customized Python interpreters, notably Dropbox, which use their own magic and encrypt bytecode. With the exception of the Dropbox's old Python 2.5 interpreter this kind of thing is not handled.

We also don't handle PJOrion or otherwise obfuscated code. For PJOrion try: PJOrion Deobfuscator to unscramble the bytecode to get valid bytecode before trying this tool; pydecipher might help with that.

This program can't decompile Microsoft Windows EXE files created by Py2EXE, although we can probably decompile the code after you extract the bytecode properly. Pydeinstaller may help with unpacking Pyinstaller bundlers.

Handling pathologically long lists of expressions or statements is slow. We don't handle Cython or MicroPython which don't use bytecode.

There are numerous bugs in decompilation. And that's true for every other CPython decompiler I have encountered, even the ones that claimed to be "perfect" on some particular version like 2.4.

As Python progresses decompilation also gets harder because the compilation is more sophisticated and the language itself is more sophisticated. I suspect that attempts there will be fewer ad-hoc attempts like unpyc37 (which is based on a 3.3 decompiler) simply because it is harder to do so. The good news, at least from my standpoint, is that I think I understand what's needed to address the problems in a more robust way. But right now until such time as project is better funded, I do not intend to make any serious effort to support Python versions 3.8 or 3.9, including bugs that might come in. I imagine at some point I may be interested in it.

You can easily find bugs by running the tests against the standard test suite that Python uses to check itself. At any given time, there are dozens of known problems that are pretty well isolated and that could be solved if one were to put in the time to do so. The problem is that there aren't that many people who have been working on bug fixing.

Some of the bugs in 3.7 and 3.8 are simply a matter of back-porting the fixes in decompyle3. Any volunteers?

You may run across a bug, that you want to report. Please do so after reading How to report a bug and follow the instructions when opening an issue.

Be aware that it might not get my attention for a while. If you sponsor or support the project in some way, I'll prioritize your issues above the queue of other things I might be doing instead. In rare situtations, I can do a hand decompilation of bytecode for a fee. However this is expansive, usually beyond what most people are willing to spend.

  • https://github.com/rocky/python-decompile3 : Much smaller and more modern code, focusing on 3.7 and 3.8. Changes in that will get migrated back here.
  • https://code.google.com/archive/p/unpyc3/ : supports Python 3.2 only. The above projects use a different decompiling technique than what is used here. Currently unmaintained.
  • https://github.com/figment/unpyc3/ : fork of above, but supports Python 3.3 only. Includes some fixes like supporting function annotations. Currently unmaintained.
  • https://github.com/wibiti/uncompyle2 : supports Python 2.7 only, but does that fairly well. There are situations where uncompyle6 results are incorrect while uncompyle2 results are not, but more often uncompyle6 is correct when uncompyle2 is not. Because uncompyle6 adheres to accuracy over idiomatic Python, uncompyle2 can produce more natural-looking code when it is correct. Currently uncompyle2 is lightly maintained. See its issue tracker for more details.
  • How to report a bug
  • The HISTORY file.
  • https://github.com/rocky/python-xdis : Cross Python version disassembler
  • https://github.com/rocky/python-xasm : Cross Python version assembler
  • https://github.com/rocky/python-uncompyle6/wiki : Wiki Documents which describe the code and aspects of it in more detail
  • https://github.com/zrax/pycdc : The README for this C++ code says it aims to support all versions of Python. You can aim your slign shot for the moon too, but I doubt you are going to hit it. This code is best for Python versions around 2.7 and 3.3 when the code was initially developed. Accuracy for current versions of Python3 and early versions of Python is lacking. Without major effort, it is unlikely it can be made to support current Python 3. See its issue tracker for details. Currently lightly maintained.

python-uncompyle6's People

Contributors

andrem-eberle avatar berbe avatar bloerwald avatar byehack avatar cclauss avatar elfring avatar graingert avatar grkov90 avatar henryhjung avatar htgoebel avatar jameshilliard avatar jbremer avatar jlugjb avatar jwilk avatar kernelsmith avatar lelicopter avatar lostbeta avatar moagstar avatar mysterie avatar rocky avatar skyfion avatar supervirus avatar tangboxuan avatar tey avatar thedrow avatar timgates42 avatar trengri avatar wangym5106 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

python-uncompyle6's Issues

uncompyle6-2.9.8 test failures

Previous release have been passing tests fine.
...
Tue Jan 3 12:24:01 2017
Source directory: bytecode_pypy2.7
Output directory: /var/tmp/portage/dev-python/uncompyle6-2.9.8/temp/py-dis-84_2hf_1/bytecode_pypy2.7
decompiled 22 files: 0 okay, 0 failed
Everything good, removing /var/tmp/portage/dev-python/uncompyle6-2.9.8/temp/py-dis-84_2hf_1
Tue Jan 3 12:24:02 2017
Source directory: bytecode_1.5
Output directory: /var/tmp/portage/dev-python/uncompyle6-2.9.8/temp/py-dis-84_2hf_1/bytecode_1.5
decompiled 34 files: 0 okay, 0 failed
Everything good, removing /var/tmp/portage/dev-python/uncompyle6-2.9.8/temp/py-dis-84_2hf_1
/usr/bin/python3.4 test_pythonlib.py --bytecode-3.4 --verify
Tue Jan 3 12:24:10 2017
Source directory: bytecode_3.4
Output directory: /var/tmp/portage/dev-python/uncompyle6-2.9.8/temp/py-dis-mnghpppd/bytecode_3.4
Code differs in '..readline' at offset 43_0 [COME_FROM_LOOP] != [POP_BLOCK]

Code differs in ..readline

3 0 SETUP_LOOP 43 'to to 43'
11 0 SETUP_LOOP 44 'to to 44'

4 3 LOAD_FAST 'self'
12 3 LOAD_FAST 'self'
6 LOAD_ATTR 'join_lines' 6 LOAD_ATTR 'join_lines'
9 POP_JUMP_IF_FALSE 39 'to 39' 9 POP_JUMP_IF_FALSE 39 'to 39'

5 12 LOAD_FAST 'line'
13 12 LOAD_FAST 'line'
15 POP_JUMP_IF_FALSE 24 'to 24' 15 POP_JUMP_IF_FALSE 24 'to 24'

6 18 CONTINUE 3 'to 3'
14 18 CONTINUE 3 'to 3'
21 JUMP_FORWARD 24 'to to 24' 21 JUMP_FORWARD 24 'to to 24'

7 24 LOAD_FAST 'self'
15 24 LOAD_FAST 'self'
27 POP_JUMP_IF_FALSE 39 'to 39' 27 POP_JUMP_IF_FALSE 39 'to 39'

8 30 CONTINUE 3 'to 3'
16 30 CONTINUE 3 'to 3'
33 JUMP_ABSOLUTE 39 'to 39' 33 JUMP_ABSOLUTE 39 'to 39'
36 JUMP_FORWARD 39 'to to 39' 36 JUMP_FORWARD 39 'to to 39'

10 39 LOAD_FAST 'line'
17 39 LOAD_FAST 'line'
42 RETURN_VALUE 42 RETURN_VALUE
43_0 COME_FROM_LOOP '0' 43 POP_BLOCK
43 LOAD_CONST '' 44_0 COME_FROM_LOOP '0'
46 RETURN_VALUE 44 LOAD_CONST ''

decompiled 53 files: 0 okay, 0 failed, 1 verify failed

Error Verifying 10_while.pyc

Code differs in '..readline' at offset 43_0 [COME_FROM_LOOP] != [POP_BLOCK]

Code differs in ..readline

3 0 SETUP_LOOP 43 'to to 43'
11 0 SETUP_LOOP 44 'to to 44'

4 3 LOAD_FAST 'self'
12 3 LOAD_FAST 'self'
6 LOAD_ATTR 'join_lines' 6 LOAD_ATTR 'join_lines'
9 POP_JUMP_IF_FALSE 39 'to 39' 9 POP_JUMP_IF_FALSE 39 'to 39'

5 12 LOAD_FAST 'line'
13 12 LOAD_FAST 'line'
15 POP_JUMP_IF_FALSE 24 'to 24' 15 POP_JUMP_IF_FALSE 24 'to 24'

6 18 CONTINUE 3 'to 3'
14 18 CONTINUE 3 'to 3'
21 JUMP_FORWARD 24 'to to 24' 21 JUMP_FORWARD 24 'to to 24'

7 24 LOAD_FAST 'self'
15 24 LOAD_FAST 'self'
27 POP_JUMP_IF_FALSE 39 'to 39' 27 POP_JUMP_IF_FALSE 39 'to 39'

8 30 CONTINUE 3 'to 3'
16 30 CONTINUE 3 'to 3'
33 JUMP_ABSOLUTE 39 'to 39' 33 JUMP_ABSOLUTE 39 'to 39'
36 JUMP_FORWARD 39 'to to 39' 36 JUMP_FORWARD 39 'to to 39'

10 39 LOAD_FAST 'line'
17 39 LOAD_FAST 'line'
42 RETURN_VALUE 42 RETURN_VALUE
43_0 COME_FROM_LOOP '0' 43 POP_BLOCK
43 LOAD_CONST '' 44_0 COME_FROM_LOOP '0'
46 RETURN_VALUE 44 LOAD_CONST ''

make[2]: *** [Makefile:43: check-3.4] Error 3
make[2]: Leaving directory '/var/tmp/portage/dev-python/uncompyle6-2.9.8/work/uncompyle6-2.9.8/test'
make[1]: *** [Makefile:32: check-3.4] Error 2
make[1]: Leaving directory '/var/tmp/portage/dev-python/uncompyle6-2.9.8/work/uncompyle6-2.9.8'
make: *** [Makefile:23: check] Error 2

  • ERROR: dev-python/uncompyle6-2.9.8::gentoo failed (test phase):
  • emake failed

COME_FROMs need to be segregated more

We are getting incorrect control flow because grammar rules with multiple COME_FROMS get mangled with other grammar rules. A fix is to separate COME_FROMs from except statements, vs others. Same may be true with loops, if/else and so on.

Synchronizing nonterminal names with Python AST grammar nonterminal names

Write some sort of filter to turn uncompyle6's parse tree into Python's AST. See the other transformers in uncompyle6/semantics for other kinds of transformations.

Some things to note. uncompyle6's parse tree is not an AST (abstract syntax tree). In particular it is not abstract: a traversal of the leaves of the tree will produce the entire (pseudo-instruction augmented) instruction stream. So in contrast to an AST, there is nothing removed. See http://rocky.github.io/pycon2018.co/#/ for examples how how they differ.

As with other phases of the program this phase will no doubt need customization per version as I'm sure the AST changes by Python version, if for no other reason as a result of new language. (But I suspect there are other things as well.)

@belph See above.

Getting more Publicity and users for UnCompyle6.

How can we get more publicity and users for UnCompyle6?

I saw that your Awesome Python pull request
vinta/awesome-python#809

is still open. Bummer. So how can we get your more publicity and users?

How about a link exchange?

https//Pylang.info

is the world's largest Python directory. It starts off with the Awesome Python information, and adds additional links. It lists not just libraries, but also blogs videos, articles and other interesting content. It is a much deeper and richer hierarchy than the github pages. I invite you to watch the lightening talk video on the web site.

PyLang is also just starting to list companies that are doing Python development. To learn more about tha softwre for listing companies, I invite you to take a look at the video on the sister site

https://ioscompanies.info

Please let me know if you are interested in a link exchange.

This offer applies to all Python (or iOS) related Blogs, Articles and Libraries.

Warm Regards
Christopher Lozinski

Decompilation failed with if-else ternary operator embedded into a boolean expression

Hello,

thanks for making your tool freely available: this is exactly what I was looking for!
Unfortunately I'm encountering a little problem.
I've just cloned the source repository and tested uncompyle6 with one of my libraries.
The decompilation failed with the following error code:

Syntax error at or near `RETURN_END_IF' token at offset 21

I succeeded in restricting the failing code to the one generated by the following function:

def minimum(x, y):
    return x or (x if x < y else y)

The same error is generated if I replace the bolloean operator or with and.

Below is the whole transcript of my little session:

python -c 'import testcase' && uncompyle6 testcase.pyc
# Python 2.7 (decompiled from Python 2.7)
# Embedded file name: testcase.py
# Compiled at: 2016-04-04 17:47:25


def minimum--- This code section failed: ---

   2       0    LOAD_FAST         'x'
           3    JUMP_IF_FALSE_OR_POP '25'
           6    LOAD_FAST         'x'
           9    LOAD_FAST         'y'
          12    COMPARE_OP        '<'
          15    POP_JUMP_IF_FALSE '22'
          18    LOAD_FAST         'x'
          21    RETURN_END_IF     ''
          22    LOAD_FAST         'y'
        25_0    COME_FROM         '3'
          25    RETURN_VALUE      ''
          -1    RETURN_LAST       ''


# Deparsing stopped due to parse error
# Can't uncompile testcase.pyc
Syntax error at or near `RETURN_END_IF' token at offset 21                                     

This seems the same issue described in Conditional lambda problems

Python 3.6 opcodes

Python 3.6 beta introduces some new opcodes:

  • STORE_ANNOTATION (removed in 3.7)
  • CALL_FUNCTION_EX
  • SETUP_ANNOTATIONS (no operational semantics; handle by filtering out?)
  • BUILD_STRING
  • BUILD_TUPLE_UNPACK_WITH_CALL (possibly)

Decompilation of these opcodes needs to be added. I started work on BUILD_STRING and have it almost working with some edge cases failing at the moment.

For the other opcodes I think some minimal test cases should be found first.

setup_requires incorrect usage

s/setup_requires/tests_require/

Nose is a test only dependency. Using setup_requires forces it to be installed.
It should be a tests_require setting.

Grammar rules for new Python 3.5 opcodes

New 3.5 opcodes are:

  • BINARY_MATRIX_MULTIPLY
  • INPLACE_MATRIX_MULTIPLY
  • GET_AITER (possibly)
  • GET_ANEXT
  • BEFORE_ASYNC_WITH
  • GET_YIELD_FROM_ITER
  • GET_AWAITABLE (possibly)
  • WITH_CLEANUP_START
  • WITH_CLEANUP_FINISH
  • BUILD_LIST_UNPACK
  • BUILD_MAP_UNPACK
  • BUILD_MAP_UNPACK_WITH_CALL (possibly)
  • BUILD_TUPLE_UNPACK
  • BUILD_SET_UNPACK

Testing on BUILD_...UNPACK may be buggy.

  • SETUP_ASYNC_WITH

Make sure we handle each of these.

AST Formatting

In an AST display I find it hard to figure out the children of an internal node. This is needed in writing semantic actions. e368ab2 is an attempt to address this. consider this output

$ ./bin/uncompyle6 -t test/bytecode_2.5/10_if_else_ternary.pyc
# Python bytecode 2.5 (decompiled from Python 2.6)
# Embedded file name: simple_source/branching/10_if_else_ternary.py
...

(2)  sstmt
   (2)  return_stmt
      (1)  ret_expr
         (1)  expr
            (4)  or
               (1)  expr

                       11       0   LOAD_FAST         'x'
               (1)  jmp_true
                             3  JUMP_IF_TRUE      '27'
               (1)  expr
                  (6)  conditional
                     (1)  expr
                        (1)  cmp
                           (3)  compare
                              (1)  expr
                                            6   LOAD_FAST         'x'
                              (1)  expr
                                            9   LOAD_FAST         'y'
                                        12  COMPARE_OP        '<'
                     (1)  jmp_false
                                  15    JUMP_IF_FALSE     '24'
                     (1)  expr
                                  18    LOAD_FAST         'x'
                               21   JUMP_FORWARD      '27'
                     (1)  expr
                                  24    LOAD_FAST         'y'
                             27_0   COME_FROM         '21'
               (0)  come_from_opt
                27  RETURN_VALUE      ''
             -1 RETURN_LAST       ''

The numbers in parens are the counts of the children. So for example come_from_opt has no children. While compare has 3: expr, expr, COMPARE_OP.

And conditional has 6: expr, jmp_false , expr, JUMP_FORWARD, expr, COME_FROM.

@moagstar your thoughts?

getting an error on STORE_LOCALS

--- This code section failed: ---

102 0 LOAD_FAST 'locals'
3 STORE_LOCALS ''
4 LOAD_NAME 'name'
7 STORE_NAME 'module'
10 LOAD_CONST '_zone'
13 STORE_NAME 'qualname'

103 16 LOAD_NAME 'staticmethod'
19 LOAD_CONST '<code_object add_sim>'
22 LOAD_CONST '_zone.add_sim'
25 MAKE_FUNCTION_0 ''
28 CALL_FUNCTION_1 ''
31 STORE_NAME 'add_sim'

106 34 LOAD_NAME 'staticmethod'
37 LOAD_CONST '<code_object remove_sim>'
40 LOAD_CONST '_zone.remove_sim'
43 MAKE_FUNCTION_0 ''
46 CALL_FUNCTION_1 ''
49 STORE_NAME 'remove_sim'

Syntax error at or near `STORE_LOCALS' token at offset 3

These guys fixed this here zrax/pycdc#63 - I'll take a look.

Handle line breaks in fragments parser

The fragments version of the semantics routine are a little behind in code the batch (pysource) code. What would be very helpful would be to use the line number marks for instructions that are in the batch routine here. That way in a debugger or in a error callback if you are stopped at or hit an error at say:

def foo(a,           
        b=len(baz), c=a/b):

You know where you are.

deparse_code in eval mode doesn't always emit expressions for 'x if y else z'

e.g. consider the following running on 2.7 (the same happens on 3.5 if you change the Python version):

>>> _ = deparse_code(2.7, (lambda x: (1 if x else 2)).__code__, compile_mode='eval')
if x:
    return 1
return 2

Despite having a compile_mode of eval the result is not an expression but instead multiple statements.

It's not just a lack of support for if/else expressions as the following works as expected:

>>> _ = deparse_code(2.7, (lambda x: (1 if x else 2) * 2).__code__, compile_mode='eval')
return (1 if x else 2) * 2

This was tested on uncompyle 2.8.2.

Decompilation failed #2

Python 2.7
uncompyle6 v 2.7.1

Syntax error at or near `POP_BLOCK' token at offset 6

Source code:

while 1:pass

More precise line breaks

uncompyle6 uses the source-code line number table to help it know when to split a line.
However it usually knows where there was a line break after the string that had it has been added/printed

Consequently, it usually one behind in a list with line breaks.

Fix somehow.

uncompyle6 2.9.2 can throw an AttributeError when calling `scanners.scanner2.opsize`

This was originally spotted in HypothesisWorks/hypothesis#379, although it probably affects other users of this library.


uncompyle2.9.0 bumped the required version of xdis, specifically xdis >= 3.2.0, < 3.3.0. This coincides with a new release of xdis – previously it used 3.1.x.

In 2.9.0, there’s a function op_size which refers to self.opc.hasArgumentExtended:

def op_size(self, op):
    if op < self.opc.HAVE_ARGUMENT and op not in self.opc.hasArgumentExtended:
        return 1
    else:
        return 3

This name does not appear anywhere else in the uncompyle2 source tree. It did appear in xdis 3.1.0, but was removed in xdis 3.2.0, under rocky/python-xdis@6046fa2.

As a result, if you call this function with uncompyle2 2.9.0 and xdis 3.2.0, you throw an AttributeError in the second half of the if condition. We end up calling this function as part of Hypothesis, and this causes test runs to crash. We can work around this by pinning the versions, but this will affect other users of the library.

Given the rationale for removing this opcode in rocky/python-xdis@6046fa2 was “It’s not used”, it seems like the solution is probably just to revert that commit and release a new minor version of xdis with the opcode restored.

xdis.std.Instruction should not take "has_arg" parameter

I think we want has_stdarg in the underlying bytecode instruction. However in std.Instruction it shouldn't be passed as a parameter but computed based on the opc, e.g. has_arg = opname >= opc.HAVE_ARGUMENT.

In an ideal world, fields argrepr and opname wouldn't need to be passed in std.Instruction(). We could either create another function or have setting these to be None to mean that they get filled in. And opname probably should be checked against opcode.

runtime errors after installing from uncompyle6-2.5.0.tar.gz

Downloaded
https://pypi.python.org/packages/d3/b4/af8130cbf6826d2fbc66dedaf4a14e449db880557413d63bd1fb79973209/uncompyle6-2.5.0.tar.gz#md5=b632fff4a10273055cf8a59c44c75892

installed via "python setup.py install"

tried on:
RH Linux 6, Python 2.6.6)
Windows 10 + Cygwin, Python 2.7.10
Windows 10, Python 2.7.12

In each case executing uncompyle6 I get always some form of import error (formatting is slightly different depending on version), here for Windows:

c:\Python27\Scripts>uncompyle6.exe -v
Traceback (most recent call last):
  File "c:\Python27\Scripts\uncompyle6-script.py", line 9, in <module>
    load_entry_point('uncompyle6==2.5.0', 'console_scripts', 'uncompyle6')()
  File "c:\Python27\lib\site-packages\pkg_resources\__init__.py", line 542, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "c:\Python27\lib\site-packages\pkg_resources\__init__.py", line 2569, in load_entry_point
    return ep.load()
  File "c:\Python27\lib\site-packages\pkg_resources\__init__.py", line 2229, in load
    return self.resolve()
  File "c:\Python27\lib\site-packages\pkg_resources\__init__.py", line 2239, in resolve
    raise ImportError(str(exc))
ImportError: 'module' object has no attribute 'main_bin'

With the 2.4 version I have no such issue.

uncompyle6 has no API stability statement

I gather that uncompyle6 is mostly intended for command line use, but I'd like to use it as an API, and it's currently a little hard to do because there's no way to tell what APIs are "public" and how likely they are to break. In particular, I'd like to know how much of the behaviour of deparse_code I can rely on.

Context: I'm probably going to be adding uncompyle6 as a dependency to Hypothesis because it would allow me to replace some terrible code that doesn't work properly and I hate maintaining.

Initial experiments using it worked great, so I'm probably just going to use it anyway and add compatibility layers between versions, but it would be nice to know what I can depend on so I don't have to. :-)

Go over grammar for why there are duplicate (more) reductions.

Consider this deparse:

$ ./bin/uncompyle6 -g test/bytecode_2.5/10_if_else_ternary.pyc
# Python bytecode 2.5 (decompiled from Python 2.6)
# Embedded file name: simple_source/branching/10_if_else_ternary.py
...
exprlist ::= expr
compare ::= expr expr COMPARE_OP
compare ::= expr expr COMPARE_OP
cmp ::= compare
cmp ::= compare
expr ::= cmp
expr ::= cmp
...

In the good old days for Python 2.7 this used to be:

compare ::= expr expr COMPARE_OP
cmp ::= compare
expr ::= cmp

I suspect adding various optional nonterminals, e.g. come_from_opt for Python 3.5 is what caused this. And even here, the optional nonterminals were a result of fixing problems before I realized that python 2.7. structure analysis was needed in Python 3.

It is also possible that the duplication comes from changes to use left-recursive grammars. I read in wikipedia that left-recursive grammars for Early parsers generally gives linear parse time, but perhaps I shouldn't believe wikipedia? Or it could be the other way around, maybe I added a right-recursive rule instead of a left-recursive one. (If that's the case we should fix up spark to warn about right, or is it left, recursive rules.)

Tracking this down can be as simple as a git bisect, or reinstating the old 2.7 grammar say from uncompyle2 which should be in git history.

By the way, we now have ways to better select more specific grammars for sets of Python versions, so that could and should be used.

@moagstar is this something that interests you?

Verification idea

Right now the python verification process (decompile, compile, bytecode check) is complicated because there may need to be differences we need to ignore: timestamps, order of hash elements and so on.

Another approach is to take the test programs that come with a python distribution and for that:

  • compile (if needed)
  • decompile
  • run decompiled test program

Here, we take advantage of the property that the test programs when run will report if they are correct or not.

Assertion error

When I do this command on any file, for example this command:
C:\Users\UnknownWarrior8910\Desktop\uncomp>uncompyle6 modloader.pyc > un.txt
This happens:
Traceback (most recent call last): File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\Scripts\uncompyle6-script.py", line 11, in <module> load_entry_point('uncompyle6==2.9.9', 'console_scripts', 'uncompyle6')() File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\bin\uncompile.py", line 163, in main_bin File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\main.py", line 145, in main File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\main.py", line 72, in uncompyle_file File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\main.py", line 46, in uncompyle File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\semantics\pysource.py", line 2256, in deparse_code File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\scanners\scanner2.py", line 135, in ingest File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\scanners\scanner2.py", line 971, in find_jump_targets File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\scanners\scanner2.py", line 635, in detect_control_flow File "C:\Users\UnknownWarrior8910\AppData\Local\Programs\Python\Python36-32\lib\site-packages\uncompyle6-2.9.9-py3.6.egg\uncompyle6\scanners\scanner2.py", line 474, in next_except_jump AssertionError

What should I do with this?
Im using it on windows 10.

Scripts without header

Hi rocky. Is that possible to decompile scripts without header? I have game which uses (Python 2.7.x - CPython) with compiled variants and they not contain MAGIC and TIMESTAMP. Also some strings written in unicode. Can you check it?

Samples: https://www.sendspace.com/file/zgxhk2

Thanks advance! 👍 :)

directly pass code objects

hi there,

im pretty new to fiddling with python bytecode. However i have binary file whose format seems to be
clutched with some more stuff, it is missing the correct python magic and so on.. so it is not a regular
pyc. Its been created by pyinstaller on a windows system so encodings also a bit different.. Unompyle
complains about wrong format, I can however unmarshal the code so i end
up having a code object:

code
<code object at 0x7ff718016270, file "main.py", line 3>

which of course can be disassembled using dis.dis()

Now i wonder: is there a way i can transform back to source using uncompyle6 by make
it use the code object directly or dumped disassambled code?

Thank you

ImportError: No module named pkg_resources

Instalation python-2.7.2.msi (md5=44c8bbe92b644d78dd49e18df354386f).
Unpack python-uncompyle6-master.zip (https://codeload.github.com/rocky/python-uncompyle6/zip/master) , put bat-file
C:\Python27\python.exe setup.py install
in same folder where setup.py (in my case "I:\anime\python-uncompyle6-master\python-uncompyle6-master")
After running bat-file I get :

I:\anime\python-uncompyle6-master\python-uncompyle6-master>C:\Python27\python.ex
e setup.py install
Traceback (most recent call last):
File "setup.py", line 11, in
import('pkg_resources')
ImportError: No module named pkg_resources

Modularize semantic actions

Tables for semantic actions semantics/{pysource,fragment}.py span all versions. We need a way to isolate specific rules for python semantic actions better.

DRY opcodes

DRY Python2 opcodes like we do for Python3. At the same time we could merge both. But right now I'd like to do this in two steps rather than one.

ImportError: Unknown magic number 62215

seems this old issue still not resolved:
wibiti/uncompyle2#23

➔ uncompyle6 zipfile.pyc 
Traceback (most recent call last):
  File "/usr/bin/uncompyle6", line 9, in <module>
    load_entry_point('uncompyle6==2.5.0', 'console_scripts', 'uncompyle6')()
  File "/usr/share/python2.7/site-packages/uncompyle6/bin/uncompile.py", line 160, in main_bin
  File "/usr/share/python2.7/site-packages/uncompyle6/main.py", line 119, in main
  File "/usr/share/python2.7/site-packages/uncompyle6/main.py", line 52, in uncompyle_file
  File "/usr/share/python2.7/site-packages/xdis/load.py", line 89, in load_module
ImportError: Unknown magic number 62215 in zipfile.pyc

to reproduce this, download dropbox linux client and unpack it:

  • wget http://dl-web.dropbox.com/u/17/dropbox-lnx.x86_64-5.4.24.tar.gz
  • tar xf dropbox-lnx.x86_64-5.4.24.tar.gz
  • uncompyle6 .dropbox-dist/dropbox-lnx.x86_64-5.4.24/tornado-4.2-py2.7-linux-x86_64.egg/tornado/test/__init__.pyc

i just picked smallest .pyc file from the dist, but any file would do

how to run uncompyle6?

win7
Python34

>python34 uncompyle6 -h
  File "uncompyle6", line 52
    program =  os.path.basename(__file__)
          ^
SyntaxError: invalid syntax

Missing comma.

Had an error decompiling a library. Here is a repro case where the decompiled file is not the same as the original. Would it be possible to fix it?

Original file

def some_function():
    return ['some_string']

def some_other_function():
    some_variable, = some_function()
    print(some_variable)

Decompiled file

# Python bytecode 2.7 (62211) disassembled from Python 2.7
# Embedded file name: E:/Junk/Uncompyle6_repro\some_file_original.py
# Compiled at: 2016-10-04 15:35:56


def some_function():
    return ['some_string']


def some_other_function():
    some_variable = some_function()
    print some_variable

Look at the comma after the 'some_variable' declaration in the source file, missing in the decompiled file. Personally would have coded it differently but eh...

repro.zip

Thanks a lot for your help.

Parse error at or near `POP_BLOCK'

$ python2 -m compyleall test.py
Compiling test.py ...

$ python3 -c "import uncompyle6, sys;uncompyle6.uncompyle_file('test.pyc', sys.stdout)"
# uncompyle6 version 2.9.9
# Python bytecode 2.7 (62211)
# Decompiled from: Python 3.5.2 (default, Nov 17 2016, 17:05:23) 
# [GCC 5.4.0 20160609]
# Embedded file name: test.py
# Compiled at: 2017-01-27 16:38:32
Instruction context:
   
  11      43  JUMP_BACK             3  'to 3'
             46  JUMP_BACK             3  'to 3'
->           49  POP_BLOCK        


def test--- This code section failed: ---

   3       0  SETUP_LOOP           47  'to 50'
           3  LOAD_GLOBAL           0  'True'
           6  POP_JUMP_IF_FALSE    49  'to 49'

   4       9  LOAD_CONST            1  1
          12  STORE_FAST            1  'x'

   5      15  LOAD_FAST             1  'x'
          18  POP_JUMP_IF_FALSE    37  'to 37'

   6      21  LOAD_FAST             1  'x'
          24  POP_JUMP_IF_FALSE     3  'to 3'

   7      27  BREAK_LOOP       
          28  JUMP_ABSOLUTE        46  'to 46'

   9      31  JUMP_BACK             3  'to 3'
          34  JUMP_BACK             3  'to 3'

  10      37  LOAD_FAST             1  'x'
          40  POP_JUMP_IF_FALSE     3  'to 3'

  11      43  JUMP_BACK             3  'to 3'
          46  JUMP_BACK             3  'to 3'
          49  POP_BLOCK        
        50_0  COME_FROM                '0'

Parse error at or near `POP_BLOCK' instruction at offset 49Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/uncompyle6-2.9.9-py3.5.egg/uncompyle6/main.py", line 46, in uncompyle
  File "/usr/local/lib/python3.5/dist-packages/uncompyle6-2.9.9-py3.5.egg/uncompyle6/semantics/pysource.py", line 2302, in deparse_code
uncompyle6.semantics.pysource.SourceWalkerError: Deparsing stopped due to parse error

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/uncompyle6-2.9.9-py3.5.egg/uncompyle6/main.py", line 72, in uncompyle_file
  File "/usr/local/lib/python3.5/dist-packages/uncompyle6-2.9.9-py3.5.egg/uncompyle6/main.py", line 49, in uncompyle
uncompyle6.semantics.pysource.SourceWalkerError: Deparsing stopped due to parse error
$ cat test.py
def test(self):
    while True:
        x = 1
        if x:
            if x:
                break
            else:
                continue
        elif x:
            pass

If test.py is compiled with Python3 decompilation is successful
Python2 version is Python 2.7.12

Parse error at `EXTENDED_ARG'

$> uncompyle6 XXX_2.cpython-35.pyc

# uncompyle6 version 2.9.9
# Python bytecode 3.5 (3350)
# Decompiled from: Python 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 15:54:25) [MSC v.1900 64 bit (AMD64)]
# Embedded file name: C:\Users\.....\XXX_2.py
# Compiled at: 2016-12-25 15:22:48
# Size of source mod 2**32: 615 bytes
Instruction context:

  12      12  LOAD_NAME                'XXX_1'
             15  LOAD_ATTR                'RContext'
             18  LOAD_CONST               ('ctx',)
             21  LOAD_CONST               '<code_object enterR>'
             24  LOAD_CONST               'XXX_2.enterR'
->	         27  EXTENDED_ARG          2  ''
             30  MAKE_FUNCTION_A_2_0        '0 positional, 0 keyword pair, 2 annotated'
             33  STORE_NAME               'enterR'

# file XXX_2.cpython-35.pyc
# --- This code section failed: ---

   9       0  LOAD_NAME                '__name__'
           3  STORE_NAME               '__module__'
           6  LOAD_CONST               'XXX_2'
           9  STORE_NAME               '__qualname__'

  12      12  LOAD_NAME                'XXX_1'
          15  LOAD_ATTR                'RContext'
          18  LOAD_CONST               ('ctx',)
          21  LOAD_CONST               '<code_object enterR>'
          24  LOAD_CONST               'XXX_2.enterR'
          27  EXTENDED_ARG          2  ''
          30  MAKE_FUNCTION_A_2_0        '0 positional, 0 keyword pair, 2 annotated'
          33  STORE_NAME               'enterR'

  16      36  LOAD_NAME                'XXX_1'
          39  LOAD_ATTR                'RContext'
          42  LOAD_CONST               ('ctx',)
          45  LOAD_CONST               '<code_object exitR>'
          48  LOAD_CONST               'XXX_2.exitR'
          51  EXTENDED_ARG          2  ''
          54  MAKE_FUNCTION_A_2_0        '0 positional, 0 keyword pair, 2 annotated'
          57  STORE_NAME               'exitR'

Parse error at or near `EXTENDED_ARG' instruction at offset 27

syntax error at or near `POP_BLOCK`...

python34 C:\Python34\Scripts\uncompyle6 --asm -o C:\dc\ 1_Lisa.pyc

...
...
        1510    LOAD_NAME         'resp'
        1513    CALL_FUNCTION     '1 positional, 0 keyword pair'
        1516    POP_TOP           ''
        1517    JUMP_FORWARD      'to 1520'
      1520_0    COME_FROM         '1517'
        1520    JUMP_BACK         1476
        1523    POP_BLOCK         ''
      1524_0    COME_FROM         '1456'
      1524_1    COME_FROM         '1459'
        1524    JUMP_BACK         1200
        1527    POP_BLOCK         ''
      1528_0    COME_FROM         '1197'
        1528    JUMP_BACK         1163
        1531    POP_BLOCK         ''
      1532_0    COME_FROM         '1160'
        1532    JUMP_FORWARD      'to 1535'
      1535_0    COME_FROM         '1532'
        1535    LOAD_CONST        ''
        1538    RETURN_VALUE      ''
decompiled 0 files: 0 okay, 1 failed

python34 C:\Python34\Scripts\uncompyle6 --asm 1_Lisa.pyc

...
...
        1524    JUMP_BACK         1200
        1527    POP_BLOCK         ''
      1528_0    COME_FROM         '1197'
        1528    JUMP_BACK         1163
        1531    POP_BLOCK         ''
      1532_0    COME_FROM         '1160'
        1532    JUMP_FORWARD      'to 1535'
      1535_0    COME_FROM         '1532'

Syntax error at or near `POP_BLOCK' token at offset 1527

# Can't uncompile C:\Python34\home\python-uncompyle6-master\bin\1_Lisa.pyc

pyc and decompiled files using pycdc https://drive.google.com/open?id=0B69FzC4j62F9ZmlILUt3eWlvSVE
pycdc decompiled larger number of files than uncompyle6.

Breakage with xdis 3.2.0

I have this library installed for using hypothesis in one of my projects. Today I tried upgrading to the latest xdis (3.2.0) and got this breakage:

self = <uncompyle6.scanners.scanner27.Scanner27 object at 0x7fb928b3f790>
op = 12
    def op_size(self, op):
        """
            Return size of operator with its arguments
            for given opcode <op>.
            """
>       if op < self.opc.HAVE_ARGUMENT and op not in self.opc.hasArgumentExtended:
E       AttributeError: 'module' object has no attribute 'hasArgumentExtended'

Full stack trace can be seen here on Travis: https://travis-ci.org/adamchainz/mariadb-dyncol/builds/170691586

Decompilation failed

Python 2.7
uncompyle6 v 2.7.0

Syntax error at or near `JUMP_BACK' token at offset 33

Source code:

for a in b:
    try:c()
    except:continue

Hardcoded Python path in last release

When I install from PyPI and try to un uncompyle6 I get:

$ uncompyle6 
/usr/local/Cellar/pyenv/20160422/libexec/pyenv-exec: /usr/local/var/pyenv/versions/2.7.11/bin/uncompyle6: /home/rocky/.pyenv/versions/3.5.1/bin/python: bad interpreter: No such file or directory
/usr/local/Cellar/pyenv/20160422/libexec/pyenv-exec: line 47: /usr/local/var/pyenv/versions/2.7.11/bin/uncompyle6: Undefined error: 0
$

As you can see I'm using pyenv, but I think something went wrong when building the whl file or something, that it hardcoded the /home/rocky/.pyenv/versions/3.5.1/bin/python path. Any idea how to fix it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.