Giter VIP home page Giter VIP logo

py-hprof's Introduction

hprof -- a python library for .hprof file analysis

The hprof library allows convenient access to Java .hprof files, generated by e.g. OpenJDK's JVM or Android's ART. It was built to enable scripted or interactive analysis of heap dumps.

Key features

  • Java classes and objects are modeled 1:1 as Python classes and objects
  • Fields are exposed as Python attributes
  • Arrays are indexable and iterable
  • isinstance(), issubclass() works as expected
  • Tab completion works, enabling convenient exploration

Examples

Class lookups by name:

>>> car_cls, = heap.classes['com.example.cars.Car']
>>> car_cls
<JavaClass 'com.example.cars.Car'>

Finding all instances of a class, field access, instanceof:

>>> vehicles = heap.all_instances('com.example.cars.Vehicle')
>>> vehicles = sorted(vehicles, key=lambda v: str(v.make))
>>> for v in vehicles:
...     print(type(v), v.make, isinstance(v, car_cls))
com.example.cars.Bike Axes False
com.example.cars.Bike Fånark False
com.example.cars.Car Lolvo True
com.example.cars.Limo Stretch True
com.example.cars.Car Toy Yoda True

Reading object arrays:

>>> carex, = heap.all_instances('com.example.Cars')
>>> carex
<com.example.Cars 0x...>
>>> carex.vehicles
<com.example.cars.Vehicle[5] 0x...>
>>> print(carex.vehicles)
Vehicle[5] {<com.example.cars.Car 0x...>, <com.example...}
>>> carex.vehicles[0]
<com.example.cars.Car 0x...>
>>> print(carex.vehicles[0])
Car@...

Accessing a field, printing a java.lang.String:

>>> carex.vehicles[0].make
<java.lang.String 0x...>
>>> print(carex.vehicles[0].make)
Lolvo

Limitations

Supports heap dumps only

At least for now.

.hprof files are quite versatile, with many different record types. The hprof library currently supports only the records that were needed to support heap dump analysis.

Reference types

.hprof files don't contain information about the declared type of object references. Hence, there is no way to tell the difference between a and b in this case:

String a = "hello";
Object b = a;

In Java, we would see only Object fields when accessing b; any fields declared in String, including possible overloads, would only be accessible through a, or by casting b to String.

Since the hprof library cannot know the declared type, both a and b will behave as if they were references to the exact type of the object -- String, in this case.

The cast() function allows emulating the Java behavior by explicitly supplying the type of the reference.

Name collisions

Since Java objects and classes are modeled as Python objects and classes, there may be name collisions between values from the .hprof file and the library's Python code.

The library prefixes all its internal fields with _hprof_ to minimize this risk, but this provides no guarantees. In particular, a Java application could pick its names maliciously to maximize collisions. On top of collisions with library-internal fields, there may also be name conflicts with built-in functions like __dir__ or __str__.

It will not be possible to avoid all these situations, but we still welcome reports of bugs caused by such collisions; many may be fixable.

Object/Class hierarchy

Our model of the Java class hierarchy breaks down at the top where java.lang.Object and java.lang.Class meet. This is because Class is an Object, and Object is an instance of Class, which proved too difficult to model.

There are some special cases in functions like isinstance() that try to patch this up, but there will probably be holes remaining.

This should not be a problem for most use cases, but may be good to know if you're doing really advanced stuff.

High memory usage

The hprof library instantiates all heap objects when you open your file. This makes the implementation a little simpler, and explicitly ensures that all heap references are valid at load time, so that you won't have nasty surprises later.

This can be quite memory intensive, especially when working with large .hprof files.

Callstacks not supported (yet?)

.hprof files may contain various callstacks, perhaps most interestingly allocation callstacks. The hprof library currently skips over them while parsing.

Does not expose all heap dump information (yet?)

The hprof library does not expose all information available in the heap dumps. If your tool needs something, feel free to contribute to the library!

Contributions

Contributions of all kinds are welcome!

Please see CONTRIBUTING.md for details.

Credits

Originally written by Snild Dolkow at Sony Mobile Communications Inc.

Please see AUTHORS and the git log for other contributors and detailed history.

py-hprof's People

Contributors

snild-sony avatar dolkow avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.