Giter VIP home page Giter VIP logo

python-typedstream's Introduction

pytypedstream

A pure Python, cross-platform library/tool for reading Mac OS X and NeXTSTEP typedstream files.

The typedstream format is a serialization format for C and Objective-C data structures. It is used by Apple's implementation of the Foundation classes NSArchiver and NSUnarchiver, and is based on the data format originally used by NeXTSTEP's NXTypedStream APIs.

The NSArchiver and NSUnarchiver classes and the typedstream format are superseded by NSKeyedArchiver and NSKeyedUnarchiver, which use binary property lists for serialization. NSArchiver and NSUnarchiver are deprecated since macOS 10.13 (but still available as of macOS 10.15) and have never been available for application developers on other Apple platforms (iOS, watchOS, tvOS). Despite this, the typedstream data format is still used by some macOS components and applications, such as the Stickies and Grapher applications.

Note: The typedstream data format is undocumented and specific to the Mac OS X implementation of NSArchiver/NSUnarchiver. Other Objective-C/Foundation implementations may also provide NSArchiver/NSUnarchiver, but might not use the same data format. For example, GNUstep's NSArchiver/NSUnarchiver implementations use a completely different format, with a GNUstep archive signature string.

Features

  • Pure Python, cross-platform - no native Mac APIs are used.

  • Provides both a Python API (for use in programs and in the REPL) and a command-line tool (for quick inspection of files from the command line).

  • Typedstream data is automatically parsed and translated to appropriate Python data types.

    • Standard Foundation objects and structures are recognized and translated to dedicated Python objects with properly named attributes.
    • Users of the library can create custom Python classes to decode and represent classes and structure types that aren't supported by default.
    • Unrecognized objects and structures are still parsed, but are represented as generic objects whose contents can only be accessed by index (because typedstream data doesn't include field names).
  • Unlike with the Objective-C NSCoder API, values can be read without knowing their exact type beforehand. However, when the expected types in a certain context are known, the types in the typedstream are checked to make sure that they match.

  • In addition to the high-level NSUnarchiver-style decoder, an iterative stream/event-based reader is provided, which can be used (to some extent) to read large typedstreams without storing them fully in memory.

Requirements

Python 3.6 or later. No other libraries are required, except on older Python versions:

These libraries are declared as platform-specific dependencies and will be installed automatically as needed.

Installation

pytypedstream is available on PyPI and can be installed using pip:

$ python3 -m pip install pytypedstream

Alternatively you can download the source code manually, and run this command in the source code directory to install it:

python3 -m pip install .

Examples

Simple example

>>> import typedstream
>>> data = b"\x04\x0bstreamtyped\x81\xe8\x03\x84\x01@\x84\x84\x84\x08NSString\x01\x84\x84\x08NSObject\x00\x85\x84\x01+\x0cstring value\x86"
>>> typedstream.unarchive_from_data(data)
NSString('string value')
>>> _.value
'string value'

Low-level stream reading

>>> from typedstream.stream import TypedStreamReader
>>> ts = TypedStreamReader.from_data(data)
>>> for event in ts:
...     print(event)
typedstream.stream.BeginTypedValues([b"@"])
typedstream.stream.BeginObject()
typedstream.stream.SingleClass(name=b'NSString', version=1)
...

Command-line interface

Full high-level decoding:

$ pytypedstream decode StickiesDatabase
NSMutableArray, 2 elements:
    object of class Document v1, extends NSObject v0, contents:
        NSMutableData(b"rtfd\x00\x00\x00\x00[...]")
        0
        struct ?:
            struct ?:
                8.0
                224.0
            struct ?:
                442.0
                251.0
        0
        <NSDate: 2003-03-05 02:13:27.454397+00:00>
        <NSDate: 2007-09-26 14:51:07.340778+00:00>
    [...]

Low-level stream-based reading/dumping:

$ pytypedstream read StickiesDatabase
streamer version 4, byte order little, system version 1000

begin typed values (types [b'@'])
    begin literal object (#0)
        class NSMutableArray v0 (#1)
        class NSArray v0 (#2)
        class NSObject v0 (#3)
        None
        begin typed values (types [b'i'])
            2
        end typed values
        begin typed values (types [b'@'])
            begin literal object (#4)
                class Document v1 (#5)
                <reference to class #3>
                [...]
            end literal object
        end typed values
        [...]
    end literal object
end typed values

Limitations

Many common classes and structure types from Foundation, AppKit, and other standard frameworks are not supported yet. How each class encodes its data in a typedstream is almost never documented, and the relevant Objective-C implementation source code is normally not available, so usually the only way to find out the meaning of the values in a typedstream is through experimentation and educated guessing.

Writing typedstream data is not supported at all.

License

Copyright (C) 2020 dgelessus

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Changelog

Version 0.1.1 (next version)

  • (no changes yet)

Version 0.1.0

  • Initial version.

python-typedstream's People

Contributors

dgelessus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

python-typedstream's Issues

How to access content once decoded

Once you have decoded a typestream, how do you access content (ie. "bar")?

foo = typedstream.unarchive_from_file("typedstream")
# <-- What comes here?
type b'@': object of class NSMutableAttributedString v0, extends NSAttributedString v0, extends NSObject v0:
        super object: <NSObject>
        type b'@': NSMutableString('bar')

Maybe time to release?

While this thing could use some more instructions, if you can figure out how to use it- it does work pretty well as is. This issue is just a suggestion to make it available via pip. Don't let perfect be the enemy of good enough :-) And thank you for making this, very useful for parsing some data fields in MacOS messages sqlite database.

Unarchiving of NSURL

Thanks for this great library. Added support for NSURL in a personal project. See diff below - don't care about credit happy to just move this forward.

@archiving.archived_class
class NSURL(NSObject):
	value: str

	def _init_from_unarchiver_(self, unarchiver: archiving.Unarchiver, class_version: int) -> None:
		if class_version != 0:
			raise ValueError(f"Unsuppored version: {class_version}")

		# unk flag
		unarchiver.decode_value_of_type(b"c")

		# URL string
		_nsstr_rep = unarchiver.decode_value_of_type(b"@")
		self.value = _nsstr_rep.value

	def __repr__(self) -> str:
		return f"{type(self).__name__}({self.value!r})"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.