Giter VIP home page Giter VIP logo

zodbpickle's Introduction

zodbpickle README

image

Coverage status

PyPI

Python versions

This package presents a uniform pickling interface for ZODB:

  • Under Python2, this package forks both Python 2.7's pickle and cPickle modules, adding support for the protocol 3 opcodes. It also provides a new subclass of bytes, zodbpickle.binary, which Python2 applications can use to pickle binary values such that they will be unpickled as bytes under Py3k.
  • Under Py3k, this package forks the pickle module (and the supporting C extension) from both Python 3.2 and Python 3.3. The fork add support for the noload operations used by ZODB.

Caution

zodbpickle relies on Python's pickle module. The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source as arbitrary code might be executed.

Also see https://docs.python.org/3.6/library/pickle.html

General Usage

To get compatibility between Python 2 and 3 pickling, replace:

import pickle

by:

from zodbpickle import pickle

This provides compatibility, but has the effect that you get the fast implementation in Python 3, while Python 2 uses the slow version.

To get a more deterministic choice of the implementation, use one of:

from zodbpickle import fastpickle # always C
from zodbpickle import slowpickle # always Python

Both modules can co-exist which is helpful for comparison.

But there is a bit more to consider, so please read on!

Loading/Storing Python 2 Strings

In all their wisdom, the Python developers have decided that Python 2 str instances should be loaded as Python 3 str objects (i.e. unicode strings). Patches were proposed in Python issue 6784 but were never applied. This code base contains those patches.

Example 1: Loading Python 2 pickles on Python 3 :

$ python2
>>> import pickle
>>> pickle.dumps('\xff', protocol=0)
"S'\\xff'\np0\n."
>>> pickle.dumps('\xff', protocol=1)
'U\x01\xffq\x00.'
>>> pickle.dumps('\xff', protocol=2)
'\x80\x02U\x01\xffq\x00.'

$ python3
>>> from zodbpickle import pickle
>>> pickle.loads(b"S'\\xff'\np0\n.", encoding='bytes')
b'\xff'
>>> pickle.loads(b'U\x01\xffq\x00.', encoding='bytes')
b'\xff'
>>> pickle.loads(b'\x80\x02U\x01\xffq\x00.', encoding='bytes')
b'\xff'

Example 2: Loading Python 3 pickles on Python 2 :

$ python3
>>> from zodbpickle import pickle
>>> pickle.dumps(b"\xff", protocol=0)
b'c_codecs\nencode\np0\n(V\xff\np1\nVlatin1\np2\ntp3\nRp4\n.'
>>> pickle.dumps(b"\xff", protocol=1)
b'c_codecs\nencode\nq\x00(X\x02\x00\x00\x00\xc3\xbfq\x01X\x06\x00\x00\x00latin1q\x02tq\x03Rq\x04.'
>>> pickle.dumps(b"\xff", protocol=2)
b'\x80\x02c_codecs\nencode\nq\x00X\x02\x00\x00\x00\xc3\xbfq\x01X\x06\x00\x00\x00latin1q\x02\x86q\x03Rq\x04.'

$ python2
>>> import pickle
>>> pickle.loads('c_codecs\nencode\np0\n(V\xff\np1\nVlatin1\np2\ntp3\nRp4\n.')
'\xff'
>>> pickle.loads('c_codecs\nencode\nq\x00(X\x02\x00\x00\x00\xc3\xbfq\x01X\x06\x00\x00\x00latin1q\x02tq\x03Rq\x04.')
'\xff'
>>> pickle.loads('\x80\x02c_codecs\nencode\nq\x00X\x02\x00\x00\x00\xc3\xbfq\x01X\x06\x00\x00\x00latin1q\x02\x86q\x03Rq\x04.')
'\xff'

Example 3: everything breaks down :

$ python2
>>> class Foo(object):
...     def __init__(self):
...         self.x = 'hello'
...
>>> import pickle
>>> pickle.dumps(Foo(), protocol=0)
"ccopy_reg\n_reconstructor\np0\n(c__main__\nFoo\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'x'\np6\nS'hello'\np7\nsb."
>>> pickle.dumps(Foo(), protocol=1)
'ccopy_reg\n_reconstructor\nq\x00(c__main__\nFoo\nq\x01c__builtin__\nobject\nq\x02Ntq\x03Rq\x04}q\x05U\x01xq\x06U\x05helloq\x07sb.'
>>> pickle.dumps(Foo(), protocol=2)
'\x80\x02c__main__\nFoo\nq\x00)\x81q\x01}q\x02U\x01xq\x03U\x05helloq\x04sb.'

$ python3
>>> from zodbpickle import pickle
>>> class Foo(object): pass
...
>>> foo = pickle.loads("ccopy_reg\n_reconstructor\np0\n(c__main__\nFoo\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'x'\np6\nS'hello'\np7\nsb.", encoding='bytes')
>>> foo.x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Foo' object has no attribute 'x'

wait what? :

>>> foo.__dict__
{b'x': b'hello'}

oooh. So we use encoding='ASCII' (the default) and errors='bytes' and hope it works:

>>> foo = pickle.loads("ccopy_reg\n_reconstructor\np0\n(c__main__\nFoo\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'x'\np6\nS'hello'\np7\nsb.", errors='bytes')
>>> foo.x
'hello'

falling back to bytes if necessary :

>>> pickle.loads(b'\x80\x02U\x01\xffq\x00.', errors='bytes')
b'\xff'

Support for noload()

The ZODB uses cPickle's noload() method to retrieve all persistent references from a pickle without loading any objects. This feature was removed from Python 3's pickle. Unfortuantely, this unnecessarily fills the pickle cache.

This module provides a noload() method again.

zodbpickle's People

Contributors

agroszer avatar ctismer avatar dataflake avatar icemac avatar jamadden avatar jugmac00 avatar kedder avatar mgedmin avatar strichter avatar tseaver avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zodbpickle's Issues

ImportError: No module named pythainlp.tokenize.multi_cut

I get this error while running:

Traceback (most recent call last):
  File "dictionary.py", line 13, in <module>
    w2w = Word2word(codes[0], codes[1], dict_path=dict_path)
  File "/word2word/word2word.py", line 12, in __init__
    self.word2x, self.y2word, self.x2ys = download_or_load(lang1, lang2, dict_path)
  File "/word2word/word2word/utils.py", line 52, in download_or_load
    word2x, y2word, x2ys = pickle.load(open(fpath, 'rb'))
  File "/usr/local/lib/python2.7/site-packages/zodbpickle/pickle_2.py", line 1525, in load
    return Unpickler(file).load()
  File "/usr/local/lib/python2.7/site-packages/zodbpickle/pickle_2.py", line 881, in load
    dispatch[key](self)
  File "/usr/local/lib/python2.7/site-packages/zodbpickle/pickle_2.py", line 1142, in load_global
    klass = self.find_class(module, name)
  File "/usr/local/lib/python2.7/site-packages/zodbpickle/pickle_2.py", line 1176, in find_class
    __import__(module)
ImportError: No module named pythainlp.tokenize.multi_cut

As suggested, I have added pythainlp with pip install pythainlp -U, but then I get

[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/loretoparisi/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.
[nltk_data] Downloading package omw to
[nltk_data]     /Users/loretoparisi/nltk_data...
[nltk_data]   Unzipping corpora/omw.zip.
Traceback (most recent call last):
  File "dictionary.py", line 13, in <module>
    w2w = Word2word(codes[0], codes[1], dict_path=dict_path)
  File "/word2word/word2word/word2word.py", line 12, in __init__
    self.word2x, self.y2word, self.x2ys = download_or_load(lang1, lang2, dict_path)
  File "/word2word/word2word/utils.py", line 52, in download_or_load
    word2x, y2word, x2ys = pickle.load(open(fpath, 'rb'))
  File "/usr/local/lib/python2.7/site-packages/zodbpickle/pickle_2.py", line 1525, in load
    return Unpickler(file).load()
  File "/usr/local/lib/python2.7/site-packages/zodbpickle/pickle_2.py", line 881, in load
    dispatch[key](self)
  File "/usr/local/lib/python2.7/site-packages/zodbpickle/pickle_2.py", line 1142, in load_global
    klass = self.find_class(module, name)
  File "/usr/local/lib/python2.7/site-packages/zodbpickle/pickle_2.py", line 1176, in find_class
    __import__(module)
ImportError: No module named multi_cut

Build warnings: conversion from 'Py_ssize_t' to 'int', possible loss of data

Here are some excerpts from the build log:

C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -Ic:\Python26_64\include -Ic:\Python26_64\PC /Tcsrc/zodbpickle/_pickle_27.c /Fobuild\temp.win-amd64-2.6\Release\src/zodbpickle/_pickle_27.obj
_pickle_27.c
src/zodbpickle/_pickle_27.c(846) : warning C4244: 'function' : conversion from 'Py_ssize_t' to 'long', possible loss of data
src/zodbpickle/_pickle_27.c(883) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'char', possible loss of data
src/zodbpickle/_pickle_27.c(1107) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(1164) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(1166) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(1370) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(1381) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(1685) : warning C4244: 'function' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(1710) : warning C4244: 'function' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(2231) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(2232) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(2331) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(2332) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4097) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4586) : warning C4244: 'return' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4607) : warning C4244: 'return' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4632) : warning C4244: 'function' : conversion from 'Py_ssize_t' to 'long', possible loss of data
src/zodbpickle/_pickle_27.c(4636) : warning C4244: 'return' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4659) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4662) : warning C4244: 'return' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4739) : warning C4244: 'return' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4745) : warning C4244: 'return' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(4913) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_27.c(5375) : warning C4244: 'return' : conversion from 'Py_ssize_t' to 'int', possible loss of data

and also on Python 3.2:

C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -Ic:\Python32_64\include -Ic:\Python32_64\PC /Tcsrc/zodbpickle/_pickle_32.c /Fobuild\temp.win-amd64-3.2\Release\src/zodbpickle/_pickle_32.obj
_pickle_32.c
src/zodbpickle/_pickle_32.c(5502) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data
src/zodbpickle/_pickle_32.c(5513) : warning C4244: '=' : conversion from 'Py_ssize_t' to 'int', possible loss of data

Fails to build on Python 3.2 on Windows: 'stackUnderflow' undefined

Here's an excerpt from the build log:

building 'zodbpickle._pickle' extension
creating build\temp.win32-3.2
creating build\temp.win32-3.2\Release
creating build\temp.win32-3.2\Release\src
creating build\temp.win32-3.2\Release\src\zodbpickle
C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -Ic:\Python32_32\include -Ic:\Python32_32\PC /Tcsrc/zodbpickle/_pickle_32.c /Fobuild\temp.win32-3.2\Release\src/zodbpickle/_pickle_32.obj
_pickle_32.c
src/zodbpickle/_pickle_32.c(5618) : warning C4013: 'stackUnderflow' undefined; assuming extern returning int
creating build\lib.win32-3.2
creating build\lib.win32-3.2\zodbpickle
C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\link.exe /DLL /nologo /INCREMENTAL:NO /LIBPATH:c:\Python32_32\libs /LIBPATH:c:\Python32_32\PCbuild /EXPORT:PyInit__pickle build\temp.win32-3.2\Release\src/zodbpickle/_pickle_32.obj /OUT:build\lib.win32-3.2\zodbpickle\_pickle.pyd /IMPLIB:build\temp.win32-3.2\Release\src/zodbpickle\_pickle.lib /MANIFESTFILE:build\temp.win32-3.2\Release\src/zodbpickle\_pickle.pyd.manifest
   Creating library build\temp.win32-3.2\Release\src/zodbpickle\_pickle.lib and object build\temp.win32-3.2\Release\src/zodbpickle\_pickle.exp
_pickle_32.obj : error LNK2019: unresolved external symbol _stackUnderflow referenced in function _do_noload_setitems
build\lib.win32-3.2\zodbpickle\_pickle.pyd : fatal error LNK1120: 1 unresolved externals

Fails to build on Python 3.10.0a2

Trying to build zodbpickle on Python 3.10.0a2 fails with

  gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/mg/src/zodbbrowser/.tox/py310/include -I/home/mg/opt/python310/include/python3.10 -c src/zodbpickle/_pickle_33.c -o build/temp.linux-x86_64-3.10/src/zodbpickle/_pickle_33.o
  src/zodbpickle/_pickle_33.c: In function ‘Pdata_New’:
  src/zodbpickle/_pickle_33.c:179:19: error: lvalue required as left operand of assignment
    179 |     Py_SIZE(self) = 0;
        |                   ^
  src/zodbpickle/_pickle_33.c: In function ‘Pdata_clear’:
  src/zodbpickle/_pickle_33.c:205:19: error: lvalue required as left operand of assignment
    205 |     Py_SIZE(self) = clearto;
        |                   ^
  src/zodbpickle/_pickle_33.c: In function ‘Pdata_pop’:
  src/zodbpickle/_pickle_33.c:247:23: error: lvalue required as decrement operand
    247 |     return self->data[--Py_SIZE(self)];
        |                       ^~
  src/zodbpickle/_pickle_33.c: In function ‘Pdata_push’:
  src/zodbpickle/_pickle_33.c:257:29: error: lvalue required as increment operand
    257 |     self->data[Py_SIZE(self)++] = obj;
        |                             ^~
  src/zodbpickle/_pickle_33.c: In function ‘Pdata_poptuple’:
  src/zodbpickle/_pickle_33.c:283:19: error: lvalue required as left operand of assignment
    283 |     Py_SIZE(self) = start;
        |                   ^
  src/zodbpickle/_pickle_33.c: In function ‘Pdata_poplist’:
  src/zodbpickle/_pickle_33.c:300:19: error: lvalue required as left operand of assignment
    300 |     Py_SIZE(self) = start;
        |                   ^
  src/zodbpickle/_pickle_33.c: In function ‘save_long’:
  src/zodbpickle/_pickle_33.c:1646:18: warning: implicit declaration of function ‘_PyUnicode_AsStringAndSize’; did you mean ‘PyUnicode_FromStringAndSize’? [-Wimplicit-function-declaration]
   1646 |         string = _PyUnicode_AsStringAndSize(repr, &size);
        |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~
        |                  PyUnicode_FromStringAndSize
  src/zodbpickle/_pickle_33.c:1646:16: warning: assignment to ‘char *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
   1646 |         string = _PyUnicode_AsStringAndSize(repr, &size);
        |                ^
  src/zodbpickle/_pickle_33.c: In function ‘save_pers’:
  src/zodbpickle/_pickle_33.c:2942:29: warning: assignment to ‘char *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
   2942 |             pid_ascii_bytes = _PyUnicode_AsStringAndSize(pid_str, &size);
        |                             ^
  src/zodbpickle/_pickle_33.c: In function ‘Pickler_init’:
  src/zodbpickle/_pickle_33.c:3582:9: warning: implicit declaration of function ‘_PyObject_HasAttrId’; did you mean ‘_PyObject_SetAttrId’? [-Wimplicit-function-declaration]
   3582 |     if (_PyObject_HasAttrId((PyObject *)self, &PyId_persistent_id)) {
        |         ^~~~~~~~~~~~~~~~~~~
        |         _PyObject_SetAttrId
  src/zodbpickle/_pickle_33.c: In function ‘load_pop’:
  src/zodbpickle/_pickle_33.c:4840:30: error: lvalue required as left operand of assignment
   4840 |         Py_SIZE(self->stack) = len;
        |                              ^
  src/zodbpickle/_pickle_33.c: In function ‘do_append’:
  src/zodbpickle/_pickle_33.c:5145:38: error: lvalue required as left operand of assignment
   5145 |                 Py_SIZE(self->stack) = x;
        |                                      ^
  src/zodbpickle/_pickle_33.c:5151:30: error: lvalue required as left operand of assignment
   5151 |         Py_SIZE(self->stack) = x;
        |                              ^
  src/zodbpickle/_pickle_33.c: In function ‘Pdata_pop’:
  src/zodbpickle/_pickle_33.c:248:1: warning: control reaches end of non-void function [-Wreturn-type]
    248 | }
        | ^
  error: command '/usr/lib/ccache/gcc' failed with exit code 1

When installing ZODB with pip, the zodbpickle failed building wheel

BUG/PROBLEM REPORT / FEATURE REQUEST

What I did:

Tried to install ZODB framework with pip 24.0

What I expect to happen:

Install the ZODB framework.

What actually happened:

When installing ZODB framework this error occur in your library

ERROR: Failed building wheel for zodbpickle
Running setup.py clean for zodbpickle
Failed to build zodbpickle
ERROR: Could not build wheels for zodbpickle, which is required to install pyproject.toml-based projects

Did only work when i put this command pip install zodbpickle==3.2

What version of Python and Zope/Addons I am using:

Windows 10 Enterprise 22H2

Python 3.12.1

`noload` broken for ZODB multi-database references

Over on the zodb-dev mailing list, Jim Fulton and I got into a discussion about zc.zodbdgc being broken under Python 2.7 with both cPickle and zodbpickle.

To summarize the details of that discussion, it turns out that when cPickle in Python 2.7 fixed issue 1101399 (dict subclasses and noload) in October 2009 it wound up breaking the unpickling of the list objects that are used for persistent multi-database references: they are always empty. This in turn breaks zc.zodbdgc (with an IndexError) or anything else that wants to look at references using noload when multi-databases are involved.

This same code is in zodbpickle, so references are also broken when unpickled using this code.

It wasn't real clear what the right solution was. Jim said he wanted to think about it, at least as far as zc.zodbdgc is concerned. I hoped it would be useful to bring this issue up in this forum too if only to make sure people were aware or could find it easier if they encounter the problem.

Should noload_ext1() et al instantiate an object?

Tackling #21 I noticed a discrepancy. Unpickler.noload_global() reads the arguments from the pickle, discards them, then appends None to the stack. Unpickler.noload_ext1() (and variants) read the code from the pickle, load the object onto the stack (by calling Unpickler.get_extension(), then replace it on the stack with None.

Shouldn't noload_ext1() do the same as noload_global()? I.e. make no attempt load (instantiate) an object, then append None to the stack

Status of 'noload' in Python 3 stdlib ? (i.e. let's try to kill this module for Python 3)

zodbpickle has recently entered Debian. And since it forks code from Python, it led to the following discussion:
https://lists.debian.org/debian-security-tracker/2018/04/msg00021.html

In particular, I discovered that https://bugs.python.org/issue6784 is fixed and if we manage to get back the noload operation in Python 3 (it existed in Python 2), we could stop forking the stdlib modules.

  1. About issue 6784, there's apparently only a small difference between upstream and zodbpickle. Upstream does not have errors='bytes' (used in ZODB._compat), but I guess we can achieve the same result with:
try:
  return loads(s)
except UnicodeDecodeError:
  return loads(s, encoding='bytes')
  1. What's the status 'noload' in Python 3 stdlib ? I could not find anything in bugs.python.org.

PyPI wheels for Python 3.8

It would be nice to have Python 3.8 wheels for Windows at least (Linux users tend to have C compilers) published on PyPI.

This can be done with a little extra snippet in appveyor.yml without having to wait for official Appveyor Python 3.8 support: see zopefoundation/persistent#118

Breaks ZODB after 0.5.0

I noticed today that database packing doesn't seem to work (at least
on Python 3.3)

Minimal example:

from ZODB import FileStorage, DB

storage = FileStorage.FileStorage('/tmp/mystorage.fs') db =
DB(storage) db.pack() db.close()

Expected: No output / db.pack() should succeed, although there is
likely nothing to pack.

Downgrading zodbpickle to 0.5.0 solves this issue (thx Tres Seaver)

Use PyFloat_Pack8() on Python 3.11 alpha7

The private _PyFloat_Pack8() and _PyFloat_Unpack8() functions were removed in Python 3.11, but Python 3.11 alpha7 adds new clean public PyFloat_Pack8() and PyFloat_Unpack8() functions: see https://bugs.python.org/issue46906

zodbpickle should use them to be compatible with Python 3.11.

make a 0.5.1 tag

Hi Stefan,

thanks a lot that you merged my work!

Now there is a little bit missing:
For some reason, the 0.5.1 tag was not merget from the clone.
All files there, but the tag is still 0.5.0.

I would like to push it on PyPI, but for that I need a new tag.

Can you please add a new tag, like 0.5.1 ?

Thanks & cheers -- chris

RFE: is it possible to start making github releases?🤔

On create github release entry is created email notification to those whom have set in your repo the web UI Watch->Releases.
gh release can contain additional comments (li changelog) or additional assets like release tar balls (by default it contains only assets from git tag) however all those part are not obligatory.
In simplest variant gh release can be empty because subiekt of the sent email contains git tag name.

I'm asking because my automation process uses those email notifications by trying to make preliminary automated upgrades of building packages, which allows saving some time on maintaining packaging procedures.
Probably other people may be interested to be instantly informed about release new version as well.

Documentation and examples of generate gh releases:
https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository
https://cli.github.com/manual/gh_release_upload/
jbms/sphinx-immaterial#282
https://github.com/marketplace/actions/github-release
https://pgjones.dev/blog/trusted-plublishing-2023/
jbms/sphinx-immaterial#281 (comment)
tox target to publish on pypi and make gh release https://github.com/jaraco/skeleton/blob/928e9a86d61d3a660948bcba7689f90216cc8243/tox.ini#L42-L58

2.3: pytest is failing

I'm trying to package your module as an rpm package. So I'm using the typical PEP517 based build, install and test cycle used on building packages from non-root account.

  • python3 -sBm build -w --no-isolation
  • because I'm calling build with --no-isolation I'm using during all processes only locally installed modules
  • install .whl file in </install/prefix>
  • run pytest with PYTHONPATH pointing to sitearch and sitelib inside </install/prefix>

import zodbpickle.pickle UnicodeDecodeError

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-33-82dc2cbee6e8> in <module>()
----> 1 import zodbpickle.pickle

C:\Users\user\Miniconda3\lib\site-packages\zodbpickle\pickle.py in <module>()
      2 
      3 if sys.version_info[0] >= 3:
----> 4     from .pickle_3 import *
      5 else:
      6     from .pickle_2 import *

C:\Users\user\Miniconda3\lib\site-packages\zodbpickle\pickle_3.py in <module>()
   1481 # Use the faster _pickle if possible
   1482 try:
-> 1483     from zodbpickle._pickle import *
   1484 except ImportError:
   1485     Pickler, Unpickler = _Pickler, _Unpickler

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 4: invalid start byte

Set up Appveyor to build and upload Windows wheels to PyPI

Checklist from zopefoundation/zope.proxy#24:

  • a copy of appveyor.yml from zope.interface or similar
  • someone with an Appveyor account (e.g. me) to enable this project
  • encrypted PyPI upload credentials in the appveyor.yml (no change required from the copy in zope.interface if I'm the one who enables the Appveyor account)
  • grant zope.wheelbuilder PyPI rights
  • push the appveyor.yml commit to master
  • (optionally) a push of a temporary branch that builds and uploads zodbpickle 0.6.0 wheels nah, let's get 0.7.0 out the door instead

Possible discrepency in Unpickler.noload_obj()

Unpickler.noload_obj() does not append a None to the stack. This appears to be inconsistent with noload_inst() et al, which do - in order to substitute for an instantiated object.

Like #40 I'm not sure if this is a very obscure or a misunderstanding on my part. I'll investigate further.

RFC: CPython 3.8.0b1 stdlib pickle.py depends on C _pickle module

I'd like to draw your attention to https://bugs.python.org/issue37210, in case you wish to express an interest. In CPython 3.8.0b1 the pure Python version of the pickle module is no longer independent of the C _pickle module. The former imports PickleBuffer from the later, and is essential to implementing Pickle protocol 5.

Although bpo-37210 was opened at my behest I'm proposing to close it as WONTFIX.

Should you have no preference, then sorry for the noise and feel free to close this ticket without comment.

zodbpickle completeness

Hi Tres et al,

I appreciate the creation of zodbpickle very much, because that is a really working
solution for pickling compatibility.

What I do not understand:

Why does zodbpickle not try to be complete:

Python 3's pickle defines a DEFAULT_PROTOCOL module global,
which zodbpickle.pickle does not have.
Is there any good reason to not defining these standard things?
Maybe intentionally, to be a common denominator, or is it just forgotten?

Many thanks, anyway, this saved me a couple of days ;-)

cheers - chris

Add coverage tests

Currently there is the ability via tox.ini to test for coverage but it uses nosetests and is currently broken. Additionally coveralls.io should be activated for this package.

Support pypy

running build_ext
building 'zodbpickle._pickle' extension
creating build/temp.linuxx86_64-2.7
creating build/temp.linux-x86_64-2.7/src
creating build/temp.linux-x86_64-2.7/src/zodbpickle
cc -O2 -fPIC -Wimplicit -I/nix/store/p4rfgk7an0x6vb04v48vzb4r0z3b84cw-pypy-2.3.1/pypy-c/include -c src/zodbpickle/_pickle_27.c -o build/temp.linux-x86_64-2.7/src/zodbpickle/_pickle_27.o
src/zodbpickle/_pickle_27.c:2:23: fatal error: cStringIO.h: No such file or directory
 #include "cStringIO.h"

RFE: please prepare test suite to be able use it with pytest as well

As we discussed in #70 I've pointed that currently test suite is not ready to be used with pytest.
Part of the issue is that is pytest scans only test* files so to even asses how much of the test suite needs to be adapted to be able use it with pytest as well some files needs to be renamed.

At the end of our conversation you mention that it takes time to work.
If you want I can try to prepare PR adding necessary adaptations.

Create more wheels

Currently there is only an sdist the sdist and win wheels on PyPI. Having more wheels there would be nice.

Tests fail under Py3k

Tests fail on the trunk which passed with 0.5.0::

$ tox -e py33
======================================================================
ERROR: test_load_str_protocol_1 (zodbpickle.tests.test_pickle_3.PyPicklerBytestrTests)
Test str from protocol=1
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 1280, in test_load_str_protocol_1
    b'bytestring \x00\xa0')
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 1265, in unpickleEqual
    loaded = self.loads(data, encoding="bytes")
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/test_pickle_3.py", line 44, in loads
    return u.load()
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/pickle_3.py", line 844, in load
    dispatch[key[0]](self)
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/pickle_3.py", line 1033, in load_short_binstring
    value = str(data, self.encoding, self.errors)
LookupError: unknown encoding: bytes

======================================================================
ERROR: test_load_str_protocol_2 (zodbpickle.tests.test_pickle_3.PyPicklerBytestrTests)
Test str from protocol=2
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 1287, in test_load_str_protocol_2
    b'bytestring \x00\xa0')
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 1265, in unpickleEqual
    loaded = self.loads(data, encoding="bytes")
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/test_pickle_3.py", line 44, in loads
    return u.load()
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/pickle_3.py", line 844, in load
    dispatch[key[0]](self)
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/pickle_3.py", line 1033, in load_short_binstring
    value = str(data, self.encoding, self.errors)
LookupError: unknown encoding: bytes

======================================================================
ERROR: test_pop_empty_stack (zodbpickle.tests.test_pickle_3.CPicklerTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 741, in test_pop_empty_stack
    self.assertRaises((pickle.UnpicklingError, IndexError), self.loads, s)
File "/opt/Python-3.3.1/lib/python3.3/unittest/case.py", line 571, in assertRaises
    return context.handle('assertRaises', callableObj, args, kwargs)
File "/opt/Python-3.3.1/lib/python3.3/unittest/case.py", line 135, in handle
    callable_obj(*args, **kwargs)
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/test_pickle_3.py", line 44, in loads
    return u.load()
_pickle.UnpicklingError: unpickling stack underflow

======================================================================
ERROR: test_reduce_bad_iterator (zodbpickle.tests.test_pickle_3.CPicklerTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 1127, in test_reduce_bad_iterator
    self.dumps(C(), proto)
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/test_pickle_3.py", line 37, in dumps
    p.dump(arg)
_pickle.PicklingError: fourth element of the tuple returned by __reduce__ must be an iterator, not list

======================================================================
ERROR: test_reduce_bad_iterator (zodbpickle.tests.test_pickle_3.CDumpPickle_LoadPickle)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 1127, in test_reduce_bad_iterator
    self.dumps(C(), proto)
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/test_pickle_3.py", line 37, in dumps
    p.dump(arg)
_pickle.PicklingError: fourth element of the tuple returned by __reduce__ must be an iterator, not list

======================================================================
ERROR: test_pop_empty_stack (zodbpickle.tests.test_pickle_3.DumpPickle_CLoadPickle)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 741, in test_pop_empty_stack
    self.assertRaises((pickle.UnpicklingError, IndexError), self.loads, s)
File "/opt/Python-3.3.1/lib/python3.3/unittest/case.py", line 571, in assertRaises
    return context.handle('assertRaises', callableObj, args, kwargs)
File "/opt/Python-3.3.1/lib/python3.3/unittest/case.py", line 135, in handle
    callable_obj(*args, **kwargs)
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/test_pickle_3.py", line 44, in loads
    return u.load()
_pickle.UnpicklingError: unpickling stack underflow

======================================================================
ERROR: test_insecure_strings (zodbpickle.tests.test_pickle_3.InMemoryPickleTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/pickletester_3.py", line 645, in test_insecure_strings
    self.assertRaises(ValueError, self.loads, buf)
File "/opt/Python-3.3.1/lib/python3.3/unittest/case.py", line 571, in assertRaises
    return context.handle('assertRaises', callableObj, args, kwargs)
File "/opt/Python-3.3.1/lib/python3.3/unittest/case.py", line 135, in handle
    callable_obj(*args, **kwargs)
File "/home/tseaver/projects/Zope/ZODB/zodbpickle/src/zodbpickle/tests/test_pickle_3.py", line 64, in loads
    return pickle.loads(buf, **kwds)
_pickle.UnpicklingError: pickle data was truncated

----------------------------------------------------------------------
Ran 393 tests in 1.835s

FAILED (errors=7, skipped=20)

Install only files suitable for the version of Python the package is being installed for

Package contains source files meant only for Python 3 and not valid on Python 2 like pickle_3.py for example. It would nice if files meant for specific version of Python would only be installed.

Below is the error one gets when building RPM containing zodbpickle.

Compiling /.../env/lib/python2.7/site-packages/zodbpickle/pickle_3.py ...
  File "/.../env/lib/python2.7/site-packages/zodbpickle/pickle_3.py", line 178
    def __init__(self, file, protocol=None, *, fix_imports=True):
                                             ^
SyntaxError: invalid syntax

Add support for pickle protocol 4: It can be *much* faster

Pickle protocol 4, introduced with Python 3.4, brings some nice enhancements. In particular, it turns out that the framing feature introduced in protocol 4 "with a potentially huge performance impact" really does have a big impact.

I'm using zodbpickle to gain access to pickle protocol 3 under Python 2 in RelStorage. Under Python 3, I use the builtin pickle and (under 3.4 and up) protocol 4.

In one test in RelStorage, protocol 4 takes about 5 seconds to handle an object, whereas protocol 3 takes about 30 seconds. (To workaround this in Python 2, I have to buffer a large object into memory; even using buffered IO from an SSD and with the data putatively in the buffer cache. Of course, this varies by system, but the slower the filesystem/stream the worse the problem).

It would be nice to have protocol 4 available to python 2.

I don't know how much work would be involved in the backport (and making sure that no_load stays supported). I personally don't have any concrete plans to do this work right now (maybe someday), but I thought I'd throw this feature request out there along with the motivating numbers in case it intrigues any one else.

2.3: python 2.x syntac in `zodbpickle` code

Looks like in latest version still it is possible to find python 2.x syntax.
This blocks generate python 3.x pcode.

+ /usr/bin/python3 -sBm compileall2 -f -j48 -o 0 -o 1 -o 2 -s /home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64 -p / /home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages /home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib/python3.8/site-packages
Listing '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages'...
Listing '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/__init__.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/fastpickle.py'...
Listing '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/tests'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/pickle_3.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/pickle.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/pickle_2.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/pickletools_2.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/pickletools_3.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/slowpickle.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/tests/__init__.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/tests/pickletester_2.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/tests/pickletester_3.py'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/tests/test_pickle.py'...
Listing '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle-2.3.dist-info'...
Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/tests/test_pickle_2.py'...
***   File "/usr/lib64/python3.8/site-packages/zodbpickle/tests/pickletester_2.py", line 433
    x = [0, 1L, 2.0, 3.0+0j]
             ^
SyntaxError: invalid syntax

Compiling '/home/tkloczko/rpmbuild/BUILDROOT/python-zodbpickle-2.3-2.fc35.x86_64/usr/lib64/python3.8/site-packages/zodbpickle/tests/test_pickle_3.py'...
***   File "/usr/lib64/python3.8/site-packages/zodbpickle/pickle_2.py", line 882
    except _Stop, stopinst:
                ^
SyntaxError: invalid syntax

***   File "/usr/lib64/python3.8/site-packages/zodbpickle/pickletools_2.py", line 1803
    print "skipping %r: it doesn't look like an opcode name" % name
          ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("skipping %r: it doesn't look like an opcode name" % name)?

DOC: Pickle is Unsafe

From http://docs.python.org/2/library/pickle.html#pickle-python-object-serialization

Warning The pickle module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.

Upon unserialization (.loads, .load), Python Pickles may execute arbitrary code.

Because of the warning in the Python documentation, this functionality of Pickle is not an:

References:

What's the reason behind `zodbpickle.binary`?

The code says it is there for converting binary data to Python 3. I tried it and failed:

Python 2.7:

>>> import pickle, pickletools, zodbpickle
>>> pickle.dumps(zodbpickle.binary('a'))
"ccopy_reg\n_reconstructor\np0\n(czodbpickle\nbinary\np1\nc__builtin__\nstr\np2\nS'a'\np3\ntp4\nRp5\n."
>>> pickletools.dis(pickle.dumps(zodbpickle.binary('a')))
    0: c    GLOBAL     'copy_reg _reconstructor'
   25: p    PUT        0
   28: (    MARK
   29: c        GLOBAL     'zodbpickle binary'
   48: p        PUT        1
   51: c        GLOBAL     '__builtin__ str'
   68: p        PUT        2
   71: S        STRING     'a'
   76: p        PUT        3
   79: t        TUPLE      (MARK at 28)
   80: p    PUT        4
   83: R    REDUCE
   84: p    PUT        5
   87: .    STOP
highest protocol among opcodes = 0

Trying to read this pickle on Python 3 failed:

>>> import pickle, pickletools, zodbpickle
>>> pickle.loads(b"ccopy_reg\n_reconstructor\np0\n(czodbpickle\nbinary\np1\nc__builtin__\nstr\np2\nS'a'\np3\ntp4\nRp5\n.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/copyreg.py", line 45, in _reconstructor
    obj = base.__new__(cls, state)
TypeError: str.__new__(bytes): bytes is not a subtype of str
str.__new__(bytes): bytes is not a subtype of str

The reason seems to be that the base classstr from Python 2 is still str (see the dis in Python 3):

>>> pickletools.dis(b"ccopy_reg\n_reconstructor\np0\n(czodbpickle\nbinary\np1\nc__builtin__\nstr\np2\nS'a'\np3\ntp4\nRp5\n.")
    0: c    GLOBAL     'copy_reg _reconstructor'
   25: p    PUT        0
   28: (    MARK
   29: c        GLOBAL     'zodbpickle binary'
   48: p        PUT        1
   51: c        GLOBAL     '__builtin__ str'
   68: p        PUT        2
   71: S        STRING     'a'
   76: p        PUT        3
   79: t        TUPLE      (MARK at 28)
   80: p    PUT        4
   83: R    REDUCE
   84: p    PUT        5
   87: .    STOP
highest protocol among opcodes = 0

It gets even worse when trying to unpickle a non-ASCII Python 2 string:

>>> pickle.dumps(zodbpickle.binary('ä'))
"ccopy_reg\n_reconstructor\np0\n(czodbpickle\nbinary\np1\nc__builtin__\nstr\np2\nS'\\xc3\\xa4'\np3\ntp4\nRp5\n."

Trying to read it in Python 3 results in:

>>> pickle.loads(b"ccopy_reg\n_reconstructor\np0\n(czodbpickle\nbinary\np1\nc__builtin__\nstr\np2\nS'\\xc3\\xa4'\np3\ntp4\nRp5\n.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

What am I doing wrong? Which pice of the puzzle am I missing?

SyntaxError !!!

BUG/PROBLEM REPORT (OR OTHER COMMON ISSUE)

What I did:

Starting Plone docker container throw below Syntax Errors:

What I expect to happen:

Expecting those files being both Python 2 and Python 3 compatible.

What actually happened:

File "/plone/buildout-cache/eggs/zodbpickle-2.0.0-py3.8-linux-x86_64.egg/zodbpickle/pickletools_2.py", line 1803
    print "skipping %r: it doesn't look like an opcode name" % name
          ^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("skipping %r: it doesn't look like an opcode name" % name)?

  File "/plone/buildout-cache/eggs/zodbpickle-2.0.0-py3.8-linux-x86_64.egg/zodbpickle/pickle_2.py", line 882
    except _Stop, stopinst:
                ^
SyntaxError: invalid syntax

  File "/plone/buildout-cache/eggs/zodbpickle-2.0.0-py3.8-linux-x86_64.egg/zodbpickle/tests/pickletester_2.py", line 433
    x = [0, 1L, 2.0, 3.0+0j]
             ^
SyntaxError: invalid syntax

What version of Python and Zope/Addons I am using:

Python 3

Debian

Py2: binary class is large and tracked by gc; implement in C?

Instances of zodbpickle.binary on CPython 2.7 are at least 32 bytes larger than the equivalent bytes/str object:

>>> from zodbpickle import binary
>>> sys.getsizeof(binary(''))
69
>>> sys.getsizeof('')
37

They are also tracked by the garbage collector, where bytes (which are known to be immutable) are not:

>>> import gc
>>> b = binary('')
>>> gc.collect(); gc.collect()
0
0
>>> gc.is_tracked(b)
True
>>> gc.is_tracked('')
False

Adding __slots__ = () changes none of this. (16 bytes of the overhead would be for the two GC pointers, another 8 for the __dict__ pointer, if present. I can't explain the final 8. Perhaps alignment? Perhaps the char* is no longer stored at the end of the object when subclassed so there's an extra pointer involved? I haven't looked into it.)

This adds up surprisingly quickly because ZODB uses zodbpickle.binary to store OIDs. They get turned into str in some cases, but in ghosts you can see the binary objects:

>>> import persistent
>>> import ZODB
>>> db = ZODB.DB(None)
>>> with db.transaction() as c:
...     c.root.key = persistent.Persistent()
...
>>> with db.transaction() as c:
...     type(c.root.key._p_oid)
...
<type 'str'>
>>> db.cacheMinimize()
None
>>> with db.transaction() as c:
...     type(c.root.key._p_oid)
...
<class 'zodbpickle.binary'>

In one application, binary was the largest type of object tracked by the GC by an order of magnitude (according to objgraph):

binary                                1141836
LOBucket                              316823
tuple                                 282777
LLBucket                              236532
dict                                  233084
list                                  159828
function                              124778

That's about a 35MB difference in memory used compared to str, but even worse, because all those objects are tracked by the GC, GC times increase by 7x (the relative impact diminishes as other objects are added but the constant cost remains):

$ python -m pyperf timeit \
     -s "strs = [str(i) for i in range(1141836)]; import gc" \
    "gc.collect()"
.....................
Mean +- std dev: 10.5 ms +- 0.9 ms
$ python -m pyperf timeit \
    -s "from zodbpickle import binary; strs = [binary(i) for i in range(1141836)]; import gc" \
    "gc.collect()"
.....................
Mean +- std dev: 69.8 ms +- 3.0 ms

I don't know of a way to solve these problems in Python, but I'm guessing/hoping it should be pretty simple to solve them by implementing binary using a C extension.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.