zopefoundation / btrees Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
When we subtract reference counter of an object, the destructor is automatically called if the value of counter is 0.
At the same time an error occurs, as the GC executed while the destructor is still in running phase.
To eliminate the error, we have added below program which exclude objects from GC in the destructor in line 4.
static void
Wrapper_dealloc(Wrapper *self)
{
PyObject_GC_UnTrack((PyObject )self); //Add this line to eliminate self from GC
Wrapper_clear(self);
self->ob_type->tp_free((PyObject)self);
}
Patch: ZODB3-3.10.7.zip
There are currently only the wheels for windows but not for linux at PyPI: https://pypi.org/project/BTrees/4.5.0/#files
See https://travis-ci.org/zopefoundation/BTrees/builds/428605264
The OID is now presented in hex.
Currently the size of the tree (__len__
) is calculated on demand and involves loading and querying all buckets. I think it would be reasonable to track the number of items explicitly (by counting through insertions and deletions), so that querying it doesn't need to load anything except the root BTree object. This only becomes problematic if client code pries out buckets from a tree and manipulates these itself.
Naturally this changes the pickle layout, so it would require extra compat code and would not be backwards compatible.
Perhaps something for a major release?
Getting a clang error when trying to install ZODB. Some issue with compiler arguments. Have latest Command Line Tools and Xcode 5.1
creating build/temp.macosx-10.9-intel-2.7/src/BTrees
cc -fno-strict-aliasing -fno-common -dynamic -arch x86_64 -arch i386 -g -Os -pipe -fno-common -fno-strict-aliasing -fwrapv -mno-fused-madd -DENABLE_DTRACE -DMACOSX -DNDEBUG -Wall -Wstrict-prototypes -Wshorten-64-to-32 -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch x86_64 -arch i386 -pipe -Isrc -I/System/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c src/BTrees/_OOBTree.c -o build/temp.macosx-10.9-intel-2.7/src/BTrees/_OOBTree.o
clang: error: unknown argument: '-mno-fused-madd' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
error: command 'cc' failed with exit status 1
Clang + LLVM Version:
gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 5.1 (clang-503.0.38) (based on LLVM 3.4svn)
Target: x86_64-apple-darwin13.1.0
Thread model: posix
We are getting a SystemError where a POSKeyError should be raised, probably due to the exception not being set in the C code. Here is an example traceback (data is an OOBTree):
>>> data.get(u'photo', None)
2018-05-25 18:19:20,336 WARNI [relstorage][MainThread] POSKeyError on oid 57955827: no tid found; history-free adapter
2018-05-25 18:19:20,337 ERROR [ZODB.Connection][MainThread] Couldn't load state for BTrees.OOBTree.OOBTree 0x037455f3
Traceback (most recent call last):
File "/srv/osfkarl/.buildout/eggs/cp27mu/ZODB-5.2.0-py2.7.egg/ZODB/Connection.py", line 796, in setstate
p, serial = self._storage.load(oid)
File "/srv/osfkarl/.buildout/eggs/cp27mu/perfmetrics-2.0-py2.7.egg/perfmetrics/__init__.py", line 127, in call_with_metric
return f(*args, **kw)
File "/srv/osfkarl/.buildout/eggs/cp27mu/RelStorage-2.1a2-py2.7-linux-i686.egg/relstorage/storage.py", line 587, in load
raise POSKeyError(oid)
POSKeyError: 0x037455f3
Traceback (most recent call last):
File "<console>", line 1, in <module>
SystemError: error return without exception set
This is very similar to #82, but since this is a POSKeyError it might come from a different place in the C code. Or it could be this is just a duplicate. The difference with the other issue is that there is no client disconnect involved.
ValueError: incomparable [0/1943]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/nix-build-python3.5-BTrees-4.1.4.drv-0/BTrees-4.1.4/BTrees/tests/testBTrees.py", line 426, in testF
oo
t[DoesntLikeBeingCompared()] = None
File "/tmp/nix-build-python3.5-BTrees-4.1.4.drv-0/BTrees-4.1.4/BTrees/tests/testBTrees.py", line 416, in __cmp
__
raise ValueError('incomparable')
SystemError: <class 'ValueError'> returned a result with an error set
Build in my own Linux machine.
Thank you!
https://ci.appveyor.com/project/mgedmin/btrees/build/1.0.6/job/qg5a0mkrfo00jyg2#L51
ERROR: test_extremes (BTrees.tests.test_OLBTree.OLBTreeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "c:\projects\btrees\BTrees\tests\test_OLBTree.py", line 120, in test_extremes
btree['SMALLEST_64_BITS'] = SMALLEST_64_BITS
OverflowError: Python int too large to convert to C long
All 32 bit versions work fine, 64 bit python 2.7 passes as well. https://ci.appveyor.com/project/mgedmin/btrees/build/1.0.6
I met a problem about multiunion function that when using default C implementation it reports error but could work correctly when using python implementation... For example
fwd_index = OOBTree()
tp = (1,2,3)
fwd_index["a"] = tp
IF.multiunion(self._fwd_index.values(("a",))
If an object's class attribute is changed, then it won't be correctly unghostified from a retrieval via B-tree since the class is baked into the bucket contents (as an optimization).
Perhaps a solution would be to listen to a commit event for objects loaded through a B-tree and fix-up any buckets if necessary.
I had to pin down to BTrees < 4.4
in zope.index
to get rid of the test failure.
See zopefoundation/zope.index#10.
I am not sure if it is a problem within BTrees
or if zope.index
has to be changed.
Any help is welcome.
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/lib/python2.7/site-packages/ZODB3-3.9.7-py2.7-linux-x86_64.egg -I/opt/include/python2.7 -c BTrees/_OOBTree.c -o build/temp.linux-x86_64-2.7/BTrees/_OOBTree.o
In file included from BTrees/BTreeModuleTemplate.c:420:0,
from BTrees/_OOBTree.c:39:
BTrees/BTreeTemplate.c: In function ‘_BTree_set’:
BTrees/BTreeTemplate.c:736:5: warning: implicit declaration of function ‘PER_READCURRENT’ [-Wimplicit-function-declaration]
PER_READCURRENT(self, goto Error);
^
BTrees/BTreeTemplate.c:736:27: error: expected expression before ‘goto’
PER_READCURRENT(self, goto Error);
^
error: command 'gcc' failed with exit status 1
Installing the BTrees 4.4.1 manylinux wheels for Python 2.7 doesn't get you the C extension.
This shows up as a build failure by anything that needs the C extension. Notably, because of zopefoundation/zope.security#20, security checks on iter(tree.items())
fail.
These wheels should probably be deleted from PyPI.
(Unfortunately they can't be re-uploaded again.)
(screening)NY120-14-143BX:~ vkarri$ pip install btrees
Collecting btrees
Downloading BTrees-4.2.0.tar.gz (209kB)
100% |████████████████████████████████| 212kB 1.5MB/s
Complete output from command python setup.py egg_info:
Download error on https://pypi.python.org/simple/persistent/: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590) -- Some packages may not be found!
Couldn't find index page for 'persistent' (maybe misspelled?)
Download error on https://pypi.python.org/simple/: EOF occurred in violation of protocol (_ssl.c:590) -- Some packages may not be found!
No local packages or download links found for persistent
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/sg/sjflst_x7tg9zq115lqy0yrs5z65tg/T/pip-build-8sK7EZ/BTrees/setup.py", line 160, in <module>
"""
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 111, in setup
_setup_distribution = dist = klass(attrs)
File "/Users/vkarri/pyenvs/screening/lib/python2.7/site-packages/setuptools/dist.py", line 268, in __init__
self.fetch_build_eggs(attrs['setup_requires'])
File "/Users/vkarri/pyenvs/screening/lib/python2.7/site-packages/setuptools/dist.py", line 313, in fetch_build_eggs
replace_conflicting=True,
File "/Users/vkarri/pyenvs/screening/lib/python2.7/site-packages/pkg_resources/__init__.py", line 836, in resolve
dist = best[req.key] = env.best_match(req, ws, installer)
File "/Users/vkarri/pyenvs/screening/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1081, in best_match
return self.obtain(req, installer)
File "/Users/vkarri/pyenvs/screening/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1093, in obtain
return installer(requirement)
File "/Users/vkarri/pyenvs/screening/lib/python2.7/site-packages/setuptools/dist.py", line 380, in fetch_build_egg
return cmd.easy_install(req)
File "/Users/vkarri/pyenvs/screening/lib/python2.7/site-packages/setuptools/command/easy_install.py", line 623, in easy_install
raise DistutilsError(msg)
distutils.errors.DistutilsError: Could not find suitable distribution for Requirement.parse('persistent')
We have a corrupt BTree in a ZODB. I don't know exactly how we got here, but it just happened recently with the most recent releases of BTree and it's C extension, so I suspect there's a lurking bug.
This happens to be a IOBTree in a catalog, so it's probably pretty large.
The first symptom is that doing list(btree.keys())
results in RuntimeError: the bucket being iterated changed size
. But this is a single-threaded program, and that's an atomic call implemented in C, so changing size is not possible.
When we subsequently do btree._check()
we get this happy error: AssertionError: Bucket next pointer is damaged
. Those two errors seem to fit correctly together (I think).
Here's where it gets interesting. If we do the same two operations with the Python implementation instead of the C implementation, they both pass. The iteration doesn't do that kind of checking, so the absence of a RuntimeError
is not surprising. But the Python _check
does look for next pointers being damaged, and doesn't raise an error on our tree:
for i in range(len(data)-1):
assert_(data[i].child._next is data[i+1].child,
"Bucket next pointer is damaged")
So we appear to have a case where Python and C are interpreting the same pickle data in different ways. (That's bad.)
It's highly probable that this BTree was written to by both the Python and C implementations at different times over the past few weeks. I think we removed the BTree C extension while we investigated zopefoundation/persistent#62 (BTree being another library with a C extension that had recently changed). It seems the obvious compatibility issues of mixing C and Python implementations may be gone, but there's still something going on there.
I can try to take a look at this (if no one wants to beat me to it 😄 ), but it'll probably be awhile before I can get back to it in depth.
The BTree package only builds if I specify PURE_PYTHON
. This is due to some inside baseball nonsense between setuptools and distutils and that thing that always works out: monkey-patching.
While not a technical issue with this project, it does seem possible to mitigate the issue by changing around how things are imported: https://bitbucket.org/pypa/setuptools/issue/309/error-each-element-of-ext_modules-option
I have a BTree
which I am persisting via ZODB
. In this BTree
I add and remove items over time. Somehow, I've managed to get a BTree
which has a key for a particular item, but raises a KeyError
if I try to actually access that item. Whatever happened to this BTree
it's been persisted to my ZODB
and even closing/reloading from the DB maintains the same behavior.
In the below code-snippets, repro
is the BTree
in question loaded from ZODB
.
print('job-0000000014' in repro) # prints True
print('job-0000000386' in repro) # prints False
for item in repro:
print(item)
shows
...
job-0000000014
...
job-0000000386
Finally, this code raises a KeyError
on job-0000000386
:
for item in repro:
print('{} = {}'.format(item, repro[item]))
I stepped through a bit in a debugger and it seems that this item is the last item in the last bucket, and interestingly, I can get the item from the bucket directly -- the following code works fine and returns the object t2
.
bucket = repro._firstbucket
while bucket._next is not None:
bucket = bucket._next
t1 = bucket.get('job-0000000281')
t2 = bucket.get('job-0000000386')
Digging a bit more, it seems that calling repro.maxKey()
triggers an access violation: -1073741819 (0xC0000005)
If you need more details about the structure of my BTree
with repros the problem I can share them with you (unfortunately I am not able to construct a new one which reproduces this problem deterministically, but I have a copy of one from a ZODB
)
The current buildout relies on being able to treat scripts as executables. This only works on newer linux kernals.
The need for this seems to be driven by:
Some alternatives:
-s
argument in the test runner to only run the BTrees tests.I vote for using submodules.
It seems like there was some sort of unlogged problem building the window wheels for the last release. @mgedmin would you mind restarting this build: https://ci.appveyor.com/project/mgedmin/btrees/build/1.0.60
I helped set up wheel building for windows and mac os x about a year ago. In the meantime, the infrastructure for building manylinux1 wheels has gotten better.
Is there interest in PRs for building manylinux1 wheels for this
Here's an example of how I've been using Travis to build manylinux wheels:
The C implementation does a comparison by first checking whether the two keys are the same pointer; the Python implementation just goes right to the ==
operator. In some cases of (broken?) objects this leads to a discrepancy: The C implementation can find a key, but the Python implementation cannot.
I would suggest that the Python implementation should use k is key or k == key
to clear this up.
See https://groups.google.com/forum/#!topic/zodb/xhVM0ejl6aE
This was requested on the BTrees mailing list:
On Mar 14, 2019, at 13:16, Tres Seaver wrote:
On 3/14/19 10:28 AM, Qiwen Chen wrote:
Is there an easy to extend the F (32-bit C float) type in BTrees ( such as IFBTree, LFBTree) to a DOUBLE type?
I currently use IOBTree and LOBTree for storing double type. But I would assume it's not as efficient.
There is no knob to allow that. You could cobble it together yourself, following the pattern for the float-value trees:
- Copy 'BTrees/floatvalue.h' -> 'doublevalue.h' and make the appropriate changes (figure out which Python API calls need to change).
- Add new interfaces to 'BTrees/Interfaces.py' (IIntegerDoubleBTreeModule, ILongDoubleBTreeMoudle)
- Copy 'BTrees/_IFBTree.c' -> '_IDBTree.c' and make appopriate changes ('#include "doublevaluemacros.h"', etc. Ditto for 'BTrees/LFBTree.c'.
- Copy 'BTrees/IFBTree.py' -> 'IDBTree.py' and make appropriate changes. Ditto for 'BTrees/LFBTree.py'.
- Copy 'BTrees/tests/test_IFBTree.py' and make appropriate chagnes. Ditto for 'BTrees/tests/test_LFBtree.py'. You could maybe skip this part, but they would be required if you wanted to get your branch merged.
- Update 'setup.py' to add the new 'FLAVORS' and 'FAMILIES' values.
- Test thoroughly, including measuring RAM saved over the 'IO' / 'LO' variants. :)
It would be awesome if BTrees implemented the reversed method.
from collections import Reversible
from BTrees.IOBTree import IOBTree
isinstance(IOBTree(), Reversible)
# >>> False
for _ in reversed(IOBTree()):
pass
# >>> TypeError: 'BTrees.IOBTree.IOBTree' object is not reversible
What do you think?
Reading the docs seems this library is made to work to ZODB, however, I'm trying to use this simply as an in memory BTree to store some ordered data and be able to efficiently query it after, since it's implemented in C (or the critical parts at least) it seems to yield slightly better results than using a simple python list with a bisect algorithm, not to mention a much cleaner syntax for range searches.
Am I using this wrong since it's optimized around ZODB? Should I consider some other implementation?
Example:
Python 3.4.5 (default, Jun 27 2016, 04:57:21)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import BTrees, pickle
>>> bt = BTrees.LOBTree.BTree()
>>> for i in range(15000): bt[i] = i
...
>>> pickle.dumps(bt)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded while calling a Python object
You can of course call sys.setrecursionlimit
to make this go away...up to a point, at which time your Python segfaults with something like this (I think _pickle.so
here is from zodbpickle):
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 org.python.python 0x0000000105f41b1e _PyUnicode_AsUTF8String + 26
1 org.python.python 0x0000000105f416f5 PyUnicode_AsEncodedString + 409
2 cPersistence.so 0x0000000107981978 Per_getattro + 232
3 org.python.python 0x0000000105ef2054 PyObject_CallMethodObjArgs + 124
4 cPersistence.so 0x0000000107980050 pickle___reduce__ + 144
5 org.python.python 0x0000000105ef19f2 PyObject_Call + 103
6 org.python.python 0x0000000105f899d2 PyEval_CallObjectWithKeywords + 165
7 org.python.python 0x0000000105f3b256 object_reduce_ex + 196
8 org.python.python 0x0000000105ef19f2 PyObject_Call + 103
9 _pickle.so 0x0000000107e3a71d _Pickle_FastCall + 51
10 _pickle.so 0x0000000107e394a6 save + 4598
11 _pickle.so 0x0000000107e3b8b0 store_tuple_elements + 59
12 _pickle.so 0x0000000107e39003 save + 3411
13 _pickle.so 0x0000000107e3ab80 save_reduce + 1068
14 _pickle.so 0x0000000107e39a1d save + 5997
15 _pickle.so 0x0000000107e3b8b0 store_tuple_elements + 59
16 _pickle.so 0x0000000107e39003 save + 3411
17 _pickle.so 0x0000000107e3ab80 save_reduce + 1068
18 _pickle.so 0x0000000107e39a1d save + 5997
19 _pickle.so 0x0000000107e3b8b0 store_tuple_elements + 59
The exact number of items it takes for this to happen depends on the BTree type; an IIBTree, for example, can hold more items before it gets here (different bucket sizes?). I suspect it also depends on the system and the word size. On my system the limit is somewhere under 15,000 items for the LOBTree
above (where even the values are primitive types).
This is a consequence of the way a BTree is composed of buckets which are composed of other buckets...and their stored state all boils down to tuples:
>>> state = bt.__getstate__()
>>> len(state), type(state)
(2, <class 'tuple'>)
>>> type(state[0][0]), type(state[0][0].__getstate__()[0])
(<class 'BTrees.LOBTree.LOBucket'>, <class 'tuple'>)
This works out fine when used in the context of ZODB because the sub-buckets are replaced with persistent object identifiers. It just means that large-ish BTree can't be pickled outside of ZODB or some similar system.
I doubt anything can be done about this without major redesign and breaking compatibility, but maybe someone will have an idea. And maybe it's worth a mention in some docs somewhere? (I also wanted to leave this here for Google's sake.)
(For what it's worth, this isn't actually a problem for me. It came up in the context of testing some cache persistence strategies for RelStorage. Pickling a single large dict with ~600K keys in it is very slow prior to pickle protocol 4, so I wondered if BTrees might work better. Because of this they didn't, and I didn't want to role a mini-persistent object system, so I went a different direction.)
As I found, OOBTreeItems
cannot be sliced backwards or with a step greater than 1
, seemingly due to this line.
In [1]: from BTrees.OOBTree import OOBTree
In [2]: a = OOBTree({1:"one", 2:"two", 3:"three"})
In [3]: list(a.keys()[1:2:-1])
---------------------------------------------------------------------------
RuntimeError
Traceback (most recent call last)
<ipython-input-3-2e5a94b46d47> in <module>()
----> 1 list(a.keys()[1:2:-1])
RuntimeError: slices must have step size of 1
Is there any plans to develop this feature? Does it require a lot of work?
For the moment, I use workarounds like reversed(a.keys()[1:2])
to simulate a -1
step, but I suppose it would be more efficient if that was coded in C?
I ran into an implementation difference between the C version and the Python versions.
Here's PyPy:
Python 2.7.10 (7e8df3df9641, Jun 14 2016, 13:30:54)
[PyPy 5.3.1 with GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>> import BTrees
>>>> BTrees.OOBTree.OOBTree().get(object())
>>>>
Here's CPython:
Python 2.7.12 (default, Jul 11 2016, 16:16:26)
[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import BTrees
>>> BTrees.OOBTree.OOBTree().get(object())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Object has default comparison
>>>
The specific type of object with default comparison doesn't really matter. And they both raise a TypeError if you try to set a key with default comparison (albeit with different error messages).
Which is the right behaviour here? I would argue that the Python behaviour is certainly nicer and friendlier to the user. It makes it easier to substitute a BTree for a dict in more places. In fact, on CPython, I had to subclass the BTree to be able to use it in place of a dict because of this TypeError (even though I had already made sure that we would never try to set such a value, sometimes we still query for them).
If there's consensus I can put together a PR to bring them into line.
The Python implementation currently allows this.
See https://groups.google.com/d/msg/zodb/xhVM0ejl6aE/7mXRaZeEAQAJ
OS X 10.8.3, Python 3.3.1, ZODB 4.0.0b2. Traceback below. Python config.log at:
https://gist.github.com/pauleveritt/5621925
In file included from BTrees/_LOBTree.c:41:
BTrees/BTreeModuleTemplate.c:105:18: warning: signed shift result (0x100000000)
requires 34 bits to represent, but 'int' only has 32 bits
[-Wshift-overflow]
maxint = INT_GETMAX();
^~~~~~~~~~~~
BTrees/_compat.h:27:24: note: expanded from macro 'INT_GETMAX'
~^ ~~
1 warning generated.
In file included from BTrees/_OLBTree.c:41:
BTrees/BTreeModuleTemplate.c:105:18: warning: signed shift result (0x100000000)
requires 34 bits to represent, but 'int' only has 32 bits
[-Wshift-overflow]
maxint = INT_GETMAX();
^~~~~~~~~~~~
BTrees/_compat.h:27:24: note: expanded from macro 'INT_GETMAX'
~^ ~~
1 warning generated.
In file included from BTrees/_LLBTree.c:43:
BTrees/BTreeModuleTemplate.c:105:18: warning: signed shift result (0x100000000)
requires 34 bits to represent, but 'int' only has 32 bits
[-Wshift-overflow]
maxint = INT_GETMAX();
^~~~~~~~~~~~
BTrees/_compat.h:27:24: note: expanded from macro 'INT_GETMAX'
~^ ~~
1 warning generated.
In file included from BTrees/_LFBTree.c:43:
BTrees/BTreeModuleTemplate.c:105:18: warning: signed shift result (0x100000000)
requires 34 bits to represent, but 'int' only has 32 bits
[-Wshift-overflow]
maxint = INT_GETMAX();
^~~~~~~~~~~~
BTrees/_compat.h:27:24: note: expanded from macro 'INT_GETMAX'
The interface for IFBTree.values and LFBTree.values led me to believe I could obtain an iterator for entries with float values falling between two points, min= and max=.
However, it throws 'expected integer key' when using min or max. Perhaps I misunderstand the interface?
To reproduce, paste the following to Python >=3.4 repl:
from BTrees.LFBTree import LFBTree
import random
lft = LFBTree()
for i in range(10000):
lft.update({random.randint(a=1, b=9999999): random.random()})
vals = lft.values(min=0.1, max=0.2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected integer key
Would it be possible to have binary wheels mac and windows on pypi? I'd be happy to build them or set up up scripts for appveyor and travis-ci to build them for you.
Release 4.1.1 fails one test on 32-bit Linux systems with python 3.4:
Traceback (most recent call last):
File "/builddir/build/BUILD/python-BTrees-4.1.1/python3-BTrees-4.1.1/BTrees/tests/test_OLBTree.py", line 118, in test_extremes
btree['SMALLEST_64_BITS'] = SMALLEST_64_BITS
OverflowError: Python int too large to convert to C long
This occasionally shows up in the ZODB tests of multiple threads under PyPy:
Error in test check7ZODBThreads (ZODB.tests.testMVCCMappingStorage.MVCCMappingStorageTests)
Traceback (most recent call last):
File "/opt/python/pypy2.7-7.1.1/lib-python/2.7/unittest/case.py", line 329, in run
testMethod()
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/tests/MTStorage.py", line 234, in check7ZODBThreads
self._checkNThreads(7, ZODBClientThread, db, self)
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/tests/MTStorage.py", line 222, in _checkNThreads
t.join(60)
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/tests/MTStorage.py", line 45, in join
self._exc_info[0], self._exc_info[1], self._exc_info[2])
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/tests/MTStorage.py", line 37, in run
self.runtest()
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/tests/MTStorage.py", line 68, in runtest
self.commit(d, i)
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/tests/MTStorage.py", line 75, in commit
transaction.commit()
File "/home/travis/build/zopefoundation/ZODB/eggs/transaction-2.4.0-py2.7.egg/transaction/_manager.py", line 252, in commit
return self.manager.commit()
File "/home/travis/build/zopefoundation/ZODB/eggs/transaction-2.4.0-py2.7.egg/transaction/_manager.py", line 131, in commit
return self.get().commit()
File "/home/travis/build/zopefoundation/ZODB/eggs/transaction-2.4.0-py2.7.egg/transaction/_transaction.py", line 316, in commit
self._synchronizers.map(lambda s: s.afterCompletion(self))
File "/home/travis/build/zopefoundation/ZODB/eggs/transaction-2.4.0-py2.7.egg/transaction/weakset.py", line 61, in map
f(elt)
File "/home/travis/build/zopefoundation/ZODB/eggs/transaction-2.4.0-py2.7.egg/transaction/_transaction.py", line 316, in <lambda>
self._synchronizers.map(lambda s: s.afterCompletion(self))
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/Connection.py", line 757, in afterCompletion
self.newTransaction(transaction, False)
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/Connection.py", line 737, in newTransaction
invalidated = self._storage.poll_invalidations()
File "/home/travis/build/zopefoundation/ZODB/src/ZODB/tests/MVCCMappingStorage.py", line 99, in poll_invalidations
excludemin=True, excludemax=False):
File "/home/travis/build/zopefoundation/ZODB/eggs/BTrees-4.5.1-py2.7-linux-x86_64.egg/BTrees/_base.py", line 1218, in __iter__
for k in getattr(bucket, itertype)(*iterargs):
File "/home/travis/build/zopefoundation/ZODB/eggs/BTrees-4.5.1-py2.7-linux-x86_64.egg/BTrees/_base.py", line 397, in <genexpr>
for i in xrange(*self._range(*args, **kw)))
IndexError: list index out of range
This grows out of the discussion here #21 What happened in my program is, I have some greenlets writing changes to a BTree
, and another greenlet committing the changes. Doing this with enough load, and the BTree
s start to throw KeyError
s when reading keys, they're supposed to contain. This only happens when using ClientStorage
and a server, not on direct access to a DemoStorage
. So please let me know if I should redirect this error report to the ZEO package.
I'm aware it's not an ideal setup, and I probably should have used a separate transaction manager for every single greenlet. Though, I was using this setup before on direct file access, and thought having the database run as a separate server process should be a drop-in replacement. Here the code:
"""
As counterpart for this test, run a ZEO server with this config:
<zeo>
address /tmp/zodbsock
</zeo>
<demostorage>
</demostorage>
"""
from gevent import monkey, joinall, spawn, coros, sleep
monkey.patch_all()
import logging
import random
import string
import transaction
from BTrees.OOBTree import OOBTree
from ZODB import DB, DemoStorage
from ZEO import ClientStorage
class Database(object):
_closed = True
def __init__(self, socket):
self.socket = socket
self.commit_lock = coros.RLock()
self.open()
def open(self):
if not self._closed:
return
if self.socket:
self.storage = ClientStorage.ClientStorage(self.socket)
else:
self.storage = DemoStorage.DemoStorage()
self.db = DB(self.storage)
self.tm = transaction.TransactionManager()
self.connection = self.db.open(transaction_manager=self.tm)
self.root = self.connection.root()
self._closed = False
def close(self):
if self._closed:
return
self.tm.commit()
self.connection.close()
self.db.close()
self.storage.close()
del self.root
del self.connection
del self.db
del self.storage
self._closed = True
def commit(self):
with self.commit_lock:
self.tm.commit()
self.tm.begin()
print("Commited")
NOBJS = 10000
def randstr(n):
return ''.join(random.choice(string.ascii_lowercase) for _ in range(n))
def keyfunc(k1, k2='asdf', **kwargs):
return str(k1) + '_' + k2
def init(sock=None):
db = Database(sock)
if 'test' in db.root:
del db.root['test']
db.commit()
db.root['test'] = test = OOBTree()
for i in xrange(NOBJS):
test[keyfunc(i)] = randstr(20)
db.commit()
return db
def _stress_inner(tree, j):
""" Read, updated and insert data """
i = random.randint(0, NOBJS - 1)
# Without reading, no errors happen
_ = tree[keyfunc(i)]
tree[keyfunc(i)] = randstr(30)
# Adding no items reduces the KeyErrors
tree[keyfunc(i + j * 1000, 'extra')] = randstr(20)
def stress(db, i):
tree = db.root['test']
for j in range(100):
_stress_inner(tree, j)
# Using a plain sleep() reduces the chance for KeyErrors
sleep(random.random() / 1000)
def main():
logging.basicConfig(level=logging.INFO)
# Using plain init() and the KeyErrors are gone
db = init('/tmp/zodbsock')
try:
stressers = [spawn(stress, db, i) for i in xrange(100)]
while not all(s.ready() for s in stressers):
db.commit()
sleep(0.1)
joinall(stressers)
db.commit()
finally:
db.close()
if __name__ == '__main__':
main()
The errors I get, look like this:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
result = self._run(*self.args, **self.kwargs)
File "crashdemo.py", line 106, in stress
_stress_inner(tree, j)
File "crashdemo.py", line 97, in _stress_inner
_ = tree[keyfunc(i)]
KeyError: '2063_asdf'
<Greenlet at 0x7fdcdff6e7d0: stress(<__main__.Database object at 0x7fdce040b190>, 36)> failed with KeyError
Depending a bit on luck, I'd say they happen in 90% of the cases, maybe 50 per run. As you can see from the code, all these missing keys got created beforehand, so they should be available.
Given this class:
class Bad(object):
def __eq__(self, other):
return False
Under Python 2.7, you can use instances of this class as a key in any size C BTree.
Under Python 3, you can insert this into an empty tree only. Once you do, the tree is broken. Jim Fulton suggested that the key should be compared to itself on an empty tree to catch this (but given that pointer equality is checked first, it's not clear exactly how that would be done).
The Python implementation rejects this object in all versions in all size trees.
See https://groups.google.com/forum/#!topic/zodb/xhVM0ejl6aE
Keys such as None currently result in KeyError in the subscript cases, even if they're in the tree. (How are they in the tree? Unpickling old data, or #52.) We should be able to delete them.
The Python implementation allows this.
See https://groups.google.com/d/msg/zodb/xhVM0ejl6aE/NMBsQDNcAQAJ
@gforcada and I are running into semi-rare weird error messages that seem to indicate that something in the BTrees can't handle issues when an object can't be loaded properly. This seems to be somewhat of a race condition around the C-level implementation of getitem, as many times the exception will occur around something like those lines of code:
File "/srv/s-derfreitag/deployment/work/plone/eggs/zope.intid-3.7.2-py2.7.egg/zope/intid/__init__.py", line 87, in getId
return self.ids[key]
SystemError: error return without exception set
Module Products.PluginIndexes.common.UnIndex, line 458, in _apply_index
SystemError: error return without exception set
return self._tree[id].__of__(self)
SystemError: error return without exception set
Whenever this appears, either the Zope server is being shut down at the moment and ZEO has already disconnected, or the ZEO server has had a disconnect and immediately before those errors we see the ZEO client complaining:
File "/srv/s-derfreitag/deployment/work/plone/eggs/ZODB3-3.10.7-py2.7-linux-x86_64.egg/ZODB/Connection.py", line 901, in _setstate
p, serial = self._storage.load(obj._p_oid, '')
AttributeError: 'NoneType' object has no attribute 'load'
File "/srv/s-derfreitag/deployment/work/plone/eggs/ZODB3-3.10.7-py2.7-linux-x86_64.egg/ZEO/ClientStorage.py", line 88, in __getattr__
raise ClientDisconnected()
One thing I'm wondering about is why there would be anything still processing data or requests if Zope already disconnected the database. My memory tells me that Zope would allow threads to finish their requests and then shut down.
So, from reading and googling the error message it seems somewhere the Python C API is signalled an error but no exception is set. I looked around a bit and a wild guess would be that this could be an issue from cPersistence in the PER_USE macro (it seems to signal an error condition but doesn't set an exception apparently) or by the _bucket_get function in BucketTemplate.c:86 where we return NULL indicating an error and seem to implictly rely on someone else already having set an exception. I could be digging at the completely wrong end, though ... :)
from BTrees.OOBTree import OOBTree
keys = [('a', 100), ('b', 200), ('c', None)]
tree = OOBTree()
for i, key in enumerate(keys):
tree[key] = i
tree[('c', None)] # works
tree[('a', None)] # TypeError
tree[(None, None)] # TypeError
I recently migrated from py2, where the above code worked fine. With py3, the more strict comparison seems to cause some trouble. Tested with the latest version from pypi.
This has become common for zopefoundation packages such as persistent, Acquisition, zope.security and so on. It's convenient for debugging and exploring the system, and also for testing.
BTrees does not support this variable, and this causes problems because it conflicts with persistent
, which does:
$ pip freeze | egrep "BTree|pers"
BTrees==4.4.1
persistent==4.2.4.2
$ cat test.py
import persistent
from BTrees.OOBTree import OOBTree
class BaseClass(persistent.Persistent):
pass
class MyTree(OOBTree, BaseClass):
pass
$ python --version
Python 3.6.2
$ python test.py # C Extensions work
$ PURE_PYTHON=1 python test.py # Mixed does not
Traceback (most recent call last):
File "test.py", line 7, in <module>
class MyTree(OOBTree, BaseClass):
TypeError: multiple bases have instance lay-out conflict
If this is agreeable, I can work up a PR when I get some time.
Why isn't there any documentation for the BTrees.Length.Length
class? Is that because it doesn't have an interface and the documentation is supposed to contain only interfaces? Is Length
even an official part of BTrees
if it's undocumented? It is mentioned in the ZODB docs, that's why I'm wondering...
Seen during test runs under Python 2
//BTrees/BTrees/_base.py:1502: DeprecationWarning: integer argument expected, got float
if not unpack("i", pack("i", v))[0] == v: #pragma: no cover
//BTrees/BTrees/_base.py:1521: DeprecationWarning: integer argument expected, got float
if not unpack("q", pack("q", v))[0] == v: #pragma: no cover
I recently switched from running ZODB in-process to a client-server solution. Since I switched, I keep getting segfaults within the client process, that look like that:
#0 _BTree_get(self=0x7f298..., keyarg=0x7f2988... has_key=0) at BTrees/BTreeTemplate.c:268
child = <optimized out>
key = 0x7f29....
result = 0x0
copied = <optimized out>
#1 0x00000000000000... in PyEval_EvalFrameEx ()
.....
The client is using gevent and accesses the database via zlibstorage/clientstorage. Not sure if the the gevent stuff is relevant in terms of thread safety and the crash may be related to that, but looking at several of these stack traces, they all seem to happen in this _BTree_get function.
If I somehow can provide more helpful debug output, please let me know.
Edit:
Doing some more experiments, I caught another segfault, this time in rangeSearch:
Program received signal SIGSEGV, Segmentation fault.
BTree_rangeSearch (self=0x7fffe1f74c50, args=<optimized out>, kw=<optimized out>, type=<optimized out>) at src/BTrees/BTreeTemplate.c:1595
1595 src/BTrees/BTreeTemplate.c: Datei oder Verzeichnis nicht gefunden.
(gdb) info threads
Id Target Id Frame
16 Thread 0x7fffbb7fe700 (LWP 22623) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffac000bc0) at sem_waitcommon.c:42
15 Thread 0x7fffbbfff700 (LWP 22622) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffb4000e00) at sem_waitcommon.c:42
14 Thread 0x7fffd8ff9700 (LWP 22621) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffc0000e30) at sem_waitcommon.c:42
13 Thread 0x7fffd97fa700 (LWP 22620) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffbc000bf0) at sem_waitcommon.c:42
12 Thread 0x7fffd9ffb700 (LWP 22619) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffc8000e00) at sem_waitcommon.c:42
11 Thread 0x7fffda7fc700 (LWP 22618) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffc4001090) at sem_waitcommon.c:42
10 Thread 0x7fffdaffd700 (LWP 22617) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffd0000bc0) at sem_waitcommon.c:42
9 Thread 0x7fffdb7fe700 (LWP 22616) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffcc000e10) at sem_waitcommon.c:42
8 Thread 0x7fffdbfff700 (LWP 22615) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffd4000bf0) at sem_waitcommon.c:42
7 Thread 0x7fffe0ea6700 (LWP 22614) "python" 0x00007ffff7bca0c9 in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7fffdc000cd0) at sem_waitcommon.c:42
6 Thread 0x7fffee64a700 (LWP 22607) "python" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
5 Thread 0x7fffeee4b700 (LWP 22606) "python" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
4 Thread 0x7fffef64c700 (LWP 22605) "python" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
3 Thread 0x7ffff41fb700 (LWP 22604) "python" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
* 1 Thread 0x7ffff7fca700 (LWP 22596) "python" BTree_rangeSearch (self=0x7fffe1f74c50, args=<optimized out>, kw=<optimized out>, type=<optimized out>) at src/BTrees/BTreeTemplate.c:1595
So I'm still clueless as before in terms of how to fix it, but I guess it's rather gevent+ClientStorage not playing well together, rather than BTrees, and I'll close the issue here.
This grows out of the discussion in zopefoundation/persistent#32. The user had been using the C implementation of BTrees and had many saved pickles. Upon updating dependencies, the user discovered he was using the Python implementation. This was slow, so the user switched back to the C implementation.
This leads to two related issues. The first, most serious, issue is that a BTree pickled by the Python implementation will only ever unpickle as the Python implementation, though a BTree pickled by the C implementation will unpickle as whichever implementation is available (but see below). Because of the tree-like structure of BTree objects, this can lead to AttributeErrors when the parent object is (re)pickled as the Python implementation and still has children that are pickled as the C implementation.
The second issue I discovered when testing the first issue. An empty C BTree unpickled with the Python implementation isn't initialized correctly, leading to its own AttributeErrors.
In environments like my own organizations, that are slowly rolling out PyPy, it will be common to have developers and production/test environments sharing pickles where everyone is on some combination of PyPy and CPython. Maybe even different production (micro)services will be on PyPy and CPython at the same time. So the first AttributeError could be a real problem.
It seems to me like the two implementations should produce pickles that always result in loading the "best" available implementation, i.e., the Python implementation should not specify the Py
suffix in pickle names. I think that's the only way to avoid the first issue. Though, there is an argument to be made that someone might explicitly want to pickle the Python implementation, but I'm not sure why. I'm not sure how to fix this without changing the pickling format in a backwards-incompatible way, but I'm not a pickle expert. I can try to look into that a bit.
I suspect the second issue is easier to fix.
If there's consensus that something needs to be done on this I can do the work to submit a PR.
When installing BTrees, the index-url of all dependencies seem to be "forced" to pypi.python.org regardless of the index-url setting in requirements.txt or in ~/.pip.ini
This creates serious problems in environments where Artifactory is used as a "proxy" to pypi.python.org and where direct connectivity to pypi.python.org is not permitted by policy. This causes the install to hang, and then finally fail with an error saying it failed to retrieve a pypi.python.org URL.
This is easy to reproduce if you have an Artifactory server or some other non pypi.python.org PyPi repository URL configured in your ~/.pip.ini like this:
--- snip ---
[global]
index-url = http://my.local.repo/artifactory/api/pypi/pypi-repos/simple
--- snip ---
When I do pip install BTrees, the first HTTP/HTTPS request is made to grab the BTrees package from my local Artifactory server, as expected. After that, the 'persistent' package dependency is fetched since it is a dependency of BTrees- but it is not fetched from the Artifactory server as one would expect- instead, it always goes directly to pypi.python.org to try to fetch the package.
The workaround is to manually install all of the dependencies before install BTrees, so that it doesn't need to fetch any dependencies.
I would like to submit a pull request to fix the issue, but I'm not terribly familiar with setuptools or pip. Maybe this is an easy fix for someone who knows the tools better?
Thanks.
Just like with persistent. The fix is probably the same
$ wget https://files.pythonhosted.org/packages/10/71/3f6de6470513f11311bc91aeae374c0cc539cf237656853c8bef08001dd9/BTrees-4.5.1-cp27-cp27m-macosx_10_6_intel.whl
...
2018-10-22 16:33:02 (2.79 MB/s) - ‘BTrees-4.5.1-cp27-cp27m-macosx_10_6_intel.whl’ saved [838249/838249]
$ unzip -l BTrees-4.5.1-cp27-cp27m-macosx_10_6_intel.whl
...
2703 08-10-2018 11:05 BTrees/tests/test_utils.py
31 08-10-2018 11:05 terryfy/__init__.py
811 08-10-2018 11:05 terryfy/bdist_wheel.py
720 08-10-2018 11:05 terryfy/cp_suff_real_libs.py
1099 08-10-2018 11:05 terryfy/fuse_suff_real_libs.py
1708 08-10-2018 11:05 terryfy/monkeyexec.py
1018 08-10-2018 11:05 terryfy/repath_lib_names.py
2148 08-10-2018 11:05 terryfy/test_travisparse.py
1763 08-10-2018 11:05 terryfy/travisparse.py
9118 08-10-2018 11:05 terryfy/wafutils.py
Travis currently only has 3.7.0a4+, but that should be good enough to test with for this package, I am guessing.
I think it has something to do with the new (18.0) release of pip, but I'm still investigating.
Historically my company has used uuid4
to identify its data. So right now we use uuid strings as OOBtree keys, but the BTrees documentation advises that we can also use data structures specialized to integers, which are faster and use less memory
. I thought about using the uuid as integers instead of strings, so we could maybe benefits of objects like LOBtree.
It seems that 128bits integers are not supported by BTrees. Some people have solutions to convert a uuid to a 64bits int, but I would prefer a cleaner way.
Is 128bits int support considered for BTrees?
import uuid, BTrees.LOBTree
BTrees.LOBTree.LOBTree()[uuid.uuid4().int]
# ValueError: long integer out of range
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.