patx / pickledb Goto Github PK
View Code? Open in Web Editor NEWpickleDB is an open source key-value store using Python's json module.
Home Page: https://patx.github.io/pickledb
License: BSD 3-Clause "New" or "Revised" License
pickleDB is an open source key-value store using Python's json module.
Home Page: https://patx.github.io/pickledb
License: BSD 3-Clause "New" or "Revised" License
Documentation states:
'''Remove a list and all of its values'''
But it will actually return the number of entries in a list, not the values.
source
import pickledb
db = pickledb.load('example.db', False)
db.set('key', 'value')
db.dump()
db = pickledb.load('example.db', False)
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Documents\Python Projects\garbage-chatter\venv\lib\site-packages\pickledb.py", line 39, in load
return pickledb(location, auto_dump)
File "C:\Users\Documents\Python Projects\garbage-chatter\venv\lib\site-packages\pickledb.py", line 49, in init
self.load(location, auto_dump)
File "C:\Users\Documents\Python Projects\garbage-chatter\venv\lib\site-packages\pickledb.py", line 79, in load
self.loaddb()
File "C:\Users\Documents\Python Projects\garbage-chatter\venv\lib\site-packages\pickledb.py", line 96, in loaddb
self.db = json.load(open(self.loco, 'rb'))
File "C:\Users\AppData\Local\Programs\Python\Python35\Lib\json_init.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "C:\Users\AppData\Local\Programs\Python\Python35\Lib\json_init.py", line 312, in loads
s.class.name))
TypeError: the JSON object must be str, not 'bytes'
pickleDB is a lightweight and simple key-value store. It is built upon Python's simplejson module and was inspired by redis. It is licensed with the BSD three-caluse license.
When a SIGTERM signal (e.g. by hitting Ctrl-C in terminal) is received during dump() execution, the exit is performed immediately which causes corruption of the pickledb database file. This can be very annoying as it might mean the loss of critical data.
Upon execution of load() after a restart, I get this this error:
Traceback (most recent call last):
File "test.py", line 22, in <module>
main()
File "test.py", line 14, in main
db = get_pickledb()
File "test.py", line 9, in get_pickledb
db = pickledb.load(DB_FILEPATH, False)
File "/home/don/tmp/pickledb_robust/pickledb/pickledb.py", line 34, in load
return pickledb(location, option)
File "/home/don/tmp/pickledb_robust/pickledb/pickledb.py", line 41, in __init__
self.load(location, option)
File "/home/don/tmp/pickledb_robust/pickledb/pickledb.py", line 49, in load
self._loaddb()
File "/home/don/tmp/pickledb_robust/pickledb/pickledb.py", line 186, in _loaddb
self.db = json.load(open(self.loco, 'rb'))
File "/usr/lib/python2.7/json/__init__.py", line 290, in load
**kw)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
My pull request will fix this: #17
Hi,
I would have some ideas for improvements.
First, it should be called jsondb. You use json instead of pickle, so the name is very misleading. The line "import json as pickle" is terrible.
I would put everything in a class. Then you could have an interface like this:
jdb = JsonDb('test.db') # __init__ would do the job of your load() function
jdb.set('key', 'value')
jdb.get('key')
Global variables are discouraged to use, but if you group everything in a class, you could use those variables in the class with 'self', like self.db, etc.
Also, you do lots of I/O operations. It should be kept in memory and the user could call dump himself: jdb.dump() , and then you write everything to a file.
Add comments to every function. At the end of the source, add this:
if __name__ == "__main__":
# write an example for every function
# then the users will understand immediately how to use it
Laszlo
To delete a particular key from memory after a particular time in seconds
just like one offered in redis.expire option.
Ex:
db.set('key', 'value', expire=1)
as diagnosed by maintainers here;
#37
this shows the user is supplying an int and the framework changes the type to string.
JSON supports integer tyes as values and talking about JSON “values” confuses this discussion on a key/value pair database where the key type is being discovered, not the value.
can someone fix it?..
setup.py
mentions that this project is BSD 3-clause licensed but it is missing a license file. Please add one to make it clearer how it is licensed. This is very helpful when packaging this for conda-forge.
first, i think pockleDB is very good tool
and i want to know why use json model ?
The version on Pypi is still have the problem that can't save the database file on project written by Python3.. I think it need updating
BSD three-caluse license
should be BSD three-clause license
:)
because of this change 89897b4 the way pickledb get
was being used changes completely.
If the change is to be kept then it should at least be worth of a minor version and not just a patch
test script;
#main.py
import os
import pickledb
DBCHANGES = False
base_dir = '/tmp'
filename = 'example.db'
db = pickledb.load(os.path.join(base_dir, filename), False)
key = 123456789
value = 'blah'
if not db.exists(key):
db.set(key, value)
DBCHANGES = True
if db.exists(key):
print db.get(key)
if DBCHANGES:
db.dump()
print 'dumped db'
results in;
~$ python main.py
blah
dumped db
~$ python main.py
blah
dumped db
~$ cat /tmp/example.db
{"123456789": "blah", "123456789": "blah"}%
And we now have duplication in the databse because due to the failure of the exists
method
You should add a serializer and unserializer member to pickledb
so instead of
def _loaddb(self):
'''Load or reload the json info from the file'''
self.db = json.load(open(self.loco, 'rb'))
def _dumpdb(self, forced):
'''Dump (write, save) the json dump into the file'''
if forced:
json.dump(self.db, open(self.loco, 'wb'))
it will be
def _loaddb(self):
'''Load or reload the json info from the file'''
self.db = self.unserializer(open(self.loco, 'rb'))
def _dumpdb(self, forced):
'''Dump (write, save) the json dump into the file'''
if forced:
self.serializer(self.db, open(self.loco, 'wb'))
default values will be:
unserializer=json.load
serializer=json.dump
but it will be possible to set
unserializer=simplejson.load
serializer=simplejson.dump
or
unserializer=jsonpickle.decode
serializer=jsonpickle.encode
I noticed that the only dependency that is not Python 2.3 compatible is json. By importing simplejson as json, it works with python 2.3. I would like to suggest this as an option when loading the library.
I also noticed that the code hasn't been touched in two years, but for anyone searching for this: Just use simplejson.
Hi,
do you consider the command that can list all names of databases?
Thanks
usage of lrem function in the test cases fails because it is renamed to lremlist
I see that many (all?) examples use
pickledb.load('pickle.db', False)
What would happen if this was set to True
?
Signal handler is wrong and it sohuld take two arguments but it takes zero. If script is killed, this exception appear.
TypeError: sigterm_handler() takes 0 positional arguments but 2 were given
Tested with python 3.6.10
Here is a simple script which reveal this error. It simply kill self.
import pickledb
import os
db = pickledb.load('test.json', False)
os.system('kill $PPID')
In latest versions, there was made bad decision to replace None
with False
. It is now not possible to distinguis between False
and not set entry because neither KeyError
nor None
is returned like in all other standard python interfaces (eg. dict
)
Only good way to solve this would be to mimic python dict
and introduce full featured get()
method with user configurable default argument, which is used when there is no entry in the db with default default
set to None.
Please, can you revert it back and use None
in every case, where there is no single item to return or at least introduce optional default
argument, so it is possible for the user to overcome this bad decision without maintaining its own downstream version.
This project should really be called "jsondb". I understand it's called "pickledb" for historical reasons, but that's like calling a business that sells cars "bike store" because it used to sell bikes in the past.
I was looking for a simple JSON database and scrolled past pickledb because it had "pickle" in its name. I found out it doesn't use pickle as storage format much later. I could've easily not figured that out at all; it was pure luck.
Is it possible, that this project does not get maintained anymore?
There are open Pull requests since 2013..
Originally posted by @Benjamin-Dewey in #18 (comment)_
Please can we get this fork merged back to master?
I've run into this issue when using PickleDB with GUnicorn and Flask because the workers are run as child threads from the worker, so of course, PickleDB can't install its handler.
Hi, thank you for reading this.
What do you think about this idea to switch from an ordinary dictionary to the OrderedDict? I find that most of the times in my projects I need data to be sorted by keys so I have to do it manually. I thought maybe I'm not the only one and we can all benefit from making it a built-in feature? :) OrderedDict is located in the "collections" std-lib module.
Trying to use it on Mac book, Here's the code:
import pickledb
db = pickledb.load('./test.db', False)
db.set('key', 'value')
db.get('key')
db.dump('test.db')
got these error messages:
Traceback (most recent call last):
File "zero1.py", line 15, in <module>
db = pickledb.load('./test.db', False)
File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/pickledb.py", line 34, in load
return pickledb(location, option)
File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/pickledb.py", line 41, in __init__
self.load(location, option)
File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/pickledb.py", line 49, in load
self._loaddb()
File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/pickledb.py", line 196, in _loaddb
self.db = simplejson.load(open(self.loco, 'rb'))
File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/simplejson/__init__.py", line 459, in load
use_decimal=use_decimal, **kw)
File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/simplejson/__init__.py", line 516, in loads
return _default_decoder.decode(s)
File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
add a has_key function if it's not there
I am thinking to write a web api wrapper around pickledb so that it can be accessed over http among multiple servers
example :
http://someurl:8080/set/a
post : value
http://someurl:8080/get/a
returns its value in JSOiN format
So I wanted to ask community its worth the effort or not.
I'm following the example from readme:
>>> import pickledb
>>> db = pickledb.load('test.db', False)
>>> db.set('key', 'value')
>>> db.get('key')
'value'
>>> db.dump()
However, on dump()
, I'm getting:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/piotrek/.local/lib/python3.6/site-packages/pickledb.py", line 56, in dump
self._dumpdb(True)
File "/home/piotrek/.local/lib/python3.6/site-packages/pickledb.py", line 201, in _dumpdb
simplejson.dump(self.db, open(self.loco, 'wb'))
File "/home/piotrek/.local/lib/python3.6/site-packages/simplejson/__init__.py", line 279, in dump
fp.write(chunk)
TypeError: a bytes-like object is required, not 'str'
The version of Python is 3.6.3.
To anyone who might have stumbled upon this package.. as far as my understanding goes, pickleDB won't work for large dictionaries, since it keeps the entire dictionary in RAM (or loads the entire db-file at once using simplejson). So, if you are looking to work with very large disk-based dictionaries, you might be better off using the Sqlite database, and writing a dictionary like wrapper on top.
Also, the Shelve module seems to work nicely for large dictionaries. However, it has the downside that it uses Pickle internally, and thus, the Database files are not directly readable.
Please correct me if I am wrong. Thanks!
Or it is normal?
In the code, there is
key_string_error = TypeError('Key/name must be a string!')
which create exception on place, where it does not occur.
And later use
raise self.key_string_error
Which append new tracebak to existing one in this exception object so the traceback grows without bounds.
It is later imposible to debug the code, because the traceback can be very long. It also introduces memory leak, because key_string_error
is class variable and is never freed.
Raise new instance of TypeError('Key/name must be a string!')
every time, or create new exception type with predefined message for this.
import pickledb
import traceback
db = pickledb.load('example.db', False)
for i in range(100):
try:
db.set({}, "")
except TypeError:
print("***********************************")
traceback.print_exc()
json.dumps can't serialize user defined instance variables. If you allow us to pass in configuration to json.dumps, we can solve the problem.
def my_instance_json_encoder(x):
lambda x: x.__dict__
s = json.dumps(
users, default=my_instance_json_encoder
)
Better yet:
config = {
"default": lambda x: x.__dict__,
"indent": " ",
}
json.dumps(users, config)
In my case, this is mainly for to pass the indent
argument, which makes the db easier to inspect visually, and to view diffs.
Is this library thread-safe?
Cheers!
db.lcreate('users') raise
TypeError: a bytes-like object is required, not 'str'
We sohould open file not as binary?
Create a method that returns a count of the number of the keys currently in the db.
import pickledb
db = pickledb.load('database.sql', True)
db.set('key', ['one', 'two', 'three'])
db.ladd('key', 'four')
db.lremvalue('key', 'four') # You need to use db.dump() as a workaround
# db.dump()
Hi ! I am looking for retrieving all entries that start with an exact specified key (partial match).
Is this a feature available or upcoming ?
Thanks !
Dumping the key-value pair from dictionary to disk has been done by creating a new thread from main thread. However, since the process is running on a single core and only one thread can be run concurrently, it is as good as dumping it to disk from the main thread.
If however, making of a new thread is replaced by a new process using fork, this might be of help while writing to disk. This is because the new process(child) will have the task of dumping the key value to disk, meanwhile the parent process would continue to run alongside. This will ensure consistency as even if the parent process crashes due to some errors, the child process will ensure that the key-value pair is written to the disk and is always consistent.
This consistency issue is not taken care by multi-threading because if the process crashes due to some errors, all the threads of the process will be killed and the key-value may might not be completely written to disk, making the disk inconsistent.
@patx please let me know your thoughts.
It will be nice to provide a get method which return a default value when nothing was stored for a given key into database like a defaultdict
http://docs.python.org/2/library/collections.html#collections.defaultdict
db.get('key', 'default_value')
LRANGE command will get key and start and end value as input and return the list of values in that range.
list = [2,3,4,5]
LRANGE 0 2 => [2, 3]
LRANGE 0 -1 => [2,3,4]
It is similar to LRANGE in redis
Shall I raise a pull request for this command implementation?
Hello, first of all thanks for maintaining and developing this project, it really fits my needs.
Is pickledb thread-safe? Can it be used in bigger projects as, for example, user database?
when you do a db.lgetall("key") you actually get a list that is the same stored list in the db. now when doing a list.pop() yo actually modify the db itself outside of its API. this is a bad abstraction leak even if there is some performance reason to it. much better to return a copy()
Hello,
The latest release of pickledb which is 0.6.2
does not support Python 3. It is already fixed on the master branch. However, It didn't released. Are you planning to release a new version with Python 3 support?
There seems to be an issue with the example provided by pickledb
itself. When calling the dump
function with Python 3, there is a TypeError
produced by the code.
Here's what was inputted in a test file:
import pickledb
db = pickledb.load("test.db", False)
db.set("key", "value")
print(db.get("key"))
db.dump()
Traceback:
Traceback (most recent call last):
File "tests.py", line 6, in <module>
db.dump()
File "/Users/shreydesai/GitHub/pickledb/pickledb.py", line 56, in dump
self._dumpdb(True)
File "/Users/shreydesai/GitHub/pickledb/pickledb.py", line 201, in _dumpdb
simplejson.dump(self.db, open(self.loco, 'wb'))
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/
simplejson/__init__.py", line 277, in dump
fp.write(chunk)
TypeError: 'str' does not support the buffer interface
Being able to append to the pickledb.
Everytime i call db.dump() it overwrites existing file.
It would be awesome to have optional encryption and decryption function supplied by the user so that they can encrypt the data before writing to file or decrypt using the decryptor function before reading from given file.
Hi!
simplejson.dump(self.db, open(self.loco, 'wb'))
gives us TypeError: 'str' does not support the buffer interface
because in python3 bare literal strings are unicode (I read about it here )
I just changed the mode of opening file from wb
to wt
for compability.
Please, watch it.
https://github.com/approximatenumber/pickledb/tree/my_branch
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.