patx / pickledb Goto Github PK

View Code? Open in Web Editor NEW

875.0 14.0 126.0 134 KB

pickleDB is an open source key-value store using Python's json module.

Home Page: https://patx.github.io/pickledb

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

python python3 pickledb database key-value bsd-3-clause datastore json

pickledb's People

Contributors

Stargazers

Watchers

Forkers

bolotov szaydel tallus tst-eboersma kgashok mrlambchop arun1729 approximatenumber acimnotes valiantone dron22 areejabdu zhengkai6 chansonz radupotop jntme tonyingithub dev-techmoe walkoncross fiefdx mneukirch nullkarma joylin1984 sniranjan stethd netspool samvarankashyap pebsconsulting jcjordyn130 sergiishcherbak iamalvin cszixin thpoll maruf0011 vermuz valiantljk zbingwen movinghera fatway toggled bosley odsum krishnarekapalli usermonk rohitgcs ashokpant stylite h0rn3t shw2886 zwork101 jayjeetatgithub nitzanmar weiyanhua100 bdeweygit juanino flipchan davidhy411 gccmbr prabhupant xhochy sandy1811 lecardozo forksbot alejandrosocorro kacmak7 pieterjanmontens jay51 vikneswaran20 odalle ziux larbisahli awoziji mougams lepy muhammedfurkan djun vivekthoppil silversit dfaerch algolink ceorleorn y0geshpatil coolora600 smsajal ooojpeg ujjwal3067 actuarial-tools matt-wisdom andreheringer smallgram ravihari trendingtechnology guionardo dc-avasilev solegaonkar playfloor boomxy lefanch divkix alston-v-abraham

pickledb's Issues

Incorrect documention for LREMLIST

Documentation states:

'''Remove a list and all of its values'''

But it will actually return the number of entries in a list, not the values.
source

Problem with reloading an already created db. TypeError: the JSON object must be str, not 'bytes'

import pickledb
db = pickledb.load('example.db', False)
db.set('key', 'value')
db.dump()
db = pickledb.load('example.db', False)

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Documents\Python Projects\garbage-chatter\venv\lib\site-packages\pickledb.py", line 39, in load
return pickledb(location, auto_dump)
File "C:\Users\Documents\Python Projects\garbage-chatter\venv\lib\site-packages\pickledb.py", line 49, in init
self.load(location, auto_dump)
File "C:\Users\Documents\Python Projects\garbage-chatter\venv\lib\site-packages\pickledb.py", line 79, in load
self.loaddb()
File "C:\Users\Documents\Python Projects\garbage-chatter\venv\lib\site-packages\pickledb.py", line 96, in loaddb
self.db = json.load(open(self.loco, 'rb'))
File "C:\Users\AppData\Local\Programs\Python\Python35\Lib\json_init.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "C:\Users\AppData\Local\Programs\Python\Python35\Lib\json_init.py", line 312, in loads
s.class.name))
TypeError: the JSON object must be str, not 'bytes'

Typo on Website

pickleDB is a lightweight and simple key-value store. It is built upon Python's simplejson module and was inspired by redis. It is licensed with the BSD three-caluse license.

https://youtu.be/cAFjZ1gXBxc?t=12430

Database file corruption caused by SIGTERM during dump() execution

When a SIGTERM signal (e.g. by hitting Ctrl-C in terminal) is received during dump() execution, the exit is performed immediately which causes corruption of the pickledb database file. This can be very annoying as it might mean the loss of critical data.

Upon execution of load() after a restart, I get this this error:

Traceback (most recent call last):
  File "test.py", line 22, in <module>
    main()
  File "test.py", line 14, in main
    db = get_pickledb()
  File "test.py", line 9, in get_pickledb
    db = pickledb.load(DB_FILEPATH, False)
  File "/home/don/tmp/pickledb_robust/pickledb/pickledb.py", line 34, in load
    return pickledb(location, option)
  File "/home/don/tmp/pickledb_robust/pickledb/pickledb.py", line 41, in __init__
    self.load(location, option)
  File "/home/don/tmp/pickledb_robust/pickledb/pickledb.py", line 49, in load
    self._loaddb()
  File "/home/don/tmp/pickledb_robust/pickledb/pickledb.py", line 186, in _loaddb
    self.db = json.load(open(self.loco, 'rb'))
  File "/usr/lib/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

My pull request will fix this: #17

some improvements

Hi,

I would have some ideas for improvements.

First, it should be called jsondb. You use json instead of pickle, so the name is very misleading. The line "import json as pickle" is terrible.

I would put everything in a class. Then you could have an interface like this:

jdb = JsonDb('test.db')      # __init__ would do the job of your load() function
jdb.set('key', 'value')
jdb.get('key')

Global variables are discouraged to use, but if you group everything in a class, you could use those variables in the class with 'self', like self.db, etc.

Also, you do lots of I/O operations. It should be kept in memory and the user could call dump himself: jdb.dump() , and then you write everything to a file.

Add comments to every function. At the end of the source, add this:

if __name__ == "__main__":
    # write an example for every function
    # then the users will understand immediately how to use it

Laszlo

timeout or expire function

To delete a particular key from memory after a particular time in seconds

just like one offered in redis.expire option.

Ex:
db.set('key', 'value', expire=1)

source code changes key type rendering key lookups in failure

as diagnosed by maintainers here;
#37

this shows the user is supplying an int and the framework changes the type to string.

JSON supports integer tyes as values and talking about JSON “values” confuses this discussion on a key/value pair database where the key type is being discovered, not the value.

it's broken

can someone fix it?..

Add a LICENSE file

setup.py mentions that this project is BSD 3-clause licensed but it is missing a license file. Please add one to make it clearer how it is licensed. This is very helpful when packaging this for conda-forge.

why use json replace simplejson and pickle?

first， i think pockleDB is very good tool

and i want to know why use json model ?

The package on Pypi is too old, please update it

The version on Pypi is still have the problem that can't save the database file on project written by Python3.. I think it need updating

Typo on website

BSD three-caluse license should be BSD three-clause license :)

0.7.5 is not backward compatible

because of this change 89897b4 the way pickledb get was being used changes completely.

If the change is to be kept then it should at least be worth of a minor version and not just a patch

exists always false when load existing db

test script;

#main.py

import os
import pickledb

DBCHANGES = False
base_dir = '/tmp'
filename = 'example.db'

db = pickledb.load(os.path.join(base_dir, filename), False)

key = 123456789
value = 'blah'

if not db.exists(key):
  db.set(key, value)
  DBCHANGES = True

if db.exists(key):
  print db.get(key)

if DBCHANGES:
  db.dump()
  print 'dumped db'

results in;

~$ python main.py
blah
dumped db
~$ python main.py
blah
dumped db
~$ cat /tmp/example.db
{"123456789": "blah", "123456789": "blah"}%

And we now have duplication in the databse because due to the failure of the exists method

Passing simplejson or jsonpicke (or anything else) as serializer/unserializer

You should add a serializer and unserializer member to pickledb

so instead of
def _loaddb(self):
'''Load or reload the json info from the file'''
self.db = json.load(open(self.loco, 'rb'))

def _dumpdb(self, forced):
    '''Dump (write, save) the json dump into the file'''
    if forced:
        json.dump(self.db, open(self.loco, 'wb'))

it will be

def _loaddb(self):
    '''Load or reload the json info from the file'''
    self.db = self.unserializer(open(self.loco, 'rb'))

def _dumpdb(self, forced):
    '''Dump (write, save) the json dump into the file'''
    if forced:
        self.serializer(self.db, open(self.loco, 'wb'))

default values will be:

unserializer=json.load
serializer=json.dump

but it will be possible to set

unserializer=simplejson.load
serializer=simplejson.dump

unserializer=jsonpickle.decode
serializer=jsonpickle.encode

A small change could make this lib python 2.3 compatible

I noticed that the only dependency that is not Python 2.3 compatible is json. By importing simplejson as json, it works with python 2.3. I would like to suggest this as an option when loading the library.

I also noticed that the code hasn't been touched in two years, but for anyone searching for this: Just use simplejson.

List all names of database

Hi,

do you consider the command that can list all names of databases?

Thanks

Test cases fails due to the usage of lrem

usage of lrem function in the test cases fails because it is renamed to lremlist

DOC: What does autodump from pickledb.load do?

I see that many (all?) examples use

pickledb.load('pickle.db', False)

What would happen if this was set to True?

bug: sigterm_handler has wrong number of parameters

Signal handler is wrong and it sohuld take two arguments but it takes zero. If script is killed, this exception appear.

TypeError: sigterm_handler() takes 0 positional arguments but 2 were given

Tested with python 3.6.10

Here is a simple script which reveal this error. It simply kill self.

import pickledb
import os

db = pickledb.load('test.json', False)
os.system('kill $PPID')

feature: Return None instead of False

In latest versions, there was made bad decision to replace None with False. It is now not possible to distinguis between False and not set entry because neither KeyError nor None is returned like in all other standard python interfaces (eg. dict)

Only good way to solve this would be to mimic python dict and introduce full featured get() method with user configurable default argument, which is used when there is no entry in the db with default default set to None.

Please, can you revert it back and use None in every case, where there is no single item to return or at least introduce optional default argument, so it is possible for the user to overcome this bad decision without maintaining its own downstream version.

Rename project

This project should really be called "jsondb". I understand it's called "pickledb" for historical reasons, but that's like calling a business that sells cars "bike store" because it used to sell bikes in the past.

I was looking for a simple JSON database and scrolled past pickledb because it had "pickle" in its name. I found out it doesn't use pickle as storage format much later. I could've easily not figured that out at all; it was pure luck.

is this project unmaintained?

Is it possible, that this project does not get maintained anymore?
There are open Pull requests since 2013..

`Exception calling application: signal only works in main thread`

Originally posted by @Benjamin-Dewey in #18 (comment)_

Please can we get this fork merged back to master?

I've run into this issue when using PickleDB with GUnicorn and Flask because the workers are run as child threads from the worker, so of course, PickleDB can't install its handler.

Switching to OrderedDict

Hi, thank you for reading this.
What do you think about this idea to switch from an ordinary dictionary to the OrderedDict? I find that most of the times in my projects I need data to be sorted by keys so I have to do it manually. I thought maybe I'm not the only one and we can all benefit from making it a built-in feature? :) OrderedDict is located in the "collections" std-lib module.

simplejson.scanner.JSONDecodeError

Trying to use it on Mac book, Here's the code:

import pickledb
db = pickledb.load('./test.db', False)
db.set('key', 'value')
db.get('key')
db.dump('test.db')

got these error messages:

Traceback (most recent call last):
  File "zero1.py", line 15, in <module>
    db = pickledb.load('./test.db', False)
  File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/pickledb.py", line 34, in load
    return pickledb(location, option)
  File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/pickledb.py", line 41, in __init__
    self.load(location, option)
  File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/pickledb.py", line 49, in load
    self._loaddb()
  File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/pickledb.py", line 196, in _loaddb
    self.db = simplejson.load(open(self.loco, 'rb'))
  File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/simplejson/__init__.py", line 459, in load
    use_decimal=use_decimal, **kw)
  File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/simplejson/__init__.py", line 516, in loads
    return _default_decoder.decode(s)
  File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/Users/macbookpro/anaconda2/lib/python2.7/site-packages/simplejson/decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Has_key

add a has_key function if it's not there

rest web api for pickle

I am thinking to write a web api wrapper around pickledb so that it can be accessed over http among multiple servers
example :
http://someurl:8080/set/a
post : value

http://someurl:8080/get/a
returns its value in JSOiN format

So I wanted to ask community its worth the effort or not.

TypeError: a bytes-like object is required, not 'str'

I'm following the example from readme:

    >>> import pickledb
    >>> db = pickledb.load('test.db', False)
    >>> db.set('key', 'value')
    >>> db.get('key')
    'value'
    >>> db.dump()

However, on dump(), I'm getting:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/piotrek/.local/lib/python3.6/site-packages/pickledb.py", line 56, in dump
    self._dumpdb(True)
  File "/home/piotrek/.local/lib/python3.6/site-packages/pickledb.py", line 201, in _dumpdb
    simplejson.dump(self.db, open(self.loco, 'wb'))
  File "/home/piotrek/.local/lib/python3.6/site-packages/simplejson/__init__.py", line 279, in dump
    fp.write(chunk)
TypeError: a bytes-like object is required, not 'str'

The version of Python is 3.6.3.

Won't work for large files

To anyone who might have stumbled upon this package.. as far as my understanding goes, pickleDB won't work for large dictionaries, since it keeps the entire dictionary in RAM (or loads the entire db-file at once using simplejson). So, if you are looking to work with very large disk-based dictionaries, you might be better off using the Sqlite database, and writing a dictionary like wrapper on top.

Also, the Shelve module seems to work nicely for large dictionaries. However, it has the downside that it uses Pickle internally, and thus, the Database files are not directly readable.

Please correct me if I am wrong. Thanks!

lremvalue function doesn't have self._autodumpdb()

Or it is normal?

bug: reuse of exception causing memory leak

In the code, there is

key_string_error = TypeError('Key/name must be a string!')

which create exception on place, where it does not occur.

And later use

raise self.key_string_error

Which append new tracebak to existing one in this exception object so the traceback grows without bounds.

It is later imposible to debug the code, because the traceback can be very long. It also introduces memory leak, because key_string_error is class variable and is never freed.

Expected solution:

Raise new instance of TypeError('Key/name must be a string!') every time, or create new exception type with predefined message for this.

Example code to reproduce the issue:

import pickledb
import traceback

db = pickledb.load('example.db', False)
for i in range(100):
    try:
        db.set({}, "")
    except TypeError:
        print("***********************************")
        traceback.print_exc()

pass in a json encoder function to json.dumps

json.dumps can't serialize user defined instance variables. If you allow us to pass in configuration to json.dumps, we can solve the problem.

def my_instance_json_encoder(x):
    lambda x: x.__dict__


s = json.dumps(
        users, default=my_instance_json_encoder
)

Better yet:

config = {
    "default": lambda x: x.__dict__,
    "indent": "    ",
}

 json.dumps(users, config)

feature: pass kwargs to json.dump

In my case, this is mainly for to pass the indent argument, which makes the db easier to inspect visually, and to view diffs.

[Question] Thread-Safety

Is this library thread-safe?

Cheers!

TypeError when create list

db.lcreate('users') raise
TypeError: a bytes-like object is required, not 'str'

We sohould open file not as binary?

Create a method to return a count of the number of keys currently in the db.

Create a method that returns a count of the number of the keys currently in the db.

from https://bitbucket.org/patx/pickledb/issues/5

db.lremvalue(key, value) doesn't dump to file when auto_dump is True.

import pickledb

db = pickledb.load('database.sql', True)
db.set('key', ['one', 'two', 'three'])

db.ladd('key', 'four')

db.lremvalue('key', 'four') # You need to use db.dump() as a workaround
# db.dump()

Retrieve Entries That Start With

Hi ! I am looking for retrieving all entries that start with an exact specified key (partial match).

Is this a feature available or upcoming ?

Thanks !

Use of Threading

Dumping the key-value pair from dictionary to disk has been done by creating a new thread from main thread. However, since the process is running on a single core and only one thread can be run concurrently, it is as good as dumping it to disk from the main thread.

If however, making of a new thread is replaced by a new process using fork, this might be of help while writing to disk. This is because the new process(child) will have the task of dumping the key value to disk, meanwhile the parent process would continue to run alongside. This will ensure consistency as even if the parent process crashes due to some errors, the child process will ensure that the key-value pair is written to the disk and is always consistent.

This consistency issue is not taken care by multi-threading because if the process crashes due to some errors, all the threads of the process will be killed and the key-value may might not be completely written to disk, making the disk inconsistent.

@patx please let me know your thoughts.

Add a get method with default value

It will be nice to provide a get method which return a default value when nothing was stored for a given key into database like a defaultdict
http://docs.python.org/2/library/collections.html#collections.defaultdict

db.get('key', 'default_value')

Bug: `sigterm_handler()` takes 0 positional arguments but 2 were given

LRANGE command which return the subset of the list

LRANGE command will get key and start and end value as input and return the list of values in that range.

list = [2,3,4,5]
LRANGE 0 2 => [2, 3]
LRANGE 0 -1 => [2,3,4]

It is similar to LRANGE in redis

Shall I raise a pull request for this command implementation?

Status of this project

Hello, first of all thanks for maintaining and developing this project, it really fits my needs.
Is pickledb thread-safe? Can it be used in bigger projects as, for example, user database?

lgetall leaks the db abstraction

when you do a db.lgetall("key") you actually get a list that is the same stored list in the db. now when doing a list.pop() yo actually modify the db itself outside of its API. this is a bad abstraction leak even if there is some performance reason to it. much better to return a copy()

New Release for Python 3

Hello,

The latest release of pickledb which is 0.6.2 does not support Python 3. It is already fixed on the master branch. However, It didn't released. Are you planning to release a new version with Python 3 support?

TypeError when executing dump function

There seems to be an issue with the example provided by pickledb itself. When calling the dump function with Python 3, there is a TypeError produced by the code.

Here's what was inputted in a test file:

import pickledb
db = pickledb.load("test.db", False)
db.set("key", "value")
print(db.get("key"))
db.dump()

Traceback:

Traceback (most recent call last):
  File "tests.py", line 6, in <module>
    db.dump()
  File "/Users/shreydesai/GitHub/pickledb/pickledb.py", line 56, in dump
    self._dumpdb(True)
  File "/Users/shreydesai/GitHub/pickledb/pickledb.py", line 201, in _dumpdb
    simplejson.dump(self.db, open(self.loco, 'wb'))
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/
simplejson/__init__.py", line 277, in dump
    fp.write(chunk)

TypeError: 'str' does not support the buffer interface

Please, watch it.
https://github.com/approximatenumber/pickledb/tree/my_branch