dpath-maintainers / dpath-python Goto Github PK

View Code? Open in Web Editor NEW

609.0 609.0 91.0 334 KB

A python library for accessing and searching dictionaries via /slashed/paths ala xpath.

License: MIT License

Python 100.00%

dpath-python's People

Contributors

Stargazers

Watchers

Forkers

tml calebcase xhh2a gagandeep ajaniv pombredanne dsuch msabramo djj88 xaverrevax lexhung benthomasson jlquant greenpau alekseyef phdkiran penn201500 banjocat dirkakrid scott-vsi vintasoftware yijxiang dacjames seraekim siam28 7artur7 harel adrianer dradetsky vuchau netzmb harlowja quincyc379 adamdavis40208 statechular11 pmalhaire qmando thuongdinh-agilityio danielpops mindw gotnone streamsets sair770 berend zedjones clarkenciel martinresearch gridl chernogorsky awhetter mrusoff mrjuicybacon bottoy nayyarv minh5 bobosui wsantos zhongweng ra2003 datafields-team aibaq marius-mather kviktor devsiddhesh29 hylarucoder bass-03 sygutss alainlich lihao2017-11-15 lgouger moomoohk tonyzzr gitter-badger arkadesorg dilson-dev python-repository-hub drreeww rgbmrc consumeraffairs niklaskhf wiky-avis gruebel sudhanannamalai arpitjain799 reubenj a-detiste squatched florisla

dpath-python's Issues

dpath.util.new() should raise TypeError when user asks to create a new path in an existing but unsubscriptable object

OS X 10.9.5
Tested on Python 3.5.0, 3.4.3, and 2.7.10

So, I'm trying to use dpath in one of my projects, and the examples in the README docs work just fine. However, when I fiddle around on my own, dpath.util.new seems to think I want to tack on an array when I really don't:

In [1]: import dpath.util

In [2]: d = {"a": 1, "b": 2}

In [3]: dpath.util.new(d, "a/c", "something")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/[username]/.pyenv/versions/3.5.0/lib/python3.5/site-packages/dpath/path.py in path_types(obj, path)
     29         try:
---> 30             result.append([path[-1], cur[path[-1]].__class__])
     31         except TypeError:

TypeError: 'int' object is not subscriptable

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-3-e13767a0cab0> in <module>()
----> 1 dpath.util.new(d, "a/c", "something")

/Users/[username]/.pyenv/versions/3.5.0/lib/python3.5/site-packages/dpath/util.py in new(obj, path, value, separator)
     44     """
     45     pathlist = __safe_path__(path, separator)
---> 46     pathobj = dpath.path.path_types(obj, pathlist)
     47     return dpath.path.set(obj, pathobj, value, create_missing=True)
     48

/Users/[username]/.pyenv/versions/3.5.0/lib/python3.5/site-packages/dpath/path.py in path_types(obj, path)
     30             result.append([path[-1], cur[path[-1]].__class__])
     31         except TypeError:
---> 32             result.append([path[-1], cur[int(path[-1])].__class__])
     33     except (KeyError, IndexError):
     34         result.append([path[-1], path[-1].__class__])

ValueError: invalid literal for int() with base 10: 'c'

I don't understand what the substantive difference is between this and the analogous example in the docs. Am I missing anything obvious, here?

dpath.path.validate chokes on integer keys

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "build/bdist.linux-x86_64/egg/dpath/util.py", line 89, in search
  File "build/bdist.linux-x86_64/egg/dpath/util.py", line 75, in _search_view
  File "build/bdist.linux-x86_64/egg/dpath/path.py", line 93, in search
  File "build/bdist.linux-x86_64/egg/dpath/path.py", line 52, in paths
  File "build/bdist.linux-x86_64/egg/dpath/path.py", line 42, in paths
  File "build/bdist.linux-x86_64/egg/dpath/path.py", line 10, in validate
TypeError: argument of type 'int' is not iterable

This occurs whenever dpath.path.validate() receives a key that is an integer, such as when iterating over a list.

provide example of list of dictionaries and filtering by intermediate key

Say I have this:

d = {'x': [{'a': 5, 'b': 2}, {'a': 6, 'b': 4}]}

and I want to list all the values of a. I cannot get from the README that I would use:

dpath.util.values(d, '/x/*/a')

Also, how can I look up the value of 'b' for when 'a' is 6, for example? Is there a place to look for documentation (that could be linked from the README?) Am I not looking for docs in the right place just because I’m unfamiliar with python? ;-)

Thanks for any help or doc updates!

** search does not handle lists properly

{'a': {'b': {'c': [], '3': 2, '43': 30, 'd': ['red', 'buggy', 'bumpers']}}}
ipdb> dpath.search(data, '**')
{'a': {'b': {'3': 2, '43': 30, 'd': {0: 'red', 1: 'buggy', 2: 'bumpers'}}}}

There are 2 issues here: the "c" key has been completely removed and the "d" key has been converted from a list to a dictionary. Is this an issue or am I just not understanding the expected behaviour?

Fatal error with empty keys

Even if you set dpath.options.ALLOW_EMPTY_STRING_KEYS = True, and dpath.util.search on a dict with empty keys, you will hit a IndexError exception thanks to elif (skip and k[0] == '+'): in dpath.path.paths

dpath-python doesn't respond properly to queries that begin with '/'

This is going to frustrate a lot of people, but dpath doesn't understand '/' at the beginning of queries, and will return empty result sets in that case.

Glob matches are case-normalized on Windows

Using fnmatch for globbing means that both patterns and keys are case-normalized using os.path.normcase.

On *nix, this has no effect, but on Windows:

>>> d = {'a': 42}
>>> dpath.util.get(d, 'a')
42
>>> dpath.util.get(d, 'A') # should be a KeyError
42
>>> d = {r'a\b': 42}
>>> dpath.util.get(d, 'a/b') # should be a KeyError
42

(IIRC, fnmatch used to do its own case normalization instead of calling os.path.normcase until somewhere around 2.7 and 3.2, so in 2.6 or 3.1 the last one won't be a problem, only the first.)

To fix this, if the only function out of fnmatch that you're using is fnmatch, just use fnmatchcase instead.

Posible to merge lists uniqe

is it possible to add a something like MERGE_UNIQUE so 2 lists are combined to a single list of unique values

merge( ['1','3','4',5'] , ['1','2','3'] )
['1','2','3','4',5']

Would be great - thanks

Is it possible to filter multiple times?

[{'signal_name': 's111',
'op_cond': [
	 {'name': 'op1',
		 'duration': 1,
		 'properties': [
			{'type': 't0', 'param0': 1}]},
	 {'name': 'op2',
		 'duration': 2,
		 'properties': [
			{'type': 't1', 'param1': 1, 'param2': 10},
			{'type': 't1', 'param1': 2, 'param2': 11}]}
			]},
{'signal_name': 's222',
'op_cond': [
	 {'name': 'op1',
		 'duration': 1,
		 'properties': [
			{'type': 't0', 'param0': 1}]},
	 {'name': 'op2',
		 'duration': 2,
		 'properties': [
			{'type': 't1', 'param1': 1, 'param2': 10},
			{'type': 't1', 'param1': 2, 'param2': 11}]}
			]}]

lets say I want all duration values from (signal_name == 's222') and (name = op2).
Is it possible to do with a one line command?

Could not figure it out reading the docs...

MERGE_REPLACE does not replace but updates a dst list

`from dpath.util import merge, MERGE_REPLACE

dict1 = {
1: [1, 2, 3]
}
dict2 = {
1: ['a']
}
merge(dict1, dict2, flags=MERGE_REPLACE)
print(dict1)`

return actual: {1: ['a', 2, 3]}
return expected: {1: ['a']}

Is it correct behavior? from documentation i read:
MERGE_REPLACE : Instead of combining list objects, when
2 list objects are at an equal depth of merge, replace
the destination with the source.

Glob characters cannot be escaped

Python's fnmatch and glob don't allow escaping glob characters with a backslash.

This is by design for Python (even if it's a weird design), but I don't think it's what most people expect from dpath, even if they see the reference to fnmatch:

>>> d = {'a': 23, '*': 42}
>>> dpath.util.get(d, '*')
ValueError: dpath.util.get() globs must match only one leaf : *
>>> dpath.util.search(d, '*')
{'a': 1, '*': 2}
>>> dpath.util.get(d, r'\*')
KeyError: '\\*'
>>> dpath.util.search(d, r'\*')
{}
>>> dpath.util.get(d, '[*]') # is anyone going to guess this?
42
>>> dpath.util.search(d, '[*]')
{'*': 42}

I've run into a similar issue in the past with many projects that want "globbing like the glob module but with backslash escapes". The simplest solution (but make sure to verify compatible licensing) is to fork fnmatch into your own code, and change this part of the translate function:

    if c == '*':

… to this:

    if c == '\\' and i < n:
        res = res + re.escape(pat[i])
        i = i+1
    elif c == '*':

However, that may not be all you need here.

I'm guessing (given the KeyError) that you have some code that checks for a wildcard and uses fnmatch if found but does a direct lookup if not. If so, your wildcard test has to change to understand backslashes (re.search on (?<!\\)[*?[]) and your direct-lookup has to manually unescape the same characters (re.sub on a similar pattern but that also handles ] and \\). Although once you're getting into manually unescaping, you may want to consider what you want to do with a backslash before any other character—leave it as a backslash, unicode-unescape it, raise an exception, …

Fixing this could also make it easy to provide an alternate fix for #30 by allowing users to backslash-escape the path separator character.

Document how to seek keys with more than one letter

Right now all example use keys with len==1: result = dpath.util.search(x, "a/b/[cd]").

It would be helpful to know how to seek to, e.g. we have a key called 'c1' and 'c2' under a/b: do we do dpath.util.search(x, "a/b/[c1c2]") or something else? How do we prevent it from grabbing keys like '1c'?

Improve documentation

There have been several emails and bugs filed about dpath's documentation. Apparently there are several users who find it confusing, and even the maintainers have found the documentation frustrating at times.

Let's please link all documentation-related incidents here, so we can get them all knocked out at once.

#42 - The documentation for .values() is right there in the README, but somehow the user missed it. How can we make that more obvious?
#15 - the documentation implies to some people that keys must be 1 character in length.
#38 - Document the globbing syntax

Bug on list merge

def test_merge_list():
    src = {"l": [1]}
    p1 = {"l": [2], "v": 1}
    p2 = {"v": 2}
    dst1 = {}
    for d in [src, p1]:
        dpath.util.merge(dst1, d)
    dst2 = {}
    for d in [src, p2]:
        dpath.util.merge(dst2, d)

    assert dst1["l"] == [1, 2]
    assert dst2["l"] == [1]

dst2["l"] should be [1]

pytest result:

tests/test_util_merge.py::test_merge_list FAILED

========================================================================== FAILURES ==========================================================================
______________________________________________________________________ test_merge_list _______________________________________________________________________

    def test_merge_list():
        src = {"l": [1]}
        p1 = {"l": [2], "v": 1}
        p2 = {"v": 2}
        dst1 = {}
        for d in [src, p1]:
            dpath.util.merge(dst1, d)
        dst2 = {}
        for d in [src, p2]:
            dpath.util.merge(dst2, d)

        assert dst1["l"] == [1, 2]
>       assert dst2["l"] == [1]
E       assert [1, 2] == [1]
E         Left contains more items, first extra item: 2
E         Full diff:
E         - [1, 2]
E         + [1]

tests/test_util_merge.py:123: AssertionError
============================================================= 1 failed, 8 passed in 0.13 seconds =============================================================

Filtering doesn't work as expected

I guess I will look into this sometime

>>> import dpath.util
>>> a = { 'actions' : [ { 'type': 'correct' }, { 'type': 'incorrect' } ] }
>>> dpath.util.values(a, 'actions/*')
[{'type': 'correct'}, {'type': 'incorrect'}]
>>> dpath.util.values(a, 'actions/*',afilter=lambda x: x.get('type', None) is 'correct')
[]
>>> def f(x):
>>>     print x
>>> dpath.util.values(a, 'actions/*',afilter=f)
[]

PEP 440 compliance

Hi there,

I have been using dpath for several months and it's all good, nice work.

However, one thing that recently changed in setuptools is that it looks it apparently treats differently versions not compatible with PEP 440

In particular, when using setuptools under zc.buildout, when once wants to install, say, dpath 1.2-70 setuptools turns the version into 1.2.post70 and it ends with

Error: There is a version conflict.
We already have: dpath 1.2.post70

This is part of a bigger build mechanism so I'm not sure exactly which part of setuptools or zc.buildout does it but the root cause is that this is 1.2-70 rather than 1.2.70.

The question is, could you please start using the latter format for package versions? It doesn't make any real difference except that it makes both setuptools and zc.buildout happy.

Thanks.

delete() uses a hardcoded '/' separator, not the separator specified in the function call

separators are not passed through to _inner_search

If we want to allow "/" characters in the keys, we need to pass in a separator that is different, but inner search which is called by search ignores this and calls .paths which then uses the default "/" character and everything fails.

Cannot merge at various levels

I am attempting to search a dictionary for a key, retrieve the value, pop the key, and insert the contents of a file in its place.
Given dictionary:
{ 'z': ['1.1.1.1', '1.1.1.2'], 'include': 'file1.json', 'u': {'include': 'file2.json'}, 'hello': 'goodbye'}
Is it possible to merge at multiple levels without having the source files match the levels of the destination dictionary?

Asterix search does not appear to work

I noticed that the example in the readme doesn't actually work:

import dpath.util
import json

x = {
    "a": {
        "b": {
        "3": 2,
        "43": 30,
        "c": [],
        "d": ['red', 'buggy', 'bumpers'],
        }
    }
}

print(str(dpath.util.values(x, '/a/b/d/\*')))
>>> []

The documentation suggests it should return this:
['red', 'buggy', 'bumpers']

Document globbing syntax

The readme could use some documentation on how globs like [cd] work. This doesn't look like regex or even bash globbing, and there aren't too many results on what eglob is supposed to refer to outside of references to dpath-python.

dpath.path.get fails to properly retrieve values from nested collections

This test illustrates the problem:

def test_search_return_list_globbed():
    tdict = {
        "a": {
            "b": [
                0,
                1,
                2]
            }
        }
    res = dpath.util.search(tdict, '/a/b/[02]')
    print res
    assert(isinstance(res['a']['b'], list))
    assert(len(res['a']['b']) == 2)
    assert(res['a']['b'] == [0, 2])

... and the nosetest fails:

======================================================================
FAIL: test_util_search.test_search_return_list_globbed
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nose-1.3.0-py2.7.egg/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/tests/test_util_search.py", line 148, in test_search_return_list_globbed
    assert(len(res['a']['b']) == 2)
AssertionError

----------------------------------------------------------------------

... it fails because get is absolutely not doing what we expect. Check out this bit of dpath.path.get() :

    if view:
        if isinstance(tail, dict):
            if issubclass(pair[1], (list, dict)):
                tail[key] = pair[1]()
            else:
                tail[key] = None
            up = tail
            tail = tail[key]

... What's happening here is that when it paths to /a/b, up gets set to {'b': []}, not [] like we would expect. So when the last element in the path comes through, being /0 and /2, we wind up with {0: 0, 2: 2, 'b': []} ... Illustrated with some prints in the test:

======================================================================
FAIL: test_util_search.test_search_return_list_globbed
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nose-1.3.0-py2.7.egg/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/tests/test_util_search.py", line 148, in test_search_return_list_globbed
    assert(len(res['a']['b']) == 2)
AssertionError:
-------------------- >> begin captured stdout << ---------------------
{'a': {0: 0, 'b': []}}
{'a': {2: 2, 'b': []}}

--------------------- >> end captured stdout << ----------------------

... So that much is obvious. The intent of 'up', obviously, was to maintain a pointer to the most recent collection created, so that when the end of the path is reached, we can just insert into 'up' and be done. Unfortunately, if I restructure it like this:

        if isinstance(tail, dict):
            if issubclass(pair[1], (list, dict)):
                tail[key] = pair[1]()
                up = tail
            else:
                tail[key] = None
            tail = tail[key]

... It just makes things worse:

======================================================================
ERROR: test_util_merge.test_merge_filter
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nose-1.3.0-py2.7.egg/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/tests/test_util_merge.py", line 93, in test_merge_filter
    dpath.util.merge(dst, src, filter=filter)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/dpath/util.py", line 120, in merge
    src = search(src, '**', filter=filter)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/dpath/util.py", line 93, in search
    return _search_view(obj, glob)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/dpath/util.py", line 77, in _search_view
    val = dpath.path.get(obj, path, filter=filter, view=True)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/dpath/path.py", line 263, in get
    up[key] = target
TypeError: 'NoneType' object does not support item assignment

======================================================================
FAIL: test_util_search.test_search_return_dict_globbed
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nose-1.3.0-py2.7.egg/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/tests/test_util_search.py", line 121, in test_search_return_dict_globbed
    assert(res['a']['b'] == {0: 0, 2: 2})
AssertionError

======================================================================
FAIL: test_util_search.test_search_return_list_globbed
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nose-1.3.0-py2.7.egg/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/cygdrive/c/Users/akesterson/source/upstream/dpath-python/tests/test_util_search.py", line 148, in test_search_return_list_globbed
    assert(len(res['a']['b']) == 2)
AssertionError

----------------------------------------------------------------------
Ran 30 tests in 0.044s

FAILED (errors=1, failures=2)

... GRRRR ..... I HATES LISTS TO PIECES .....

Bamboo builds are not pushing tags back to git

See #19

Can I remove a key?

Following your example: x. I can see we can search, modify an existing key, adding a new key. My use case is I need to remove a key, is there a way to do it? Thanks.

Should use `Mapping`, `Sequence` rather than `Mutable` versions

dpath (after #35 was fixed) distinguishes between mappings, sequences, and other objects using MutableMapping and MutableSequence, instead of Mapping and Sequence. This means that it can't search within immutable sequences, like tuples:

>>> dpath.util.get([1,2,3], '1')
2
>>> dpath.util.get((1,2,3), '1')
KeyError: '1'

What about functions that need to mutate the input, like set or add? I think that even there, you want to path through using Mapping and Sequence. You may need to check at the "bottom" level, where you do the update/insert, that it's actually a MutableMapping or MutableSequence, but I'm not sure even that is necessary—and if it is, you probably still want to check Mapping or Sequence first, so you correctly report that a tuple is immutable instead of reporting that it's not something you can path into.

Consider:

>>> x = [[]]
>>> dpath.util.new(x, '0/0', 'hi')
>>> x
[['hi']]
>>> x = ([], )
>>> dpath.util.new(x, '0/0', 'hi') # should succeed and give us (['hi'],)
TypeError: Unable to path into elements of type ([],) at []
>>> x = [()]
>>> dpath.util.new(x, '0/0', 'hi') # should fail because x[0] is immutable
TypeError: Unable to path into elements of type () at [0]

(Also, the text of the TypeError is a bit weird—the problem is pathing into elements of (), or of type tuple, not of type ().)

This does raise the question of whether you want to allow pathing into strings, since isinstance(str, Sequence). If you don't want that, you'd need to explicitly exclude it with a second isinstance which checks (unicode, bytearray) for 2.x, (bytes, bytearray) for 3.0-3.4, and ByteString for 3.5+. But maybe it's fine for dpath.util.get(['abc'], '0/0') return return 'a' instead of raising a TypeError about being unable to path into strings.

Unable to install with easy_install

after it seems to install i go to use it and get errors like

help(dpath.util.search)
Traceback (most recent call last):
File "", line 1, in
NameError: name 'dpath' is not defined

Another example of the error is

result = dpath.util.search(x, "a/b/[cd]")
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'module' object has no attribute 'util'

Creating items in nested lists

Hi,

I would expect that running dpath.util.new on row[0][field1] would created a nested structure that would result in an dict -> list -> dict:

+ {'row': [{'field1': 'val1', 'field2': 'val2'}, {'field1': 'val3'}]}
?         ^                                                        +

instead of 

- {'row': {'0': {'field1': 'val1', 'field2': 'val2'}, '1': {'field1': 'val3'}}}
?         ^^^^^^                                     -----                    -

What are your thoughts? I'd love it. Note: It would break backward compatibility..

Coding

Coding it seems quite easy:
https://github.com/akesterson/dpath-python/blob/d0380f817265ffd043a261eac8a3f5380f7577a0/dpath/path.py#L27
should be turn into

 is_num = re.match('[\d]+', elem)
 result.append([elem, (list if is_num else dict)])

plus maybe accessors should be modified.

BTW. Thank you for this lib!

Attempting to use dpath for this output

How would I use dpath, or is it even possible, to get this output given this list:

B-A.txt
B-B.txt
B-C
B-C/B-C-A.txt
B-C/B-C-B.txt
B-C/B-C-C
B-C/B-C-C/B-C-C-B.txt
B-C/B-C-C/B-C-C-A.txt

Output:

{
   "fullpath": "uploads_fortest/v999999/B.tar.gz",
   "name": "B.tar.gz",
   "paths": [
      {
         "fullpath": "uploads_fortest/v999999/B.tar.gz/B-A.txt",
         "name": "B-A.txt",
         "type": "file"
      },
      {
         "fullpath": "uploads_fortest/v999999/B.tar.gz/B-B.txt",
         "name": "B-B.txt",
         "type": "file"
      },
      {
         "fullpath": "uploads_fortest/v999999/B.tar.gz/B-C",
         "name": "B-C",
         "paths": [
            {
               "fullpath": "uploads_fortest/v999999/B.tar.gz/B-C/B-C-A.txt",
               "name": "B-C-A.txt",
               "type": "file"
            },
            {
               "fullpath": "uploads_fortest/v999999/B.tar.gz/B-C/B-C-B.txt",
               "name": "B-C-B.txt",
               "type": "file"
            },
            {
               "fullpath": "uploads_fortest/v999999/B.tar.gz/B-C/B-C-C",
               "name": "B-C-C",
               "paths": [
                  {
                     "fullpath": "uploads_fortest/v999999/B.tar.gz/B-C/B-C-C/B-C-C-B.txt",
                     "name": "B-C-C-B.txt",
                     "type": "file"
                  },
                  {
                     "fullpath": "uploads_fortest/v999999/B.tar.gz/B-C/B-C-C/B-C-C-A.txt",
                     "name": "B-C-C-A.txt",
                     "type": "file"
                  }
               ],
               "type": "file"
            }
         ],
         "type": "file"
      }
   ],
   "type": "tarfile"
}

What I am trying to do is iterate over the output of tarfile.getmembers() and build a json tree without having to recurse using your dpath.

So far my recursion kinda works (not really as I can't set "type" appropriately)...

def add(branch, trunk):
        parts = branch.split('/', 1)
        if "paths" not in trunk:
            trunk["paths"] = []

        if len(parts) > 1:
            node, others = parts
            # print(trunk["paths"][-1])
            add(others, trunk["paths"][-1])

        elif len(parts) == 1:
            # trunk["type"] = file_type
            trunk["paths"].append(dict(fullpath=os.path.join(trunk["fullpath"], parts[0]),
                                       name=os.path.basename(parts[0]),
                                       type="file"))

result = {"fullpath": tar_path, "name": os.path.basename(tar_path), "type": "tarfile"}
tar = tarfile.open(tar_path)
tar_infos = tar.getmembers()
for member in tar_infos:
      add(member.name, result)

This is the contents of a tar file therefore I can't use os.listdir and the like. Also I'd like to be able to create the master json recursively for any tar files found inside a tarfile with a wrapper function.

Any good ideas ;) ?

Can't install - missing module dpath.version

Thanks for creating this. Looks like there is a missing module in the download. Where can I get it so I can install?

C:\Python27\Lib\site-packages\dpath-python-master>python steup.py install
python: can't open file 'steup.py': [Errno 2] No such file or directory

C:\Python27\Lib\site-packages\dpath-python-master>python setup.py install
Traceback (most recent call last):
File "setup.py", line 2, in
import dpath.version
ImportError: No module named version

C:\Python27\Lib\site-packages\dpath-python-master>

Spike - encapsulate Segments API into an overridable class

Currently everything about dpath's 2.0 behavior is wrapped up in the segments library, or virtually all of it. We'd like to be able to easily specify whether the v1 or v2 API is being used. We could easily break this out into "v1" and "v2" submodules, but that would break backwards compatibility for 1.x users who expect everything to be in dpath.utils (and, frankly, it gets in the way of dpath's simplicity - ultimately this is a utility library of only a few methods).

Let's see how it looks when we take the segments API and move it into some kind of "DPathBehavior" class, which is (by default) 2.0 behavior. Then let's see how easy that is to subclass to v1.0 and make the API accept a behavior implementation class, instead of a ton of flags.

util.search on certain nested lists/dictionaries doesn't return anything.

I'm not sure if there is a bug or if I am just using the search call incorrectly, but the following tests doesn't do anything.

from dpath import util as dpathutil

if __name__ == '__main__':
    a = {'entry' : ['a', 'b'] }
    for x in dpathutil.search(a, 'entry', yielded=True):
        print x
    b = { 'entry' : [ 'a', 'b', ['c'] ] }
    for x in dpathutil.search(b, 'entry', yielded=True):
        print x
    c = [['validate_code', 200], ['validate_body_json', {'Errors': None}], ['validate_keys_json', ['Location/0/Distance', 'Location/0/Availability']]]
    for x in dpathutil.search(c, '0', yielded=True):
        print x
    new = { "a": { "b": { "c": [], "d": [ "red", "buggy", "bumpers" ] } } }
    for x in dpathutil.search(new, "a/b/[cd]", yielded=True):
        print x

The last test on the above code snippet is from the readme and doesn't output what is expected. neither do any of the tests above that.

util.new seems to work fine for nested dictionary/list/string objects.

Check for interfaces instead of concrete types

I was hoping dpath would work with objects implementing e.g. collections.MutableMapping but unfortunately it will only work with concrete types like dict, list and tuple.

import dpath
from collections import Mapping, MutableMapping

class Namespace(MutableMapping):
    def __init__(self, data={}):
        self._mapping = {}
        self._mapping.update(data)

    def __len__(self):
        return len(self._mapping)

    def __iter__(self):
        return iter(self._mapping)

    def __contains__(self, key):
        return key in self._mapping

    def __getitem__(self, key):
        return self._mapping[key]

    def __setitem__(self, key, value):
        self._mapping[key] = value

    def __delitem__(self, key):
        del self._mapping[key]

x = Namespace()
x['user'] = Namespace()
x['user']['group'] = 'test'

assert isinstance(dict(), Mapping) == True
assert isinstance({}, Mapping) == True

assert x['user']['group'] == 'test'
assert dpath.util.get(x, '/user/group') == 'test'

feature request - anchor & collapse path

I've used perls Data::Dpath (https://metacpan.org/pod/Data::DPath) in the past and found its 'special elements' feature really useful (pasted below). Would it be possible to implement the same functionality?

//
Anchors to any hash or array inside the data structure below the currently found points (or the root).

Typically used at the start of a path to anchor the path anywhere instead of only the root node:

//FOO/BAR
but can also happen inside paths to skip middle parts:

/AAA/BBB//FARAWAY
This allows any way between BBB and FARAWAY.

<< this bullet is supposed to be an asterisk

Matches one step of any value relative to the current points (or the root). This step might be any hash key or all values of an array in the step before.

Documentation does not state that you should import dpath.util not import dpath

See issue #10, this is confusing for some users.

get root of dictionary

Hi.

It is possible to get leafs of a dictionary e.g. x = {'p':{'a':{'t':{'h':'value'}}, 'f':'misc'} with dpath by e.g.
dpath.util.get(x, "/p/a/t/h").
But how to get the root? I expected something like dpath.util.get(x, "/") returns "x". But this does not work currently.

Thnks,
H.

dpath-python should throw an exception whenever it encounters a path component or key that contains the separator character

See summary, this means that if someone sets:

>>> x = {"some/key": 0}

... then we will fail to search it, and we will fail silently. An exception should be raised when a path component is located that contains invalid characters.

Simple 'get()' or 'first()' or 'value()' method for dpath

Currently dpath assumes you will get multiple results for pretty much any query. However, if you know an exact path into a dictionary, and you want to extract the single value at that exact location, dpath makes this less than trivial. Consider a test case:

>>> print json.dumps(x, indent=4, sort_keys=True)
{
    "a": {
        "b": {
            "3": 2, 
            "43": 30, 
            "c": [], 
            "d": [
                "red", 
                "buggy", 
                "bumpers"
            ]
        }
    }
}
>>> dpath.util.search(x, 'a/b/43')
{'a': {'b': {'43': 30}}}

dpath.util.search assumes you want a dict because it assumes there will be multiple results (which, in most cases, is accurate). Even using the yielded iterator produces a suboptimal user experience here, because you have to iterate over the results:

>>> dpath.util.search(x, 'a/b/43', yielded=True)[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'generator' object is unsubscriptable
>>> for item in dpath.util.search(x, 'a/b/43', yielded=True):
...     print item
... 
('a/b/43', 30)

You can, however, easily get around this by using a list comprehension - but that just feels quirky and hacky:

>>> [y for y in dpath.util.search(x, 'a/b/43', yielded=True)][0][1]
30

We should, instead, be able to say "give me all the values for this query". Something like:

>>> dpath.util.values(x, 'a/b/43')
[30]

... And when you know/expect to only get a single value, or you only care about the first one ("first" being a dangerous term here unless the dict is ordered), you could call:

>>> dpath.util.get(x, 'a/b/43')
30

Doesn't allow dicts with keys that contain separator in them

I'm using the default separator of /. I can't retrieve anything if any keys have a / in them. I could change the separator but might run into another problem. I suspect that it could be modified to not care about what data is in the keys.

  File "/Users/marca/dev/surveymonkey/InventorySvc/.tox/py27/lib/python2.7/site-packages/dpath/path.py", line 57, in validate
    separator))
InvalidKeyName: /dev/sdc6 at ansible_facts/ohai_filesystem contains the separator /

Query for "entire structure?"

I assumed that there would be a way to return the entire dictionary somehow, but I can't seem to find one. Would it be useful?

ie:

x = {1:"first", 2:"second"}

dpath.util.get(x, "/") => return entire dictionary

Possible bug in search with multiple stars in path.

Consider the following input:

testdata = {
    "a": [
        {
            "b": [
                {"c": 1},
                {"c": 2},
                {"c": 3}
            ]
        }
    ]
}

testpath = "a/*/b/*/c"

I would expect search(testdata, testpath) to produce the following result(return the same object as the input):

{'a': [{'b': [{'c': 1}, {'c': 2}, {'c': 3}]}]}

Instead the result is:

{'a': [{'b': [{'c': 1}]},
       {'b': [{'c': 2}]},
       {'b': [{'c': 3}]}]}

Am I misunderstanding the behavior of search?

Bug infinite loop in list merge

A test to show the bug:

def test_merge_list():
    src = {"l": [1]}
    p1 = {"l": [2], "v": 1}
    p2 = {"v": 2}
    dst1 = {}

    for d in [src, p1]:
        dpath.util.merge(dst1, d)
    dst2 = {}
    for d in [src, p2]:
        dpath.util.merge(dst2, d)
    dst3 = {}
    for d in [src, p2]:
        dpath.util.merge(dst2, d)

    assert dst1["l"] == [1, 2]
    assert dst2["l"] == [1]
    assert dst3["l"] == [1]

Unicode support

It seems dpath.util.merge cannot handle dictionaries with unicode keys. For example,

a = {'中': ['zhong']}
b = {'文': ['wen']}
dpath.util.merge(a, b)

The above code will fail with

UnicodeEncodeError: 'ascii' codec can't encode character u'\uXXXX' in position 0
: ordinal not in range(128)

Unicode strings in dict keys causes UnicodeEncodingError to raise

>>> import dpath.path
>>> d = { u"Key Contains Unicode \u00af\u00f5": "value does not" }
>>> list(dpath.path.paths(d))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/danielpops/venv/local/lib/python2.7/site-packages/dpath/path.py", line 93, in paths
    validate(newpath)
  File "/home/danielpops/venv/local/lib/python2.7/site-packages/dpath/path.py", line 53, in validate
    strkey = str(key)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 21-22: ordinal not in range(128)

readme file example does not work

{'a': {'b': {'c': [], '3': 2, '43': 30, 'd': ['red', 'buggy', 'bumpers']}}}
ipdb> dpath.search(data, "a/b/[cd]")
{}

The ability to select multiple keys at the same level using the syntax as described in the readme file does not appear to work.

dpath.util.merge with dpath.util.MERGE_REPLACE doesn't work as expected

Case:

import dpath.util
dct_a = {"a": {"b": [1,2,3]}}
dct_b = {"a": {"b": [1]}}
dpath.util.merge(dct_a, dct_b, flags=dpath.util.MERGE_REPLACE)
dct_a

Expected:
{'a': {'b': [1]}}

Actual:
{'a': {'b': [1, 2, 3]}}

dpath.util.get should not raise a KeyError if the value is None

Hi,

Consider the following example:

>>> d = {'p': {'a': {'t': {'h': 'value'}}}}
>>> dpath.util.get(d, 'p/a/t/h')
'value'
>>> d = {'p': {'a': {'t': {'h': None}}}}
>>> dpath.util.get(d, 'p/a/t/h')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  [snip]
    raise KeyError(glob)
KeyError: 'p/a/t/h'

I would expect the second call to return None.

ALLOW_EMPTY_STRING_KEYS doesn't work when skip=True

A code path exists (dpath.util.get) wherein ALLOW_EMPTY_STRING_KEYS ultimately interacts poorly with skip=True in the dpath.path.paths function.

Repro:

>>> import dpath.util
>>> import dpath.options
>>> dpath.options.ALLOW_EMPTY_STRING_KEYS=True
>>> d = {"": "", "Empty": { "": "value", "key": ""}}
>>> dpath.util.get(d, 'foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/nail/home/dpopes/venv/local/lib/python2.7/site-packages/dpath/util.py", line 103, in get
    for item in search(obj, glob, yielded=True, separator=separator):
  File "/nail/home/dpopes/venv/local/lib/python2.7/site-packages/dpath/util.py", line 144, in _search_yielded
    for path in _inner_search(obj, globlist, separator, dirs=dirs):
  File "/nail/home/dpopes/venv/local/lib/python2.7/site-packages/dpath/util.py", line 159, in _inner_search
    for path in dpath.path.paths(obj, dirs, leaves, skip=True):
  File "/nail/home/dpopes/venv/local/lib/python2.7/site-packages/dpath/path.py", line 90, in paths
    elif (skip and k[0] == '+'):
IndexError: string index out of range

Similarly, at the source:

>>> import dpath.path
>>> import dpath.options
>>> dpath.options.ALLOW_EMPTY_STRING_KEYS=True
>>> d = {"": "", "Empty": { "": "value", "key": ""}}
>>> list(dpath.path.paths(d, skip=True))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/nail/home/dpopes/venv/local/lib/python2.7/site-packages/dpath/path.py", line 90, in paths
    elif (skip and k[0] == '+'):
IndexError: string index out of range

dpath.util.merge needlessly expands lists during merge

While contemplating answers for #5 and #6 , I discovered this bit of unhappy:

>>> for path in dpath.path.search(data, ['**'], dirs=False):
...     print dpath.path.get(data, path, view=True)
...
{'a': {'b': {'d': ['red']}}}
{'a': {'b': {'d': [None, 'buggy']}}}
{'a': {'b': {'d': [None, None, 'bumpers']}}}
{'a': {'b': {'3': 2}}}
{'a': {'b': {'43': 30}}}
>>> for path in dpath.path.search(data, ['**'], dirs=False):
...     dpath.util.merge(res, dpath.path.get(data, path, view=True))
...
>>> print json.dumps(res, indent=4, sort_keys=True)
{
    "a": {
        "b": {
            "3": 2,
            "43": 30,
            "d": [
                "red",
                null,
                null,
                null,
                null,
                    "bumpers"
            ]
        }
    }
}

... obviously this is wrong. fix it.

Returning default value

Hey, can you please update the code to send in a default value if an object is not found using the get function.

The function "get" under dpath/util.py can be changed as mentioned below to achieve it

def get(obj, glob, separator="/", default = None):
"""
Given an object which contains only one possible match for the given glob,
return the value for the leaf matching the given glob.
If more than one leaf matches the glob, ValueError is raised. If the glob is
not found, KeyError is raised.
"""
ret = None
for item in search(obj, glob, yielded=True, separator=separator):
if ret is not None:
raise ValueError("dpath.util.get() globs must match only one leaf : %s" % glob)
ret = item[1]
if ret is None:
if (default != None):
ret = default
else:
raise KeyError(glob)
return ret

Loose typechecking for MutableMapping objects in _check_typesafe

It would be nice to add the ability to check if obj1[key] and obj2[key] in _check_typesafe are functional equivalents, not just exact type equivalents, particularly for various types of dict like objects.

When attempting to merge two dicts, I encountered an issue where an object subclassing dict in src was unable to be merged into a dict object in dst. I think a check to see if the two objects in question are instances of MutableMapping may be sufficient to get most cases. It seems to me that two dict-like-objects should continue the recursive merge and not force a wholesale replacement of dst by src.

Perhaps an additional flag could be added to permit this loose functional equivalence matching or at the least MutableMapping instance matching. As far as the resulting object type, I feel that merge should keep the dst's current object type, but perhaps that could be configured by a flag.