Giter VIP home page Giter VIP logo

Comments (3)

shawnbrown avatar shawnbrown commented on June 12, 2024

You are correct about the accepted.tolerance() behavior--it's only applied to direct child values, not to values within nested dictionaries. A workaround would be to convert nested dictionaries into a flattened dictionary with composite keys.

Here's a function that converts nested dictionaries into a flat dictionary with composite tuple keys:

def flatten(d, parent_key=()):
    """Helper function to flatten nested dictionaries."""
    items = []
    for k, v in d.items():
        new_key = tuple(parent_key) + (k,) if parent_key else k
        if isinstance(v, dict):
            items.extend(flatten(v, new_key).items())
        else:
            items.append((new_key, v))
    return dict(items)

Using the function above, you could flatten the dictionaries like so:

>>> dict1 = {"x": {"a": 0.99, "b": 2.0}, "y": 3.0}
>>> dict2 = {"x": {"a": 1.0, "b": 1.99}, "y": 2.99}
>>> flatten(dict1)
{('x', 'a'): 0.99, ('x', 'b'): 2.0, 'y': 3.0}
>>> flatten(dict2)
{('x', 'a'): 1.0, ('x', 'b'): 1.99, 'y': 2.99}

This would let you change your sample code to the following:

import pytest
from datatest import validate, accepted, ValidationError


def flatten(d, parent_key=()):
    """Helper function to flatten nested dictionaries."""
    items = []
    for k, v in d.items():
        new_key = tuple(parent_key) + (k,) if parent_key else k
        if isinstance(v, dict):
            items.extend(flatten(v, new_key).items())
        else:
            items.append((new_key, v))
    return dict(items)


def test_datatest():
    dict1 = {"x": {"a": 0.991, "b": 2.0}, "y": 3.0}
    dict2 = {"x": {"a": 1.0, "b": 1.991}, "y": 2.991}

    with accepted.tolerance(0.01):
        validate(flatten(dict1), flatten(dict2))  # <- Flattened for validation.

# NOTE: I changed the `.99`s in this sample code because
# the floating point math was giving me a difference of
# `0.010000000000000009` (outside the accepted tolerance).

I like the idea of validating nested dictionary values directly but the implementation gets more complex that it might initially seem. Since ValidationError differences reflect the structure of the tested data, nested dictionaries would mean nested difference handling. At this time, the internal acceptance machinery is not set-up to handle this sort of thing and in combination with accepted.count() it would have resulted in non-deterministic behavior when running on older versions of Python. This is because it was written to support versions of Python that didn't guarantee dictionaries with stable order.

That said, future versions of datatest will drop support for those old versions of Python and direct validation of nested values should be possible. But that's not something I can add in the short term. For now, the dictionaries will need to be flattened for validation.

This is a good question though and I should definitely add a page to the How-to Guide that addresses this use case.

from datatest.

shawnbrown avatar shawnbrown commented on June 12, 2024

A different flatten() function could combine the keys into a single string value. Doing this is less precise than the tuple-keys version shown previously but many use cases don't need to preserve the keys exactly and the result can be more readable:

def flatten(d, parent_key="", sep="."):
    """Helper function to flatten nested dictionaries."""
    items = []
    for k, v in d.items():
        new_key = f"{parent_key}{sep}{k}" if parent_key else k
        if isinstance(v, dict):
            items.extend(flatten(v, new_key, sep=sep).items())
        else:
            items.append((new_key, v))
    return dict(items)

This function would give more compact keys:

>>> dict1 = {"x": {"a": 0.99, "b": 2.0}, "y": 3.0}
>>> dict2 = {"x": {"a": 1.0, "b": 1.99}, "y": 2.99}
>>> flatten(dict1)
{'x.a': 0.99, 'x.b': 2.0, 'y': 3.0}
>>> flatten(dict2)
{'x.a': 1.0, 'x.b': 1.99, 'y': 2.99}

from datatest.

teese avatar teese commented on June 12, 2024

Thanks @shawnbrown for the excellent, fast response. The workaround with flatten() that combined the keys into a single string was perfect for my use-case.
It's no problem if you want to close this issue, preferably after updating the documentation :).

from datatest.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.