Giter VIP home page Giter VIP logo

getschema's People

Contributors

daigotanaka avatar mlavoie-sm360 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

getschema's Issues

Improved handling for None values

If I may, I suspect that there are changes that have been made since 0.2.0 that have created what I would consider a regression, but which may be a desired effect. I’d like to know your thoughts.

After updating to 0.2.4, I noticed “None” strings and such in my data, values that I expected to be none to be 0 for integers, False for booleans and “None” for strings. I did a code comparison and spotted new behavior that also had a matching test (test_invalid_obj_type).

Essentially, when a value is None, the new code fills in with a default, 0 or False for example. But if the schema supports [“null”, “integer”], then does that not mean that if the value is None, it should be left as such? If the schema on the other hand was [“integer”] only, then in that case we would want a None to be converted to the default value for the type, ie: 0, False, etc?

Maybe my understanding of how the schema works is at fault.

Incorrect handling of nullable

schema:

{
  "type": "object",
  "properties": {
    "index": {
      "type": ["null", "integer"]
    },
    "array": {"type": ["null", "array"],
    "items": {"type": ["null", "number"]}},
    "nested_field": {
      "type": ["null", "object"],
      "properties": {
        "some_prop": {"type": ["null", "integer"]}
      }
    },
    "boolean_field": {"type": ["null", "boolean"]}, 
    "another_boolean_field": {"type": ["null", "boolean"]}
  }
}

test

null_index = {
    "index": null,
    "array": [
        "1",
    ],  
    "nested_field": {
        "some_prop": "3",
    },  
    "boolean_field": null,
    "another_boolean_field": true,
}

This will fail with ValueError: Null object given at ['properties', 'index'] while index should be able to take a null value.

getschema.fix_type function: A valid null object is rejected

v0.2.6

fix_type A null object (=dictionary) is not handled correctly. It results in

This test:

import getschema

null_entries = {
    "index": None,
    "array": [
        "1.5",
        None,
    ],
    "nested_field": None,
    "boolean_field": None,
    "number_field": None,
    "string_field": None,
}

def test_reject_null_object():
    schema = getschema.infer_schema(records)
    # This will pass
    _ = getschema.fix_type(null_entries, schema)

    schema["properties"]["nested_field"]["type"] = ["object"]
    try:
        _ = getschema.fix_type(null_entries, schema)
    except Exception as e:
        assert(str(e).startswith("Null object given at"))
    else:
        raise Exception("Supposed to fail with null value")

...will fail with:

KeyError: "property type (object) Expected a dict object.Got: <class 'NoneType'>...

Error when dict object is empty

On line 268 of impl.py:

if not (type(obj) is dict and obj.keys()):
, an exception is raised when the type of an object is a dict and obj.keys() returns False (e.g. if it's an empty dict).

This is causing my tap to fail even though it should run correctly - in this case, I have a value in my data which can take on a null value (it's defined in the schema as 'integrations': {'type': ['null', 'object']}) and can sometimes be populated, but is sometimes an empty dict ({}).

This causes the somewhat misleading exception to be raised:

KeyError: "property type (object) Expected a dict object.Got: <class 'dict'> {}"

Is the extra check for obj.keys() necessary? By removing it, I can successfully run the tap, since the empty dict doesn't cause any issues.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.