Giter VIP home page Giter VIP logo

strictyaml's Introduction

StrictYAML

StrictYAML is a type-safe YAML parser that parses and validates a restricted subset of the YAML specification.

Priorities:

  • Beautiful API
  • Refusing to parse the ugly, hard to read and insecure features of YAML like the Norway problem.
  • Strict validation of markup and straightforward type casting.
  • Clear, readable exceptions with code snippets and line numbers.
  • Acting as a near-drop in replacement for pyyaml, ruamel.yaml or poyo.
  • Ability to read in YAML, make changes and write it out again with comments preserved.
  • Not speed, currently.

Simple example:

# All about the character
name: Ford Prefect
age: 42
possessions:
- Towel
from strictyaml import load, Map, Str, Int, Seq, YAMLError

Default parse result:

>>> load(yaml_snippet)
YAML({'name': 'Ford Prefect', 'age': '42', 'possessions': ['Towel']})

All data is string, list or OrderedDict:

>>> load(yaml_snippet).data
{'name': 'Ford Prefect', 'age': '42', 'possessions': ['Towel']}

Quickstart with schema:

from strictyaml import load, Map, Str, Int, Seq, YAMLError

schema = Map({"name": Str(), "age": Int(), "possessions": Seq(Str())})

42 is now parsed as an integer:

>>> person = load(yaml_snippet, schema)
>>> person.data
{'name': 'Ford Prefect', 'age': 42, 'possessions': ['Towel']}

A YAMLError will be raised if there are syntactic problems, violations of your schema or use of disallowed YAML features:

# All about the character
name: Ford Prefect
age: 42

For example, a schema violation:

try:
    person = load(yaml_snippet, schema)
except YAMLError as error:
    print(error)
while parsing a mapping
  in "<unicode string>", line 1, column 1:
    # All about the character
     ^ (line: 1)
required key(s) 'possessions' not found
  in "<unicode string>", line 3, column 1:
    age: '42'
    ^ (line: 3)

If parsed correctly:

from strictyaml import load, Map, Str, Int, Seq, YAMLError, as_document

schema = Map({"name": Str(), "age": Int(), "possessions": Seq(Str())})

You can modify values and write out the YAML with comments preserved:

person = load(yaml_snippet, schema)
person['age'] = 43
print(person.as_yaml())
# All about the character
name: Ford Prefect
age: 43
possessions:
- Towel

As well as look up line numbers:

>>> person = load(yaml_snippet, schema)
>>> person['possessions'][0].start_line
5

And construct YAML documents from dicts or lists:

print(as_document({"x": 1}).as_yaml())
x: 1

Install

$ pip install strictyaml

Why StrictYAML?

There are a number of formats and approaches that can achieve more or less the same purpose as StrictYAML. I've tried to make it the best one. Below is a series of documented justifications:

Using StrictYAML

How to:

Compound validators:

Scalar validators:

Restrictions:

Design justifications

There are some design decisions in StrictYAML which are controversial and/or not obvious. Those are documented here:

Star Contributors

  • @wwoods
  • @chrisburr
  • @jnichols0

Other Contributors

  • @eulores
  • @WaltWoods
  • @ChristopherGS
  • @gvx
  • @AlexandreDecan
  • @lots0logs
  • @tobbez
  • @jaredsampson
  • @BoboTIG

StrictYAML also includes code from ruamel.yaml, Copyright Anthon van der Neut.

Contributing

  • Before writing any code, please read the tutorial on contributing to hitchdev libraries.
  • Before writing any code, if you're proposing a new feature, please raise it on github. If it's an existing feature / bug, please comment and briefly describe how you're going to implement it.
  • All code needs to come accompanied with a story that exercises it or a modification to an existing story. This is used both to test the code and build the documentation.

strictyaml's People

Contributors

bobotig avatar chrisburr avatar christophergs avatar crdoconnor avatar danilomendesdias avatar ezag avatar gvx avatar holmboe avatar jaredsampson avatar jnichols0 avatar lots0logs avatar marhar avatar psanan avatar scooter-dangle avatar tfuzeau avatar tharvik avatar tobbez avatar waldyrious avatar wwoods avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

strictyaml's Issues

FEATURE: schema from data

Provide a way to construct schema from data. This would make it possible to have schema stored as a yaml file. I have a specific use case for this in the project that I am currently developing. Roughly speaking, there are several tiers of configurability, the first being the data and the second, among the other things, is the schema for that data. Currently I am using Cerberus for that purpose but since I intend to use strictyaml anyway, it feels a bit awkward to have two validators in the codebase.
Of course, I could build this functionality atop of strictyaml myself, but then I would need to both keep it in sync with stricyaml (should any new validators be added in the future) and write the end user documentation on "schema schema" whereas with Cerberus I can rely on existing documentation and just refer end-user to it.

Remove dependency on ruamel.yaml

strictyaml is underpinned by ruamel.yaml - this is a dependency used to parse the YAML. It parses the YAML into an abstract syntax tree which is then run through the validator. ruamel.yaml is a YAML 1.2 parser and as such, contains an awful lot of cruft related to the complications involved in adhering to the spec. StrictYAML uses only a limited subset of its features.

Owing to the complications involved in parsing the YAML spec, the code complication has grown rather large and thus it would be more ideal if strictyaml handled parsing and roundtripping itself. There is also an unfortunate dependency on the internal workings of the library which has led to breakages in the past.

This refactoring should not change the behavior of the library itself, but simply remove the dependency on ruamel.yaml.

[Question] Get value of key of type Enum as string?

Getting the value of a key of type Enum as string feels not intuitive for me. Am I doing it correctly?

from strictyaml import Map, Enum, load, YAML

yaml_snippet = """
a: A
"""
schema = Map({"a": Enum(["A", "B", "C"])})
parsed = load(yaml_snippet, schema)
# parsed is YAML(OrderedDict([('a', YAML(A))]))
# parsed['a'].value is YAML(A)
# parsed['a'].value.value is 'A'
# parsed['a'].data.data is 'A'

Support request: How to pprint?

I have parsed a strictyaml document into a strictyaml object, like so:

import strictyaml
data = None
with open("doc.strictyaml", 'r') as yfile:
    data = strictyaml.load(yfile.read())

I wish to print data with pprint

import pprint
pprint.pprint(data)

but it does not format prettily because it is a YAML object, and pprint does not know what that is. When I use data = data.value, data becomes a dictionary of YAML objects.

How can I convert a strictyaml object into an object that is made of only dictionaries, lists, and strings?

Exception when accessing parsed YAML data using key other than Str() for MapPattern

Attempting to retrieve the OrderedDict form of data from a YAML object results in an exception if the schema contains a MapPattern with a key other than Str():

Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from strictyaml import *
>>> my_dict = {10.25: 23, 20.33: 76}
>>> schema = MapPattern(Float(), Int())
>>> yaml = as_document(my_dict, schema)
>>> yaml.as_yaml()
'10.25: 23\n20.33: 76\n'
>>> yaml.data
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\venv\lib\site-packages\strictyaml\representation.py", line 114, in data
    mapping[key.data] = value.data
AttributeError: 'str' object has no attribute 'data'

Changing the key from Float() to Str() and then converting separately would be a less than ideal workaround.

Environment:

  • Python 3.7.3
  • strictyaml 1.0.1
  • ruamel.yaml 0.15.96

Scalars as mapping key are interpreted as non-strings without validator

strictyaml.load("""
42: 42
2.0: 2.0
null: null
2016-02-01: 2016-02-01
true: true
""")

The result is {True: 'True', 42: '42', None: '', 2.0: '2.0', datetime.date(2016, 2, 1): '2016-02-01'}

Additionally, null is parsed as the empty string when it is a mapping or sequence value, and true and false are capitalized.

[query] What is the set relationship between StrictYAML and JSON?

It is clear that:

  • StrictYAML is a subset of YAML
  • JSON is a subset of YAML

...but what is the relationship between StrictYAML and JSON?

  1. All JSON is a subset of StrictYAML
  2. There is some meaningful overlap ("StrictJSON", readable by both StrictYAML and JSON parsers)
  3. There is some overlap, but it doesn't have practical application (readable, but missing essential structures for common use cases)
  4. There is no overlap
  5. Something else?

...if it is 2, is there an easy way to check that a given StrictYAML string is also JSON parseable? This would allow me to use one library (StrictYAML) to generate and parse all my JSON and YAML data.

Indent should be consistent

  • Feature request

How many indent spaces should I use?

I encounter many YAML files, some indent 4 spaces, some indent 2 spaces,
and some indent 2 spaces before - list, and others not.
Sometimes I got confused when they mixed:

example:
  indent: 2 spaces
  items:
  - one
  - two
  others:
    - three
    - four
  inner:
      indent: 4 spaces
      items:
      - five
      - six
      others:
          - seven
          - eight

Add support to load a yaml from a file

While ruamel.yaml supports any kind of stream for its load method, this is not the case for strictyaml, which currently checks the following:

    if stream_type not in ("<type 'unicode'>", "<type 'str'>", "<class 'str'>"):
        raise TypeError("StrictYAML can only read a string of valid YAML.")

I think strictyaml.load should implicitly accept all the types that are supported by the underlying call to ruamel.yaml.load.

I also strongly believe you should avoid using this kind of type comparisons and use isinstance instead. Also consider adapting validators.py where types are checked using == which is clearly a bad idea ;-)

Or validation is not working as intended

When you load yaml file and your scheme has an or validator you cannot change the type of the data afterwards.

Example:

from strictyaml import Map, Bool, Int, YAMLValidationError, load

schema = Map({"a": Bool() | Int()})
data = load('a: yes', schema)
data['a'] = 5
*** strictyaml.exceptions.YAMLValidationError: when expecting a boolean value (one of "yes", "true", "on", "1", "y", "no", "false", "off", "0", "n")
found an arbitrary integer
  in "None", line 1, column 1:
    '5'
     ^ (line: 1)

I found a workaround. You actually need to do is change the validator:

from strictyaml import Map, Bool, Int, YAMLValidationError, load

schema = Map({"a": Bool() | Int()})
data = load('a: yes', schema)
data['a']._validator = Int()
data['a'] = 5
assert data['a'] == 5

But I would say that this doesn't work as intended.

Loading to named tuple

Hi,
Thanks for interesting project - we live in very strange world, where so basic things like human readable file format has no good and popular solution.
StrictYAML looks like good candidate.
I have a small question/feature request - is it possible to load StrictYAML file not to OrderedDict, but to named tuple?
Where is two reasons for this:

  1. It's immutable - for example, it's more safe for global config.
  2. It's more readable: you can write person.name instead person['name']

New Mapping Type

Create a new kind of mapping validator where it looks for one key and then once it sees that key it expects several others (some may be optional).

Primary use case is here : https://github.com/hitchdev/seleniumdirector/blob/master/seleniumdirector/webdirector.py where the following groups of keys in the mapping are acceptable:

  • id, in iframe, which, but parent, subelements (in iframe, which, but parent, subelements) all optional
  • class, in iframe, which, but parent, subelements (ditto, all optional)
  • attribute, in iframe, which, but parent, subelements (ditto, all optional)
  • text is, in iframe, which, but parent, subelements (ditto, all optional)
  • text contains, in iframe, which, but parent, subelements (ditto, all optional)

However id, class, attribute, text is and text contains should never be seen together, however - using the current validator, every key is optional.

Raise exception on bad list items

Issue raised by @Carpetsmoker in #23:

foo:
  - one
  - two
    - three

is (yaml module):

{'foo': ['one', 'two - three']}

or (strictyaml module):

YAML(OrderedDict([('foo', ['one', 'two - three'])]))

Desired outcome: some kind of exception. Maybe prevent multiline list items unless | is used.

Input and comments on this issue welcomed.

Support for nested maps?

Does strictyaml support nested maps like this one? Is # nesting restricted?

a_nested_map:
  key: value
  another_key: Another Value
  another_nested_map:
    hello: hello

Empty document not allowed to be parsed with all-optional fields schema

Found when working on gjcarneiro/yacron#19:

When there is a schema, and all outer fields are optional, it's strictyaml doesn't allow parsing the empty document. Example:

>>> from strictyaml import Map, Str, Optional as Opt
>>> import strictyaml
>>> strictyaml.load("", Map({Opt("foo"): Str()}))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/gjc/projects/yacron/env/lib/python3.6/site-packages/strictyaml-0.11.7-py3.6.egg/strictyaml/parser.py", line 276, in load
    return schema(YAMLChunk(document, label=label))
  File "/home/gjc/projects/yacron/env/lib/python3.6/site-packages/strictyaml-0.11.7-py3.6.egg/strictyaml/validators.py", line 14, in __call__
    self.validate(chunk)
  File "/home/gjc/projects/yacron/env/lib/python3.6/site-packages/strictyaml-0.11.7-py3.6.egg/strictyaml/compound.py", line 99, in validate
    items = chunk.expect_mapping()
  File "/home/gjc/projects/yacron/env/lib/python3.6/site-packages/strictyaml-0.11.7-py3.6.egg/strictyaml/yamllocation.py", line 87, in expect_mapping
    "found {0}".format(self.found())
  File "/home/gjc/projects/yacron/env/lib/python3.6/site-packages/strictyaml-0.11.7-py3.6.egg/strictyaml/yamllocation.py", line 34, in expecting_but_found
    self
strictyaml.exceptions.YAMLValidationError: when expecting a mapping
found a blank string
  in "<unicode string>", line 1, column 1:
    ''
     ^ (line: 1)
>>> 

Optional default can’t be empty list

I‘m unsure whether this is a bug report or a feature request (i.e. whether the current behaviour is by design).

My desired behaviour is this: I would like to specify an optional map element that can be either a non-empty sequence or a non-empty comma-separated list. If the element isn’t specified, it should default to an empty list.

To clarify, here are valid examples:

a:
  - foo
b: bar
b: bar

And this is an invalid example:

a:
b: bar

Currently, the following works (from strictyaml import *):

schema = Map({Optional('a', None): Seq(Str()), 'b': Str()})
list(map(lambda t: load(t, schema), ['a:\n  - x\nb:\n', 'b:\n']))

However, when changing the default to [], it fails:

schema = Map({Optional('a', []): Seq(Str()), 'b': Str()})

InvalidOptionalDefault: Optional default for 'a' failed validation:
Expected a non-empty list, found an empty list.
Use EmptyList validator to serialize empty lists.

Likewise, the following fails, with a different error, at a different time:

schema = Map({Optional('a', []): CommaSeparated(Str()), 'b': Str()})
list(map(lambda t: load(t, schema), ['a: x\nb:\n', 'b:\n']))

IndexError: string index out of range (in strictyaml/utils.py L96)

As suggested in the first error message above the following works, but it describes a different schema (namely, it allows an existing, empty a)!

schema = Map({Optional('a', list()): EmptyList() | Seq(Str()), 'b': Str()})

Support question: OpenAPI 3.0.x flow mapping workarounds

Thanks for the great library! I'm hoping to use this instead of pyyaml when parsing OpenAPI 3.0.x documents, but I'm running into an issue with the Security Requirements Object of a path item.

From that link, emphasis mine:

Each name MUST correspond to a security scheme which is declared in the Security Schemes under the Components Object. If the security scheme is of type "oauth2" or "openIdConnect", then the value is a list of scope names required for the execution. For other security scheme types, the array MUST be empty.

This means I've got paths like this:

paths:
  /users/{userId}/widgets/{widgetId}/revisions:
    post:
      operationId: createWidgetRevision
      security:
        - apiToken: []

Which fails as expected with FLowMappingDisallowed:

$ ipython
Python 3.6.5 (default, Apr  1 2018, 15:30:28) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.5.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from strictyaml import load

In [2]: doc="""paths:
   ...:   /users/{userId}/widgets/{widgetId}/revisions:
   ...:     post:
   ...:       operationId: createWidgetRevision
   ...:       security:
   ...:         - apiToken: []"""

In [3]: load(doc)
FlowMappingDisallowed [traceback clipped]

Any recommendations? There's also a commonly-used form for an unauthenticated method with a similar issue:

paths:
  /items/{itemId}:
    put:
      operationId: updateItem
      security:
        - {}

Documentation not rendering 'given' sections

The documentation doesn't correctly display most given: code snippets, rendering the generated docs mostly unusable:

Parsing comma separated items shows:

variations:
        Parse as int:
          given:
            yaml_snippet: |
              a: 1, 2, 3
          steps:
          - Run:
              code: |
                Ensure(load(yaml_snippet, int_schema)).equals({"a": [1, 2, 3]})

but the HTML rendered page doesn't display the given: snippet: screenshot. This is consistent across all of the generated docs.

Bug: Or on deeper structures breaks

Hi, i found a bug when trying to "or" deeper nested structures:

from strictyaml import load, Map, MapPattern, Str, Int

yaml="""
configuration: 
  test1:
    helptext: "some text"
  test2:
    helptext: "some other text."
    min: 0
"""

schema = Map(
    {
        "configuration": MapPattern(
            Str(), 
            Map({
                "helptext": Str(), 
            }) | Map({
                "helptext": Str(), 
                "min": Int(), 
            }),
            minimum_keys=1
        ),
    }
)

data = load(yaml, schema=schema)
print(data.data)

which results in

strictyaml.exceptions.YAMLValidationError: when expecting an integer
found arbitrary text
  in "<unicode string>", line 5, column 1:
        helptext: some other text.
    ^ (line: 5)

the strange thing is, it "works" when themin validator is a Str() so i guess the validators are applied to the wrong content?

Can not load document created by StrictYAML

StrictYAML dumps empty dictionary as {} which it later refuses to load because it's "ugly flow style". That makes it impossible to load documents created by StrictYAML back into itself.

Example (strictyaml==0.13.0)

>>> data = {'hello': {}}
>>> doc = as_document(data)
>>> doc
YAML(OrderedDict([('hello', OrderedDict())]))
>>> serialized = doc.as_yaml()
>>> serialized
'hello: {}\n'
>>> doc2 = load(serialized)

…

strictyaml.exceptions.FlowMappingDisallowed: While scanning
  in "<unicode string>", line 1, column 8:
    hello: {}
           ^ (line: 1)
Found ugly disallowed JSONesque flow mapping (surround with ' and ' to make text appear literally)
  in "<unicode string>", line 1, column 9:
    hello: {}
            ^ (line: 1)

Difficult to use Optional("key", None)

Great project, just a few usability issues.

Assume the following schema:

import strictyaml as s
s.load("a: 8", s.Map({ 
    s.Optional("a", 1): s.Int(),
    s.Optional("key_a", None): s.Str(), 
    s.Optional("key_b", {}): s.EmptyDict() | s.Map({})
}))

A few issues with usability arise:

  1. Why is a: 8 required from the YAML file? It seems like a blank file (or one consisting only of comments) should be perfectly valid with this schema.
  2. The result is YAML(OrderedDict([('a', 8), ('key_b', {})])). It is confusing that key_a is missing entirely from the results, despite having a default value of None. A common pattern for fixing this in Python is to define Optional using e.g.:
missing = {}  # Creates a unique object
class Optional:
    def __init__(self, key, default=missing):
        ...

# Later
if optional.default is missing:
    # User specified no default, OK not to include in output

In this way, Optional with no default specified can result in the absence of the key, while Optional with any default would always be present in the output.

UnicodeEncodeError in Python2.7

If the YAML file has any non-latin1 character, strictyaml can't load the document.

Test case:

# coding=utf-8
import strictyaml

yaml = u'''
name: Olé
age: 38
'''
strictyaml.load(yaml).data

FAQ has some wonky wording in "What is wrong with explicit syntax typing" section

In the What is wrong with explicit syntax typing in a readable configuration languages section, there is this line:

StrictYAML does require quotation marks for strings that are implicitly converted to other types (e.g. yes or 1.5), but it does require quotation marks for strings that are syntactically confusing (e.g. "{ text in curly brackets }")

Based on the functioning of the module, it seems like this is intended to actually be something more like:

StrictYAML does not require quotation marks for strings that would be implicitly converted to other types in ordinary YAML (e.g. yes or 1.5), but it does require quotation marks for strings that are syntactically confusing (e.g. "{ text in curly brackets }")

Language independent schema

Hi,
Now, StrictYAML use Python code for defining schema of a data, but for beeing popular and wide used StrictYAML must have possibility to be ported to other languages.
Maybe will be good if data schema will be YAML/StrictYAML?

StrictYAMLError should not inherit from YAMLError

Consider the following:
https://github.com/crdoconnor/strictyaml/blob/master/strictyaml/exceptions.py#L4

StrictYAMLError should not inherit from YAMLError, allowing people to make a clear distinction between errors that are raised by your library and errors that are raised by the underlying one.

I know one could simply do:

try:
  # ...
except YAMLError:
  pass
except StrictYAMLError:
  pass

But (1) this implies that YAMLError is in the current scope (it is not even exposed by your library, but that's not the point), and (2) YAMLError are caught before StrictYAMLError.

Implement Mapping abstract base class

YAML class does already meet almost all requirements of the Mapping abstract base class (collections.abc.Mapping). The only missing method is __iter__

If you implement that method, YAML instances will automatically be considered instances of Mapping ABC - that will enable passing them to third party functions that require generic mappings.

Thank you!

Add label to load function

Feature requested in gjcarneiro/yacron#4

Example:

try:
    doc = strictyaml.load(data, CONFIG_SCHEMA, label=path).data
except YAMLValidationError as ex:
    print(ex)

strictyaml.exceptions.YAMLValidationError: while parsing a mapping
unexpected key not in schema 'killTimeoutx'
  in "/path/to/whatever.yaml", line 73, column 1:
      killTimeoutx: '0.5'
    ^ (line: 73)

Incorrect error location reported

strictyaml version 0.4.1

import strictyaml as sy
sy.load("1: a", sy.MapPattern(sy.Int(), sy.Int()))

 YAMLValidationError: when expecting an integer
 found non-integer
 in "<unicode string>", line 1, column 1:
     '1': a
     ^

One would expect "column 6" (i.e. the "a" character) being marked as an error location.

Basic documentation for validators

Could you please provide, eg. in the README, some examples showing how to use the different validators provided by your library?

For instance, I see in the code there is an Optional validator, a support for disjunctive validators, etc. but this is not mentioned in the "documentation" nor in some examples.

I assume those are pretty intuitive to use (ie. Optional(Str()), or Str() | Int()). Am I right?

Documents with single scalars or only comments do not parse correctly

Unless I am missing something, which is certainly possible, I do not think StrictYAML is correctly parsing documents that consist either of only comments or comments with a single scalar. Example document:

# Some comment.
"A string value."

Using ruamel.yaml:

doc = '# Some comment.\n"A string value."'

from ruamel.yaml import YAML
print(YAML().load(doc))

The output is:

A string value.

With StrictYAML:

import strictyaml
print(strictyaml.load(doc))

The output is:

# Some comment.
"A string value."

It appears that StrictYAML is including the comment in the scalar value.

Thanks for the great work on StrictYAML and for any help with this issue.

Allow extra keys in a yaml file

Currently, I am unable to set the equivalent to the json additionalProperties = True schema property, getting the error message

unexpected key not in schema ''

Would it be possible to add an optional flag to allow/disallow this?

My use case is strictly specifying some fields within a larger yaml file that I do not care about. I would be happy to contribute a PR if you are able to help guide me to where to add this!

DeprecationWarning on collections in Python 3.7

With Pyhon 3.7, I get the following warning :

.tox/py37/lib/python3.7/site-packages/strictyaml/utils.py:2
  /home/daniel/Client_API_VN/.tox/py37/lib/python3.7/site-packages/strictyaml/utils.py:2: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    from collections import Iterable

.tox/py37/lib/python3.7/site-packages/strictyaml/parser.py:57
  /home/daniel/Client_API_VN/.tox/py37/lib/python3.7/site-packages/strictyaml/parser.py:57: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
    if not isinstance(key, collections.Hashable):

ruamel.yaml 0.15.48 breaks strictyaml

  File "/Users/fschulze/Development/devpi/devel/lib/python3.6/site-packages/strictyaml/__init__.py", line 2, in <module>
    from strictyaml.parser import load
  File "/Users/fschulze/Development/devpi/devel/lib/python3.6/site-packages/strictyaml/parser.py", line 6, in <module>
    from strictyaml import exceptions
  File "/Users/fschulze/Development/devpi/devel/lib/python3.6/site-packages/strictyaml/exceptions.py", line 1, in <module>
    from ruamel.yaml import MarkedYAMLError
ImportError: cannot import name 'MarkedYAMLError'

I haven't investigated whether this is a regression in ruamel.yaml (looks like it though) or if strictyaml used unofficial API

Package license file

Could you please add the license file to MANIFEST.in so that it will be included in sdists, whls, and other packages?

Implicit default value for optional key in mappings?

According to the optional keys docs and the sources there is no support for an implicit default value for an optional key. My use case: I potentially have to configure a very long list of key value mappings for my command line application. The majority of the optional-keys usually have a default value which is implicit but from application/user perspective reasonable, not surprising. I know implicit data may be bad. Anyway... what do you think?

- mandatory-key: ...
  optional-key:
- mandatory-key: ...
  optional-key:
(... potentially very long list of other mappings ...)

Something broken?

I encounter the following error in a freshly created virtual environment for both 2.7 and 3.6 versions of Python. This happens on two different machines, Gentoo and Ubuntu.

>>> import strictyaml as sy
>>> sy.load("1: a")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/VENV2/lib/python2.7/site-packages/strictyaml/parser.py", line 215, in load
    document = ruamelyaml.load(yaml_string, Loader=StrictYAMLLoader)
  File "/tmp/VENV2/lib/python2.7/site-packages/ruamel/yaml/main.py", line 95, in load
    loader = Loader(stream, version, preserve_quotes=preserve_quotes)
  File "/tmp/VENV2/lib/python2.7/site-packages/strictyaml/parser.py", line 200, in __init__
    RoundTripScanner.__init__(self)
  File "/tmp/VENV2/lib/python2.7/site-packages/ruamel/yaml/scanner.py", line 91, in __init__
    self.fetch_stream_start()
  File "/tmp/VENV2/lib/python2.7/site-packages/ruamel/yaml/scanner.py", line 407, in fetch_stream_start
    mark = self.reader.get_mark()
  File "/tmp/VENV2/lib/python2.7/site-packages/ruamel/yaml/scanner.py", line 135, in reader
    return self.loader._reader
AttributeError: 'NoneType' object has no attribute '_reader'

FEATURE: report all violations and not just the first one

(Optionally) provide a way to report all violations, both of strictyaml's extra restrictions and those of schema, and not just the first one. I do realize that this may involve some major code overhaul but the best end user (i.e. the one that writes those yaml files) experience is the ultimate goal, isn't it?

Independent specification

I wonder if it would make sense to write a complete specification for StrictYAML that does not depend on the YAML specification (as in, you can write a compliant parser without reading that one first), to help it spread to other programming language ecosystems.

There has been discussion in the rust subreddit as to wether YAML's enormous and ambiguous specification or TOML's alien syntax was preferable. Rust's popular (de)serialization framework serde uses structs to define a schema, so StrictYAML would be a great fit.

Bug: StrictYaml cannot serialize `None` (null value)

Typing the following into the python interpreter

Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 14:05:16) [MSC v.1915 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import strictyaml
>>> yaml = strictyaml.as_document({'a':None})

results in

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\parser.py", line 258, in as_document
    return schema(YAMLChunk(schema.to_yaml(data), label=label))
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\any_validator.py", line 45, in to_yaml
    return schema_from_data(data).to_yaml(data)
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\compound.py", line 197, in to_yaml
    for key, value in data.items()
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\compound.py", line 200, in <listcomp>
    and value != self._defaults[key]
  File "D:\Temp\3\venv3\lib\site-packages\strictyaml\scalar.py", line 148, in to_yaml
    raise YAMLSerializationError("'{}' is not a string".format(data))
strictyaml.exceptions.YAMLSerializationError: 'None' is not a string

This appears to be a bug.
(If it is not a bug, this gaping feature-hole is not documented anywhere.)

What to do about empty sequences and mappings?

I've been writing a Dumper that outputs data conforming to strictyaml, and in my testing of it's behaviour, I've found a problem: the only way to represent empty sequences and mappings in YAML is with flow style, which strictyaml disallows. I don't really know any way around making an exception for [] and {}, though, aside from just refusing to dump empty lists and dicts.

Clarification on tags justification, e.g. AWS cloudformation's shorthand private tags

First off, I really like this library, and the design choices you've made, so thanks!

I was looking at the removed features, and it lists explicit tags as being a form of syntax typing, which is absolutely bad when it's defined by the schema, yes! But tags don't have to be used that way, they can be used as a reserved syntax for alternate ways to provide a value, in particular AWS Cloudformation uses them as a short-hand for their "function" syntax:

Without tag shorthand (or flow):

Parameters:
  HostedZoneName: ...
  RecordName: ...
  RecordComment: ...
  ...

Resources:
  LoadBalancer: ...

  RecordSet:
    Type: AWS::Route53::RecordSet
    Properties:
      HostedZoneName:
        Fn::Sub: '${HostedZoneName}.'
      Comment:
        Ref: RecordComment
      Name:
        Fn::Sub: '${RecordName}.${HostedZoneName}.'
      Type: A
      AliasTarget:
        DNSName:
          Fn::GetAtt:
          - LoadBalancer
          - DNSName
        HostedZoneId:
          Fn::GetAtt:
          - LoadBalancer
          - CanonicalHostedZoneNameID

With tag shorthands:

  RecordSet:
    Type: AWS::Route53::RecordSet
    Properties:
      HostedZoneName: !Sub '${HostedZoneName}.'
      Comment: !Ref RecordComment
      Name: !Sub '${RecordName}.${HostedZoneName}.'
      Type: A
      AliasTarget:
        DNSName: !GetAtt LoadBalancer.DNSName
        HostedZoneId: !GetAtt LoadBalancer.CanonicalHostedZoneNameID

Embedding a sub-syntax in strings like suggested in #20 here is a bad idea, as the transformation is generic across the whole document (even if there are places where it's not valid), and there are plenty of values (like Comment) that permit arbitrary values; so I think AWS has the right long-hand syntax, but as you can see it quickly gets unwieldy, so the private tags are very heavily used. In this sense, tags are used as an already reserved syntax that can safely escape an embedded string syntax (rather than adding another level of escaping).

As another example that is closer to the original justification for removal, you can also (and it is probably the original intention of tags) use tags to provide types better syntax and lower (user) implementation cost where it's not directly providable by the schema, for example:

shape:
  fill: !LinearGradient
    from: 0 10
    to: 30 10
    stops: red 0.0, blue 0.2, green 1.0
  path:
  - !Move 0 0
  - !Arc 20 0 20 20
  - !Line 20 0

That said, I'm perfectly OK with strictyaml not supporting tags for implementation or compatibility complexity, or other such reasons, but the justifications only talk about using them for syntax typing, and then only for yaml built-in types, which is insufficient for me to remove this feature on its own.

To be clear, just updating the docs would be fine, though I won't refuse adding tags support 😇

Implicit Typing in StrictYAML

Consider this YAML:

array:
  - string 1!
  - string: 2?

Which parses into this:

{'array': ['string 1!', {'string': '2?'}]}

The value determines its type.
You've eliminated this behavior for primitive types:

python: 3.5.3
postgres: 9.3

Both are strings. But there is still implicit typing between string and map.

An array element syntax like:

array:
  -
  key1: value1
  key2: value2
  -
  key1: value1
  key2: value2
  - some: text

would fix this behavior but then it wouldn't be a subset of YAML anymore 😄

A valid use case for flow style

Maybe I shouldn't use strictyaml for linear algebra heavy stuff.

matrix: [
  [ 2., 5., 1.],
  [ 3., 2., 3.],
  [ 4., 1., 2.],
  ]
matrix_strict:
  -
    - 2.
    - 5.
    - 1.
  -
    - 3.
    - 2.
    - 3.
  -
    - 4.
    - 1.
    - 2.

FEATURE: augment constructed (loaded) document with location marks

strictyaml does schema based validation and reports exact text positions of erroneous input. However in addition to schema validation, the application may perform other checks which are impossible to express in terms of schema. It would be nice to be able to report to user exact text position of offending data.
I was able to come up with a quick-and-dirty proof of concept (tested only on python 3.5) that uses type(name, bases, dict) to construct subclasses of built-in types (dict, list, int, str) that contain start_mark and end_mark of nodes producing the corresponding data. I create those sub-classed instances in construct_object.
I can send you the above-mentioned prototype should you find it useful.

more critiques of toml

First, TOML has quirky syntax for arrays of tables. It's not obvious that adding a second pair of brackets around the table will turn it into an array. Then if you have a nested table inside a table in a table of arrays, then it becomes even more unclear.

# not clear that this is an array
[[tables]]
foo = "foo"

# especially now, it isn't clear at all that tables is actually an array
# and we are referencing the inner table of the last table added to the tables array.
[tables.inner]
foo = "bar"

Second, all arrays have the type array. So even though arrays are homogenous in TOML, you can oddly do

array = [["foo"], [1]]

# but not
array = ["foo", 1]

Which is very inconsistent.

Third, there is also inline table syntax which is the TOML equivalent of YAML's flow style. Same downsides.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.