Comments (7)
@nitzmahone: Thanks! That works like a charm.
from pyyaml.
IMO this is the desired behavior, not a bug. I'm not a spec expert (maybe @ingydotnet, @perlpunk, or someone else can chime in), but IIUC the primary purpose of doc markers is to delimit multiple documents in a single YAML-only stream, not to arbitrary mixed-content document streams. Doing so reliably in a parser-friendly fashion would be difficult, especially since you can mix and match styles with doc/directive end markers (the latter also being ---
) around otherwise bare documents. If the spec allowed arbitrary non-YAML content between documents, there'd need to be a way to escape non-YAML lines that might be "interesting" to the parser, and it would make the future introduction of any new top-level markers a breaking change (since the escaping mechanism would also need to be updated).
That's not saying you can't trivially roll your own framing mechanism to support such a thing, but I wouldn't anticipate any direct support for such a thing from the spec/tooling.
from pyyaml.
---
time: 20:03:20
player: Sammy Sosa
action: strike (miss)
...
Some other stuff
Is valid YAML according to the spec, but pyyaml and libyaml error on it.
They want to see: https://play.yaml.io/main/parser?input=LS0tCnRpbWU6IDIwOjAzOjIwCnBsYXllcjogU2FtbXkgU29zYQphY3Rpb246IHN0cmlrZSAobWlzcykKLi4uCi0tLQpTb21lIG90aGVyIHN0dWZmCg==
---
time: 20:03:20
player: Sammy Sosa
action: strike (miss)
...
---
Some other stuff
or:
---
time: 20:03:20
player: Sammy Sosa
action: strike (miss)
---
Some other stuff
from pyyaml.
Thanks for your clarifications!
So, for now, I'll work around that by extracting the YAML block before feeding it to the parser. My application is parsing the YAML header of Markdown documents before transforming them with Pandoc – so I can't easily change the document format because I need to keep Pandoc happy. I do have control over the document generation process so extracting the YAML header myself, reliably, is not a big deal. It just would have been very elegant to let the YAML parser do that, too.
from pyyaml.
It does seem like there are some inconsistencies in PyYAML's implementation WRT the spec here, but the long-standing default behavior of "anything trailing the document is an error" does appear to have been intentional with the implementation of get_single_node()
. Putting my user hat on, it's also exactly the behavior we want for Ansible (and I'd argue most users that are not explicitly supporting multi-doc scenarios)- if that were to suddenly change, it could cause a lot of subtle problems. That part is arguably an API choice (rather than a spec deviation)- basically if you want to allow/ignore trailing stuff, use the multi-doc APIs and just ignore all but the first document stream (and of course ensure that PyYAML's API is fully spec-adherent in multi-doc scenarios).
It's also not clear to me from the spec what the actual intended behavior was around non-document content between explicit documents, and how that might interact with a bare document following a doc end marker. There's mention of "communication channels", and the spec says that ...
does not start a new document- fair enough, but this test seems to imply that any content following a doc end marker that's not a marker or directive starts a new bare document. That certainly makes it a simple rule if that's in fact the case, because it precludes the possibility of non-document "junk" on any line following a doc-end marker, but the spec's references to the c-forbidden
production around bare documents confuses me there (quite possibly PEBCAK on my part 😉).
from pyyaml.
I'll work around that by extracting the YAML block before feeding it to the parser.
@philpagel Since this appears to be more an API implementation choice of get_single_node()
(used by the non-all
versions of the top-level API functions) explicitly failing on extra document content, assuming you're using PyYAML directly, something like the following should do what you want without any extra machinations:
next(yaml.safe_load_all(doc))
from pyyaml.
@philpagel nice- thanks for verifying the workaround gets you the behavior you need. I'm going to close this issue, since we've verified that the underlying bits are (more or less) spec-compliant and that the correct behavior is already visible via the multi-doc APIs.
from pyyaml.
Related Issues (20)
- CSafeDumper doesn't appear to respect `allow_unicode=True`
- failed installing dependencies on WSL2 HOT 1
- bug? some string constants are getting printed as ints HOT 2
- yaml dump outputs not-to-spec keys HOT 2
- yaml-6.0.1 ==> cython-3.0.7
- How to install via pip using `--without-libyaml` when `--install-option` is deprecated? HOT 2
- PyYaml missing AArch64 wheel for Python 3.12 only HOT 1
- Result of safe_load_all is not writable with safe_dump_all HOT 4
- Install PyYAML 6.0 from source results into failure: ERROR: No matching distribution found for wheel HOT 1
- Unable to install in Python 3.12 environment HOT 1
- Yaml on dumping giving different result
- RFE: port to `cython` 3.x HOT 2
- Depth first firing of generators HOT 1
- Unable to install due to dependency on setuptools - help please? HOT 2
- Provide wheel for `musllinux` `arm64`
- [Joke Issue] Attempt to write the most complex YAML file in human history
- Can't seem to read emoji HOT 1
- "Y" does not round-trip in practice HOT 1
- Instant of custom python class as value
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyyaml.