kdl-org / kdl-rs Goto Github PK
View Code? Open in Web Editor NEWRust parser for KDL
Home Page: https://docs.rs/kdl
License: Other
Rust parser for KDL
Home Page: https://docs.rs/kdl
License: Other
Currently, we pretty much just throw floats at format!("{:?}")
and call it a day.
Unfortunately, this yields different results depending on Rust version (for example, 1.56 vs 1.60, so not even a big gap!).
So, we need to write our own that will follow a predictable pattern. Then we'll be able to re-enable the underscored tests that deal with floating point numbers that would otherwise fail.
Ideally, this formatter will be pretty smart about formatting exponent form (1.0e-10
etc), so the numbers also look nice, not just be correct.
Hey,
I have a config schema that has a peers
block where I use dash vals to create a list:
peers {
- "foo"
- "bar"
}
I have a function which finds the correct node and then inserts a node into its child block:
match doc.nodes_mut().iter_mut().find_map(|node| {
let node_name = node.name().repr().unwrap();
if node_name == tree_id {
Some(node)
} else {
None
}
}) {
Some(subtree) => {
let block = subtree.children_mut().as_mut().expect("Invalid sub-block");
let mut new_node = KdlNode::new("-");
new_node.push(KdlEntry::new(item));
block.nodes_mut().push(new_node);
}
None => panic!("invalid subtree ?!"),
}
This works, but the resulting formatting seems really wonky:
peers { - "foo"
- "bar"
}
Which makes me think that maybe I'm doing something wrong? Or this is a bug. In either way, I thought I'd open an issue about it. Any help is much appreciated!
the current formatter basically blows away a bunch of stuff you might definitely want to keep (like comments). This can definitely be improved, imo.
I've built a tool which is heavily relying on kdl-rs, specifically some internal/private functions, for example:
I have copied relevant code from kdl-rs repository because those are private at the moment.
It would be wonderful if I can access those functions directly from kdl crate instead of copy-paste.
Hi,
I struggle to figure out how to do a simple check-and-update for an attribute:
use indoc::indoc;
use kdl::{KdlDocument, KdlEntry, KdlNode, KdlValue};
fn main() {
let doc = indoc! {r#"
foo "bar"
"#};
println!("{doc}");
let mut doc: KdlDocument = doc.parse()?;
let val = doc.get_arg_mut("foo").unwrap();
*val = KdlValue::String(String::from("oof"));
println!("{doc}");
}
This compiles, but doc
is not changed. What am I missing?
fn build_abc() -> KdlDocument {
let mut c = KdlNode::new("c");
c.ensure_children();
let mut b = KdlNode::new("b");
b.ensure_children().nodes_mut().push(c);
let mut a = KdlNode::new("a");
a.ensure_children().nodes_mut().push(b);
let mut doc = KdlDocument::new();
doc.nodes_mut().push(a);
doc.fmt();
doc
}
#[test]
fn parse_vs_build() -> miette::Result<()> {
let mut built = build_abc();
let mut parsed: KdlDocument = built.to_string().parse()?;
built.fmt();
parsed.fmt();
assert_eq!(built, parsed);
Ok(())
}
This test fails.
Diff < left / right > :
KdlDocument {
< leading: None,
> leading: Some(
> "",
> ),
nodes: [
KdlNode {
< leading: None,
> leading: Some(
> "",
> ),
ty: None,
name: KdlIdentifier {
value: "a",
repr: None,
},
entries: [],
before_children: None,
children: Some(
KdlDocument {
< leading: None,
> leading: Some(
> "\n",
> ),
nodes: [
KdlNode {
< leading: None,
> leading: Some(
> " ",
> ),
ty: None,
name: KdlIdentifier {
value: "b",
repr: None,
},
entries: [],
before_children: None,
children: Some(
KdlDocument {
< leading: None,
> leading: Some(
> "\n",
> ),
nodes: [
KdlNode {
< leading: None,
> leading: Some(
> " ",
> ),
ty: None,
name: KdlIdentifier {
value: "c",
repr: None,
},
entries: [],
before_children: None,
children: Some(
KdlDocument {
< leading: None,
> leading: Some(
> "\n",
> ),
nodes: [],
< trailing: None,
> trailing: Some(
> "\n ",
> ),
},
),
< trailing: None,
> trailing: Some(
> "\n",
> ),
},
],
< trailing: None,
> trailing: Some(
> " ",
> ),
},
),
< trailing: None,
> trailing: Some(
> "\n",
> ),
},
],
< trailing: None,
> trailing: Some(
> "",
> ),
},
),
< trailing: None,
> trailing: Some(
> "\n",
> ),
},
],
< trailing: None,
> trailing: Some(
> "",
> ),
}
It looks like what the spec calls "Arguments", kdl-rs calls "values" (see the field names for the KdlNode struct). Should something be done?
I'm developing a CLI tool that can query a document of KDL by a KDL Query string.
I've run into a case, for example, given following code:
fn main() {
let document = kdl::parse_document(r#"step uses="actions/checkout@v1""#).unwrap();
for node in document {
println!("{}", node);
}
}
Execute it and you'd get an output:
step uses="actions\/checkout@v1"
As you can see, the slash /
has been escaped:
-step uses="actions/checkout@v1"
+step uses="actions\/checkout@v1"
From the tool perspective, the output should be identical, I'm not sure it is expected or not, am I doing it wrong?
I just realized (after an embarassingly long debugging session trying to understand why parsing was failing) that I actually depend on PR #7, but it doesn't appear to be on crates.io as it missed the 1.0 release by 10 days, is it possible a newer version could be posted? Thanks!
The following fails to parse:
node { child }
I don't think the spec specifies whether or not this should parse, but I would like it to. Looking at the code of the nodejs implementation, I think that one will parse this?
Modern parsers don't just stop at the first error they find--they continue optimistically as best they can, either collecting or printing errors as they go.
kdl-rs should do this, too, by returning a wrapper around a collection of parse errors instead of a single error.
This issue can dovetail with #67, since we're going to be kinda rewriting the parser anyway.
given the following:
struct Complex {
}
kdl spits out the following error:
Error:
× Expected valid node terminator.
╭─[7:1]
7 │
8 │ struct Complex {
· ───┬───
· ╰── parsed node
╰────
help: Nodes can only be terminated by `;` or a valid line ending.
It sees Complex
is a kdl ident and thinks you're trying to start a new node without finishing the struct
one, but really you just forgot to wrap Complex
in quotes. I'm not 100% sure about the best way to detect this situation and provide a suggestion. It's not wrong and there's some cases where it's definitely the right error.
I think maybe framing it as "Complex is wrong" might work better?
Error:
× Found a bare identifier in a node's args
╭─[7:1]
7 │
8 │ struct Complex {
· ───┬───
· ╰── this isn't valid
╰────
help: If this was supposed to be a string, wrap it in quotes.
If this was supposed to be a new node, close the previous node with `;` or `}`.
I'll ponder if this kind of detection is easy to add.
This might be a problem if files are generated from KdlDocument
-s .
If you chose to never ever support Windows, you'd find my deep empathy and understanding.
There should be a nice query API (possibly/probably using KQL?) to fetch individual nodes or sets of nodes.
The README.md, the description on crates.io and the description on docs.rs, mention that this project is under the Parity license with a broken link to the old license file.
It looks like the project is Apache2 now according to 0dbf75c, so those should probably be updated.
As the title says, it looks like there are no available versions of Kaydle on Crates.io. Should the mention be removed unless/until a new version is published?
The spec draft is ready for implementors to start trying it out: kdl-org/kdl#286
There should be built-in support for Serialize/Deserialize in this crate.
nom supports doing bit-level parsing. However, if you're not using that, you can disable the feature and save on a few dependencies.
I only noticed this because of ferrilab/bitvec#105 causing a compilation failure (that I was able to work around) but it would be nice to not have unused dependencies regardless.
Hi,
I try to write a CLI, that formats KDL documents. Therefore I want to use the KdlDocument::fmt()
function to achieve this.
While writing the first little unit test, I found a weird behaviour within the format output. When the following document is formatted, only the first node is indented correctly.
Input
world prop="value" {
child 1
child 2
}
Output
world prop="value" {
child 1
child 2
}
Expected Output
world prop="value" {
child 1
child 2
}
Is the expected behaviour intended? For testing the issue, here is the source code for reproduction. Please feel free to message me in case you need some more details.
use kdl::KdlDocument;
pub fn format_document(input: &str) -> miette::Result<String> {
let mut doc: KdlDocument = input.parse()?;
doc.fmt();
Ok(doc.to_string())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_format_document() {
let input = r#"
world prop="value" {
child 1
child 2
}
"#;
let exp = r#"world prop="value" {
child 1
child 2
}
"#;
let res = format_document(input);
assert!(res.is_ok());
let res = res.unwrap();
assert_eq!(res, exp);
}
}
use kdl::{KdlDocument, KdlEntry, KdlNode, KdlValue};
use std::str::FromStr;
fn main() {
let mut node = KdlNode::new("what");
node.entries_mut().push(KdlEntry::new(KdlValue::RawString(
"\"#\nkdl-injection \"no way!!\"//".into(),
)));
let mut doc = KdlDocument::new();
doc.nodes_mut().push(node);
println!("{doc}");
println!(
"{} ≠ {} ‽",
doc.nodes().len(),
// unparse and parse the document
KdlDocument::from_str(&doc.to_string())
.unwrap()
.nodes()
.len()
);
}
outputs
what r#""#
kdl-injection "no way!!"//"#
1 ≠ 2 ‽
this is because the implementation of write_raw_string
checks for sequences of /#{n}"/
instead of /"#{n}/
i'd suggest replacing the for loop with something like
for char in raw.chars() {
if char == '"' {
consecutive = 1;
} else if char == '#' && consecutive > 0 {
consecutive += 1;
} else {
consecutive = 0;
}
maxhash = maxhash.max(consecutive);
}
Just found out about KDL, awesome stuff and exactly what I've been looking for. I know that this implementation is being redone for 1.0 compliance (or depreciated in favor of another implementation) so I didn't want to spend too much time doing a PR myself, but wanted to put this suggestion here for whatever that new implementation is.
It would be useful if KdlNode
had a pub line_number: Option<usize>
field. Lets say my project has the following configuration
workflow name=foo {
do_a
do_b
}
But the user gets confused and does
workflow name=second {
do_a
do_b
}
workflow name=foo
do_a
do_b
If I use the current API, I'll see 4 root nodes in that example, but the last 2 do_a
and do_b
aren't valid as root nodes. This project is definitely meant for non-programmers so if I throw an error saying "do_a
isn't valid as a root node, it should be wrapped by a workflow" they may be confused about what I"m talking about (the first do_a
is wrapped in a workflow, it's the second one that's not).
Having a line number allows me to exactly point them to which entry I'm talking about, which will be handy since in my real configurations there may be many do_a
nodes specified across multiple workflows.
I would expect clear_fmt
to discard all formatting, including the representation of value (as described by the documentation of the method itself). But currently, it doesn't do such thing.
Hi! I think it's worth to make a derive
macro for decoding and validating KDL. This is a bit different from #17, as that has some specific mapping between KDL and serde structure. And I'm proposing something, more like in clap.
So for this example from help (adjusted a bit):
contents {
section "First section" style="emphasized" {
paragraph "This is the first paragraph"
paragraph "This is the second paragraph"
}
}
You could have structures like:
#[derive(kdl::Decode)]
struct Contents {
#[kdl(children, filter(node_name="section"))]
sections: Vec<Section>
}
#[derive(kdl::Decode)]
struct Section {
#[kdl(argument)]
title: String,
#[kdl(property, default="normal")]
style: String,
#[kdl(children)]
children: Vec<Block>
}
#[derive(kdl::Decode)]
enum Block {
#[kdl(node_name="paragraph")]
Paragraph(String),
#[kdl(node_name="figure")]
Figure(Figure),
}
And this could serve as both: more convenient structure to work with data and to validate that there are no extra arguments and properties (unless there is an explicit extension point) and types of all arguments and properties are correct.
What do you think?
Legal bare identifiers with a prefix that matches a keyword cause the parser to return errors. I submitted a PR with test cases to the main repo to help test the desired behavior.
use kdl::parse_document;
fn main() {
println!("{:?}", parse_document("null_id"));
println!("{:?}", parse_document("node null_id=1"));
println!("{:?}", parse_document("true_id"));
println!("{:?}", parse_document("node true_id=1"));
println!("{:?}", parse_document("false_id"));
println!("{:?}", parse_document("node false_id=1"));
}
Err(KdlError { input: "null_id", offset: 0, line: 1, column: 1, kind: Other })
Err(KdlError { input: "node null_id=1", offset: 0, line: 1, column: 1, kind: Other })
Err(KdlError { input: "true_id", offset: 0, line: 1, column: 1, kind: Other })
Err(KdlError { input: "node true_id=1", offset: 0, line: 1, column: 1, kind: Other })
Err(KdlError { input: "false_id", offset: 0, line: 1, column: 1, kind: Other })
Err(KdlError { input: "node false_id=1", offset: 0, line: 1, column: 1, kind: Other })
Ok([KdlNode { name: "null_id", values: [], properties: {}, children: [] }])
Ok([KdlNode { name: "node", values: [], properties: {"null_id": Int(1)}, children: [] }])
Ok([KdlNode { name: "true_id", values: [], properties: {}, children: [] }])
Ok([KdlNode { name: "node", values: [], properties: {"true_id": Int(1)}, children: [] }])
Ok([KdlNode { name: "false_id", values: [], properties: {}, children: [] }])
Ok([KdlNode { name: "node", values: [], properties: {"false_id": Int(1)}, children: [] }])
I did not submit test cases in the main repo for this, but I just noticed that if a keyword is prefixed with a sign character—which I think is a legal bare identifier—it fails as well. Please ignore if this is intentional! For example:
+false
node -false=1
Type annotations are parsed but not stored anywhere in the AST.
Can we fix this or do we have to wait for the new parser?
Steps to reproduce:
use kdl::{KdlIdentifier, KdlNode};
use std::fmt::Display; // <----- here!
fn main() {
let mut section_node = KdlNode::new(KdlIdentifier::from("words"));
section_node.fmt();
}
Error message:
error[E0061]: this function takes 1 argument but 0 arguments were supplied
--> crates/core/src/config.rs:266:22
|
266 | section_node.fmt();
| ^^^- supplied 0 arguments
| |
| expected 1 argument
|
note: associated function defined here
--> /home/dimitri/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:772:8
|
772 | fn fmt(&self, f: &mut Formatter<'_>) -> Result;
| ^^^
I think it would be best to rename fmt
to auto_format
or something like this
The following document fails to parse, with KdlErrorKind::Other
:
example 1.0
winnow really seems like a great evolution of nom! I think it would be lovely to port the current parser to it.
fn main() {
let test = r#"
// Nodes can be separated into multiple lines
title \
"Some title"
// Files must be utf8 encoded!
smile "😁"
// Instead of anonymous nodes, nodes and properties can be wrapped
// in "" for arbitrary node names.
"!@#$@$%Q#$%~@!40" "1.2.3" "!!!!!"=true
// The following is a legal bare identifier:
foo123~!@#$%^&*.:'|?+ "weeee"
// And you can also use unicode!
ノード お名前="☜(゚ヮ゚☜)"
// kdl specifically allows properties and values to be
// interspersed with each other, much like CLI commands.
foo bar=true "baz" quux=false 1 2 3.
"#;
let err: kdl::KdlError = test.parse::<kdl::KdlDocument>().unwrap_err();
println!("{:?}", miette::Report::from(err));
}
× Expected valid value.
╭─[21:1]
21 │ // interspersed with each other, much like CLI commands.
22 │ foo bar=true "baz" quux=false 1 2 3.
· ─┬
· ╰── invalid float
╰────
help: Floating point numbers must be base 10, and have numbers after the decimal point.
My first time playing with KDL and I have the following test KDL:
let content = "
settings {
first second
num 12345
string \"abcdefg\"
}";
When parsing this I get the following KDL error:
KdlError { input: "\nsettings {\n first second\n num 12345\n string \"abcdefg\"\n}", offset: 1, line: 1, column: 1, kind: Other }
It was not clear to me that unquoted strings are not allowed in the KDL spec. However, this error had me stumped for over an hour because it claimed the whole document was wrong for an unknown reason. Quoting the second
and the 12345
made this pass validation.
While I don't really agree with unquoted strings not being allowed (but that's a spec issue not kdl-rs issue), kdl-rs should at least pinpoint to the right word that's causing the error, and should have a better error kind for this situation, such as InvalidValueType
.
It looks to me like the specification allows any kind of Value -- including hex Numbers -- as property values., but the following document doesn't parse:
track flags=0xdeadbeef
This repo is not KDL 1.0 compliant, but there's a separate implementation being worked on that will likely replace this one altogether. This is just a tracking issue as a TODO for the future. No further modification should be made to this code for now.
This is another crack at Serde support (linking #17 for historical reference).
The basic API that I'm thinking of would look something like this (borrowing from Serde JSON):
#[derive(Serialize, Deserialize)]
struct Config {
root_dir: PathBuf,
iterations: u32,
}
// Using KDL v1 syntax
let data = r#"
config {
root_dir "/path/to/project"
iterations: 3
}
"#;
// Deserialize from string
let conf: Config = kdl::from_str(data).unwrap();
// Deserialize from file
let mut file = std::fs::File::open("config.kdl").unwrap();
let conf2: Config = kdl::read(&mut file).unwrap()
// Serialize to string
let roundtrip_data = kdl::to_string(&conf).unwrap();
// Serialize to file
let mut file = std::fs::File::create("config.kdl").unwrap();
kdl::write(&mut file, &conf).unwrap();
If no one has an issue, I'd be happy to take this on.
The insertion/removal behavior in the following cases seem a bit unintuitive to me:
Based on the doc comments for KdlNode::remove
and KdlNode::insert
, these both seem like bugs. If someone could confirm that is the case, I'll go ahead and PR my fork: main...jaxter184:kdl-rs:main
Also, a tangentially related sidenote, but KdlNode::remove
will remove only the first instance in the case of multiple keywords, which is potentially inconsistent with the KdlNode::get
behavior of returning the last instance. Maybe it would make sense to remove every instance of a keyword and return only the last instance?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.