Giter VIP home page Giter VIP logo

Comments (30)

asomov avatar asomov commented on June 14, 2024 2

@lread SnakeYAML 1.33 has been released. It may take a few hours to promote the JARs to Maven central

from clj-yaml.

lread avatar lread commented on June 14, 2024 2

@asomov a big sincere thank you for SnakeYAML and for this new release!

from clj-yaml.

lread avatar lread commented on June 14, 2024 1

So should we recap our current options?

option 1 - do nothing

Let @grzm and others suffer! 😈

option 2 - Preserve quotes around strings that look like numbers

option 2a - PR snakeyaml

After it is approved maybe use a fork until it gets released?

option 2b - try to get snakeyaml 1.32 to do this

Requires some snakeyaml spelunking and might not be viable.

option 3 - tell snakeyaml to not recognize octals

Not sure about the effect of this yet, but can tell snakeyaml not to resolve octals.
Maybe current octals would be parsed as strings?
Would this affect emitting?
Would need to try.

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024 1

Option 2a sounds promising, but this depends on what @asomov thinks about this.

Option 3: @asomov hinted at this here I believe.
Option 3b would be: provide an option to drop octal parsing (enabled by default or disabled by default).

from clj-yaml.

asomov avatar asomov commented on June 14, 2024 1

there is another issue to fix. Once finished, we can release SnakeYAML 1.33

from clj-yaml.

grzm avatar grzm commented on June 14, 2024 1

Thank you, @asomov !

from clj-yaml.

dpsutton avatar dpsutton commented on June 14, 2024

With "good" 108 version:

❯ clj -Sdeps '{:deps { clj-commons/clj-yaml {:mvn/version "0.7.108"}}}'
Downloading: clj-commons/clj-yaml/0.7.108/clj-yaml-0.7.108.pom from clojars
Downloading: clj-commons/clj-yaml/0.7.108/clj-yaml-0.7.108.jar from clojars
Clojure 1.11.1
user=> (require '[clj-yaml.core :as yaml])
nil
user=>  (doseq [x ["083" {:x "083"}]] (print (yaml/generate-string x)))
'083'
{x: '083'}
nil
user=>

With "good" 108 but bumping the yaml lib to 1.32 (version used in 110)

 clj -Sdeps '{:deps {org.yaml/snakeyaml {:mvn/version "1.32"} clj-commons/clj-yaml {:mvn/version "0.7.108"}}}'
Downloading: org/yaml/snakeyaml/1.32/snakeyaml-1.32.pom from central
Downloading: org/yaml/snakeyaml/1.32/snakeyaml-1.32.jar from central
Clojure 1.11.1
user=> (require '[clj-yaml.core :as yaml])
nil
user=> (doseq [x ["083" {:x "083"}]] (print (yaml/generate-string x)))
083
{x: 083}
nil

from clj-yaml.

grzm avatar grzm commented on June 14, 2024

Looks like the regression was introduced in snakeyaml 1.30. Using [org.yaml/snakeyaml "1.29"] shows the expected, quoted result.

from clj-yaml.

grzm avatar grzm commented on June 14, 2024

Reported upstream: https://bitbucket.org/snakeyaml/snakeyaml/issues/550/regression-in-handling-number-like-strings

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024

@grzm I tested this with EDN input like this:

{:foo [123 "045" "083"]}

Output:

$ clj -Sdeps '{:deps {clj-commons/clj-yaml {:mvn/version "0.7.110"} org.yaml/snakeyaml {:mvn/version "1.31"}}}}' -M -e "(require '[clj-yaml.core :as yaml]) (print (yaml/generate-string (clojure.edn/read-string (slurp \"/tmp/test.edn\"))))"
foo: [123, '045', 083]

That doesn't make sense: strings should stay strings.

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024

It looks like we're have to going to fix it in the clj-yaml library instead of snakeyaml:

https://bitbucket.org/snakeyaml/snakeyaml/issues/550/regression-in-handling-number-like-strings

from clj-yaml.

asomov avatar asomov commented on June 14, 2024

You can provide your own Resolver to SnakeYAML (and drop the octal numbers and timestamps)
https://bitbucket.org/snakeyaml/snakeyaml/src/master/src/main/java/org/yaml/snakeyaml/resolver/Resolver.java

from clj-yaml.

lread avatar lread commented on June 14, 2024

I don't fully get it yet, so I'm trying to understand better by hitting snakeyaml directly.

Given snake_load.clj

(require '[clojure.pprint :as pprint])
(import '(org.yaml.snakeyaml Yaml))

(pprint/print-table
  (for [yaml ["'083'" "083" "043" "83"]]
    {:from (pr-str yaml) :to (pr-str (.load (Yaml.) yaml))}))

And snake_dump.clj

(require '[clojure.pprint :as pprint])
(import '(org.yaml.snakeyaml Yaml))

(pprint/print-table
  (for [in ["83" "043" "083" 83]]
    {:from (pr-str in) :to (pr-str (.dump (Yaml.) in))}))

And a little snaketest.clj bb driver (we were recently on 1.26, our issue seems to have occurred between 1.29 and 1.30 and we were recently on 1.31 so...)

(require '[babashka.tasks :as t])

(doseq [script ["snake_load.clj" "snake_dump.clj"]
        snake-version ["1.26" "1.29" "1.30" "1.31" "1.32"]]
  (println "snakeyaml version" snake-version script "--------")
  (t/clojure "-Sdeps"
             (format "{:deps {org.yaml/snakeyaml {:mvn/version \"%s\"}}}" snake-version)
             "-M" script)
  (println))

When I run bb snaketest.clj I get (edited down and annotated to what is interesting).

Loading from YAML:

snakeyaml version 1.26 and 1.29 snake_load.clj --------

|   :from |   :to |
|---------+-------|
| "'083'" | "083" |
|   "083" |  83.0 | <-- this seems wrong
|   "043" |    35 |
|    "83" |    83 |

snakeyaml version 1.30, 1.31 and 1.32 snake_load.clj --------

|   :from |   :to |
|---------+-------|
| "'083'" | "083" |
|   "083" | "083" | <<-- difference
|   "043" |    35 |
|    "83" |    83 |

So: 083 is not a valid YAML octal number so it is assumed to be a string.
043 is valid YAML octal, so it gets loaded as a numeric.
(Octal 43 is 35 decimal).

Dumping to YAML:

snakeyaml version 1.26 and 1.29 snake_dump.clj --------

| :from |       :to |
|-------+-----------|
|  "83" |  "'83'\n" |
| "043" | "'043'\n" |
| "083" | "'083'\n" |
|    83 |    "83\n" |

snakeyaml version 1.30, 1.31 and 1.32 snake_dump.clj --------

| :from |       :to |
|-------+-----------|
|  "83" |  "'83'\n" |
| "043" | "'043'\n" |
| "083" |   "083\n" | <-- difference
|    83 |    "83\n" |

So: I don't get this one yet. The string "083" is being converted to what looks like an invalid YAML number? Does anybody understand the rationale for this new snakeyaml behaviour yet? After reading this snakeyaml issue, all I'm guessing is since string "083" looks like a number (starts with 0), but can't be number because it is not valid octal, it is safe to be dumped like as a YAML without quotes, because it won't be interpreted as YAML number when it is loaded? Brain twisty!

So maybe in YAML 1.1 a string that looks like a number but can't be a number can forego the quotes (but why? I've not noticed any shortage in the availability of quote characters).

Is this a bug in the YAML 1.1 spec (they did rework octal numbers for YAML 1.2) or a bug in snakeyaml or a bug in my noggin?

Anyhoo.... if my understanding above is all solid, or good enough, I think we can adapt to spitting out strings that look like numbers but are not numbers to include the YAML quotes.

@asomov are you suggesting there is a similar issue with timestamps?

from clj-yaml.

asomov avatar asomov commented on June 14, 2024

you asked a few questions, but I think you know the answers

  • Brain twisty! - imagine how much time it took for me to implement it (and understand it)
  • I would like octals to be completely removed, they bring nothing but confusion
  • timestamp should be easier for you

from clj-yaml.

lread avatar lread commented on June 14, 2024

Thank you for your reply @asomov, very much appreciated!

And also thank you so much for all the hard work on the snakeyaml libraries (sorry, I didn't make the connection until just now). I can't imagine that it is easy work!

from clj-yaml.

lread avatar lread commented on June 14, 2024

@borkdude I'll also take a peek at your roundtrip test from above, understand what is going on, and make sure we'll be good there.

from clj-yaml.

lread avatar lread commented on June 14, 2024

So @borkdude, on your example:

{:foo [123 "045" "083"]}

Output:

$ clj -Sdeps '{:deps {clj-commons/clj-yaml {:mvn/version "0.7.110"} org.yaml/snakeyaml {:mvn/version "1.31"}}}}' -M -e "(require '[clj-yaml.core :as yaml]) (print (yaml/generate-string (clojure.edn/read-string (slurp \"/tmp/test.edn\"))))"
foo: [123, '045', 083]

That doesn't make sense: strings should stay strings.

The non-intuitive thing (to me) that I learned is that 083 is actually a string in YAML because, although it looks like a number, it cannot be a number.

Under snakeyaml 1.3.1

(def x {:foo [123 "045" "083"]})
(def o (generate-string x))
o
;; => "foo: [123, '045', 083]\n"

(parse-string o)
;; => #ordered/map ([:foo (123 "045" "083")])

We can choose to explicitly exclude octal support with a custom snakeyaml Resolver.
I'd rather just emit '083' instead of 083, but not sure if we can ask snakeyaml to do that yet.

But... we can also do nothing. It could be argued that we do not have a bug here.

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024

Is it safe to propose the following based on @grzm's latest comment at SnakeYAML's repo:

clj-yaml will preserve quotes around strings that look like numbers. This is not against the yaml 1.1 spec and will undo breaking changes experienced by reading output of this library by yaml 1.2 readers.

We could coordinate with @asomov to get this change back into SnakeYAML but I think SnakeYAML doesn't have very frequent releases. Until that happens, we could perhaps implement the proposal in clj-yaml, until a new SnakeYAML comes out that has the change and then undo it here.

from clj-yaml.

lread avatar lread commented on June 14, 2024

Yes thanks, I think we've finally gotten to the crux of this issue! 🎉

from clj-yaml.

grzm avatar grzm commented on June 14, 2024

This is not against the yaml 1.1 spec

To be clear, I believe this to be true, but haven't read the spec closely enough to, say, provide expert testimony in a court of law.

from clj-yaml.

lread avatar lread commented on June 14, 2024

Oh right forgot:

option 4 - migrate from snakeyaml to snakeyaml-engine

This avoids the broken YAML 1.1 octals by moving to YAML 1.2.
But also could be a bit of work, and might introduce other issues.
Have not looked into it at all. Might be worth an exploration.

from clj-yaml.

lread avatar lread commented on June 14, 2024

Trying option 2a: Submitted an issue and PR to snakeyaml.

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024

It was merged! Thanks you @lread and @asomov!

Now, we could either wait for 1.33 or depend on a fork until 1.33 comes out. Considering that this a breaking change to other Clojure users, I'd like to have it sooner than later.

from clj-yaml.

lread avatar lread commented on June 14, 2024

Thanks @asomov! (and thanks for the variable name change, more clear for sure!).

If I can help with that final issue, just let me know.

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024

For clarity: what is that final issue?

from clj-yaml.

asomov avatar asomov commented on June 14, 2024

https://bitbucket.org/snakeyaml/snakeyaml/issues/553/loaderoptionssetcodepointlimit-not-honored

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024

@asomov Don't mean to push you, so just take this as a question:

Is there a likelihood that there will be a new SnakeYAML release within the next few days?

I'd like to release a new version of https://babashka.org/ (scripting environment for Clojure) in which clj-yaml is a built-in dependency, so it'd be nice to know if I should wait, or just go ahead without the new SnakeYAML and bump clj-yaml in a future release.

Thanks for the ongoing work!

from clj-yaml.

asomov avatar asomov commented on June 14, 2024

I would like to finish this

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024

Thanks for the heads up! I'll watch the issue

from clj-yaml.

borkdude avatar borkdude commented on June 14, 2024

@asomov Thanks so much for your work!

from clj-yaml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.