Giter VIP home page Giter VIP logo

yaml's Introduction

YAML Haskell Library

Build Status

This project contains two haskell packages: yaml for higher-level parsing and writing of yaml documents, and libyaml for lower-level event-based streaming.

yaml Package

yaml provides a high-level interface based around the JSON datatypes provided by the aeson package. This allows using JSON and YAML interchangeably in your code with the aeson typeclasses. See the yaml README for more details.

libyaml Package

libyaml is a wrapper over the libyaml C library (and includes the source so no external library is needed). It is an event-based streaming API. See the libyaml README for more details.

License

This project is licensed with the BSD 3-Clause license. Copies of the license can be found in both the yaml and libyaml subdirectories

yaml's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yaml's Issues

Segfaults when `decodeFile` throws "file not found"

Even if decodeFile fails to open a file, fclose is called via addForeignPtrFinalizerEnv, thus segfaults.

Possible fixes:

  • Add NULL-check just before fclose in fclose_helper
  • Do addForeignPtrFinalizerEnv just after NULL-check

decodeEither doesn't provide structured exception

When it returns Left String, the String is the result of calling Show on a YamlException. But you can't get the YamlException back, since Read isn't defined for it.

It would be great if decodeEither returned Either YamlException a -- or perhaps if there were another function that did. I'd be able to use the information in the YamlException to craft a much nicer error message for users than I can get by simply displaying the string.

Control.Concurrent.STM.atomically was nested

foo.hs:

{-# LANGUAGE OverloadedStrings #-}

import Control.Concurrent.STM
import Data.Yaml

main = do
  atomically (let x = decode "" :: Maybe Value in x `seq` x)
  pure ()

result:

$ stack runghc --resolver lts-6.3 foo.hs
foo.hs: Control.Concurrent.STM.atomically was nested

Uh...? I don't believe this is actually a yaml bug because it doesn't depend on stm, but one of its dependencies, somewhere, is calling some unsafePerformIO (atomically foo)-type thing. Not too sure where to look from here.

Wrong string with digits decoding

Prelude Data.Yaml> (decode (encode "123")) :: Maybe String
Nothing

But

Prelude Data.Yaml> (decode (encode "123a")) :: Maybe String
Just "123a"

Please, fix first wrong case.

Data.Yaml.Include.decodeFileEither does not mimick Data.Yaml.decodeFileEither

In particular, I get an exception with the Include version but not with the older one when the file being decoded does not exist. The particular exception happens because eventsFromFile calls canonicalizePath, which in turns thrown throwErrnoPathIfNull.

Looking at the code, it seems as thought exceptions will happen when !include (rather than decodeFileEither itself) references a non-existent file too, which I at least would also consider surprising, given that I expect decodeFileEither to not throw exceptions.

Trouble with non-ASCII filenames on Windows

I just minimized commercialhaskell/stack#2491 to a bug in the Yaml library. Quoting myself there:

René/foo.yaml

true
...

testYaml.hs:

import qualified Data.Yaml as Yaml

loadYaml :: FilePath -> IO (Either Yaml.ParseException Bool)
loadYaml = Yaml.decodeFileEither

main :: IO ()
main = do
  print =<< loadYaml "René\\foo.yaml"
  return ()
$ stack runhaskell testYaml.hs
Left (InvalidYaml (Just (YamlException "Yaml file not found: Ren\233\\foo.yaml")))

(That stack runhaskell is inside a project with yaml installed, in particular stack master on lts-6.0, hence it uses https://www.stackage.org/lts-6.0/package/yaml-0.8.17.1; the changelog at https://hackage.haskell.org/package/yaml-0.8.18.1/changelog does not mention this issue, nor does this issue tracker).

What I've shown is on a fresh Win10 VM, insideGit Bash coming from Git for Windows.
Already minimized to a bug in the Yaml library.

Analysys

I expect we end up here:

yaml/Text/Libyaml.hs

Lines 525 to 533 in f88b1bf

$ withCString file $ \file' -> withCString "r" $ \r' ->
c_fopen file' r'
if file' == nullPtr
then do
c_fclose_helper file'
c_yaml_parser_delete ptr
free ptr
throwIO $ YamlException
$ "Yaml file not found: " ++ file

That uses withCString and fopen, and withCString is documented here as "using a locale-dependent encoding".
https://www.stackage.org/haddock/lts-6.0/base-4.8.2.0/Foreign-C-String.html

Also note: the source code is UTF8, and that's required to have it accepted by stack runhaskell, whether I use codepage 437 (the default) or 65001.

Pretty-print of UnexpectedEvent is backward

Hello! In Data.Yaml.Internal:

data ParseException = ...
                    | UnexpectedEvent { _received :: Maybe Event
                                      , _expected :: Maybe Event
                                      }

prettyPrintParseException :: ParseException -> String
prettyPrintParseException pe = case pe of
  ...
  UnexpectedEvent mbExpected mbUnexpected -> unlines
    [ "Unexpected event: expected"
    , "  " ++ show mbExpected
    , "but received"
    , "  " ++ show mbUnexpected
    ]

The fields appear to be in reverse order in UnexpectedEvent mbExpected mbUnexpected; at least, the error message I'm seeing doesn't make a lot of sense as is :).

Possibility of splitting up this package

Would you be willing to consider splitting up this package into smaller ones to reduce dependency footprints? For example, I'm interested in using this but I don't want to depend on conduit and I don't plan on using the conduit-flavored interface in this library, so it would be helpful to split up the package so I can use (and depend on) only what I need.

Parse error context

I would like to get the information about a parse error (problem, context, offset) as fields of the YamlException constructor instead of a string.
Something like
data YamlException = YamlException String | YamlParseException String String Integer
There is a workaround - parsing the error string produced by parserParseOne', but it does not look good.

Parsing result type is not dictated by the output type

This bug results in the following unexpected behaviour:

{-# LANGUAGE OverloadedStrings #-}

import Data.Yaml
import Control.Applicative

data A = A { phone :: String }
  deriving Show

instance FromJSON A where
  parseJSON (Object v) = A <$> v .: "phone" 
  parseJSON _ = empty

main = do
  print (decode "phone: +72304234342" :: Maybe A)
  print (decode "phone: '+72304234342'" :: Maybe A)

outputting

Nothing
Just (A {phone = "+72304234342"})

Which is as I've checked caused by the parser first converting the +72304234342 value into an Int, which afterwards, of course, doesn't satisfy the String type requirement down the row.

printing of strings with exclamation mark

I am not completely sure if this is a bug or just a feature of YAML that I don't understand. But It looks too me like a bug:

λ import Data.Yaml
λ import Control.Applicative
λ encode <$> (decodeEither "['!4']" :: Either String [String])
    Right "- ! '!4'\n"

Without the exclamation mark I get:

λ encode <$> (decodeEither "['4']" :: Either String [String])
    Right "- '4'\n"

This is with yaml-0.8.10.1

`encode` does not resolve ambiguity between string and numeric values

This one is related to #20.

{-# LANGUAGE OverloadedStrings, DeriveGeneric #-}
import Data.Yaml
import GHC.Generics (Generic)

data A = A { phone :: String }
  deriving (Generic, Show)

instance ToJSON A
instance FromJSON A

main = do
  print $ encode $ A "+72304234342"
  print $ (decode $ encode $ A "+72304234342" :: Maybe A)

outputs:

"phone: +72304234342\n"
Nothing

As you can see the decode parser fails on the result of encode.

There are two ways of solving this:

  1. Resolve the #20
  2. Put the ambiguous string values in quotes

I again vote for the first way.

Add a warning to GHCJS until its supported.

As with issue #75 it currently breaks on GHCJS because of the c library. However, it compiles and works fine, unless its actually called. Once called it breaks with "h$yaml_parser_initialize is not defined" at runtime. This might be an unexpected behavior, so I suggest to add the following until GHCJS is supported

#if (defined (ghcjs_HOST_OS))
  module Data.Yaml {-# WARNING "GHCJS is not supported yet (will break at runtime once called)." #-}
#else
  module Data.Yaml
#endif
  where

It will spit out a warning when somebody uses Data.Yaml on GHCJS. This will make people aware that it breaks, while not unnecessarily breaking their code in case they are fine with this behavior (like for some library which depends on yaml to compile and work if yaml is not used at runtime). I can create a PR.

Runtime exception for nonexistent & empty .yaml files

This might be out of the scope of this project, but from what I've seen, Yaml is a user-level config file - something where silly things like having and empty file or non-existence should be acceptible. Is there any way to integrate possibly-empty / possibly-nonexistent error handling in decodeFile? The error I would get is from a pattern match failure - there's something going on with an EventDocumentStart being expected, but getting a EventSourceEnd throws it off the rail.

If not, I'm sure I can live with System.Directory :)

Unexpected counterexample to `decode . encode = Just`

Using yaml-0.8.2.1 imported as Y

*Schiller> Y.encode ( [""] :: [String])
"- \n"
*Schiller> Y.decode $ Y.encode ( [""] :: [String]) :: Maybe [String]
Nothing

The problem seems to be the encoding of the empty string without any quotes.

*Schiller> Y.encode ( ["a"] :: [String])
"- a\n"
*Schiller> Y.decode $ Y.encode ( ["a"] :: [String]) :: Maybe [String]
Just ["a"]
*Schiller> 

I don't know YAML well, but this is unexpected. Using aeson, imported as A, the result looks as expected.

*Schiller> A.encode ( [""] :: [String])
Chunk "[\"\"]" Empty
*Schiller> A.decode $ A.encode ( [""] :: [String]) :: Maybe [String]
Just [""]
*Schiller> 

It is thus not one of the pitfalls (http://hackage.haskell.org/packages/archive/aeson/0.6.1.0/doc/html/Data-Aeson.html#g:6) that one has to know about when using aeson.

yaml-0.8.8.3\HSyaml-0.8.8.3.o: unknown symbol `strdup' (on mingw64)

Building yesod-1.2.6 on mingw64

ghc.exe: G:\Haskell\yesod-1.2.6.cabal-sandbox\x86_64-windows-ghc-7.8.2.20140609\yaml-0.8.8.3\HSyaml-0.8.8.3.o: unknown symbol `strdup'

You can see my fix at https://github.com/stuartallenmills/yaml/blob/master/libyaml/api.c

If _WIN64 then STRDUP is defined to _strdup, otherwise strdup. strdup in the code is replaced with STRDUP.

I had to use 7.8.2.xxx because some persistent libraries wouldn't link under 7.8.2 because of a compiler bug.

encodeFile: remove "|-" for Number, Bool and Null values

The concern here is similar to issue #11: we currently prepend every Number, Bool, and Null value with "|-". This is not necessary.

For example, in GHCi, if I do

encodeFile "foo.yml" (object [("array", Array Data.Vector.empty), ("string", String "string"), ("number", Number 0), ("bool", Bool False), ("null", Null)])

then I get:

'string': 'string'
'number': |-
  0
'bool': |-
  false
'array': []
'null': |-
  null

. What I would like is:

'string': 'string'
'number': 0
'bool': false
'array': []
'null': null

. Am I right about the uselessness of "|-"?

Is there a way to avoid escaping non-ascii characters in output of encode?

Non-ascii characters are currently escaped, and the strings containing them put in double quotes. Sometimes an exclamation mark is added to the front, though I haven't been able to figure out the pattern:

    # examples; the two lines are from different parts of the document
    family: "Fern\xE1n"
    - ! "An\xE1n"

Sometimes one gets instead \u followed by a four-digit hex number.

It would be great if we could have an option to leave the non-ascii characters unescaped (i.e., since it's a bytestring, encoded as UTF-8). The unescaped non-ascii characters seem to pose no problem for 'decode'. (Maybe there is a way to do this already, but I couldn't find one.)

Fails to parse yaml with anchors and references on keys

Given the following yaml document:

--- 
a:
  &id5 value: 1.0
b: 
  *id5: 1.2

decodeEither returns:

Left "UnexpectedEvent {_received = Just (EventAlias \"id5\"), _expected = Nothing}"

It should decode the same as:

--- 
a: 
  value: 1.0
b: 
  value: 1.2

I've checked a few online yaml linter/parsers to check if this is valid, and I've not found one that will reject it, so I think it's okay. (I'm getting it as output from a java yaml library).

Test suite failure (missing file in hackage tarball)

As seen on the stackage build server:

Failures:

  test/Data/YamlSpec.hs:67:
  1) Data.Yaml.Data.Yaml encode/decode files with non-ASCII names
       uncaught exception: ParseException (InvalidYaml (Just (YamlException "Yaml file not found: test/resources/accent\233/foo.yaml")))

I was able to reproduce locally like so:

$ stack unpack yaml-0.8.18.2
$ cd yaml-0.8.18.2/
$ stack init --resolver nightly-2016-08-31
$ stack test

Make tag implicit for quoted strings

Tags are implicit for plain strings:

> putStrLn $ encode $ object ["abc" .= "def"] 
abc: def

To avoid suprise of empty exclamation marks tags should also implicit for quoted strings.

Currently:

> putStrLn $ encode $ object ["abc " .= "def "]
! 'abc ': ! 'def '

Expected:

> putStrLn $ encode $ object ["abc " .= "def "]
'abc ': 'def '

Don't remove !include statements when not using Data.Yaml.Inlude

Hey Michael,

I was wondering how difficult would it be to not remove the !include statements from the YAML parsing result when the Data.Yaml.Include module is not used. More explicitly, on FromJSON parsing having a (String "!include path") rather than just (String "path")

I would like to do the fetching of the include path myself because I want to be able to get remote files as well (e.g. files hosted on an HTTP server).

I'm not familiar with the C++ YAML library this package relies on, so before going to deep into it would like to know your thoughts about this.

Cheers.

Provide a new, alternative, high-level API

The current API based on aeson works for many cases, but not all. See other issues in the issue tracker for an example. This issue is to track progress on the design and implementation of such a module.

round-tripping strings that begin with numbers doesn't work

Prelude Data.ByteString Data.Yaml> decode (encode (String "20 bar")) :: Maybe Value
Just (Number 20.0)

Prelude Data.ByteString Data.Yaml> encode (decode "20 bar" :: Maybe Value)
"20.0\n...\n"

I believe the problem is on the decoding side:

Prelude Data.ByteString Data.Yaml> decode "20 bar" :: Maybe Value
Just (Number 20.0)

Something should parsed as a number only if there is nothing left over after parsing.

more explicit scalars

The following is a list of strings; for strings starting with a period '.' character, they get prepended with an exclamation mark '!' and get enclosed in single quotes. I'd like both the exclamation mark and the singe quotes to go away, as both add needless disambiguation.

  - rnbqkbnr
  - ppppp.pp
  - ! '........'
  - ! '........'
  - ! '.....p.P'
  - ! '.....N..'
  - PPPPPPP.
  - RNBQKB.R

It would be so much better like this:

  - rnbqkbnr
  - ppppp.pp
  - ........
  - ........
  - .....p.P
  - .....N..
  - PPPPPPP.
  - RNBQKB.R

I've discovered that YAML offers both single-quoting, double-quoting, and an implicit style for scalars (seen here http://www.yaml.org/spec/1.2/spec.html#id2760844); it might be better to just make the user explicitly declare which quoting/non-quoting style to use, to better tailor it to his particular situation. For example, in the code above I am never, ever going to use characters other than ".kqrbnpKQRBNP" so it would be nice if I could tell Data.Yaml in the ToJSON instance for my type "I don't want any quoting for this type".

Make auto-quoting more consistent among special cases

Automatice multiline single quoted strings are confusing.

Currently:

> putStrLn $ encode $ array ["abc", "abc ", "abc\n", "abc \n", "abc\n "]
- abc
- ! 'abc '
- ! 'abc

'
- ! "abc \n"
- ! "abc\n "

Expected:

> putStrLn $ encode $ array ["abc", "abc ", "abc\n", "abc \n", "abc\n "]
- abc
- "abc "
- "abc\n"
- "abc \n"
- "abc\n "

I vote for DoubleQuote being used everywhere, but I'll be ok with SingleQuote provided that multiline single quote is never introduced automatically.

Rationale: I plan to use grep on yaml files and accidental newlines confuse this line oriented tool badly.

Tagging should be taken care of as issue #34.

Encoding as `Text`

I may have missed this, but I don't see any equivalent of Aeson's encodeToTextBuilder that I would hope to use for encoding as Text.

Support flow sequences

For stylistic reasons I'd like to encode some of my YAML output in flow sequences. (For me it's low priority, just nice to have.)

But I'm not sure what would be the best interface that'd allow such flexibility. One option would be to have an encoding function that'd accept a predicate like Value -> Bool (or more general Value -> Hint for some new data type Hint).

Data.Aeson-0.7.0.0 causes compile error

Data.Aeson-0.6.2.1 has Number !Number constructor in Data.Aeson.Types-0.6.2.1.
However, Data.Aeson-0.7.0.0 has Number !Scientific constructor instead of Number !Number in Data.Aeson.Types-0.7.0.0.

So, this change causes compile error as follows:

% cabal sandbox init
% cabal install yaml-0.8.5.2
Resolving dependencies...
...
Configuring yaml-0.8.5.2...
Building yaml-0.8.5.2...
Preprocessing library yaml-0.8.5.2...
[1 of 2] Compiling Text.Libyaml     ( Text/Libyaml.hs, dist/dist-sandbox-73a78a98/build/Text/Libyaml.o )

Text/Libyaml.hs:40:1:
    Warning: In the use of `unsafeForeignPtrToPtr'
             (imported from Foreign.ForeignPtr):
             Deprecated: "Use Foreign.ForeignPtr.Unsafe.unsafeForeignPtrToPtr instead; This function will be removed in the next release"

Text/Libyaml.hs:392:26:
    Warning: This binding for `pi' shadows the existing binding
               imported from `Prelude' at Text/Libyaml.hs:15:8-19
               (and originally defined in `GHC.Float')
[2 of 2] Compiling Data.Yaml        ( Data/Yaml.hs, dist/dist-sandbox-73a78a98/build/Data/Yaml.o )

Data/Yaml.hs:213:52:
    Couldn't match expected type `scientific-0.2.0.1:Data.Scientific.Scientific'
                with actual type `Number'
    In the return type of a call of `I'
    In the second argument of `($)', namely `I x'
    In the expression: Number $ I x
Failed to install yaml-0.8.5.2
cabal: Error: some packages failed to install:
yaml-0.8.5.2 failed during the building phase. The exception was:
ExitFailure 1

Getting decode error details

Currently it is possible to get error details only as an exception from decodeFile.

I would like to have a function that returns an error when parsing a string like
decodeEither :: FromJSON a => ByteString -> Either String a
or just exposed
decodeHelper :: FromJSON a => C.Source Parse Y.Event -> IO (Either ParseException (Maybe a))

ghcjs support

It would be nice if we could parse yaml from code compiled by ghcjs. This should be possible by binding a .js yaml parsing library instead of the c lib-yaml library here in the yaml package and choosing between them using CPP macros.

It may be difficult to get the same level of error reporting from the available javascript libraries. I'm not sure. The most popular js yaml parser seems to be js-yaml

Have you thought about anything around the issue of ghcjs? What conditions would you want to see met if you would take a PR for this kind of thing?

Decode files with multiple documents

It seems that the current Data.Yaml interface cannot decode files/strings containing multiple documents like


---
x: 10

---
y: 20

This seems easy to incorporate by creating a variant of the parse function in Yaml/Internal.hs. I would be happy to make the necessary changes and submit a patch.

What I would like advice on is the API. Which of these to use?

  • A new decodeAll function for every decode-type function (the Python yaml library does this)
  • A new boolean argument for all decode function
  • A new decode interface for strings which consumes some input and returns the remaining input so that the caller can resume parsing to find other documents. This allows different documents to be represented by different types and the decoding of later documents to depend on earlier one. A decodeAll for convenience can be build on top of this.

remove quotes from string-based key/values?

Whenever I export to a .yml file with the function encodeFile, I see that it gets written with tons of single quotes around strings where they are not necessary.

Is there a way to make this function avoid the use of single quotes where there is no need for any disambiguation?

E.g., I want

'game':
'result': 'black-resigns'

to look like

game:
result: black-resigns

yaml-0.8.7 requires conduit >= 1.0.11

The cabal file for yaml-0.8.7 claims that the library works for conduit >= 0.5. However, Data.Yaml.Parser imports the module Data.Conduit.Lift, which only first appeared in conduit-1.0.11. Please either raise the lower bound for the conduit dependency, or change yaml to use only the backwards-compatible part of conduit's API.

Fails to parse Compact Nested Mapping (Spec 1.2 example 2.12)

When parsing the example 2.12 from the 1.2 YAML specification, Data.Yaml fails:

Prelude Data.Yaml Data.ByteString.Char8> a
"---\n# Products purchased\n- item : Super Hoop\n quantity: 1\n- item : Basketball\n quantity: 4\n- item : Big Shoes\n quantity: 1\n"
Prelude Data.Yaml Data.ByteString.Char8> decodeEither $ pack a :: Either String Object
Left "when expecting a HashMap Text a, encountered Array instead"

Can't parse as Text fields that contain only numbers.

If you have a field whose value is parseable as a number, then you can't parse it as Text. Simple test case follows:

{-# LANGUAGE OverloadedStrings #-}
import Data.Text (Text)
import qualified Data.Yaml as Y

data S = S Text deriving (Show)

instance Y.FromJSON S where
  parseJSON (Y.Object obj) = fmap S $ obj Y..: "val"

test :: Either String S
test = Y.decodeEither "val: \"1\""

I'm not sure what would be the best solution to this problem, so I'm just opening an issue =).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.