Giter VIP home page Giter VIP logo

clj-xml's Introduction

clj-xml

Clojars Project cljdoc badge GitHub Twitter Follow

A clojure library designed to make conversions between EDN and XML a little easier.

This repository follows the guidelines and standards of the Wall Brew Open Source Policy.

Installation

A deployed copy of the most recent version of clj-xml can be found on clojars. To use it, add the following as a dependency in your project.clj file:

Clojars Project

The next time you build your application, Leiningen or deps.edn should pull it automatically. Alternatively, you may clone or fork the repository to work with it directly.

Public Functions

XML Parsing

This library is built on top of clojure.data.xml and extracts the element structure they've implemented. Based on where your XML is and how it's stored, you'll want to use one of the three following functions to extract it.

  • When you have already parsed an XML document into clojure.data.xml elements: xml->edn
  • When you have a string containing an XML document: xml-str->edn
  • When you have a java.io.InputStream or java.io.Reader (generally used for HTTP/File System): xml-source->edn

Each of these functions accepts an option map as an optional second argument, supporting the following keys:

Option Default Value Description
:preserve-keys? false Maintain the exact keyword structure provided by clojure.xml/parse
:preserve-attrs? false Maintain embedded XML attributes
:remove-empty-attrs? false Remove any empty attribute maps
:stringify-values? false Coerce non-nil, non-string, non-collection values to strings
:remove-newlines? false Remove any newline characters in xml-str. Only applicable for xml-str->edn
:force-seq? false Coerce all child XML nodes into an array of maps.
:force-seq-for-paths [] A sequence of key-path sequences that will be selectively coerced into sequences. Read more about Key Pathing below.

xml-str->edn and xml-source->edn also support the parsing options from clojure.data.xml and Java's XMLInputFactory class. Additional documentation from Oracle is available. This library does not override the default behavior of XMLInputFactory.

Option Default Value Description
:include-node? #{:element :characters} A subset of #{:element :characters :comment}, representing the XML nodes to keep
:location-info true Wether or not location metadata should be generated and attached to results
:allocator Object created Just-In-Time An instance of an XMLInputFactory/ALLOCATOR to allocate events
:coalescing true Convert CDATA nodes to text, and append any adjacent text nodes
:namespace-aware false Wether or not XML 1.0 namespacing support is enabled
:replacing-entity-references false Wether or not internal entity text should be replaced with its XML entity form
:supporting-external-entities false Wether or not externally hosted entities will be resolved at parsing time and evaluated
:validating false Wether or not attached DTDs will be resolved and used to validate the XML document
:reporter Object created Just-In-Time An instance of a XMLInputFactory/REPORTER to use in place of default
:resolver Object created Just-In-Time An instance of a XMLInputFactory/RESOLVER to use in place of defaults
:support-dtd false Wether or not attached DTDs will be read by the parser
:skip-whitespace false Wether or not any whitespace only elements will be preserved as nodes

Let's see how it works:

(require [clj-xml.core :as xml])

(def xml-example
  {:tag :TEST_DOCUMENT
   :attrs {:XMLNS "https://www.fake.not/real"}
   :content
   [{:tag :HEAD
     :attrs nil
     :content
     [{:tag :META_DATA :attrs {:TYPE "title"} :content ["Some Fake Data!"]}
      {:tag :META_DATA :attrs {:TYPE "tag"} :content ["Example Content"]}]}
    {:tag :FILE
     :attrs
     {:POSTER "JANE DOE <[email protected]>"
      :DATE "2020/04/12"
      :SUBJECT "TEST DATA"}
     :content
     [{:tag :GROUPS
       :attrs nil
       :content
       [{:tag :GROUP :attrs nil :content ["test-data-club"]}]}
      {:tag :SEGMENTS
       :attrs nil
       :content
       [{:tag :SEGMENT
         :attrs {:BITS "00111010" :NUMBER "58"}
         :content ["more data"]}
        {:tag :SEGMENT
         :attrs {:BYTES "10100010" :NUMBER "-94"}
         :content ["more fake data"]}]}]}]})

(xml/xml->edn xml-example)
;; => {:test-document
;;     {:head [{:meta-data "Some Fake Data!"}
;;             {:meta-data "Example Content"}]
;;      :file {:groups [{:group "test-data-club"}]
;;             :segments [{:segment "more data"}
;;                        {:segment "more fake data"}]}}}

;; Parse a string instead
(def xml-test-string
  "<?xml version=\"1.0\" encoding=\"UTF-8\"?><TEST_DOCUMENT XMLNS=\"https://www.fake.not/real\"><HEAD><META_DATA TYPE=\"title\">Some Fake Data!</META_DATA><META_DATA TYPE=\"tag\">Example Content</META_DATA></HEAD><FILE POSTER=\"JANE DOE &lt;[email protected]&gt;\" DATE=\"2020/04/12\" SUBJECT=\"TEST DATA\"><GROUPS><GROUP>test-data-club</GROUP></GROUPS><SEGMENTS><SEGMENT BITS=\"00111010\" NUMBER=\"58\">more data</SEGMENT><SEGMENT BYTES=\"10100010\" NUMBER=\"-94\">more fake data</SEGMENT></SEGMENTS></FILE></TEST_DOCUMENT>")

(xml/xml-str->edn xml-test-string)
;; => {:test-document
;;     {:head [{:meta-data "Some Fake Data!"}
;;             {:meta-data "Example Content"}]
;;      :file {:groups [{:group "test-data-club"}]
;;             :segments [{:segment "more data"}
;;                        {:segment "more fake data"}]}}}

;; Preserve the XML_CASE
(xml/xml->edn xml-example {:preserve-keys? true})
;; => {:TEST_DOCUMENT
;;     {:HEAD [{:META-DATA "Some Fake Data!"}
;;             {:META-DATA "Example Content"}]
;;      :FILE {:GROUPS [{:GROUP "test-data-club"}]
;;             :SEGMENTS [{:SEGMENT "more data"}
;;                        {:SEGMENT "more fake data"}]}}}

;; Preserve the XML attributes
(xml/xml->edn xml-example {:preserve-attrs? true})
;; =>   {:test-document
;;      {:head [{:meta-data "Some Fake Data!" :meta-data-attrs {:type "title"}}
;;              {:meta-data "Example Content" :meta-data-attrs {:type "tag"}}]
;;       :file {:groups [{:group "test-data-club"}]
;;              :segments [{:segment "more data" :segment-attrs {:bits "00111010" :number "58"}}
;;                         {:segment "more fake data" :segment-attrs {:bytes "10100010" :number "-94"}}]}
;;       :file-attrs {:poster "JANE DOE <[email protected]>"
;;                    :date "2020/04/12"
;;                    :subject "TEST DATA"}}
;;      :test-document-attrs {:xmlns "https://www.fake.not/real"}}

Key Pathing

When you want to ensure selective paths in the returned XML are coerced to sequences, you may pick one of two options:

  • force-seq? - to coerce all child XML nodes into a collection of nodes.
  • force-seq-for-paths - to coerce selective child XML nodes into collections of nodes.

In the case of force-seq-for-paths, you will supply a sequence of key paths, each of which direct to children a la assoc-in. This key path may contain three types of paths, which may be used together.

  • The namespace qualified keywords: :clj-xml.core/first, :clj-xml.core/last, and :clj-xml.core/every
    • These will modify a sequence's first, last, or every child, respectively
  • The provided alias symbols for the above keywords: first-child, last-child, and every-child
    • These will modify a sequence's first, last, or every child, respectively
  • Bare keywords
    • These will modify the matching given keyword in a map a la update

If the key path is incongruent with the current data structure, (e.g. every-child and a hash-map), an exception will be thrown.

(require [clj-xml.core :as xml])

(xml/xml->edn' xml-example {:force-seq-for-paths [[:test-document :file :segments xml/every-child]
                                                  [:test-document]]})
;; => {:test-document
;;     [{:head [{:meta-data "Some Fake Data!"}
;;              {:meta-data "Example Content"}]
;;      :file {:groups [{:group "test-data-club"}]
;;             :segments [[{:segment "more data"}]
;;                        [{:segment "more fake data"}

(xml/xml->edn' xml-example {:force-seq-for-paths [[xml/every-child]]})
;; => java.lang.IllegalArgumentException: The key clj-xml.core/every is incompatible with class clojure.lang.PersistentArrayMap

XML Emission

To convert EDN into XML, you'll want to use one of the following functions based on the target location.

  • When you want clojure.data.xml elements: edn->xml
  • When you want an unindented string containing an XML document: edn->xml-str
  • When you have a java.io.OutputputStream or java.io.Writer (generally used for HTTP/File System): edn->xml-stream

Each of these functions accepts an option map as an optional final argument, supporting the following keys:

Option Default Value Description
:to-xml-case? false Wether or not the keys representing XML tags will be coerced to XML_CASE
:from-xml-case? false Wether or not the source EDN is in XML_CASE
:stringify-values? false Wether or not non-nil, non-string, non-collection values should be coerced to strings

edn->xml-str and edn->xml-stream also support the parsing options from clojure.data.xml:

Option Default Value Description
:encoding UTF-8 A string representing the character encoding the encoder should emit
:doctype nil An XML Doctype that should be written as the emitted document's DTD

Let's see how it works:

(require [clj-xml.core :as xml])

(def edn-example-with-attrs-and-original-keys
  {:TEST_DOCUMENT
   {:HEAD [{:META_DATA "Some Fake Data!" :META_DATA_ATTRS {:TYPE "title"}}
           {:META_DATA "Example Content" :META_DATA_ATTRS {:TYPE "tag"}}]
    :FILE {:GROUPS [{:GROUP "test-data-club"}]
           :SEGMENTS [{:SEGMENT "more data" :SEGMENT_ATTRS {:BITS "00111010" :NUMBER "58"}}
                      {:SEGMENT "more fake data" :SEGMENT_ATTRS {:BYTES "10100010" :NUMBER "-94"}}]}
    :FILE_ATTRS {:POSTER "JANE DOE <[email protected]>"
                 :DATE "2020/04/12"
                 :SUBJECT "TEST DATA"}}
   :TEST_DOCUMENT_ATTRS {:XMLNS "https://www.fake.not/real"}})

(xml/edn->xml-str edn-example-with-attrs-and-original-keys {:to-xml-case? true :from-xml-case? true :stringify-values? true})
;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><TEST_DOCUMENT XMLNS=\"https://www.fake.not/real\"><HEAD><META_DATA TYPE=\"title\">Some Fake Data!</META_DATA><META_DATA TYPE=\"tag\">Example Content</META_DATA></HEAD><FILE POSTER=\"JANE DOE &lt;[email protected]&gt;\" DATE=\"2020/04/12\" SUBJECT=\"TEST DATA\"><GROUPS><GROUP>test-data-club</GROUP></GROUPS><SEGMENTS><SEGMENT BITS=\"00111010\" NUMBER=\"58\">more data</SEGMENT><SEGMENT BYTES=\"10100010\" NUMBER=\"-94\">more fake data</SEGMENT></SEGMENTS></FILE></TEST_DOCUMENT>"

Contributors

The GitHub profile pictures of all current contributors. Clicking this image will lead you to the GitHub contribution graph.

License

Copyright © Wall Brew Co

This software is provided for free, public use as outlined in the MIT License

clj-xml's People

Contributors

dependabot[bot] avatar fancygits avatar jmhawk avatar nnichols avatar renovate[bot] avatar wallbrewbot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

clj-xml's Issues

Fix CI/CD Actions for new contributors

Bug Report

Describe the Bug

A new contributor creates a Pull Request. After a WB maintainer approves the CI Runner to execute, most of the actions fail.

Steps to Reproduce

Example here: #20

Expected Behavior

The CI works the same as if a WB maintainer executed it

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

github-actions
.github/workflows/cljdoc_test.yml
  • actions/checkout v4.1.4
  • actions/cache v4
  • cljdoc/cljdoc-check-action v0.0.3
.github/workflows/clojure.yml
  • actions/checkout v4.1.4
  • actions/cache v4
.github/workflows/contributors.yml
  • wow-actions/contributors-list v1
.github/workflows/deploy_to_clojars.yml
  • actions/checkout v4.1.4
  • actions/cache v4
  • hole19/git-tag-action v1.0
.github/workflows/format.yml
  • actions/checkout v4.1.4
  • just-sultanov/setup-cljstyle v1
  • stefanzweifel/git-auto-commit-action v5.0.1
.github/workflows/greetings.yml
  • actions/first-interaction v1
.github/workflows/label.yml
  • actions/labeler v5
.github/workflows/lint.yml
  • actions/checkout v4.1.4
  • reviewdog/action-misspell v1
  • actions/checkout v4.1.4
  • nnichols/clojure-lint-action v2
  • actions/checkout v4.1.4
.github/workflows/scanner.yml
  • actions/checkout v4.1.4
  • clj-holmes/clj-holmes-action 53daa4da4ff495cccf791e4ba4222a8317ddae9e
  • github/codeql-action v3
.github/workflows/sync_labels.yml
  • actions/checkout v4.1.4
  • micnncim/action-label-syncer v1
.github/workflows/todo.yml
  • actions/checkout v4.1.4
leiningen
project.clj
  • org.clojure:clojure 1.11.3
  • org.clojure:data.xml 0.2.0-alpha9
  • com.github.clj-kondo:lein-clj-kondo 2024.03.13
  • com.wallbrew:lein-sealog 1.6.0
  • lein-project-version:lein-project-version 0.1.0
  • mvxcvi:cljstyle 0.16.630
maven
pom.xml
  • org.clojure:clojure 1.11.3
  • org.clojure:data.xml 0.2.0-alpha9

  • Check this box to trigger a request for Renovate to run again on this repository

Update to Clojure 1.11

Feature Request

To remove warnings related to update-keys and update-vals on projects using Clojure 1.11, can clj-xml be updated to use the new core functions?

Problem Statement

Starting a server running Clojure 1.11 and a clj-xml dependency shows warnings.
E.g.

WARNING: update-keys already refers to: #'clojure.core/update-keys in namespace: clj-xml.impl, being replaced by: #'clj-xml.impl/update-keys

Ideal Solution

Update to Clojure 1.11 and use the new core functions.

force-seq-for-nested-keys

Feature Request

Problem Statement

This option exists today and does exactly what it's supposed to:
force-seq? - to coerce child XML nodes into an array of maps.

Sometimes, there will only ever be a single child XML node, but the edn returned will be a vector of maps instead of a single map.

Ideal Solution

A new option, something like:
force-seq-for-nested-keys - to coerce child XML nodes using the supplied nested keys into an array of maps

Alternative solutions

Workaround.
When cli-xml sometimes returns a map and other times a vector of maps, we put the single map into a vector so we can consistently handle a vector of maps.

(defn vectorize
  [x]
  (if (vector? x)
    x
    [x]))

Additional Context

Add any other context or screenshots about the feature request here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.