note / xml-lens Goto Github PK

View Code? Open in Web Editor NEW

32.0 3.0 5.0 2.84 MB

XML Optics library for Scala

Home Page: https://note.github.io/xml-lens/

License: MIT License

Scala 100.00%

scala xml lenses optics

xml-lens's Issues

How would I focus on only the instances of an element which have a particular attribute value?

This lib looks great!

But I'm a bit stuck on how to do something that I'd expected to be straight forward. Maybe I just don't know enough about optics. What I want is to select only the elements that have an attribute with a given value.

So given this example:

val xml =
  """
    |<a>
    |  <b>
    |    <c example="1">1234</c>
    |    <c example="2">5678</c>
    |    <c example="3">9123</c>
    |  </b>
    |</a>
  """.stripMargin

I'd be after only the <c> that has a example attribute with value 2. Using this expression gives me all of the <c> elements:

val c = root \ "b" \ "c"
println(pl.msitko.xml.parsing.XmlParser.parse(xml).map(c.getAll))

//prints:
//Right(List(Element(Vector(Attribute(ResolvedName(,,example),1)),List(Text(1234)),Vector()), Element(Vector(Attribute(ResolvedName(,,example),2)),List(Text(5678)),Vector()), Element(Vector(Attribute(ResolvedName(,,example),3)),List(Text(9123)),Vector())))

I've tried using having, something like this:

val c = root \ "b" \ "c" having {
  case LabeledElement(_, Element(attr, _, _)) => attr.find(_.key.localName == "example").exists(_.value == "1")
}

but that seems to only pass child elements of <c> to the partial function, which doesn't give me a chance to inspect the attributes.

Benchmarks

Interoperability with XML literals

Is it possible to have some kind of interoperability with XML literals?

Normalization of XML

At some point we will want to have reasonable output. Outside of pure formatting aspect it would be nice to e.g. try to avoid multiple namespace declarations for the same namespaces. Probably all namespace declarations should be moved to root element.

Such operations should be optional - there may be some cases when user want to avoid unneccessary transformations as want to have output as much similar to input as it's possible.

There's an example of such behavior (namely - many namespace declarations for one namespace) in test replaceOrAddAttr for ResolvedNameMatcher in OpticsBuilderSpec

XmlParser should read XML version and encoding

Using e.g. https://docs.oracle.com/javase/8/docs/api/javax/xml/stream/XMLStreamReader.html#getCharacterEncodingScheme--

On the other hand not sure if any other values than (1.0 and utf-8) are in any practical use...

Decide on using monocle-cats

Performance tests mimicking real usage

There are already some simple tests but they're very synthetic. They're useful in the sense that they allow us to easily find what the bottleneck is. Besides of them we should have tests mimicking real world usage (doing some transformations on real world XML, trying to operate on quite big files (e.g. a few MBs may be also interesting).

Would be nice to add test results to doc (probably a separated MD file not to clutter the main docs)

PrettyPrinter

FYI xmlformat... xml ADT and encoder / decoder

Incase you're interested:

https://gitlab.com/fommil/scalaz-deriving/tree/master/examples/xmlformat/src/main/scala/xmlformat

you might be interested in the XNode.scala file, and from there to the encoders/decoders and the parsers/printers.

Reasonable equal implementation for Node

It's not obvious how equality should be implemented.

Probably related to #5

Equivalent to `javax.xml.stream.isCoalescing` in `XmlParser`

When replacingEntityReferences is enabled it may be observed that a few Text in row appears. In theory javax.xml.stream.isCoalescing should control this behavior but unfortunately while setting it to true solves that issue it has some unexpected side effects - namely EntityReferences are not parsed if replacingEntityReferences is set to true. It may seems that we can set isCoalescing only when replacing... is set to true but it will not work as it also causes CData not being parsed. It's described here: https://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.5/sjsxp/ReleaseNotes.html

To avoid relying on strange behaviors of Java parsers I think xml-lens should provide coalescing functionality by itself. Either as part of parser or as post-processing (same as minimize is done)

Scala.js io module

`Element`s don't compare equal when attributes are in different order

Possibly related to #7

We find that two parsed ASTs often don't compare equal because the attributes/namespace declarations are in a different order. The order of attributes should be irrelevant - https://www.w3.org/TR/REC-xml/#sec-starttags

This seems to caused by the attributes/namespacedecs being stored in a Seq:

final case class Element(attributes: Seq[Attribute] = Seq.empty, children: Seq[Node] = Seq.empty, namespaceDeclarations: Seq[NamespaceDeclaration] = Seq.empty)

Could this be solved by using a Map? For instance:

final case class Element(attributes: Map[ResolvedName, String] = Map.empty, children: Seq[Node] = Seq.empty, namespaceDeclarations: Map[String, String] = Map.empty)

I guess this looses the use of the explict Attribute and NamespaceDeclaration types but is worth the tradeoff IMHO

Make AST complete

Entity references, PCData among the others are missing.

Add more options to PrinterConfig

Ideas of additional options in PrinterConfig:

add Boolean option for repairing namespaces (i.e. automatically defining used namespaces in case they're not yet defined)
add option which defines how to treat multiple attributes for the same elements (namely <a attr="val1" attr="val2"></a>). Exemplary behaviors - ignore it and print all of them, flatten them by concatenating them separated by spaces, use the last value, use the first value)

Ensure that no internal types are leaking into client

Add appropriate package private modifiers etc.

note / xml-lens Goto Github PK

xml-lens's Issues

How would I focus on only the instances of an element which have a particular attribute value?

Benchmarks

Interoperability with XML literals

Normalization of XML

XmlParser should read XML version and encoding

Decide on using monocle-cats

Performance tests mimicking real usage

PrettyPrinter

FYI xmlformat... xml ADT and encoder / decoder

Reasonable equal implementation for Node

Equivalent to `javax.xml.stream.isCoalescing` in `XmlParser`

Scala.js io module

`Element`s don't compare equal when attributes are in different order

Make AST complete

Add more options to PrinterConfig

Ensure that no internal types are leaking into client

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent