Giter VIP home page Giter VIP logo

u8xmlparser's People

Contributors

ikorin24 avatar null-l avatar proudust avatar ramtype0 avatar sebastianstehle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

u8xmlparser's Issues

Feature Request: Support for mixed content

Afaik the normal xml is totally valid.

<div>
<strong>Hello</strong> U8XMLParser
</div>

But in this case I have no idea how to get the inner xml.

  • InnerText is empty.
  • AsRawXml returns everything including the div
  • Looping over XmlNodes returns only the strong node, but not a text node or so.

Feature Request: Xml Validation

It would be great to have XmlValidation as part of this library using XmlSchemaSet.

Currently (using XDocument):

var schemas = new XmlSchemaSet();
schemas.Add("<namespace>", XmlReader.Create("<XsdLocation>"));
var xml = "<test></test>";

var doc = XDocument.Parse(xml);
doc.Validate(_schemas, ValidationCallBack);

private void ValidationCallBack(object? sender, ValidationEventArgs e) 
{
    ...
}

System.Xml.Schema.Extensions.Validate

I'm not sure how the api would look, but I would prefer a returned object from Validate instead of a callback

`RawString.StartsWith` and `RawString.EndsWith` treats any unpaired surrogate in argument string as "�"

RawString.StartsWith and RawString.EndsWith treats any unpaired surrogate in argument string as "�".
So...

[Fact]
        public unsafe void UnpairedSurrogateComparison()
        {
            // "\ufffd" == "�" It is the default fallback character for UTF8Encoding
            const string FallbackCharStr = "\ufffd";
            // "\ud83d" is one of the surrogate
            const string SurrogateCharStr = "\ud83d";
            var fallbackCharUtf8Bytes = Encoding.UTF8.GetBytes(FallbackCharStr);
            fixed(byte* ptr = fallbackCharUtf8Bytes) {
                var fallbackCharRawStr = new RawString(ptr, fallbackCharUtf8Bytes.Length);
                Assert.False(fallbackCharRawStr.StartsWith(SurrogateCharStr));
                Assert.False(fallbackCharRawStr.EndsWith(SurrogateCharStr));
            }
        }

This kind of test fails.

XmlNode.OuterXml

I am trying to get the outer xml of a node, current it looks like you only have InnerText. is this something that could be added?

EDIT:

An example would be

<Messages>
    <Message Id="1">
        <Name>Foo</Name>
    </Message>
    <Message Id="2">
        <Name>Foo</Name>
    </Message>
</Messages>

And the result of outerxml selecting the first message in Messages.Children should be:

<Message Id="1">
    <Name>Foo</Name>
</Message>

FormatException thrown when parsing DOCTYPE

I am currently trying to parse this xml file:

<?xml version="1.0"?>
<!DOCTYPE datafile PUBLIC "-//FB Alpha//DTD ROM Management Datafile//EN" "http://www.logiqx.com/Dats/datafile.dtd">

<datafile>
</datafile>

A FormatException is being thrown when parsing the DOCTYPE element.

System.FormatException
  HResult=0x80131537
  Message=Exception of type 'System.FormatException' was thrown.
  Source=U8XmlParser
  StackTrace:
   at U8Xml.XmlParser.<TryParseDocType>g__SkipUntil|17_0(Byte ascii, RawString data, Int32& i)
   at U8Xml.XmlParser.TryParseDocType(RawString data, Int32& i, Boolean hasNode, OptionalNodeList optional, RawStringTable& entities)
   at U8Xml.XmlParser.StartStateMachine(RawString data, CustomList`1 nodes, CustomList`1 attrs, OptionalNodeList optional, RawStringTable& entities)
   at U8Xml.XmlParser.ParseCore(UnmanagedBuffer& utf8Buf, Int32 length)
   at U8Xml.XmlParser.ParseFileCore(String filePath, Encoding encoding)
   at U8Xml.XmlParser.ParseFile(String filePath, Encoding encoding)
   at U8Xml.XmlParser.ParseFile(String filePath)

Tested on U8XmlParser v1.5.0

RawString append.

Hey @ikorin24 ,

What an amazing job, you're working with unmanaged code. =)

A question, what would be the most suitable way to append/concatenation the RawString's? I haven't found any function that could already do this.

Thank,
Cheers.

Ampersand is not allowed in Xml attributes

This is just another example for an invalid XML that is accepted by this library:

<?xml version="1.0" encoding="UTF-8"?>
<SomeData>
	<Foo url="http://google.com?quer1=1&query2=2"></Foo>
</SomeData>

FormatException should contain detailed error message.

Your example was just throwing FormatException

There is a question mark missing and you just get a FormatException without any explanation. It is hard to find issues like that.

<?xml version="1.0" encoding="UTF-8"> 
<SomeData>
    <Data aa="20">bbb</Data>
    <Data aa="30">ccc</Data>
</SomeData>

The parser should be able to read the xml with comments at the end.

Describe the bug:

If there is a comment at the end of the xml, parsing will fail.

Environment:

library version: 1.6.0
.NET version: .NET6
OS: Windows10

Steps to Reproduce:

Execute the following code.

using var xml = XmlParser.Parse(
@"<foo></foo>
<!-- comment -->");

Expected behavior:

No errors.

Actual Behavior:

A FormatException is thrown.
Message:

"(line 2, char 1): Xml does not have multiple root nodes."

AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.

Describe the bug:

Intermittent bug when trying to get an attribute from a node while in an IEnumerable<XmlNode> from XmlNodeDescendantList

Our code for reference:

var type = _root.Descendants
    .FirstOrDefault(node => node.Name != "xs:attribute" && GetAttribute<string>(node, "name") == inheritedTypeName) 
        is { IsNull: false } type
            ? Visit(type) with { Name = name ?? inheritedTypeName }
            : null;


[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static T? GetAttribute<T>(XmlNode? node, string name)
{
    if (node is null || !node.Value.TryFindAttribute(name, out var attribute)) // fails here
        return default;
    var value = attribute.Value.ToString();
    if (string.IsNullOrWhiteSpace(value) || value is "unbounded")
        return default;
    return (T?)TypeDescriptor.GetConverter(typeof(T)).ConvertFromString(value);
}

Environment:

library version: 1.6.1
.NET version: .NET6 6.0.13
OS: Windows10

Steps to Reproduce:

call node.TryFindAttribute(<string value>, out var attribute)

Expected behavior:

Return the attribute

Actual Behavior:

System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
  at U8Xml.XmlAttributeEnumerableExtension.FindOrDefault[[U8Xml.XmlAttributeList, U8XmlParser, Version=1.6.1.0, Culture=neutral, PublicKeyToken=null]](U8Xml.XmlAttributeList, System.ReadOnlySpan`1<Byte>)
  at U8Xml.XmlAttributeEnumerableExtension.FindOrDefault[[U8Xml.XmlAttributeList, U8XmlParser, Version=1.6.1.0, Culture=neutral, PublicKeyToken=null]](U8Xml.XmlAttributeList, System.ReadOnlySpan`1<Char>)
  at U8Xml.XmlAttributeEnumerableExtension.FindOrDefault[[U8Xml.XmlAttributeList, U8XmlParser, Version=1.6.1.0, Culture=neutral, PublicKeyToken=null]](U8Xml.XmlAttributeList, System.String)
  at U8Xml.XmlAttributeEnumerableExtension.TryFind[[U8Xml.XmlAttributeList, U8XmlParser, Version=1.6.1.0, Culture=neutral, PublicKeyToken=null]](U8Xml.XmlAttributeList, System.String, U8Xml.XmlAttribute ByRef)
  at U8Xml.XmlNode.TryFindAttribute(System.String, U8Xml.XmlAttribute ByRef)
  at CMA.Common.Xml.Validation.Xsd.XsdParser.GetAttribute[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Nullable`1<U8Xml.XmlNode>, System.String)
  at CMA.Common.Xml.Validation.Xsd.XsdParser+<>c__DisplayClass12_0.<VisitAttribute>b__0(U8Xml.XmlNode)
  at System.Linq.Enumerable.TryGetFirst[[U8Xml.XmlNode, U8XmlParser, Version=1.6.1.0, Culture=neutral, PublicKeyToken=null]](System.Collections.Generic.IEnumerable`1<U8Xml.XmlNode>, System.Func`2<U8Xml.XmlNode,Boolean>, Boolean ByRef)
  at System.Linq.Enumerable.FirstOrDefault[[U8Xml.XmlNode, U8XmlParser, Version=1.6.1.0, Culture=neutral, PublicKeyToken=null]](System.Collections.Generic.IEnumerable`1<U8Xml.XmlNode>, System.Func`2<U8Xml.XmlNode,Boolean>)

XML parser accepts invalid entities

According to the XML specs entities need to be registered, e.g. this is not valid XML:

<?xml version="1.0" encoding="UTF-8"?>
<SomeData>
	<Data>&copy;</Data>
</SomeData>

You have to register these entities:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE SomeData[
	<!ENTITY copy "&#169;;">
]>
<SomeData>
	<Data>&copy;</Data>
</SomeData>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.