Giter VIP home page Giter VIP logo

Comments (8)

bb avatar bb commented on May 20, 2024 1

@rbscott: I'm afraid that's yet another array mode you're looking for. As far as I understood, it's only for sibling elements having the same tag; at least that's my use case. You're right: by using a map/object as container, there's no order, so basically you can't use this library for mixed content markup but only for data structure (de-)serialization.

from fast-xml-parser.

bb avatar bb commented on May 20, 2024

BTW, I didn't see arrayMode in the docs, but in the source code, e.g https://github.com/NaturalIntelligence/fast-xml-parser/blob/master/src/x2j.js#L22

If I may add a feature request here: It would be great to have the arrayMode configurable, e.g. per tag name:

parse(xml, {forceArrayTags: ["w1"]}) // expected: {"x":{"y":{"z":["z1","z2"]},"w":["w1"]}}

Instead of a tag list as config value, one might consider a arrayModeDecider like this:

parse(xml, {arrayMode: a => a.match(/^w$/) }) // expected: {"x":{"y":{"z":["z1","z2"]},"w":["w1"]}}

Personally, I currently only need tag names, so any of those approaches would work for me. But maybe there are other use cases which would require other callback parameters.

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on May 20, 2024

Array mode was previously added by some PR. But since it was having some issues and the original requirement/need was not cleared to me, I decided to remove it from the docs at the time of massive refactoring to support big XML files. So the existing project who are using FXP will not face any issue and new users will not use it until it is properly implemented and tested.

So I would be happy if you raise a PR having separate test file testing all the expectations.

But before that what we need to understand is the need of array mode. Why and in what scenario it can help along with expectations for different type of inputs.

Thanks for raising this issue.

from fast-xml-parser.

bb avatar bb commented on May 20, 2024

My use case is quite simple: I have elements which sometimes have one child of a name, sometimes multiple. Here's a very simple example:

<root>
    <item id="a">
        <sub>1</sub>
        <sub>2</sub>
    </item>
    <item id="b">
        <sub>3</sub>
    </item>
</root>
{
	"root": {
		"item": [{
			"sub": [1, 2]
		}, {
			"sub": 3
		}]
	}
}

I'd like to be able to always treat the value of "sub" as an array, because in most of my cases it is. But if there's only one value, it is not. Sometimes, I forgot checking the single child case and run into bugs.

Without the option arrayMode, I need to check, like this:

obj.root.item.map(item => !Array.isArray(item.sub) ? [item.sub] : item.sub)

instead of this version which I'd use with option arrayMode.

obj.root.item.map(item => item.sub)

I usually do not want all elements to be arrays, so instead of a global explicit boolean, I'd prefer a version which is based on tag names or xpath or a shouldEnforceArraysForSingleItemsProcessor as given in my last example.
In xml2js, I handled my parsing of those cases not using its explicitArray option but using a validator:

explicitArray: false,
validator: (xpath: string, currentValue: string, newValue: string): string | string[] => {
    if (!currentValue && xpath.split("/").pop() === "sub") {
        return [newValue];
    }
    return newValue;
}

It's (minimally) documented here: https://github.com/Leonidas-from-XIV/node-xml2js/blob/master/test/parser.test.coffee#L33
And called from here: https://github.com/Leonidas-from-XIV/node-xml2js/blob/master/src/parser.coffee#L150

For the above xml example, the validator calls which would actually be executed be like this:

validator("/root/item/sub", undefined, "1");
validator("/root/item/sub", "1", "2");
validator("/root/item", undefined, {"$":{"id":"a"},"sub":["1","2"]});
validator("/root/item/sub", undefined, "3");
validator("/root/item", {"$":{"id":"a"},"sub":["1","2"]}, {"$":{"id":"b"},"sub":"3"});
validator("/root", undefined, {"item":[{"$":{"id":"a"},"sub":["1","2"]},{"$":{"id":"b"},"sub":"3"}]});

I used this validator to get the above infos:

validator: (xpath, currentValue, newValue) => {
    console.log(`validator("${xpath}", ${JSON.stringify(currentValue)}, ${JSON.stringify(newValue)});`);
    return newValue;
}

So, instead of providing explicitArray support directly, I'd rather implement validator which would cover all those use cases. One might even provide a few default validators, like one for explicitArrays etc.

By the way, I don't like the name validator that much. I think I'd rather call it something like elementPostProcessor. I could even imagine having multiple of those chained together for different aspects.

One downside I see when using this kind of validators is the extra time/memory spent for creating the xpath. In my cases it's worth it because I can modify the output structure directly and don't need to do another pass.

I could even imagine more complex usages like creating real classes directly, etc. - but, honestly, this might be a lot of work, both conceptually and then actually implementing it. I'll see when I have time for this and provide a PR. Feel free to close this issue if you don't hear from me in a while ;)

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on May 20, 2024

Sorry for the late response. I was away from my machine for few days

I would appreciate that you always include downside of the proposed changes.

I agree on

  • having tags/values as array is a good option against having extra condition or iterating whole JSON again.
  • transforming values of a specific tag instead of all
  • elementPostProcessor is better name than validator. And since there is already a validator, it may confuse usrs.

But

The main goal of FXP is to transform XML into Nimn or JSON as fast as possible and to be compatible with maximum browsers. If there is any feature which is impacting the speed of this library drastically but being used by only few users, I would better not to implement that.

Moreover, can we think of any other post processing feature in addition of array mode for selective tags/ values? If No, we don't need the generic name.

Now for the implementation, I believe we can simply check if there is any child node with length === 1 (which is already present) put it into array for all or specific json/xml path. I hope it should not impact the performance very much in this case.

from fast-xml-parser.

rbscott avatar rbscott commented on May 20, 2024

I just started looking at this library as a possibility for parsing xml, and I have a lot of cases where order is important. I think something like arrayMode is what we are looking for as it seems like there is a no way to maintain order if the child elements are converted into dictionaries.

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on May 20, 2024

array mode (considering that it is not impacting performance) would be an optional option which can be documented with the warning about how deserialization or the sequence of tags can be impacted.

However as the insertion order is always maintained in Array, I can't understand why we can't maintain the order of child elements.

from fast-xml-parser.

amitguptagwl avatar amitguptagwl commented on May 20, 2024

Closing the issue due to no activity.. feel free to reopen or to continue discussion.

from fast-xml-parser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.