Giter VIP home page Giter VIP logo

domjson's Introduction

domJSON

License Bower version npm version Coverage Status Dependencies Travis Build

Convert DOM trees into compact JSON objects, and vice versa, as fast as possible.

Jump To

Description

The purpose of domJSON is to create very accurate representations of the DOM as JSON, and to do it very quickly. While there are probably dozens of viable use cases for this project, I've made two quick demos to showcase the library's versatility. The first simply makes a copy of a given branch of the DOM tree, which could be useful for end user bug logging, state tracking, etc. The second demo does a batch update of a large number of DOM Nodes, but much more performantly than the "traditional" jQuery select > .each() > update pattern.

Broadly speaking, the goals of this project are:

  • Provide as accurate a copy of a given node's DOM properties as possible, but allow option filtering to remove useless information
  • Be able to rebuild JSON nodes into DOM nodes as performantly as possible
  • Speed, speed, speed
  • No frivolous data: produce JSON objects that are as compact as possible, removing all information not relevant to the developer
  • Keep the library lightweight, with no dependencies

DomJSON works in the following browsers (mobile and desktop versions supported):

  • Chrome Chrome 39+
  • Chrome Firefox 24+
  • Chrome Safari 7+
  • Chrome IE 9+

Installation

Installing domJSON is easy. You can pull it from Bower...

bower install domjson

...or grab it from NPM and manually include it as a script tag...

npm install domjson --save

or just download this repo manually and include the file as a dependency.

<script src="./lib/domJSON.js"></script>

Demos

Coming soon...

Usage

Using domJSON is super simple: use the .toJSON() method to create a JSON representation of the DOM tree:

var someDOMElement = document.getElementById('sampleId');
var jsonOutput = domJSON.toJSON(myDiv);

And then rebuild the DOM Node from that JSON using .toDOM():

var DOMDocumentFragment = domJSON.toDOM(jsonOutput);
someDOMElement.parentNode.replaceChild(someDOMElement, DOMDocumentFragment);

When creating the JSON object, there are many precise options available, ensuring that developers can produce very specific and compact outputs. For example, the following will produce a JSON copy of someDOMElement's DOM tree that is only two levels deep, contains no "offset*," "client*," or "scroll*" type DOM properties, only keeps the "id" attribute on each DOM Node, and outputs a string (rather than a JSON-friendly object):

var jsonOutput = domJSON.toJSON(myDiv, {
	attributes: ['id'],
	domProperties: {
		exclude: true,
		values: ['clientHeight', 'clientLeft', 'clientTop', 'offsetWidth', 'offsetHeight', 'offsetLeft', 'offsetTop', 'offsetWidth', 'scrollHeight', 'scrollLeft', 'scrollTop', 'scrollWidth']
	},
	deep: 2,
	stringify: true
});

FilterLists

A FilterList is a custom type for certain options passed to domJSON's .toJSON() method. It allows for very granular control over which fields are included in the final JSON output, allowing developers to eliminate useless information from the result, and produce extremely compact JSON objects. It operates based on boolean logic: an inclusive FilterList allows the developer to explicitly specify which fields they would like to see in the output, while an exclusive FilterList allows them to specify which fields they would like to omit (e.g., "Copy every available field except X, Y, and Z"). The FilterList accepts an object as an argument, or a shorthand array (which is the recommended style, since it's much less verbose).

Take this example: suppose we have a single div that we would like to convert into a JSON object using domJSON. It looks like this in its native HTML:

<div id="myDiv" class="testClass" style="margin-top: 10px;" data-foo="bar" data-quux="baz">
	This is some sample text.
</div>

Let's try and make a JSON object out of this field, but only include the class and style attributes, and only the offsetTop and offsetLeft DOM properties. Both of the following inputs will have the same output:

var myDiv = document.getElementById('myDiv');

//Using object notation for our filterLists
var jsonOutput = domJSON.toJSON(myDiv, {
	attributes: {
		values: ['class', 'style']
	},
	domProperties: {
		values: ['offsetLeft', 'offsetTop']
	},
	metadata: false
});

//Same thing, using the array notation
var jsonOutput = domJSON.toJSON(myDiv, {
	attributes: [false, 'class', 'style'],
	domProperties: [false, 'offsetLeft', 'offsetTop'],
	metadata: false
});

The result:

{
	"attributes": {
		"class": "testClass",
		"style": "margin-top: 10px;"
	},
	"childNodes": [{
		"nodeType": 3,
		"nodeValue": "This is some sample text"
	}],
	"nodeType": 1,
	"nodeValue": "This is some sample text",
	"offsetLeft": 123,
	"offsetTop": 456,
	"tagName": "DIV"
}

The array notation is much shorter, and very easy to parse: the first value is a boolean that determines whether the list is exclusive or not, and the remaining elements are just the filter values themselves. Here is an example for making an exclusive list of attributes on the same div as above:

var myDiv = document.getElementById('myDiv');

//Using object notation for our filterLists, but this time specify values to EXCLUDE
var jsonOutput = domJSON.toJSON(myDiv, {
	attributes: {
		exclude: true,
		values: ['class', 'style']
	},
	domProperties: {
		values: ['offsetLeft', 'offsetTop']
	},
	metadata: false
});

//Same thing, using the array notation
var jsonOutput = domJSON.toJSON(myDiv, {
	attributes: [true, 'class', 'style'],
	domProperties: [false, 'offsetLeft', 'offsetTop'],
	metadata: false
});

The result:

{
	"attributes": {
		"id": "myDiv",
		"data-foo": "bar",
		"data-quux": "baz"
	},
	"childNodes": [{
		"nodeType": 3,
		"nodeValue": "This is some sample text"
	}],
	"nodeType": 1,
	"nodeValue": "This is some sample text",
	"offsetLeft": 123,
	"offsetTop": 456,
	"tagName": "DIV"
}

Objects

domJSON : object

domJSON is a global variable to store two methods: .toJSON() to convert a DOM Node into a JSON object, and .toDOM() to turn that JSON object back into a DOM Node

Typedefs

FilterList : Object | Array

An object specifying a list of fields and how to filter it, or an array with the first value being an optional boolean to convey the same information

domJSON : object

domJSON is a global variable to store two methods: .toJSON() to convert a DOM Node into a JSON object, and .toDOM() to turn that JSON object back into a DOM Node

Kind: global namespace


domJSON.toJSON(node, [opts]) ⇒ Object | string

Take a DOM node and convert it to simple object literal (or JSON string) with no circular references and no functions or events

Kind: static method of domJSON
Returns: Object | string - A JSON-friendly object, or JSON string, of the DOM node -> JSON conversion output
Todo

  • {boolean|FilterList} [opts.parse=false] a FilterList of properties that are DOM nodes, but will still be copied PLANNED
Param Type Default Description
node Node The actual DOM Node which will be the starting point for parsing the DOM Tree
[opts] Object A list of all method options
[opts.allowDangerousElements] boolean `false` Use true to parse the potentially dangerous elements <link> and <script>
[opts.absolutePaths] boolean | FilterList `'action', 'data', 'href', 'src'` Only relevant if opts.attributes is not false; use true to convert all relative paths found in attribute values to absolute paths, or specify a FilterList of keys to boolean search
[opts.attributes] boolean | FilterList `true` Use true to copy all attribute key-value pairs, or specify a FilterList of keys to boolean search
[opts.computedStyle] boolean | FilterList `false` Use true to parse the results of "window.getComputedStyle()" on every node (specify a FilterList of CSS properties to be included via boolean search); this operation is VERY costly performance-wise!
[opts.cull] boolean `false` Use true to ignore empty element properties
[opts.deep] boolean | number `true` Use true to iterate and copy all childNodes, or an INTEGER indicating how many levels down the DOM tree to iterate
[opts.domProperties] boolean | FilterList true 'false' means only 'tagName', 'nodeType', and 'nodeValue' properties will be copied, while a FilterList can specify DOM properties to include or exclude in the output (except for ones which serialize the DOM Node, which are handled separately by opts.serialProperties)
[opts.htmlOnly] boolean `false` Use true to only iterate through childNodes where nodeType = 1 (aka, instances of HTMLElement); irrelevant if opts.deep is true
[opts.metadata] boolean `false` Output a special object of the domJSON class, which includes metadata about this operation
[opts.serialProperties] boolean | FilterList `true` Use true to ignore the properties that store a serialized version of this DOM Node (ex: outerHTML, innerText, etc), or specify a FilterList of serial properties (no boolean search!)
[opts.stringify] boolean `false` Output a JSON string, or just a JSON-ready javascript object?


domJSON.toDOM(obj, [opts]) ⇒ DocumentFragment

Take the JSON-friendly object created by the .toJSON() method and rebuild it back into a DOM Node

Kind: static method of domJSON
Returns: DocumentFragment - A DocumentFragment (nodeType 11) containing the result of unpacking the input obj

Param Type Default Description
obj Object A JSON friendly object, or even JSON string, of some DOM Node
[opts] Object A list of all method options
[opts.allowDangerousElements] boolean `false` Use true to include the potentially dangerous elements <link> and <script>
[opts.noMeta] boolean `false` true means that this object is not wrapped in metadata, which it makes it somewhat more difficult to rebuild properly...

FilterList : Object | Array

An object specifying a list of fields and how to filter it, or an array with the first value being an optional boolean to convey the same information

Kind: global typedef
Properties

Name Type Default Description
exclude boolean false If this is set to true, the filter property will specify which fields to exclude from the result (boolean difference), not which ones to include (boolean intersection)
values Array.<string> An array of strings which specify the fields to include/exclude from some broader list

Performance

A major goal of this library is performance. That being said, there is one way to significantly slow it down: setting opts.computedStyle to true. This forces the browser to run window.getComputedStyle() on every node in the DOM Tree, which is really, really slow, since it requires a redraw each time it does it! Obviously, there are situations where this you need the computed style and this performance hit is unavoidable, but otherwise, keep opts.computedStyle set to false. Besides that, I'm working on writing some benchmark tests to give developers an idea of how each option affects the speed of domJSON, but this will take some time!

Generally speaking, avoid using FilterList type options for best performance. The optimal settings in terms of speed, without sacrificing any information, are as follows:

var jsonOutput = domJSON.toJSON(someNode, {
	absolutePaths: false,
	attributes: false,
});

A downside of the above setup is that you may well end up with a very bloated output containing a lot of frivolous fields. As of now, this is the cost of getting a quick turnaround - in the future, domJSON will be optimized to work more quickly for more precise setups.

Tests

You can give the test suite for domJSON a quick run through in the browser of your choice here. You can also view results from local Chrome tests, or the entire browser compatibility suite. Please note that Rawgit caches the tests after you run them the first time, so if something seems off, clear your cache!

Contributing

Feel free to pull and contribute! If you do, please make a separate branch on your Pull Request, rather than pushing your changes to the Master. It would also be greatly appreciated if you ran the appropriate tests before submitting the request (there are three sets, listed below).

For unit testing the Chrome browser, which is the most basic target for functionality, type the following in the CLI:

gulp unit-chrome

To record the code coverage after your changes, use:

gulp coverage

And, if you have them all installed and are feeling so kind, you can also do the entire browser compatibility suite (Chrome, Canary, Firefox ESR, Firefox Developer Edition, IE 11, IE 10):

gulp unit-browsers

If you make changes that you feel need to be documented in the readme, please update the relevant files in the /docs directory, then run:

gulp docs

##License

The MIT License (MIT)

Copyright (c) 2014 Alex Zaslavsky

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

domjson's People

Contributors

azaslavsky avatar qqilihq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

domjson's Issues

has some errors on NodeJS environment

I was tryed to use it on NodeJS environment, and I also used the 'puppeteer' ,
here is my demo:
const puppeteer = require('puppeteer'); const domJSON = require('domjson'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('http://lishi.tianqi.com/beijing/201612.html'); const ulsObjArr = await page.evaluate(() => { const uls = document.querySelectorAll('div.tqtongji2 ul'); return Array.prototype.map.call(uls, function(){ return domJSON.toJSON(ul) }); }); console.log('===================================='); console.log(ulsObjArr); console.log('===================================='); await browser.close(); })();
and it has an error:
win.location.href is not defined

Did I have something wrong ? or how can i solve it?

[question] How to pass pure html in toJSON() method?

I work on code editor and have following question:

DomJson Accepts html element as an argument for domJSON.toJSON(element)
Can i pass pure html string instead? domJSON.toJSON('<p>some code</p>') i tried that and it logs error length is undefined which may be because it expects html element, not a string.

Help guide has the wrong implementation of Node.replaceChild()

This took me a second to realize while using the plugin.

https://developer.mozilla.org/en-US/docs/Web/API/Node/replaceChild

Instead of

var DOMDocumentFragment = domJSON.toDOM(jsonOutput); someDOMElement.parentNode.replaceChild(someDOMElement, DOMDocumentFragment);

it should be

var DOMDocumentFragment = domJSON.toDOM(jsonOutput); someDOMElement.parentNode.replaceChild(DOMDocumentFragment, someDOMElement);

at least in Chrome/60.0.3112.113 in the console

better reprasentation of json output

would be nice to have a option which returns only a minimal set of the dom node, such as

https://github.com/azaslavsky/domJSON

or even better:

For example, JsonML outputs dom to json like this:


var jsonMl =    ["table",{"class":"MyTable","style":"background-color:yellow"},
        ["tbody",
            ["tr",
                ["td",{"class":"MyTD","style":"border:1px solid black"},
                "#550758"],
                ["td",{"class":"MyTD","style":"background-color:red"},
                "Example text here"]
            ],
            ["tr", 
                ["td",{"class":"MyTD","style":"border:1px solid black"},
                "#993101"],
                ["td",{"class":"MyTD","style":"background-color:green"},
                "127624015"]
            ],
            ["tr", 
                ["td",{"class":"MyTD","style":"border:1px solid black"},
                "#E33D87"],
                ["td",{"class":"MyTD","style":"background-color:blue"},
                "\u00A0",
                    ["span",{"id":"mySpan","style":
                    "background-color:maroon;color:#fff 
                    !important"},"\u00A9"],
                    "\u00A0"
                ]
            ]
        ]
    ];

Your compiler returns the nodes within a group such as "childNodes" and the like, which is noisy.
Please provide a option to turn the dom-node into this what jsonMl makes but with performance in mind:)

Getting the complete DOM

Hi,
I was wondering how I can get the full DOM for a page. I tried putting in the html element or the body element but it crashed on them both.
Thanks

I do not get instructions

Under usage there is:
var someDOMElement = document.getElementById('sampleId'); var jsonOutput = domJSON.toJSON(myDiv);

first we select ID, than there is some myDiv...i am confused. Can you be more specific or descriptive? Beginner here

Figure out a better API for absolute paths

Absolute paths are a little hard to do right now. The new API will probably split absolute paths in styles and attributes into to separate option properties. Absolutes for attributes will be handled with a FieldList type input, which will specify the attributes to perform the path check on.

Benchmark tests

So that developers have a solid idea of how each option affects performance.

"The input element's type ('checkbox') does not support selection."

Hi! Great library.

Not sure if this is a bug or a browser issue, but I think you might be interested - I seem to be having issues JSONifying a few types of input elements. Here is a JSFiddle demonstrating what can happen.

I'm using chrome n 43.0.2357.124 (64-bit).

It's as if checkbox (and radio, file, etc) will raise an exception if you so much as access selectionStart like so:

'selectionStart' in node // true
node['selectionStart'] // DOMException

... which is more or less what I think is happening here.

Missing computedStyle in `toDOM`

When using .toJSON with computedStyle: true computed styles are copied to the [node].style object. However, when using .toDOM only the [node].attributes.styles text is added to the element.

Usage w/ Jsdom: TypeError: Cannot read property 'href' of undefined

e.g.:

const dom = await JSDOM.fromURL('http://...')
const parsed = toJSON(dom.window.document) // domjson.toJSON()

Error:

TypeError: Cannot read property 'href' of undefined
    at D:\git\food-search\node_modules\domjson\dist\domJSON.js:19:28
    at domJSON (D:\path\to\my\project\node_modules\domjson\dist\domJSON.js:7:23)
    at Object.<anonymous> (D:\path\to\my\project\node_modules\domjson\dist\domJSON.js:15:3)
    at Module._compile (module.js:641:30)
    at Module._extensions..js (module.js:652:10)
    at Object.require.extensions.(anonymous function) [as .js] 

Question about banned/disallowed tags

Great tool! However, I was wondering:

domJSON defines the tags <link> and <script> as 'disallowed', and states the following:

A list of disallowed HTMLElement tags - there is no flexibility here, these cannot be processed by domJSON for security reasons!

What's the reason behind this? What would be the security implications of just serializing them as well?

Browser compatibility

Currently only works well in Chrome and Canary. Lot's of failed tests on Firefox (!!), and god only knows about IE10, let alone older versions.

Standardize "FilterList" type properties

Currently, properties that accept a list of fields can accept values of true for all fields, false for none, an array listing which fields to save (aka boolean intersect; ex: ['a', 'b', 'c',...]), or an array (lead with a boolean false) of which specify which fields NOT to save (aka boolean differenceex: [false, 'a', 'b', 'c',...]).

I would like this fields to accept objects in the following format:

{
  intersect: true, //defaults to true
  fields: ['a', 'b', 'c'] //if empty, defaults to false
}

Add factory

Make the library compatible with AMD, CommonJS/Browserify, and standalone usage by adding a factory.

Speed optimizations

  • Speed up how domProperties are parsed when using an inclusive filterList (or boolean false)
  • Maybe rewrite the createNode function to concatenate children, then append them using innerHTML?

Write documentation

Some stuff to include in the docs:

  • Has two purposes: saving exact state, and doing very big things with WebWorkers
  • Supported vs Unsupported DOM node types
  • Goals:
    • Provide as accurat a copy of a given node's DOM properties as possibly, but optional filter only the relevant ones
    • Lightweight, no dependencies
    • As performant as possible
    • No frivolous data: produce JSON objects that are as compact as possible to meet the user's needs
  • What you lose:
    • Circular references are out - only keep the children
    • Events bindings
    • Functions
  • It does explicitly copy the style!
  • fieldList datatype, including shorthand

Issues with UMD, automatically generate UMD

You have a typo in the UMD script here (dmoJSON should be domJSON): https://github.com/azaslavsky/domJSON/blob/master/src/domJSON.js#L24

Also, in CJS this is not window so the script fails here win.location is undefined): https://github.com/azaslavsky/domJSON/blob/master/src/domJSON.js#L55

I can easily submit a PR for these but I think it might be better if you wrote the code as CJS then ran it through browsify to get UMD. If you are interested I can submit a PR.

IE 11 InvalidStateError during copyJSON

Hi, in IE 11 within the copyJSON function IE stops partway into the for ... in and reports InvalidStateError. Specifically when trying to access node[n] here:

if (opts.cull) {
                        if (node[n] || node[n] === 0 || node[n] === false) {
                            copy[n] = node[n];
                        }
                    } else {
                        copy[n] = node[n];
                    }

Cannot import this inside jest tests

I got this error:

>npx jest
 FAIL  src/reviewer.test.ts (5.44s)
  × renders test site (23ms)

  ● renders test site

    TypeError: Cannot read property 'href' of undefined

       7 |
    >  8 |     const domJSON = require('domjson');
         |                     ^

      at node_modules/domjson/dist/domJSON.js:19:28
      at node_modules/domjson/dist/domJSON.js:7:23
      at Object.<anonymous> (node_modules/domjson/dist/domJSON.js:15:3)
      at Object.<anonymous> (src/reviewer.test.ts:8:21)

If I import require('jsdom-global')() as suggested on #26 (Usage w/ Jsdom: TypeError: Cannot read property 'href' of undefined), jest breaks on my tear down with this:

 FAIL  src/reviewer.test.ts
  ● Test suite failed to run

    TypeError: Illegal invocation

      at removeEventListener (node_modules/jsdom/lib/jsdom/living/generated/EventTarget.js:131:15)

Create demos

Two demos: one to simply show the JSON output, and another to demonstrate batch updates via Web Workers

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.