Giter VIP home page Giter VIP logo

upndown's Introduction

upndown

Javascript HTML to Markdown converter, for Node.js and the browser.

About

upndown converts HTML documents to Markdown documents.

upndown is designed to offer a fast, reliable and whitespace perfect conversion for HTML documents.

Install / Usage

Browser

Standard loading

Download the zip archive on github, unzip, copy in your web folder, and in your HTML:

<script type="text/javascript" src="/assets/upndown/lib/upndown.bundle.min.js"></script>
<script type="text/javascript">

    var und = new upndown();
    und.convert('<h1>Hello, World !</h1>', function(err, markdown) {
        if(err) { console.err(err); }
        else { console.log(markdown); } // Outputs: # Hello, World !
    });

</script>

Using RequireJS

Download the zip archive on github, unzip, copy in your web folder, and in your HTML:

<script type="text/javascript" src="http://requirejs.org/docs/release/2.1.11/minified/require.js"></script>
<script type="text/javascript">

require.config({
    paths: {
        'upndown': '/assets/upndown/lib/upndown.bundle.min'
    }
});

require(['upndown'], function(upndown) {
    var und = new upndown();
    und.convert('<h1>Hello, World !</h1>', function(err, markdown) {
        if(err) { console.err(err);
        else { console.log(markdown); } // Outputs: # Hello, World !
    });
});
</script>

Nodejs

Install

npm install upndown

Use

var upndown = require('upndown');

var und = new upndown();
und.convert('<h1>Hello, World !</h1>', function(err, markdown) {
    if(err) { console.err(err);
    else { console.log(markdown); } // Outputs: # Hello, World !
});

Warning: With Node < 0.12.8, you'll have to require a polyfill for the Promise functionnality (like https://www.npmjs.com/package/bluebird); see #10 on how to do that.

Options

decodeEntities

By default Updown will decode all html entities, so source HTML like this:

<p>I'm an escaped &lt;em&gt;code sample&lt;/em&gt;.</p>

Will become:

I'm an escaped *code sample*.

If your use case does not call for that behavior and you wish HTML entities to stay encoded, you can pass an option to the constructor:

var und = new upndown({decodeEntities: false})

Then just use as normal.

Test

In the browser

Navigate to test/browser/ inside the upndown folder. Browser tests are executed by QUnit.

Nodejs

To run the tests, simply execute:

npm test

Nodejs tests are executed using mocha.

Maintainer

upndown is produced by Net Gusto. Drop us a line at [email protected]

upndown's People

Contributors

bergie avatar burlesona avatar dtinth avatar netgusto avatar praveenscience avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

upndown's Issues

Structural tags are not stripped

Since tags like <div> and <span> have no particular use in a Markdown document, it would be better to just drop them in conversion.

Decoding html-encoded string values when it shouldn't

I have a case where I'm using TinyMCE to generate html which is then being converted to markdown.

However if the user writes html in the editor, TinyMCE correctly is escaping the html entities, however upndown is decoding again (which it shouldnt).

Simple Example:

// as generated by TinyMCE (correctly)
var html = "<p>&lt;script&gt;alert("hello");&lt;/script&gt;</p>"

 und.convert(html, function (err, markdown) {
    // markdown == "<script>alert("hello");</script>"

   // expected markdown == "&lt;script&gt;alert("hello");&lt;/script&gt;"
}

It looks like it is decoding html-encoded entities when it should not.

Does not work with older versions of node

It works fine under the latest version of node (4.2.2 here), but when testing on v0.10.36 (the version Amazon's AWS Lambda runs), it fails with Promise is not defined.
I'd suggest requiring a minimal version of node, so npm warns on install and to mention it in the README or to make it compatible with older versions of node.

I'd be happy to stick a warning in the README and create a pull request if you want me to.

Crashes with script tags

Hi,

I came by a nasty crash you might want to look into. Example below:

var Und = require('upndown');
var und = new Und();

var data = '<script></script><script></script>';

und.convert(data);

This yields

/node_modules/upndown/lib/upndown.js:127
              if (text.match(/^[\n\s\t]+$/)) {
                       ^
TypeError: Cannot call method 'match' of null
    at Object._text (/node_modules/upndown/lib/upndown.js:127:24)
    at /node_modules/upndown/lib/upndown.js:611:42
    at walkDOM (/node_modules/upndown/lib/upndown.js:591:13)
    at walkDOM (/node_modules/upndown/lib/upndown.js:598:15)
    at walkDOM (/node_modules/upndown/lib/upndown.js:598:15)
    at upndown.convert (/node_modules/upndown/lib/upndown.js:609:9)
    at upndown.convert (/node_modules/upndown/lib/upndown.js:4:63)
    at Object.<anonymous> (/crash.js:6:5)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)

Provide unminified bundle

The "non-min" version of the bundle currently provided still has some minified blobs in the top. What are these, what license are they under, and could we get the unminified versions?

Thanks.

Support async/await by returning Promise

Callback style APIs are really a thing of the past now that async/await are available in node > 7 !
upndown.convert(html) should return a Promise when no callback is provided !

[request] Smaller bundle for client-side conversion?

Thank you for this amazing package. I wonder if it would be possible to somehow make a smaller bundle for client-side conversion of HTML to Markdown?

I found this package does a much better job at conversion than To-markdown which weighs only ~9kb, while Upndown weighs ~129kb.

So, can someone suggest if there's a way to substantially decrease the bundle size besides gziping (for client-side use)?

If property has a dot in it, parsing fails

var Und = require('upndown');
var und = new Und();

var data = '<bottle .label="">\'</bottle></amount>';

console.log(und.convert(data));

yields

undefined:121
                throw "Parse Error: " + html;
                                      ^
Parse Error: <bottle .label="">'</bottle></amount></div>

Maybe just skip properties like that?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.