Giter VIP home page Giter VIP logo

markdownpapers's People

Contributors

lruiz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

markdownpapers's Issues

Emphasis content lost in generated HTML

When I am using * or _ to emphase some text, the content between * or _ is lost.

See the second line of the doxia-module page, the doxia-module.md file contains following Markdown :

Write all your docs in **src/site/markdown/** with **md** as file extension, ...

In the generated web page (http://markdown.tautua.org/doxia-module.html) is :

Write all your docs in with as file extension, ...

A newline inside a bracket breaks the parser

If I write

[
linkme]

I get an IndexOutOfBoundsException:

java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:191)
at java.lang.StringBuilder.charAt(StringBuilder.java:72)
at org.tautua.markdownpapers.ast.Link.getText(Link.java:40)
at org.tautua.markdownpapers.ast.Link.getResource(Link.java:64)
at org.tautua.markdownpapers.HtmlEmitter.visit(HtmlEmitter.java:151)
at org.tautua.markdownpapers.ast.Link.accept(Link.java:92)
at org.tautua.markdownpapers.ast.SimpleNode.childrenAccept(SimpleNode.java:94)
at org.tautua.markdownpapers.HtmlEmitter.visit(HtmlEmitter.java:139)
at org.tautua.markdownpapers.ast.Line.accept(Line.java:31)
at org.tautua.markdownpapers.HtmlEmitter.visitChildrenAndAppendSeparator(HtmlEmitter.java:267)
at org.tautua.markdownpapers.HtmlEmitter.visit(HtmlEmitter.java:212)
at org.tautua.markdownpapers.ast.Paragraph.accept(Paragraph.java:39)
at org.tautua.markdownpapers.HtmlEmitter.visitChildrenAndAppendSeparator(HtmlEmitter.java:267)
at org.tautua.markdownpapers.HtmlEmitter.visit(HtmlEmitter.java:66)
at org.tautua.markdownpapers.ast.Document.accept(Document.java:55)
at org.tautua.markdownpapers.Markdown.transform(Markdown.java:34)

Embedding code snippets does not work

Embedding code snippets does not work with markdownpapers version 1.2.3.

The following code

<code>
String s1 = new String("");
String s2 = new String("");
</code>

Is displayed as

String s = new String(""); String s1 = new String("");

Whereas, I think both the statements should have been displayed in different lines.

lt and gt entities in tables

Release 1.2.3 doesn't like lt and gt entities within a td.

assertEquals("<table><tr><td>&lt;test&gt;</td></tr></table>", MarkdownUtils.transformMarkdown("<table><tr><td>&lt;test&gt;</td></tr></table>"));

** * _ not interpreted

I used the unit test from the gitblit project to do the following tests, Bold, Italic and Underscore are the most common use of markdown.

Test string -> error

"*B*" -> null pointer
"* B *" ->  null pointer
**B** -> null pointer
"** B **" -> java.text.ParseException: Encountered " "*" "* "" at line 1, column 2.

Was expecting one of:
" " ...
"\t" ...
" " ...

at com.gitblit.utils.MarkdownUtils.transformMarkdown(MarkdownUtils.java:68)
at com.gitblit.utils.MarkdownUtils.transformMarkdown(MarkdownUtils.java:44)
at com.gitblit.tests.MarkdownUtilsTest.testMarkdown(MarkdownUtilsTest.java:27)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

=========== http://daringfireball.net/projects/markdown/basics
Markdown uses asterisks and underscores to indicate spans of emphasis.

Markdown:

Some of these words are emphasized.
Some of these words are emphasized also.

Use two asterisks for strong emphasis.
Or, if you prefer, use two underscores instead.
Output:

Some of these words are emphasized. Some of these words are emphasized also.

Use two asterisks for strong emphasis. Or, if you prefer, use two underscores instead.

Image will not parse with the format ![](http://some.url)

I have use the version 1.4.3 and I found that image will not parse with the format :

![](https://some.url) // will not parse

it can be parse with the format :

![alt](https://some.url) // can parse

Empty alt for image should be parse I think, this may be a bug.

Regressions in 1.2 compared to 1.1?

Hi Larry,

I use your library within my Git hosting project, Gitblit. I'm trying to keep updated on dependencies but the 1.2.x series is causing me trouble. I can not pinpoint the exact trouble as there seems to be more than one problem. I generate my project website from hybrid Markdown/Html sources which are kept alongside my project sources right here on GitHub. The project website built with the 1.1.1 release from the Markdown/Html sources is available here. I recognize that this is not a well defined issue, but maybe if you took a peek at my Markdown sources (docs folder) the flaw (either in my documents or in the library) would stand out.

Thanks,
James

images cannot be inserted

Hi!

While using the markdownpapers-doxia-module with the maven site plugin, some problems were found. The following code snippets led to double img tags in the html file. (<img<img)

![alt](path/to/image1.png "title")

[id]: path/to/image2.png "title"
![alt][id]

As a result, the images did not show up. In addition, this line resulted in a NullPointerException.

![alt](path/to/image.png)

Incorrect parsing of HTML entities representing numerical entities

In jjtree/Markdown.jit the expression that checks for numerical references appears to be missing the '#' character that follows the ampersand character. So while this expression is searching for e.g. "&1234;", according to e.g. http://en.wikipedia.org/wiki/Numeric_character_reference it should rather look for "&#1234;". The same holds for hexadecimal numerical entities that would start with "&#x" and not just "&x" as implemented here

| < NUMERIC_CHAR_REF : "&" ( ( ["0"-"9"] ){1,4} | "x" ( ["0"-"9", "a"-"f", "A"-"F"] ){1,4} ) ";" >

Links don't work with digits in URL

(using version 1.3.1 from Maven)
This markup:

Try a [reference][here]. 
[here]: http://www.example-0.com/

gives

Try a reference.

0.com/

Also inline links give a parse error,

org.tautua.markdownpapers.parser.ParseException: Encountered " <NUMBERING> "0. ""

It seems to be punctuation followed by a number; www.example1.com is OK, www.example-1.com is not.

ParserException in case of "-" characters

Hi,

I have just realized that "-" characters are not handled correctly if they are in path strings. The example below leads to a ParserException:

![id](path/to/my-photo.png)

But the same is true about links.

PlainTextEmitter

Would be very convenient if you create a PlainTextEmitter, which would produce a plain text. Sometimes this could be a useful feature.

Content "<3 Play!" produces java.text.ParseException

User generated content "<3 Play!" produces java.text.ParseException : java.text.ParseException: Encountered " "!" "! "" at line 1, column 8. Was expecting one of: "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ... "=" ...

Code block within a list item

According to Markdown syntax, it should be possible to include code blocks within a list item as follows:

This is a list:

* first item, with some code:

        int x = 10;

* second item

Thank you

(Note that the code block is indented twice with 8 spaces as required)

The output from this parser is:

<p>This is a list:</p>
<ul>
<li>
<p>first item, with some code:</p>
<p>        int x = 10;</p>
</li>
<li><p>second item</p></li>
</ul>
<p>Thank you</p>

However, the expected output should be:

<p>This is a list:</p>
<ul>
<li>
<p>first item, with some code:</p>
<pre><code>int x = 10;</code></pre>
</li>
<li><p>second item</p></li>
</ul>
<p>Thank you</p>

Tested on MarkdownPapers v1.2.3 and v1.3.3, Java 1.6.0.41

Strikethrough support

It would be useful to support strikethrough mode via leading and trailing double-dashes. E.g., dash-dash-foo-dash-dash becomes foo.

Quoted entities in code blocks are quoted twice

The following markdown text:

Some text &quot;foo&quot; `hehe` &lt; 3

    some code `with` &quot;stuff&quot; &lt; 2

boo

Results in the following HTML:

<p>Some text &quot;foo&quot; <code>hehe</code> &lt; 3</p>

<pre><code>some code `with` &amp;quot;stuff&amp;quot; &amp;lt; 2
</code></pre>

<p>boo</p>

Note the &amp;quot; that is double quoting.

Incorrect generation of HTML when entities are positioned inside emphasis underscores

This test case shows incorrect HTML generation when using entities positioned inside emphasis underscores.

__Bold&trade;__

should appear as:

Bold™

import java.io.IOException;
import java.io.StringReader;
import java.io.StringWriter;

import org.junit.Test;
import org.tautua.markdownpapers.Markdown;
import org.tautua.markdownpapers.parser.ParseException;

import static junit.framework.Assert.assertEquals;

public class MarkdownTest
{

    @Test
    public void testIncorrectEntityTransform()
    {
        // incorrectly generates:
        //            <p>__Bold&trade;__</p>
        assertEquals("<p><strong>Bold&trade;</strong></p>\n", transform("__Bold&trade;__"));
    }

    @Test
    public void testNoEntityTransform()
    {
        assertEquals("<p><strong>Bold</strong></p>\n", transform("__Bold__"));
    }

    @Test
    public void testEntityNotWrappedTransform()
    {
        assertEquals("<p><strong>Bold</strong>&trade;<em>Italics</em></p>\n", transform("__Bold__&trade;_Italics_"));
    }

    private String transform(String in)
    {
        StringWriter out = new StringWriter();
        Markdown md = new Markdown();

        try
        {
            md.transform(new StringReader(in), out);
        }
        catch (ParseException e)
        {
            throw new RuntimeException("Error parsing Markdown", e);
        }

        return out.toString();        
    }

}

Using markdownpapers-core 1.2.7

Paser error on link's title when "-" appears

The following md will not parse

[Heise-Online](http://www.heise.de "Heise-Online")

Error message

org.tautua.markdownpapers.parser.ParseException: Encountered " "-" "- "" at line 1, column 42.

Reported by Andreas Schouten

Is there a recommended way to prevent JS injection on user generated content?

Hi,

We are using MardownPapers through Markdown Play! module for user generated content : user are able to fill in their profile with Markdown enabled content.

MarkdownPapers does not prevent JavaScript injection (and it might not be its job to do it).
For exemple, the following Markdown content produces working HTML and JavaScript :

<script language="javascript">function yes(){document.location.href="https://github.com/lruiz/MarkdownPapers";}</script>
<div onmouseover="yes()">Hover with your mouse</div>

Is there a recommended workaround to prevent JavaScript injection?

<img src="http://..."> tag causes parser error

Hi,

I tried some markdown processors and all parsed img-tags properly. But somehow, I got a Parser exception because the parser wouldn't accept the ":" from "http://"

example:
(edit: uhm ... this editor directly replaces the <img ... with the image, so it obviously is valid in Markdown) ... but you certainly understand what I mean ...

caused:
Exception in thread "main" org.tautua.markdownpapers.parser.ParseException: Encountered " ":" ": "" at line 32, column 14.
Was expecting one of:
......

Thx in advance for helping me here

All the best,
Thomas

Embedded html tags get escaped

MarkdownPapers escape my embedded html tags with data-xx attribute:

<div data-bind="foreach:...">
foo
</div>

Becomes:

<p>&lt;div data-bind=&quot;foreach:...&quot;&gt;
foo
&lt;/div&gt;</p>

Parser fail when encounter a "!" alone

Parser fail when encounter a "!" alone, for example:

  Hello World !

line [1] Encountered " "!" "! "" at line 1, column 13.
[ERROR] Was expecting one of:
[ERROR] " " ...
[ERROR] "\t" ...
[ERROR] "&" ...
[ERROR] "`" ...

Rel attribute in links

While rendering markdown using this library, generated links always have <a href=".."> and I want to add <a rel="nofollow" href="..."> to eliminate direct page rank referrals from my user-generated content.

Maybe you can put a flag to constructor in order to eliminate this.

Thanks.

URL with "!" character produces ParseException

A markdown content with a link to an URL containing "!" character (like a Twitter URL) produces a ParseException.
For example, the following content produces ParseException : [My twitter account](https://twitter.com/#!/clacote)

In correct ending tag

when I have

        <i class="icon-download"></i> // explicitly close tag

It becomes

        <i class="icon-download"/> // the close tag is gone

Note the <i> tag cannot be self closed, the resulted html code cause page rendering problem in browsers.

An asterisk with a space causes a syntax error

If I'm writing a bulleted list and skip an entry, I get an error.

For example:

  • foo
  • bar

(The second line is asterisk,space)

The error is

Encountered " "\n "" at line 2, column 3. Was expecting one of: " " ... "\t" ... "&" ... "" ... "" ... "!" ... ":" ... "\"" ... "=" ... ">" ... "[" ... "(" ... "<" ... "-" ... "+" ... "]" ... ")" ... "#" ... "\'" ... "/" ... "*" ... "_" ... "<!--" ... "-->" ... <CODE_SPAN> ... <NUMBERING> ... <CHAR_ENTITY_REF> ... <NUMERIC_CHAR_REF> ... <ESCAPED_CHAR> ... <CHAR_SEQUENCE> ... " " ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... "*" ... " " ... " " ... " " ... " " ... <NUMERIC_CHAR_REF> ... <CHAR_ENTITY_REF> ... <CODE_SPAN> ... "[" ... "!" ... "<" ... "_" ... "*" ... " " ... "<" ... <CHAR_SEQUENCE> ... " " ... "&" ... "\\" ... "" ... "!" ... ":" ... "-->" ... "<!--" ... """ ... "=" ... <ESCAPED_CHAR> ... ">" ... "[" ... "(" ... "<" ... "-" ... ... "+" ... "]" ... ")" ... "#" ... "'" ... "/" ... "" ... "\t" ... "" ... "[" ... "!" ... "<" ... "" ... "" ... " " ... "<" ...

Emphasized line in a list is not highlighted correctly

As I understand Mardown, the following code

- Hello

  *World*

Should create a list with 1 item and 2 paragraphs, where the second one is emphasized.

But the following test does not work:

import java.io.StringReader;
import java.io.StringWriter;
import static org.junit.Assert.*;
import org.junit.Test;
import org.tautua.markdownpapers.Markdown;
import org.tautua.markdownpapers.parser.ParseException;

public class MarkdownTest {

    @Test
    public void emphasisAroundElementInAList() {
        String strong = transform("- Hello\n\n  **World**");
        String     em = transform("- Hello\n\n  *World*");

        assertEquals(strong.replace("strong>", "em>"), em);
    }

    private String transform(String in) {
        StringWriter out = new StringWriter();
        Markdown md = new Markdown();

        try {
            md.transform(new StringReader(in), out);
        } catch (ParseException e) {
            throw new RuntimeException("Error parsing Markdown", e);
        }

        return out.toString();
    }
}

Rendering issues

I was testing your engine and found multiple differences with other implementations:

Especially this one http://joncom.be/experiments/markdown-editor/edit/

The resulting html, when processed with markdownPapers is slightly different.

Here is the test document I am using

Syntax Cheatsheet

Phrase Emphasis

italic bold
italic bold

Links

Inline:
An example

Reference-style labels (titles are optional):
An example. Then, anywhere
else in the doc, define the link:

Images

Inline (titles are optional):
alt text

Reference-style:
alt text

Headers

Setext-style:

Header 1

Header 2

atx-style (closing #'s are optional):

Header 1

Header 2

Header 6

Lists

Ordered, without paragraphs:

  1. Foo
  2. Bar

Unordered, with paragraphs:

  • A list item.

With multiple paragraphs.

  • Bar

You can nest them:

  • Abacus
  • answer
  • Bubbles
  1. bunk
  2. bupkis
    - BELITTLER
  3. burper
  • Cunning

Blockquotes

Email-style angle brackets
are used for blockquotes.

And, they can be nested.

Headers in blockquotes

  • You can quote a list.
  • Etc.

Code Spans

<code> spans are delimited
by backticks.

You can include literal backticks like this.

Preformatted Code Blocks

Indent every line of a code block by at least 4 spaces or 1 tab.
This is a normal paragraph.

This is a preformatted
code block.

Horizontal Rules

Three or more dashes or asterisks:




Manual Line Breaks

End a line with two or more spaces:
Roses are red,
Violets are blue.

Javascript Injection

It's possible to inject malicious javascript code into markdown text.

You should avoid the possibility to use script tag and all event attributes (onXXXX)

<script language="javascript">
function yes(){
document.location.href="http://www.mysite.com";
}
</script>
<div onmouseover="yes()">
Bla Bla
</div>

Parsing error

ParseException occured : Encountered " "-" "- "" at line 75, column 19. Was expecting one of: " " ... "\t" ... "&" ... "" ... "" ... "\"" ... ">" ... "(" ... "<" ... ")" ... "\'" ... "/" ... ... ... ... ... " " ... "&" ... ... ... ">" ... "(" ... "<" ... ")" ... "\t" ... "/" ... "\\" ... "" ... "'" ... """ ... """ ... """ ... """ ... """ ... """ ... """ ... """ ... """ ... """ ... """ ... """ ... ... "<" ... " " ... "<" ... ... " " ... "&" ... "" ... "" ... "\"" ... ... ">" ... "(" ... "<" ... ... " " ... "&" ... ... ... ">" ... "(" ... "<" ... ")" ... "\t" ... "/" ... "\\" ... "" ... "'" ... """ ...

line 75 is the open div, it actually fails on custom-class

<div class="custom-class" markdown="1">
This is a div wrapping some Markdown plus.  Without the DIV attribute, it ignores the 
block. 
</div>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.