Giter VIP home page Giter VIP logo

mdformat's People

Contributors

butler54 avatar chrisjsewell avatar cube707 avatar dragoncrafted87 avatar fabaff avatar hukkin avatar hukkinj1 avatar kyleking avatar paugier avatar pre-commit-ci[bot] avatar rpdelaney avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mdformat's Issues

Document `--number` option

Forgot to add this to #41, but the ordered list numbering should be documented, for both the CLI and API usage.
Also, you might want to link in the readme to tests/data/fixtures.md, as a useful source of example formatting.

Consecutive numbered lists

I can see the thinking for using 1. for all numbered list items; with a renderer handling proper numbering.
However, one of the benefits of Markdown is that it is easily readable without the need for rendering.
This is diminished with this sort of (non-)numbering, and I bet most people would opt for consecutive numbering, if given the choice.

I would say there should at least be the option (if not in "core" then via a plugin) to achieve this.

Escape dots less

Currently a header like

## 5.0.0

would have its dots escaped and be formatted like

## 5\.0\.0

Improve dot escaping. Only escape if the dot could start an ordered list.

Escape square brackets less

Linked issue #110

Prettier seems to do a better job at square bracket escaping. Mdformat currently always escapes square brackets (to not accidentally produce link label enclosures) whereas it seems that Prettier keeps escapes from the source text.

The token stream we get from the parser is mainly just abstract syntax, so we don't know if the escapes were there in the source or not. Thus doing exactly what Prettier does is not currently easy to implement.

Like with other escapes, we could try to come up with logic that detects cases where link label enclosure certainly will not be accidentally produced and not escape if that is the case.

Note that this approach when implemented well enough can actually be an improvement over simply sustaining escapes from source, as it will automatically remove needless escapes.

Expose more things in the public API

As discussed in executablebooks/rst-to-myst#18 (comment):

I think we may want to work on making something like build_mdit public to alleviate the various gotchas involved (regarding plugins and default conf options for example).

Also things in mdformat/renderer/_util.py, e.g. mdformat-myst currently duplicates longest_consecutive_sequence.

Links should not be "resolved"

Hmm, this is interesting.
Running mdformat on:

# mdformat-tables

[![Build Status][ci-badge]][ci-link]
[![codecov.io][cov-badge]][cov-link]
[![PyPI version][pypi-badge]][pypi-link]

An [mdformat](https://github.com/executablebooks/mdformat) plugin for rendering tables.

[ci-badge]: https://github.com/executablebooks/mdformat-tables/workflows/CI/badge.svg?branch=master
[ci-link]: https://github.com/executablebooks/mdformat/actions?query=workflow%3ACI+branch%3Amaster+event%3Apush
[cov-badge]: https://codecov.io/gh/executablebooks/mdformat-tables/branch/master/graph/badge.svg
[cov-link]: https://codecov.io/gh/executablebooks/mdformat-tables
[pypi-badge]: https://img.shields.io/pypi/v/mdformat-tables.svg
[pypi-link]: https://pypi.org/project/mdformat-tables

goes to

# mdformat-tables

[![Build Status](https://github.com/executablebooks/mdformat-tables/workflows/CI/badge.svg?branch=master)](<https://github.com/executablebooks/mdformat/actions?query=workflow%3ACI+branch%3Amaster+event%3Apush>)
[![codecov.io](https://codecov.io/gh/executablebooks/mdformat-tables/branch/master/graph/badge.svg)](<https://codecov.io/gh/executablebooks/mdformat-tables>)
[![PyPI version](https://img.shields.io/pypi/v/mdformat-tables.svg)](<https://pypi.org/project/mdformat-tables>)

An [mdformat](<https://github.com/executablebooks/mdformat>) plugin for rendering tables.

I'd say that's not ideal behaviour ๐Ÿ˜ฌ

If not in "core", I think we at least want a plugin that can "inhibit" this type of behaviour.

Line wrap changes short lines (and throws error on --check)

Describe the bug

When mdformat is used with the --wrap and --check option and multiple lines are shorter than what is specified in wrap. The file is flagged as not formatted.

To Reproduce

Steps to reproduce the behavior:

docker run -it --rm ubuntu
apt update && apt install -y python3 python3-pip
pip3 install mdformat
cat > test.md << EOF
Test aa
and test bb
EOF
mdformat --check --wrap 80 test.md

Expected behavior

No error as no line is too long.

Additional context

One could argue that, without the --check option. Shorter lines should be left alone as well and I think that is pretty much how any other formatter does it but I can also see arguments why it should be changed. In all likelihood keeping the check an no-check behavior consistent is probably a good idea.

Keep shortcut reference links

Currently shortcut reference links are converted into full reference links.

For example, input:

[foo]

[foo]: /url "title"

Output:

[foo][foo]

[foo]: /url "title"

Let's try to keep the shortcuts as shortcuts instead.

Linked issue: #110

Plugin system for code formatting

Make a plugin system for code formatting.

As an example, we should be able to pip install mdformat-black after which all Python code blocks will be formatted with Black.

E.g.

~~~python
'''black hates single quotes'''
~~~

will be formatted as

~~~python
"""black hates single quotes"""
~~~

Sentence-based word wrapping

Experiment with something like

from nltk import tokenize
sentences = tokenize.sent_tokenize(paragraph)

and find out if we can implement sentence-based word wrapping to reduce diffs.

Implement as an option, dont change the default mode (which preserves wrapping).

Support for "ignore" comments

Is your feature request related to a problem? Please describe.

Some officially supported Python Markdown extensions (like admonition use non-standard Markdown syntax. These cases are not properly handled by the parser currently. For example:

# Some title

!!! note
This will render as a tooltip box.

Back to normal markdown content.

Describe the solution you'd like

It would be useful if there was an "ignore" comment that could be added to a markdown file to skip formatting specific sections. The prettier library uses <!-- prettier-ignore --> and <!-- prettier-ignore-end --> for this purpose.

Describe alternatives you've considered

Plugins could be written for to handle cases like this, but it would be useful if there was a global way to disable formatting in specific sections of a document.

Additional context

It looks like there's a similar discussion happening on an existing PR about how to ignore blocks of code. I'd be fine with a solution where I could wrap certain sections in a div that sets a class that's globally ignored, but I feel like a plain comment would be a better interface for this use case, as it wouldn't require modifying the AST of the markdown.

Hard break in a setext heading

E.g.

First line\
second line
===========

I've never seen this in the wild, but hard break in a setext heading is something that AFAIK has no ATX representation. So we have to render a setext heading in this case.

Plugin system for extending the parser

Perhaps similar to #12, it would be nice to provide a hook for extending the parser with the available plugins: https://markdown-it-py.readthedocs.io/en/latest/plugins.html#plugin-extensions

As mentioned in executablebooks/markdown-it-py#10, I guess this would work best such that it is the plugin's responsibility to provide a parser method? Maybe allowing for the parsing of one or more renderer defined options, something like:

md = MarkdownIt(renderer_cls=MDRenderer)
md.use(a_plugin, md_seperator=MARKERS.BLOCK_SEPARATOR)
md.render(md)

Line Ending Changing with formatting

Describe the bug
File is always output with crlf line endings on windows even if no formatting was changed

To Reproduce

  1. Start with a valid markdown file on Windows with lf(unix) line endings
  2. run file through mdformat
  3. file now has crlf(dos) line endings

Expected behavior
File has same line ending style it started with

Environment

Windows 10
Python 3.9.5
pre-commit 2.13.0

Style change discussion: More distinguishable thematic breaks

Thematic breaks are currently rendered as three underscores. Rendering more underscores would make the breaks easier to detect in Markdown source, without any downsides that I can think of.

If this is a reasonable change, then the question is raised, what is a good amount of characters to render. 50? 72? 80? Surely not more than 80 as that is still a common default width for terminals...

Requires a `language_version` override in Python 2 environments

Describe the bug

When attempting to install the hook I would assume all you have to do is have a .pre-commit-config.yaml with the following configuration:

-   repo: https://github.com/executablebooks/mdformat
    rev: 0.5.6
    hooks:
        -   id: mdformat

However, for any Python 2 environment (I am on 2.7.16) it seems you have to put in a language_version override to get it to install properly resulting in (something like):

-   repo: https://github.com/executablebooks/mdformat
    rev: 0.5.6
    hooks:
        -   id: mdformat
            language_version: 3.7.7

I believe there should be a default language version set so that you don't have to override this value.

The reason I ask about this is because other hooks black for example also only run on Python 3, and it does not need this override.

To Reproduce

Steps to reproduce the behavior:

  1. Create a python virtual environment where the version of python is not supported for this hook
  2. Attempt to create a pre-commit config without language_version set.
  3. pre-commit install --install-hooks

Expected behavior

Expected behavior is that it installs properly.

Environment

  • Python Version [e.g. 3.7.1]: 2.7.16
  • Package versions or output of jupyter-book --version: 0.5.6
  • Operating System: MacOS

Additional context

N/A

MyST plugin

Build mdformat-myst plugin.

Come up with a set of parser and renderer extensions to be able to format MyST. Many of the syntax extensions make sense outside the MyST context, so they should be self-contained plugins that mdformat-myst requires and activates, similar to how mdformat-gfm does with mdformat-tables.

Syntax extensions

TODO

  • Implement all extensions
  • Wait for compatible mdformat-footnote release (executablebooks/mdformat-myst#2)
  • Release on PyPI
  • Mention the plugin in mdformat docs

Links

  • This seems to be the place where myst-parser itself prepares a MarkdownIt instance for myst parsing. We should set the same extensions in myst_plugin.update_mdit(). Also see this place where myst-parser sets MarkdownIt options.

Convert list item markers

Convert list item marker symbols. For unordered lists, always prefer -. If there are sequential lists, alternate between - and *.

Do the same for ordered lists using . (preferred) and ).

Consistent indentation levels

Is your feature request related to a problem? Please describe.

I use an extension called "All in one Markdown" on VSCode, one of its most useful features is the automatic generation of a ToC. ToC generation works just fine when a have lists that are 2 layers deep (list point with sub-points) but breaks for more than that (sub-sub-points)

Describe the solution you'd like

Consistent non-adaptive indentation with enjoy lines between list items.

- list level 1

     - list level 2
     
          - list level 3

Describe alternatives you've considered

Nothing reasonable

Additional context

This is the mdformat-gfm formatted file (I ended up writing the ToC manually): file

GFM support

Come up with the necessary plugins to format Github Flavored Markdown.

Features

  • Tables (mdformat-tables)
  • Task list items (mdformat-gfm)
  • Strikethrough (mdformat-gfm)
  • Extended autolinks (not sure if changes needed, not making any right now)
  • Disallowed raw HTML (no changes needed)

Collection plugin

It might make sense to make a collection plugin (e.g. mdformat-gfm) that installs all GFM extensions.

Resources

Use upstream SyntaxTreeNode

Once markdown-it-py>0.6.2 is released, remove the bundled _tree.SyntaxTreeNode class and import the equivalent class from upstream.

API for plugins to collaboratively render a node

The current API lets plugins add new renderer functions (for node types that dont have a default one) or override existing functions. There is no good way for plugins to collaborate on the same node type, however.

This would be useful e.g. for plugins to set character escape rules by collaboratively modifying renderers of paragraph and text syntaxes. This would solve issues like executablebooks/mdformat-tables#10

Reduce asterisk (*) escaping

Similarly as the underscore, the asterisk can start (among other things) an emphasis or strong emphasis block. We need some heuristic that ensures we don't introduce unwanted emphasis blocks, if we leave the character unescaped. Currently I'm thinking to use the following:

  • Escape an asterisk always, unless
    • Both surrounding characters are Unicode whitespace , or start or end of line

Related issue about underscores: #119

Don't clear trailing whitespace

Describe the problem

Markdown defines that two trailing spaces means a new line, hence trailing whitespace shouldn't be cleared from the document when running mdformat (see here).

Link to your repository or website

No response

Steps to reproduce

  1. Create a .md file
  2. Write a line, say a block quote, that has trailing whitespace like:
>this line has two trailing spaces after it  
>so that this is actually a newline
  1. Run mdformat on the file
  2. Observe the failure to preserve the whitespace

The version of Python you're using

3.9

Your operating system

Linux 5.12 kernel

Versions of your packages

mdformat 0.7.7
mdformat-gfm 0.3.2
mdformat-tables 0.4.1

Additional context

No response

Don't Escape Underscore in Links

Describe the bug

A standalone link with underscores in it was changed:

"example.com/this_page_here"

to

"example.com/this\_page\_here"

The link was standalone, eg not in the markdown format within parentheses.

  • Python 3.7
  • Mac

Handle parsing errors/warnings

Currently, exceptions in code formatting plugins are silently ignored.
I think this is a bit "dangerous" in that, if a plugin is faulty, people won't realise that their files are not being formatted as intended.

At a minimum, I think the errors should be logged, as I have added in #36.
But perhaps also it should be the default or an option (using #35) to fail on such exceptions.

Another place where reporting/failing should be considered, is when duplicate_refs are found in the env (after parsing)

Add a guard against unsafe changes

Add a guard that checks there are no significant changes to the HTML rendered from formatted Markdown.

Do something simple like:

  • Render HTML from source Markdown
  • Render HTML from formatted Markdown
  • Strip all whitespace from the HTMLs and compare them

If the HTMLs are not equal, don't apply any formatting. Instead print an error message asking to submit an issue to https://github.com/hukkinj1/mdformat/issues

Differences to prettier style

This issue documents the main differences noted in mdformat style and prettier style when formatting Markdown in https://github.com/lambda-fairy/maud

  1. Mdformat moves link reference definitions to the bottom of the doc
  2. Mdformat orders link reference definitions
  3. Mdformat applies non-numbering to lists (by default)
  4. Mdformat converts indented code blocks to fenced code blocks
  5. Mdformat converts shortcut reference links into full reference links, e.g. [ref link] into [ref link][ref link] EDIT: This is fixed in version 0.5.5
  6. Mdformat escapes square brackets. Prettier preserves escapes from source text.

To me everything else seems desirable, besides the last two differences (5. and 6.). I will turn those into separate issues.

Escape hashes less

Input:

- Recalculate secondary dependencies between rounds (#378)

Current output:

- Recalculate secondary dependencies between rounds (\#378)

Come up with logic that allows us to escape the hash character # in fewer cases. Only escape if the character would start an ATX heading.

Failed output Pre-commit Hook

Describe the bug

When running mdformat as a pre-commit hook whenever mdformat fails there is a secondary error that happens and does not show the error output for mdformat.

Pre-commit setup:

repos:
-    repo: https://github.com/executablebooks/mdformat
     rev: 0.5.6
     hooks:
         -    id: mdformat
              args: [--check]

Error when running with pre-commit:

Traceback (most recent call last):
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/bin/mdformat", line 8, in <module>
    sys.exit(run())
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/__main__.py", line 8, in run
    exit_code = mdformat._cli.run(sys.argv[1:])
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 69, in run
    print_error(f'File "{path_str}" is not formatted.')
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 165, in print_error
    print_paragraphs(paragraphs)
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 157, in print_paragraphs
    sys.stderr.write(wrap_paragraphs(paragraphs))
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 179, in wrap_paragraphs
    return "\n\n".join(wrapper.fill(p) for p in paragraphs) + "\n"
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 179, in <genexpr>
    return "\n\n".join(wrapper.fill(p) for p in paragraphs) + "\n"
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 363, in fill
    return "\n".join(self.wrap(text))
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 354, in wrap
    return self._wrap_chunks(chunks)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 248, in _wrap_chunks
    raise ValueError("invalid width %r (must be > 0)" % self.width)
ValueError: invalid width 0 (must be > 0)

"Error" when running normally with mdformat:

Error: File "blah.md" is
not formatted.

To Reproduce

Steps to reproduce the behavior:

  1. Intentionally create a file that will fail mdformat.
  2. Add pre-commit to the environment with the above pre-commit setup.
  3. Run pre-commit run --all-files to produce an "error" on file not formatted.
  4. Observe above ValueError.

Expected behavior

The error message from mdformat is included instead of the ValueError.
Namely I would prefer:

Error: File "blah.md" is
not formatted.

Instead of:

Traceback (most recent call last):
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/bin/mdformat", line 8, in <module>
    sys.exit(run())
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/__main__.py", line 8, in run
    exit_code = mdformat._cli.run(sys.argv[1:])
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 69, in run
    print_error(f'File "{path_str}" is not formatted.')
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 165, in print_error
    print_paragraphs(paragraphs)
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 157, in print_paragraphs
    sys.stderr.write(wrap_paragraphs(paragraphs))
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 179, in wrap_paragraphs
    return "\n\n".join(wrapper.fill(p) for p in paragraphs) + "\n"
  File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 179, in <genexpr>
    return "\n\n".join(wrapper.fill(p) for p in paragraphs) + "\n"
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 363, in fill
    return "\n".join(self.wrap(text))
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 354, in wrap
    return self._wrap_chunks(chunks)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 248, in _wrap_chunks
    raise ValueError("invalid width %r (must be > 0)" % self.width)
ValueError: invalid width 0 (must be > 0)

Environment

  • Python Version [e.g. 3.7.1]: 3.8.4
  • Package version of mdformat: 0.5.6
  • Package version of pre-commit: 2.10.1
  • Operating System: MacOS Catalina 10.15.4

Additional context

No additional context.

Integrate with Markdown Linters

I use a Markdown linter with pre-commit and GitLab CI pipelines. It's useful but I'd like to use a formatter before running the linter: one of the biggest problems I have is that, as well as raising rule violations that require manual correction, the linter frequently raises large numbers of trivial formatting rule violations that could very easily be corrected automatically by a formatter. If they are going to be coupled, it's really important that the formatter and linter work together and don't conflict. I would like to suggest that mdformat be designed from the beginning to integrate with Markdown linters, particularly markdownlint [1].

I expect that this means mdformat will have to observe the markdownlint rules and the rule structure, to avoid creating linting conflicts. The advantage of this is that there's an existing structure and code (largely regexps) which can inform mdformat development.

What do you think?

[1] There are two Markdown linters I'm aware of: a Ruby version https://github.com/markdownlint/markdownlint and a node.js version: https://github.com/DavidAnson/markdownlint, both named "markdownlint" and the respective authors and maintainers work together to ensure that there is minimal divergence of rules between the two versions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.