executablebooks / mdformat Goto Github PK
View Code? Open in Web Editor NEWCommonMark compliant Markdown formatter
Home Page: https://mdformat.rtfd.io
License: MIT License
CommonMark compliant Markdown formatter
Home Page: https://mdformat.rtfd.io
License: MIT License
Forgot to add this to #41, but the ordered list numbering should be documented, for both the CLI and API usage.
Also, you might want to link in the readme to tests/data/fixtures.md, as a useful source of example formatting.
Example:
foo@bar:~$ mdformat --version
mdformat 0.3.3
I can see the thinking for using 1.
for all numbered list items; with a renderer handling proper numbering.
However, one of the benefits of Markdown is that it is easily readable without the need for rendering.
This is diminished with this sort of (non-)numbering, and I bet most people would opt for consecutive numbering, if given the choice.
I would say there should at least be the option (if not in "core" then via a plugin) to achieve this.
Currently a header like
## 5.0.0
would have its dots escaped and be formatted like
## 5\.0\.0
Improve dot escaping. Only escape if the dot could start an ordered list.
For example:
Test ```inline```
becomes:
Test `inline`
Ideally, I'd prefer an option to maintain the original backticks.
https://github.com/executablebooks/mdformat/blob/master/STYLE.md
Linked issue #110
Prettier seems to do a better job at square bracket escaping. Mdformat currently always escapes square brackets (to not accidentally produce link label enclosures) whereas it seems that Prettier keeps escapes from the source text.
The token stream we get from the parser is mainly just abstract syntax, so we don't know if the escapes were there in the source or not. Thus doing exactly what Prettier does is not currently easy to implement.
Like with other escapes, we could try to come up with logic that detects cases where link label enclosure certainly will not be accidentally produced and not escape if that is the case.
Note that this approach when implemented well enough can actually be an improvement over simply sustaining escapes from source, as it will automatically remove needless escapes.
As discussed in executablebooks/rst-to-myst#18 (comment):
I think we may want to work on making something like build_mdit public to alleviate the various gotchas involved (regarding plugins and default conf options for example).
Also things in mdformat/renderer/_util.py
, e.g. mdformat-myst currently duplicates longest_consecutive_sequence
.
Hmm, this is interesting.
Running mdformat on:
# mdformat-tables
[![Build Status][ci-badge]][ci-link]
[![codecov.io][cov-badge]][cov-link]
[![PyPI version][pypi-badge]][pypi-link]
An [mdformat](https://github.com/executablebooks/mdformat) plugin for rendering tables.
[ci-badge]: https://github.com/executablebooks/mdformat-tables/workflows/CI/badge.svg?branch=master
[ci-link]: https://github.com/executablebooks/mdformat/actions?query=workflow%3ACI+branch%3Amaster+event%3Apush
[cov-badge]: https://codecov.io/gh/executablebooks/mdformat-tables/branch/master/graph/badge.svg
[cov-link]: https://codecov.io/gh/executablebooks/mdformat-tables
[pypi-badge]: https://img.shields.io/pypi/v/mdformat-tables.svg
[pypi-link]: https://pypi.org/project/mdformat-tables
goes to
# mdformat-tables
[![Build Status](https://github.com/executablebooks/mdformat-tables/workflows/CI/badge.svg?branch=master)](<https://github.com/executablebooks/mdformat/actions?query=workflow%3ACI+branch%3Amaster+event%3Apush>)
[![codecov.io](https://codecov.io/gh/executablebooks/mdformat-tables/branch/master/graph/badge.svg)](<https://codecov.io/gh/executablebooks/mdformat-tables>)
[![PyPI version](https://img.shields.io/pypi/v/mdformat-tables.svg)](<https://pypi.org/project/mdformat-tables>)
An [mdformat](<https://github.com/executablebooks/mdformat>) plugin for rendering tables.
I'd say that's not ideal behaviour ๐ฌ
If not in "core", I think we at least want a plugin that can "inhibit" this type of behaviour.
Do this after the next release (i.e. >0.6.1)
Also change all references from latest
to stable
in pyproject.toml and README.md
Describe the bug
When mdformat is used with the --wrap
and --check
option and multiple lines are shorter than what is specified in wrap. The file is flagged as not formatted.
To Reproduce
Steps to reproduce the behavior:
docker run -it --rm ubuntu
apt update && apt install -y python3 python3-pip
pip3 install mdformat
cat > test.md << EOF
Test aa
and test bb
EOF
mdformat --check --wrap 80 test.md
Expected behavior
No error as no line is too long.
Additional context
One could argue that, without the --check
option. Shorter lines should be left alone as well and I think that is pretty much how any other formatter does it but I can also see arguments why it should be changed. In all likelihood keeping the check an no-check behavior consistent is probably a good idea.
Currently shortcut reference links are converted into full reference links.
For example, input:
[foo]
[foo]: /url "title"
Output:
[foo][foo]
[foo]: /url "title"
Let's try to keep the shortcuts as shortcuts instead.
Linked issue: #110
Make a plugin system for code formatting.
As an example, we should be able to pip install mdformat-black
after which all Python code blocks will be formatted with Black.
E.g.
~~~python
'''black hates single quotes'''
~~~
will be formatted as
~~~python
"""black hates single quotes"""
~~~
Experiment with something like
from nltk import tokenize
sentences = tokenize.sent_tokenize(paragraph)
and find out if we can implement sentence-based word wrapping to reduce diffs.
Implement as an option, dont change the default mode (which preserves wrapping).
Is your feature request related to a problem? Please describe.
Some officially supported Python Markdown extensions (like admonition use non-standard Markdown syntax. These cases are not properly handled by the parser currently. For example:
# Some title
!!! note
This will render as a tooltip box.
Back to normal markdown content.
Describe the solution you'd like
It would be useful if there was an "ignore" comment that could be added to a markdown file to skip formatting specific sections. The prettier
library uses <!-- prettier-ignore -->
and <!-- prettier-ignore-end -->
for this purpose.
Describe alternatives you've considered
Plugins could be written for to handle cases like this, but it would be useful if there was a global way to disable formatting in specific sections of a document.
Additional context
It looks like there's a similar discussion happening on an existing PR about how to ignore blocks of code. I'd be fine with a solution where I could wrap certain sections in a div that sets a class that's globally ignored, but I feel like a plain comment would be a better interface for this use case, as it wouldn't require modifying the AST of the markdown.
Make a shfmt code formatter plugin. Make it use the docker image https://hub.docker.com/r/mvdan/shfmt/ as fallback if shfmt is not installed.
E.g.
First line\
second line
===========
I've never seen this in the wild, but hard break in a setext heading is something that AFAIK has no ATX representation. So we have to render a setext heading in this case.
Create docs with either
- MkDocs
- Sphinx + CommonMark or MyST
Build the docs in Github Actions.
Publish on readthedocs.io
Perhaps similar to #12, it would be nice to provide a hook for extending the parser with the available plugins: https://markdown-it-py.readthedocs.io/en/latest/plugins.html#plugin-extensions
As mentioned in executablebooks/markdown-it-py#10, I guess this would work best such that it is the plugin's responsibility to provide a parser method? Maybe allowing for the parsing of one or more renderer defined options, something like:
md = MarkdownIt(renderer_cls=MDRenderer)
md.use(a_plugin, md_seperator=MARKERS.BLOCK_SEPARATOR)
md.render(md)
Describe the bug
File is always output with crlf line endings on windows even if no formatting was changed
To Reproduce
Expected behavior
File has same line ending style it started with
Environment
Windows 10
Python 3.9.5
pre-commit 2.13.0
Thematic breaks are currently rendered as three underscores. Rendering more underscores would make the breaks easier to detect in Markdown source, without any downsides that I can think of.
If this is a reasonable change, then the question is raised, what is a good amount of characters to render. 50? 72? 80? Surely not more than 80 as that is still a common default width for terminals...
As discussed in executablebooks/markdown-it-py#10, great work!
A few thoughts off the cuff:
Allow passing in directories, not just files to mdformat. Look for .md
files recursively in directories.
Describe the bug
When attempting to install the hook I would assume all you have to do is have a .pre-commit-config.yaml
with the following configuration:
- repo: https://github.com/executablebooks/mdformat
rev: 0.5.6
hooks:
- id: mdformat
However, for any Python 2 environment (I am on 2.7.16
) it seems you have to put in a language_version
override to get it to install properly resulting in (something like):
- repo: https://github.com/executablebooks/mdformat
rev: 0.5.6
hooks:
- id: mdformat
language_version: 3.7.7
I believe there should be a default language version set so that you don't have to override this value.
The reason I ask about this is because other hooks black
for example also only run on Python 3, and it does not need this override.
To Reproduce
Steps to reproduce the behavior:
language_version
set.pre-commit install --install-hooks
Expected behavior
Expected behavior is that it installs properly.
Environment
jupyter-book --version
: 0.5.6Additional context
N/A
It would be useful if we could use a feature flag to enforce "one sentence per line". This basically means following the first 4 rules of the "semantic line breaks" spec (and most importantly, following all of the MUST
rules).
I've found this to give much of the "benefits of sembr" without much extra work.
Build mdformat-myst
plugin.
Come up with a set of parser and renderer extensions to be able to format MyST. Many of the syntax extensions make sense outside the MyST context, so they should be self-contained plugins that mdformat-myst
requires and activates, similar to how mdformat-gfm
does with mdformat-tables
.
MarkdownIt
instance for myst parsing. We should set the same extensions in myst_plugin.update_mdit()
. Also see this place where myst-parser sets MarkdownIt options.Convert list item marker symbols. For unordered lists, always prefer -
. If there are sequential lists, alternate between -
and *
.
Do the same for ordered lists using .
(preferred) and )
.
Heya, is there a reason to use ~
over `
, which IMO is the more generally used character?
I feel this may put off general Markdown users ๐ฌ
I think at least this should be configurable
See #38 (comment)
Is your feature request related to a problem? Please describe.
I use an extension called "All in one Markdown" on VSCode, one of its most useful features is the automatic generation of a ToC. ToC generation works just fine when a have lists that are 2 layers deep (list point with sub-points) but breaks for more than that (sub-sub-points)
Describe the solution you'd like
Consistent non-adaptive indentation with enjoy lines between list items.
- list level 1
- list level 2
- list level 3
Describe alternatives you've considered
Nothing reasonable
Additional context
This is the mdformat-gfm formatted file (I ended up writing the ToC manually): file
See #38 (comment) and #33 (comment)
Come up with the necessary plugins to format Github Flavored Markdown.
It might make sense to make a collection plugin (e.g. mdformat-gfm
) that installs all GFM extensions.
Once markdown-it-py>0.6.2
is released, remove the bundled _tree.SyntaxTreeNode
class and import the equivalent class from upstream.
The current API lets plugins add new renderer functions (for node types that dont have a default one) or override existing functions. There is no good way for plugins to collaborate on the same node type, however.
This would be useful e.g. for plugins to set character escape rules by collaboratively modifying renderers of paragraph
and text
syntaxes. This would solve issues like executablebooks/mdformat-tables#10
Similarly as the underscore, the asterisk can start (among other things) an emphasis or strong emphasis block. We need some heuristic that ensures we don't introduce unwanted emphasis blocks, if we leave the character unescaped. Currently I'm thinking to use the following:
Related issue about underscores: #119
For a use-case: in the MyST-Markdown format, code cells are defined like:
```{code-cell} ipython3
:tags: [hide-output, show-input]
print("Hallo!")
```
and so, to create a formatter plugin for this, I would set the language as "{code-cell}", but then would also need to have access to the rest of the string to determine how to format
Markdown defines that two trailing spaces means a new line, hence trailing whitespace shouldn't be cleared from the document when running mdformat
(see here).
No response
.md
file>this line has two trailing spaces after it
>so that this is actually a newline
mdformat
on the file3.9
Linux 5.12 kernel
mdformat 0.7.7
mdformat-gfm 0.3.2
mdformat-tables 0.4.1
No response
Describe the bug
A standalone link with underscores in it was changed:
"example.com/this_page_here"
to
"example.com/this\_page\_here"
The link was standalone, eg not in the markdown format within parentheses.
Strip trailing whitespace if it serves no function. At least code blocks (perhaps some other block too?) may have whitespace that should not be stripped
Currently, exceptions in code formatting plugins are silently ignored.
I think this is a bit "dangerous" in that, if a plugin is faulty, people won't realise that their files are not being formatted as intended.
At a minimum, I think the errors should be logged, as I have added in #36.
But perhaps also it should be the default or an option (using #35) to fail on such exceptions.
Another place where reporting/failing should be considered, is when duplicate_refs
are found in the env
(after parsing)
Add a guard that checks there are no significant changes to the HTML rendered from formatted Markdown.
Do something simple like:
If the HTMLs are not equal, don't apply any formatting. Instead print an error message asking to submit an issue to https://github.com/hukkinj1/mdformat/issues
This issue documents the main differences noted in mdformat style and prettier style when formatting Markdown in https://github.com/lambda-fairy/maud
[ref link]
into [ref link][ref link]
To me everything else seems desirable, besides the last two differences (5. and 6.). I will turn those into separate issues.
Input:
- Recalculate secondary dependencies between rounds (#378)
Current output:
- Recalculate secondary dependencies between rounds (\#378)
Come up with logic that allows us to escape the hash character #
in fewer cases. Only escape if the character would start an ATX heading.
Describe the bug
When running mdformat
as a pre-commit hook whenever mdformat
fails there is a secondary error that happens and does not show the error output for mdformat
.
Pre-commit setup:
repos:
- repo: https://github.com/executablebooks/mdformat
rev: 0.5.6
hooks:
- id: mdformat
args: [--check]
Error when running with pre-commit
:
Traceback (most recent call last):
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/bin/mdformat", line 8, in <module>
sys.exit(run())
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/__main__.py", line 8, in run
exit_code = mdformat._cli.run(sys.argv[1:])
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 69, in run
print_error(f'File "{path_str}" is not formatted.')
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 165, in print_error
print_paragraphs(paragraphs)
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 157, in print_paragraphs
sys.stderr.write(wrap_paragraphs(paragraphs))
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 179, in wrap_paragraphs
return "\n\n".join(wrapper.fill(p) for p in paragraphs) + "\n"
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 179, in <genexpr>
return "\n\n".join(wrapper.fill(p) for p in paragraphs) + "\n"
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 363, in fill
return "\n".join(self.wrap(text))
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 354, in wrap
return self._wrap_chunks(chunks)
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 248, in _wrap_chunks
raise ValueError("invalid width %r (must be > 0)" % self.width)
ValueError: invalid width 0 (must be > 0)
"Error" when running normally with mdformat
:
Error: File "blah.md" is
not formatted.
To Reproduce
Steps to reproduce the behavior:
mdformat
.pre-commit run --all-files
to produce an "error" on file not formatted.ValueError
.Expected behavior
The error message from mdformat is included instead of the ValueError
.
Namely I would prefer:
Error: File "blah.md" is
not formatted.
Instead of:
Traceback (most recent call last):
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/bin/mdformat", line 8, in <module>
sys.exit(run())
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/__main__.py", line 8, in run
exit_code = mdformat._cli.run(sys.argv[1:])
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 69, in run
print_error(f'File "{path_str}" is not formatted.')
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 165, in print_error
print_paragraphs(paragraphs)
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 157, in print_paragraphs
sys.stderr.write(wrap_paragraphs(paragraphs))
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 179, in wrap_paragraphs
return "\n\n".join(wrapper.fill(p) for p in paragraphs) + "\n"
File "/Users/emily.hontoria/.cache/pre-commit/reponc4o4tlo/py_env-python3.8/lib/python3.8/site-packages/mdformat/_cli.py", line 179, in <genexpr>
return "\n\n".join(wrapper.fill(p) for p in paragraphs) + "\n"
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 363, in fill
return "\n".join(self.wrap(text))
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 354, in wrap
return self._wrap_chunks(chunks)
File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.8/lib/python3.8/textwrap.py", line 248, in _wrap_chunks
raise ValueError("invalid width %r (must be > 0)" % self.width)
ValueError: invalid width 0 (must be > 0)
Environment
Additional context
No additional context.
I use a Markdown linter with pre-commit and GitLab CI pipelines. It's useful but I'd like to use a formatter before running the linter: one of the biggest problems I have is that, as well as raising rule violations that require manual correction, the linter frequently raises large numbers of trivial formatting rule violations that could very easily be corrected automatically by a formatter. If they are going to be coupled, it's really important that the formatter and linter work together and don't conflict. I would like to suggest that mdformat be designed from the beginning to integrate with Markdown linters, particularly markdownlint [1].
I expect that this means mdformat will have to observe the markdownlint rules and the rule structure, to avoid creating linting conflicts. The advantage of this is that there's an existing structure and code (largely regexps) which can inform mdformat development.
What do you think?
[1] There are two Markdown linters I'm aware of: a Ruby version https://github.com/markdownlint/markdownlint and a node.js version: https://github.com/DavidAnson/markdownlint, both named "markdownlint" and the respective authors and maintainers work together to ensure that there is minimal divergence of rules between the two versions.
See #49 (comment)
Add a CLI flag for skipping Markdown validation.
I would name the flag --fast
, --unsafe
or --skip-validation
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.