Giter VIP home page Giter VIP logo

Comments (9)

facelessuser avatar facelessuser commented on June 9, 2024

There is a statement here that a block processor gets wrapped with p tags, but that is not a true statement. Block processors do not automatically get wrapped in p tags.

I'm not sure what extensions you've tried and what their approach is, but as you can see below, I can create a figure using a block processor and not have it wrapped in a p tag.

In this example, we use an extension that creates arbitrary HTML wrappers. In this case figure. No p tag is wrapped around the block. The internal content within the block gets run through the Markdown block processors and then gets wrapped in a p tag, but figure does not.

import markdown

MD = """
/// html | figure
![cool image](https://mydomain.com/cool_image.jpg)
///
"""

html = markdown.markdown(
    MD,
    extensions=['pymdownx.blocks.html'],
)

print(html)
<figure>
<p><img alt="cool image" src="https://mydomain.com/cool_image.jpg" /></p>
</figure>

I suspect they are not doing what you think they are doing. I see no reason why an extension could not create a figure without it being wrapped in a p tag when using a block processor. I cannot comment on why they may be having their content wrapped in p tags as I would have to see their implementation to explain why.

from markdown.

Apreche avatar Apreche commented on June 9, 2024

@facelessuser Thanks, I think I may not have been clear. That extension manages to avoid the p tag because there is extra non-standard markdown syntax to avoid the issue. If I'm going to be including some extra weird markdown, then I might as well just put the HTML in the markdown directly.

If the markdown is only

![cool image](https://mydomain.com/cool_image.jpg)

With no additional /// html | figure or any other such extra markup permitted, can an extension be written that renders the desired HTML without a <p> tag wrapping it?

from markdown.

facelessuser avatar facelessuser commented on June 9, 2024

That extension manages to avoid the p tag because there is extra non-standard markdown syntax to avoid the issue.

No, that is not true, it is just captured before the paragraph extension captures the block. Again, you haven't really specified how these "other extensions" approach things, but what I'm saying is that if the extension was done properly, you could get what you want.

If your block extension captures the loan image by itself and treats it as a block before the paragraph extension and creates the figure, you can embed the image within and the figure will not be wrapped in a paragraph. This is completely doable, but the extensions you are using are likely not doing that. I'm likely oversimplifying some steps, but there is nothing innate that forces a block to be wrapped in paragraphs, we have many block extensions that are not wrapped in paragraphs.

In short, a block processor that does what you want must treat the loan image as a block before the paragraph extension.

from markdown.

Apreche avatar Apreche commented on June 9, 2024

@facelessuser That's great! I will write the extension to do this. How can I ensure that my block extension happens before the paragraph extension?

from markdown.

facelessuser avatar facelessuser commented on June 9, 2024

Check out the documentation that covers priorities: https://python-markdown.github.io/extensions/api/#registries.

See code for current priorities: https://github.com/Python-Markdown/markdown/blob/master/markdown/blockprocessors.py#L42

from markdown.

waylan avatar waylan commented on June 9, 2024

A few observations.

The default behavior of always wrapping images in <p> tags is a result of the Markdown rules. Markdown is a subset of HTML and therefore does not support all of HTML's features. One feature that is not supported is block-level img elements (note that images are listed under "Span Elements" only, not under "Block Elements" in the document hierarchy). I realize some users don't like this, but we didn't write the rules, we just implement a parser which follows them.

Just because the default behavior is a certain way does not mean that it can't be changed. In fact, any part of the parser can be changed if one makes use of the correct part of the extension API. However, the default behavior will always follow the rules.

I havn't checked, but suspect the various existing extensions that you have tries all use a custom inline processor, which, will always only parse span level content. And that would explain why they always result in the images being wrapped in <p> elements. However, if you implemented a block processor instead, then that would output its own block-level element. Note that the ParagraphProcessor is the fallback block processor. It only gets called if no other block processor has already claimed the block. So, simply write a block processor which correctly identifies and processes your block-level images before they ever get to the ParagraphProcessor and your output will never get wrapped in a <p> tag. The "priority" assigned to each block processor is documented here (or as @facelessuser indicated, you can check the source code).

from markdown.

Apreche avatar Apreche commented on June 9, 2024

@waylan @facelessuser Thanks. I'm working on it right now.

One problem I've already run into is if I want to support reference-style block images.

Inline references have no problem because all the blocks in the entire document have been processed before inlines start getting processed. Therefore, even if a bunch of references are at the bottom of the Markdown document, they are all populated in md.references and ready to go.

Even if I put the priority of a block processor lower than the ReferenceProcessor it doesn't help. The images I'm trying to process as blocks have not yet had their references processed, as those references are at the bottom of the Markdown document. This means that it's only going to work for non-reference style images, or if documents happen to have the references above the images, which I think is a rather strange thing to do.

Because references are sort of a special case, is there some way we can scan the entire document for references at the very beginning before any other processing? That way they are ready and populated so that any other processor can refer to them.

from markdown.

waylan avatar waylan commented on June 9, 2024

Because references are sort of a special case, is there some way we can scan the entire document for references at the very beginning before any other processing?

Yes, you can use a preprocessor. In fact, Markdown used to do that way back.

Although, another possibility is that if you are exclusively using <figure> in your output, then you could create the <figure> tag and leave the image as Markdown for later processing by the inline processors. For example, your block processor could create this:

<figure>
![cool image](https://mydomain.com/cool_image.jpg "a cool image")
<figcaption>a cool image</figcaption>
</figure>

Well actually, I suppose the figcaption would need to also be dealt with later as a reference style image would have the caption defined in the reference. So maybe this then:

<figure>
![cool image](https://mydomain.com/cool_image.jpg "a cool image")
</figure>

But then the issue is that you need to render the image differently (include or exclude the caption) depending on what the parent is (figure or anything else) and there is no way to get the parent from within an inline processor. Although, you could perhaps use the ANCESTOR_EXCLUDES attribute to skip the inline processor. I would have two inline processors. The first one is a replacement for the default and actually is an exact copy of the default with the one difference being that is has ANCESTOR_EXCLUDES set to include 'figure'. Then the second inline processor would not have the ANCESTOR_EXCLUDES set and would insert both the img and the figcaption. So long as the first one is run first, the second will only ever see images which are in figure elements.

from markdown.

waylan avatar waylan commented on June 9, 2024

I am closing this as there is no actionable item here. If you have any additional support questions about this issue, feel free to add an additional comment. We can continue to have a discussion in the closed issue.

from markdown.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.