Giter VIP home page Giter VIP logo

pandoc-include's Introduction

pandoc-include

PyPI PyPI - Downloads GitHub

Pandoc filter to allow file and header includes.

The filter script is based on User Guide for Panflute. This repository is to provide a simple way to install and use it.

Features

  • Include as raw blocks
  • Indent and dedent included contents
  • Partial include: Allow including only parts of the file using options
  • Code include: Allow using !include in code blocks
  • Unix style pathname
  • Recursive include: It depends on include-entry header to work
  • Yaml header Merging: When an included file has its header, it will be merged into the current header. If there's a conflict, the original header of the current file remains.
  • Header include: Use !include-header file.yaml to include Yaml header from file.

TODO

  • Write options to a tmp file and pass the filename by environment variable

Installation

Using pip

To install the latest published version:

pip install --user pandoc-include

To install the current (development) version hosted on the repository, use

pip install --upgrade --force --no-cache git+https://github.com/DCsunset/pandoc-include

To check the version currently installed:

pip show pandoc-include

Using Nix

pandoc-include is included in the Nixpkgs. Simply add the package to your NixOS config or use the following command:

# install in your profile
nix-env -iA nixpkgs.pandoc-include

# Or use it temporarily in a shell
nix-shell -p pandoc-include

Usage

Note: you should use pandoc with version greater than or equal to 2.17, which is the minimum version that is tested.

Command

To use this filter, add to pandoc command

pandoc input.md --filter pandoc-include -o output.pdf

Syntax

Each include statement has its own line and has the syntax:

!include somefolder/somefile

!include-header file.yaml

Or

$include somefolder/somefile

$include-header file.yaml

Each include statement must be in its own paragraph. That is, in its own line and separated by blank lines.

For code include, use !include statement in a code block:

```cpp
!include filename.cpp
```

The path can be either absolute or relative to the current file's directory. Besides, unix-style pathname can used. (If the include statement is used in an included file, then the path is absolute or relative to the included file itself.)

If there are special characters in the filename, use quotes:

!include "filename with space"
!include 'filename"with"quotes'

If the filename contains special sequences in markdown, use backquotes: (Note: this also applies to include statment in code blocks)

!include `__filename__`

The second syntax may lead to wrong highlighting when using a markdown editor. If it happens, use the first syntax. Also make sure that there are no circular includes.

The !include command also supports options:

!include`<key1>=<value1>, <key2>=<value2>` some_file

For example, to specify line ranges in options:

!include`startLine=1, endLine=10` some_file

Or to include snippets with enclosed delimiters:

!include`snippetStart="<!-- Start -->", snippetEnd="<!-- End -->"` some_file

Or including xml files transforming them with XSLT:

!include`format="markdown", xslt="xslt/api.xslt"` main_8h.xml

where <!-- Start --> and <!-- End --> are two strings occuring in some_file.

If multiple occurences of <!-- Start --> or <!-- End --> are in some_file, then pandoc-include will include all the blocks between the delimiters. If snippetEnd or snippetStart is not found or specified, it will include till the end or from the start.

Supported options:

Key Value Description
startLine int Start line of include (default: 1)
endLine int End line of include (default: number of the last line)
snippetStart str Start delimiter of a snippet
snippetEnd str End delimiter of a snippet
includeSnippetDelimiters bool Whether to include the delimiters (default: False)
incrementSection int Increment (or decrement) section levels of include
dedent int Remove n leading whitespaces of each line where possible (-1 means remove all)
format str The input format of the included file (see pandoc --list-input-formats). It will be automatically deduced from the path if not set. (Hint: extensions can also be enabled here)
raw str Include as raw block. The arg is the format (latex, html...)
xslt str XSLT file to use for transforming the given .xml file (This is intended to be used with Doxygen's xml output

Note: the values above are parsed as Python literals. So str should be quoted like 'xxx' or "xxx"; bool should be True or False.

Header options

---
include-entry: '<path>'
include-order: 'natural'
include-resources: '<path>[:<path>]'
rewrite-path: true
pandoc-options:
  - --filter=pandoc-include
  - <other options>
---

include-entry

The include-entry option is to make recursive includes work. Its value is a path relative to current working directory or absolute where the entry file (the initial file) locates. It should be placed in the entry file only, not in the included files. It is optional and the default include-entry value is ..

For example, to compile a file in current directory, no header is needed:

pandoc test.md --filter pandoc-include -o test.pdf

However, to compile a file not in current directory, like:

pandoc dir/test.md --filter pandoc-include -o test.pdf

The header should now be set to: include-entry: 'dir'.

include-order

The include-order options is to define the order of included files if the unix-style pathname matches multiple files. The default value is natural, which means using the natural order. Other possible values are alphabetical and default. The default means to keep the order returned by the Python glob module.

include-resources

The include-resources can be used to simplify relative paths of include statements by searching in the given paths for files with relative paths when otherwise not found.

Given following directory structure and pandoc command:

main.md
examples/
    hello-world.c
content/
    chapter1/
        chapter01.md
        image.png
    chapter2/
        chapter02.md
        image.png
$ pandoc --metadata include-resources=examples/ ... main.md

This will make it possible to have following include line in chapter01.md and chapter02.md

```cpp
!include hello-world.c
```

Instead of:

```cpp
!include ../../examples/hello-world.c
```

This is most useful if some resources, e.g. source code or Doxygen output, is located in an external directory structure.

rewrite-path

The rewrite-path option is a boolean value to configure whether the relative paths of images should be rewritten to paths relative to the root file. The default value is true.

For example, consider the following directory structure:

main.md
content/
  chapter01.md
  image.png

Suppose chapter01.md uses the image image.png. It should use ![Image](image.png) if rewrite-path is true, or ![Image](content/image.png) if rewrite-path is false.

pandoc-options

The pandoc-options option is a list to specify the pandoc options when recursively processing included files. By default, the included file will inherit the pandoc-options from its parent file, unless specified in its own file.

To make the recursive includes work, --filter=pandoc-include is necessary. The default value of pandoc-options is:

pandoc-options:
  - --filter=pandoc-include

Examples

File include

File include can be used to separate chapters into different files, or include some latex files:

---
title: Article
author: Author
toc: true
---

!include chapters/chap01.md

!include chapters/chap02.md

!include chapters/chap03.md

!include`raw="latex"` data/table.tex

Header include

For header include, it is useful to define a header template and include it in many files.

For example, in the header.yaml, we can define basic info:

name: xxx
school: yyy
email: zzz

In the main.md, we can extend the header:

---
title: Title
---

!include-header header.yaml

# Section

Body

The main.md then is equivalent to the follow markdown:

---
title: Title
name: xxx
school: yyy
email: zzz
---

# Section

Body

Environment Variables

The following environment variables can be set to change the default behaviour:

Key Value Description
PANDOC_INCLUDE_NOT_FOUND_ERROR 0 or 1 Emit an error if file not found (default: 0)

Development

To setup local dev environment, you can use python venv to create a virtual environment from requirements.txt.

Or if you use Nix, you can simply run nix develop.

Trouble Shooting

Command-line options

The pandoc command-line options are processed in order. If you want some options to be applied in included files, make sure the --filter pandoc-include option is specified before those options.

For example, use bibliography in the included files:

pandoc main.md --filter pandoc-include --citeproc --bibliography=ref.bib -o main.pdf

Executable not found

For some operating systems, the path may not be set correctly after installation. Make sure that the pandoc-include executable is put in the directory which is in the PATH environment.

License

MIT License

pandoc-include's People

Contributors

aubertc avatar dcsunset avatar gullumluvl avatar mario45211 avatar russkel avatar studerluk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pandoc-include's Issues

Bug with code blocks (containing tab?)

The following markdown file:

```
test
```

when compiled with

pandoc --filter pandoc-include test.md -o test.html

returns

Traceback (most recent call last):
  File "~/.local/bin/pandoc-include", line 8, in <module>
    sys.exit(main())
  File "~/.local/lib/python3.7/site-packages/pandoc_include/main.py", line 380, in main
    return pf.run_filter(action, doc=doc)
  File "~/.local/lib/python3.7/site-packages/panflute/io.py", line 224, in run_filter
    return run_filters([action], *args, **kwargs)
  File "~/.local/lib/python3.7/site-packages/panflute/io.py", line 205, in run_filters
    doc = doc.walk(action, doc)
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 264, in walk
    ans = list(chain.from_iterable(ans))
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 262, in <genexpr>
    ans = ((item,) if type(item) != list else item for item in ans)
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 259, in <genexpr>
    ans = (item.walk(action, doc) for item in obj)
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 275, in walk
    altered = action(self, doc)
  File "~/.local/lib/python3.7/site-packages/pandoc_include/main.py", line 363, in action
    includeType, name, config = is_code_include(elem)
  File "~/.local/lib/python3.7/site-packages/pandoc_include/main.py", line 98, in is_code_include
    value, name, config = is_include_line(new_elem)
  File "~/.local/lib/python3.7/site-packages/pandoc_include/main.py", line 67, in is_include_line
    if (len(elem.content) not in [3, 4]) \
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 106, in content
    return self._content
AttributeError: 'CodeBlock' object has no attribute '_content'
Error running filter pandoc-include:
Filter returned error status 1

Note that the file does not perform any actual inclusion and compile just fine without the filter.

pandoc-include crash when the first line of some code is indented

Consider the following MWE:

```
    I'm indented!
I'm not.
```

pandoc compiles it just fine to

<pre><code>    I&#39;m indented!
I&#39;m not.</code></pre>

but with the latest update, pandoc test.md --filter pandoc_include.py -o test.html crashes with

Traceback (most recent call last):
  File "pandoc_include.py", line 272, in <module>
    main()
  File "pandoc_include.py", line 268, in main
    return pf.run_filter(action, doc=doc)
  File "/home/caubert/.local/lib/python3.7/site-packages/panflute/io.py", line 224, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/home/caubert/.local/lib/python3.7/site-packages/panflute/io.py", line 205, in run_filters
    doc = doc.walk(action, doc)
  File "/home/caubert/.local/lib/python3.7/site-packages/panflute/base.py", line 264, in walk
    ans = list(chain.from_iterable(ans))
  File "/home/caubert/.local/lib/python3.7/site-packages/panflute/base.py", line 262, in <genexpr>
    ans = ((item,) if type(item) != list else item for item in ans)
  File "/home/caubert/.local/lib/python3.7/site-packages/panflute/base.py", line 259, in <genexpr>
    ans = (item.walk(action, doc) for item in obj)
  File "/home/caubert/.local/lib/python3.7/site-packages/panflute/base.py", line 275, in walk
    altered = action(self, doc)
  File "pandoc_include.py", line 251, in action
    includeType, name, config = is_code_include(elem)
  File "pandoc_include.py", line 80, in is_code_include
    value, name, config = is_include_line(new_elem)
  File "pandoc_include.py", line 48, in is_include_line
    firstElem = elem.content[0]
  File "/home/caubert/.local/lib/python3.7/site-packages/panflute/base.py", line 106, in content
    return self._content
AttributeError: 'CodeBlock' object has no attribute '_content'
Error running filter pandoc_include.py:
Filter returned error status 1

This was not the case before this update, and can be resolved by removing the indentation from the first line.

Recursive includes not working

I have a master.md file which has !include statements for dep1.md and dep2.md. dep2.md also has !include statements. When I run

$ pandoc master.md --filter pandoc-include -o processed.md -t markdown -s

the include statements in dep2.md are not processed. Documentation says that recursive includes are supported since v0.3.1. I am on v0.3.2. Is there anything else needed to make recursive imports work?

Possible to use within code blocks?

Just wanted to ask if there's a way (or alternative syntax) to use pandoc-include to process include statements within code blocks, a la the below?

```c
!include foo.c
```

We noticed that, by default, the above simple outputs !include foo.c literally, as via:

<div class="sourceCode" id="cb3"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a>!include foo.c</span></code></pre>

Many thanks!

Adjust section level in included markdown files

Consider the following markdown

# Software components
## Platform

!include J1939Tp/J1939Tp.md

J1939Tp.md:

# J1939Tp

Implements the Transport Protocol (TP) 
.
.
.

Then I would like the heading J193Tp to be on level 3 in the merge output. "Software components" is level 1, "Platform" is level 2. The filter should know the heading level at the point of include and adjust heading levels in included file accordingly. This makes it possible to reuse included .md files in several documents.

Shared library error

Whenever I run pandoc with --filter pandoc-include I get an error about a shared library:

$ cat test.md | pandoc --filter pandoc-include
pandoc-include: error while loading shared libraries: libHSpandoc-types-1.17.5.1-8DdN87tlyJbAzvpbzH68uw-ghc8.4.3.so: cannot open shared object file: No such file or directory
Error running filter pandoc-include:
Filter returned error status 127

I do have this file on my system however, but as a newer version:

$ locate libHSpandoc-types
/usr/lib/libHSpandoc-types-1.17.5.4-8Iu0ZQ3DY7ZCF41XD1ccvX-ghc8.6.5.so

This seems more likely to be a problem in my installation, although I'm not really sure how. I installed pandoc-include from the Arch Linux AUR (and pip, for that matter). I've already tried reinstalled a lot of the dependencies of this filter, in case something went wrong during linking.

invalid api version?

I am trying to run the example file provided in the project homepage and I get this error:

Traceback (most recent call last):
  File "/home/NAME/.local/bin/pandoc-include", line 8, in <module>
    sys.exit(main())
  File "/home/NAME/.local/lib/python3.10/site-packages/pandoc_include/main.py", line 333, in main
    return pf.run_filter(action, doc=doc)
  File "/home/NAME/.local/lib/python3.10/site-packages/panflute/io.py", line 227, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/home/NAME/.local/lib/python3.10/site-packages/panflute/io.py", line 200, in run_filters
    doc = load(input_stream=input_stream)
  File "/home/NAME/.local/lib/python3.10/site-packages/panflute/io.py", line 58, in load
    doc = json.load(input_stream, object_hook=from_json)
  File "/usr/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.10/json/__init__.py", line 359, in loads
    return cls(**kw).decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.10/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/home/NAME/.local/lib/python3.10/site-packages/panflute/elements.py", line 1413, in from_json
    return Doc(*items, api_version=api, metadata=meta)
  File "/home/NAME/.local/lib/python3.10/site-packages/panflute/elements.py", line 66, in __init__
    raise TypeError("invalid api version", api_version)
TypeError: ('invalid api version', [1, 20])
Error running filter pandoc-include:
Filter returned error status 1

The installation is local (I edited the real folder name for security). The problem seems to be an invalid api version; I am almost illiterate insofar as Python is concerned and I ask your kind help.

  • Pandoc version: 2.9.2.1-3ubuntu2.
  • I am on Linux Mint 21.
  • pandoc-include version: 1.2.0

This filter looks excellent and I would really like to be able to use it.

Thank you.

newlines in src do not work

When my included source code contained a newline I got a backtrace:

Traceback (most recent call last):
  File "/usr/bin/pandoc-include", line 8, in <module>
    sys.exit(main())
  File "/usr/lib/python3.10/site-packages/pandoc_include/main.py", line 333, in main
    return pf.run_filter(action, doc=doc)
...
  File "/usr/lib/python3.10/site-packages/pandoc_include/main.py", line 309, in action
    codes.append(read_file(fn, config))
  File "/usr/lib/python3.10/site-packages/pandoc_include/main.py", line 160, in read_file
    content = "\n".join(dedent(content, config["dedent"]))
TypeError: sequence item 4: expected str instance, NoneType found
Error running filter pandoc-include:
Filter returned error status 1

I fixed this by adding a check in removeLeadingWhitespaces():

def removeLeadingWhitespaces(s, num):
    if not s.strip():  # <<< empty lines didn't work
        return s
    regex = re.compile(r"[^\s]")
    m = regex.search(s)
    if m == None:
        return
    pos = m.span()[0]
    if num < 0:
        return s[pos:]
    else:
        return s[min(pos, num):]

Raw latex in yaml headers

I have an issue when using raw Latex in the document's yaml header:

---
title: Title
date: \today{}
---

Here is a file in the same directory:

!include file.md 

Yields the following pdf:

image

This is not a huge problem on its own, but header-includes also doesn't work, which is a bit more of an issue.

Any thoughts?

Include support for bibliography

Pandoc now allows native support for bibliography. However when a citation is made inside an included document, the citation is not rendered.

Is it possible to add support for bibliography ?

pandoc --citeproc --bibliography=doc.bib  main.md --filter pandoc-include -o main.html

with main.md

# Hello world

!include chap01.md

## References

::: {#refs}
:::

with chap01.md

# Chapter 1

[@MyBibliographicRef]

Error: invalid api version

Hi there,

I've tried to run a very simple test but I got the error below.

> python -V
Python 3.9.2
> pip -V
pip 22.3.1 from /home/souto/.local/lib/python3.9/site-packages/pip (python 3.9

> pip install --user pandoc-include
Collecting pandoc-include
  Downloading pandoc_include-1.2.0-py3-none-any.whl (10 kB)
Collecting panflute>=2.0.5
  Downloading panflute-2.2.3-py3-none-any.whl (36 kB)
Collecting natsort>=7
  Downloading natsort-8.2.0-py3-none-any.whl (37 kB)
Requirement already satisfied: pyyaml<7,>=3 in /home/souto/.local/lib/python3.9/site-packages (from panflute>=2.0.5->pandoc-include) (6.0)
Requirement already satisfied: click<9,>=6 in /home/souto/.local/lib/python3.9/site-packages (from panflute>=2.0.5->pandoc-include) (8.1.3)
Installing collected packages: panflute, natsort, pandoc-include
Successfully installed natsort-8.2.0 pandoc-include-1.2.0 panflute-2.2.3

> pip show pandoc-include
Name: pandoc-include
Version: 1.2.0
Summary: Pandoc filter to allow file and header includes
Home-page: https://github.com/DCsunset/pandoc-include
Author: DCsunset
Author-email: [email protected]
License: MIT
Location: /home/souto/.local/lib/python3.9/site-packages
Requires: natsort, panflute
Required-by: 


> cat snippet.md             
## Hello world
> cat main.md 
# Main

!include snippet.md

> pandoc main.md --filter pandoc-include -o output.pdf
Traceback (most recent call last):
  File "/home/souto/.local/bin/pandoc-include", line 8, in <module>
    sys.exit(main())
  File "/home/souto/.local/lib/python3.9/site-packages/pandoc_include/main.py", line 333, in main
    return pf.run_filter(action, doc=doc)
  File "/home/souto/.local/lib/python3.9/site-packages/panflute/io.py", line 227, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/home/souto/.local/lib/python3.9/site-packages/panflute/io.py", line 200, in run_filters
    doc = load(input_stream=input_stream)
  File "/home/souto/.local/lib/python3.9/site-packages/panflute/io.py", line 58, in load
    doc = json.load(input_stream, object_hook=from_json)
  File "/usr/lib/python3.9/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.9/json/__init__.py", line 359, in loads
    return cls(**kw).decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/home/souto/.local/lib/python3.9/site-packages/panflute/elements.py", line 1362, in from_json
    return Doc(*items, api_version=api, metadata=meta)
  File "/home/souto/.local/lib/python3.9/site-packages/panflute/elements.py", line 66, in __init__
    raise TypeError("invalid api version", api_version)
TypeError: ('invalid api version', [1, 20])
Error running filter pandoc-include:
Filter returned error status 1

I'd be grateful for any tips.
Cheers, Manuel

No longer automatically inserting pagebreaks

I recently changed laptop for writing and set up my writing environment, now running pandoc-include 1.2.0 and pandoc 3.0.1.

When I included .md-files in a main one, I didn't need to include pagebreaks for each chapter of my book, which are in separate .md-files. Now I have to do this for it to work:

\pagebreak

!include forord.md

\pagebreak

!include 01.md

\pagebreak

Anything I have missed?

Implicit header references do not work, when file with header is included after the reference

Pandoc supports implicit header references, as described here.

But when using pandoc-include AND when the file with the referenced header is included after the file with the implicit reference, the reference is not rendered properly. Note that explicit references work fine in the same situation.

For example, consider the following markdown source:

---
title: Testing Implicit References
---

!include 10-chapter1.md

!include 20-chapter2.md

Here is 10-chapter1.md:

[This is Header]

[Link][This is Header]

[Explicit Link](#this-is-header)

Here is 20-chapter2.md:

# This is Header

If I convert it to HTML, I get this:

<p>[This is Header]</p>
<p>[Link][This is Header]</p>
<p><a href="#this-is-header">Explicit Link</a></p>
<h1 data-number="1" id="this-is-header"><span class="header-section-number">1</span> This is Header</h1>

Note that only the explicit reference has been properly created.
If I swap the MD files (so that 20-chapter2.md comes first), there will be no such issue.

Any way to pass options to the executed pandoc?

I execute pandoc with several options to controll the resulting format, etc.

So far, however, I haven't found any way to pass the same options to the executed pandoc via pandoc-include.

This will make some inconsistency in resulting files, between the including and the included files.

It would be extremely useful if I could list options to be passed to the child process in the header of files to be included, for example.

Could not find executable pandoc-include

I know that Issue #1 seems very similar to this, however I have followed the solution there and it does not work for me.

Problem

When trying to use pandoc-include like pandoc .\Book.md --filter pandoc-include -o book.pdf I get the error
Error running filter pandoc-include: Could not find executable pandoc-include

Background

I have done the following:

  1. Install Python from https://www.python.org/downloads/
  2. Run pip install --upgrade pandoc, pip install pandoc-include
  3. Moved pandoc-include from the Roaming folder it was installed in (C:\Users\myuser\AppData\Roaming\Python\Python311\site-packages), to the same folder as where pandoc is installed in (C:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages)
  4. Made sure that pandoc-include is in the path variable: running echo $env:PATH returns my path, which contains C:\Users\myuser\AppData\Local\Programs\Python\Python311\Lib\site-packages

My pandoc version is 2.3 and pandoc-include is 1.2.0.

How to debug json error?

Hi,
I am using pandoc 2.11.1.1
and pandoc-include

pip show pandoc-include
Name: pandoc-include
Version: 0.8.4

I split a large markdown file that was generated from a docx and unfortunately pandoc-include fails with a json error. I checked the individual files, they all conform with utf8. So what could be the problem here?

error message attached. Could it be the case that the pandoc and corresponding panflute updates created some incompatibilities with pandoc-include? I had to debug other filters using panflute for that, but I am completely have no idea on how to treat the underlying json error.

Thanks for help

Regards
Peter.

  File "/usr/local/bin/pandoc-include", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/pandoc_include.py", line 166, in main
    return pf.run_filter(action, doc=doc)
  File "/usr/local/lib/python3.9/site-packages/panflute/io.py", line 224, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/panflute/io.py", line 205, in run_filters
    doc = doc.walk(action, doc)
  File "/usr/local/lib/python3.9/site-packages/panflute/base.py", line 264, in walk
    ans = list(chain.from_iterable(ans))
  File "/usr/local/lib/python3.9/site-packages/panflute/base.py", line 262, in <genexpr>
    ans = ((item,) if type(item) != list else item for item in ans)
  File "/usr/local/lib/python3.9/site-packages/panflute/base.py", line 259, in <genexpr>
    ans = (item.walk(action, doc) for item in obj)
  File "/usr/local/lib/python3.9/site-packages/panflute/base.py", line 275, in walk
    altered = action(self, doc)
  File "/usr/local/lib/python3.9/site-packages/pandoc_include.py", line 137, in action
    new_elems = pf.convert_text(
  File "/usr/local/lib/python3.9/site-packages/panflute/tools.py", line 393, in convert_text
    out = json.loads(out, object_hook=from_json)
  File "/usr/local/Cellar/[email protected]/3.9.0_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 359, in loads
    return cls(**kw).decode(s)
  File "/usr/local/Cellar/[email protected]/3.9.0_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/[email protected]/3.9.0_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 20078 (char 20077)
Error running filter pandoc-include:
Filter returned error status 1

Included figures with relative paths are not found

Hi,

I have the following folder structure and files:

.
├── A.rst
├── included 
│   ├── B.rst
│   ├── diagram.jpg

A.rst:

Heading
----------

$include included/B.rst

B.rst:

.. figure:: imageB.png

The figure in B.rst is included via a relative path (relative to B). When B is included into A, this path is included as-is and becomes invalid:

> pandoc  --filter=pandoc-include -f rst -t pdf A.rst
[WARNING] Could not fetch resource diagram.jpg: replacing image with description

For that use case to work, I guess the $include should rewrite the paths to the figures and prepend the path from A to B, so that the relative paths become relative to A. Another option would be to convert them to absolute paths during the include, although I'm not sure how that fits into the overall pandoc-filter philosophy.

reST directives are not executed in the included files

Hi,

I have two reST files, let's call them A and B:

A.rst :

Heading
---------- 

.. figure:: imageA.png

$include B.rst

B.rst :

.. figure:: image.png

The ..figure line is inserted vertbatim in the output document, instead of being replaced by an image.

I'm new to pandoc, and I assume this issue arises because the filters run after the reST directives have been processed. Is there any workaround for this ?

Thanks !

Invalid API Version

I have a test file that looks like this, note that it doesn't use any include syntax.

---
title: Title
---

# Header

Text

When I compile this using the filter like so

pandoc -t latex -f markdown --filter pandoc-include -o test.pdf test.md

I get the following error

Traceback (most recent call last):
  File "/home/rutrum/.local/bin/pandoc-include", line 8, in <module>
    sys.exit(main())
  File "/home/rutrum/.local/lib/python3.8/site-packages/pandoc_include/main.py", line 333, in main
    return pf.run_filter(action, doc=doc)
  File "/home/rutrum/.local/lib/python3.8/site-packages/panflute/io.py", line 227, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/home/rutrum/.local/lib/python3.8/site-packages/panflute/io.py", line 200, in run_filters
    doc = load(input_stream=input_stream)
  File "/home/rutrum/.local/lib/python3.8/site-packages/panflute/io.py", line 58, in load
    doc = json.load(input_stream, object_hook=from_json)
  File "/usr/lib/python3.8/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.8/json/__init__.py", line 370, in loads
    return cls(**kw).decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/home/rutrum/.local/lib/python3.8/site-packages/panflute/elements.py", line 1362, in from_json
    return Doc(*items, api_version=api, metadata=meta)
  File "/home/rutrum/.local/lib/python3.8/site-packages/panflute/elements.py", line 66, in __init__
    raise TypeError("invalid api version", api_version)
TypeError: ('invalid api version', [1, 17, 5, 4])
Error running filter pandoc-include:
Filter returned error status 1

I get similar errors when using the include syntax. I installed this using pip, and pip show pandoc-include reveals that I am using version 1.2.

add id-prefix option

Add an id-prefix option for included files, that'd help to avoid conflicting heading IDs from multiple files.
(see the --id-prefix command line option and the identifier-prefix in YAML)

This could be either set explicitly or optionally auto-generated based on the basename of the included path

Comments in code blocks returns error

It seems as if the issue with code blocks runs deeper than expected. The filter returns an error when the code block contains a comment containing too much hashtags.

# This is a title

This is some text.

```python
################## Comment ##################
```

Compiling this with the filter active gives the following error:

> pandoc -f markdown -t latex test.md -o test.pdf --filter pandoc-include
TypeError: Header level not between 1 and 10
Error running filter pandoc-include:
Filter returned error status 1

Without the filter it compiles just fine. It seems like the filter handles the code blocks like seperate markdown files.

Traceback
Traceback (most recent call last):
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/bin/pandoc-include", line 10, in <module>
    sys.exit(main())
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/pandoc_include/main.py", line 382, in main
    return pf.run_filter(action, doc=doc)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/io.py", line 224, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/io.py", line 205, in run_filters
    doc = doc.walk(action, doc)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/base.py", line 264, in walk
    ans = list(chain.from_iterable(ans))
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/base.py", line 262, in <genexpr>
    ans = ((item,) if type(item) != list else item for item in ans)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/base.py", line 259, in <genexpr>
    ans = (item.walk(action, doc) for item in obj)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/base.py", line 275, in walk
    altered = action(self, doc)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/pandoc_include/main.py", line 365, in action
    includeType, name, config = is_code_include(elem)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/pandoc_include/main.py", line 99, in is_code_include
    new_elem = pf.convert_text(elem.text)[0]
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/tools.py", line 459, in convert_text
    out = json.loads(out, object_hook=from_json)
  File "/usr/lib/python3.8/json/__init__.py", line 370, in loads
    return cls(**kw).decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/elements.py", line 1371, in from_json
    return _res_func[tag](c)
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/elements.py", line 1282, in <lambda>
    'Header': lambda c: Header(
  File "/mnt/c/Users/Tom Van Eyck/Documents/KULeuven/Studentenjob/documentation/venv/lib/python3.8/site-packages/panflute/elements.py", line 379, in __init__
    raise TypeError('Header level not between 1 and 10')

Another issue with code blocks and horizontal lines

The following markdown file:

```
* * *
```

when compiled with

pandoc --filter pandoc-include test.md -o test.html

returns

Traceback (most recent call last):
  File "~/.local/bin/pandoc-include", line 8, in <module>
    sys.exit(main())
  File "~/.local/lib/python3.7/site-packages/pandoc_include/main.py", line 380, in main
    return pf.run_filter(action, doc=doc)
  File "~/.local/lib/python3.7/site-packages/panflute/io.py", line 224, in run_filter
    return run_filters([action], *args, **kwargs)
  File "~/.local/lib/python3.7/site-packages/panflute/io.py", line 205, in run_filters
    doc = doc.walk(action, doc)
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 264, in walk
    ans = list(chain.from_iterable(ans))
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 262, in <genexpr>
    ans = ((item,) if type(item) != list else item for item in ans)
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 259, in <genexpr>
    ans = (item.walk(action, doc) for item in obj)
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 275, in walk
    altered = action(self, doc)
  File "~/.local/lib/python3.7/site-packages/pandoc_include/main.py", line 363, in action
    includeType, name, config = is_code_include(elem)
  File "~/.local/lib/python3.7/site-packages/pandoc_include/main.py", line 98, in is_code_include
    value, name, config = is_include_line(new_elem)
  File "~/.local/lib/python3.7/site-packages/pandoc_include/main.py", line 67, in is_include_line
    if (len(elem.content) not in [3, 4]) \
  File "~/.local/lib/python3.7/site-packages/panflute/base.py", line 106, in content
    return self._content
AttributeError: 'HorizontalRule' object has no attribute '_content'
Error running filter pandoc-include:
Filter returned error status 1

Note that the file does not perform any actual inclusion and compile just fine without the filter.

support mdbook include syntax

It would be nice to have an option to support the mdbook include syntax,
that'd help portability of markdown documents across tools
(e.g. using mdbook to generate html and pandoc to generate pdf from the same markdown files)

Including latex usepackege in include header reusult in error

I have in my document

# Formatting Settings

documentclass: article
fontsize: 12pt
geometry: margin=1.0in
#geometry: "left=3cm,right=3cm,top=2cm,bottom=2cm"
header-includes:
  - \usepackage{times}
  - \usepackage[mathlines,displaymath]{lineno}
  - \linenumbers
urlcolor: blue

They work, when I include them into header.yaml and use include-header I got following error

! LaTeX Error: Missing \begin{document}.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              
                                                  
l.55 \textbackslash
                    usepackage\{times\}
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on /tmp/tex2pdf.-98c2dc4c9fc43ed6/input.log.

Too strict on "File not found"

When attempting to include a file that doesn't exist, version 1.2.0 would simply print a warning:

eprint('[Warn] included file not found: ' + name)

But version 1.2.1 throws an IOError, essentially aborting pandoc:

raise IOError(f"Included file not found: {name}")

That's too harsh, in my opinion. For example, I was trying to render a file which is an "instruction" on how to use pandoc-include itself. So there would be a line there, like

!include <your_file>

And this line is now failing the whole rendering.
I believe a warning was more appropriate.

Could not find executable pandoc-include

Hey @DCsunset ,

Thanks for this cool Pandoc feature. While trying it on Windows 10, running pandoc paper.md --filter pandoc-include -o output.pdf gives me:

Error running filter pandoc-include:
Could not find executable pandoc-include

My paper.md file follows the same syntax as the example provided, although it seems a binary problem. If I search the other Pandoc filters I use, they're all in the Scripts/ folder inside Anaconda. However, I cannot find pandoc-include.

where pandoc-*
...\Anaconda3\Scripts\pandoc-citeproc.exe
...\Anaconda3\Scripts\pandoc-eqnos.exe
...\Anaconda3\Scripts\pandoc-fignos.exe
...\Anaconda3\Scripts\pandoc-tablenos.exe

Any ideas? Thanks in advance!

Non english characters are not processed correctly

If I have these spanish words in my main *.md file:
Descripción del producto
It gets correctly converted to PDF.

However, if these same words are included and hence processed by the pandoc-include filter, I get this result in the PDF:

Descripción del producto

As you can see, that special "o" character got messed up. Any idea how to prevent this?

If this can work, it will save me a lot of time!

Thanks!

Parameterized include

Hi,

Is it possible to parameterize include ?

Our use case is to include a customized code block.

Maybe is there a workaround ?

Regards,
Étienne

Using pandoc-include results in NotImplementedError

I installed pandoc-include through pip and added the executable to the path. Upon trying to use the filter I get this error:

Traceback (most recent call last):
  File "/Users/axelkennedal/.local/bin/pandoc-include", line 8, in <module>
    sys.exit(main())
  File "/Users/axelkennedal/.local/lib/python3.7/site-packages/pandoc_include.py", line 166, in main
    return pf.run_filter(action, doc=doc)
  File "/Users/axelkennedal/.local/lib/python3.7/site-packages/panflute/io.py", line 224, in run_filter
    return run_filters([action], *args, **kwargs)
  File "/Users/axelkennedal/.local/lib/python3.7/site-packages/panflute/io.py", line 197, in run_filters
    doc = load(input_stream=input_stream)
  File "/Users/axelkennedal/.local/lib/python3.7/site-packages/panflute/io.py", line 58, in load
    doc = json.load(input_stream, object_hook=from_json)
  File "/Users/axelkennedal/miniconda3/lib/python3.7/json/__init__.py", line 296, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/Users/axelkennedal/miniconda3/lib/python3.7/json/__init__.py", line 361, in loads
    return cls(**kw).decode(s)
  File "/Users/axelkennedal/miniconda3/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Users/axelkennedal/miniconda3/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/Users/axelkennedal/.local/lib/python3.7/site-packages/panflute/elements.py", line 1375, in from_json
    raise NotImplementedError(f'Unknown tag: {tag}')
NotImplementedError: Unknown tag: Caption
Error running filter pandoc-include:
Filter returned error status 1

Is this a bug or an issue with my local installation? I use the filter like this in a bash script: pandoc ${DOCPATH} -o ${OUTPUTFILE} --filter pandoc-include. Compilation with pandoc works perfectly fine if I remove the pandoc-include filter.

I'm on macOS 10.15.6 with Python 3.7.6.

Filename's underscore gets escaped?

This problem may be too niche to be investigated here, but I thought it could be useful to mention it in case other have the same issue.

When I compile this website on-line (using github's action), then the name of the files see their "_" being escaped, which breaks the inclusion.

  • What I write: !include code/overflow_example.cs

  • Where the file is: code/overflow_example.cs

  • What github's action says:

     [WARNING] Included file not found: code/overflow\_example.cs
    

As a side note, if I get rid of the underscore but use quotes, that is, if I write!include "code/overflowExample.cs" (using quotes, as indicated at https://github.com/DCsunset/pandoc-include?tab=readme-ov-file#syntax), then the quotes themselves get modified and I get

[WARNING] Included file not found: “code/overflowExample.cs”

I discuss this issue further here: csci-1301/csci-1301.github.io#155 (comment) . I am writing that this issue may be too niche because I din't see it when I compile this source code locally on my machine (even though github's action uses the exact same version of pandoc and of pandoc-include than I do, the most recent at the time of writing this issue in both cases).

Hard line breaks are not preserved

Pandoc offers an extension for managing hard line breaks within markdown: hard_line_break extension. Unfortunately this extension is not preserved on included files.

For example, issuing:

pandoc --filter pandoc-include --from=markdown+hard_line_breaks -o mainfile.pdf mainfile.md

with this file:

mainfile.md

This will support
hard line break.

<!--- this will not  -->
!include inclusion.md

All the hard brakes are skipped on the included file.
Also html tags (such as <br />) seems to be stripped off the included file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.