laurentrdc / pandoc-pyplot Goto Github PK

View Code? Open in Web Editor NEW

21.0 21.0 3.0 17.38 MB

A Pandoc filter to generate Matplotlib/Plotly figures directly in documents

License: GNU General Public License v2.0

Haskell 95.77% PowerShell 0.88% Inno Setup 3.35%

documentation matplotlib pandoc pandoc-filter

pandoc-pyplot's Issues

[FEATURE REQUEST]: Include multiple scripts

This filter has a nice option to include a file before the code:

# file.py
x = list(range(5))

# Include in markdown
```{.pyplot include=file.py}
import matplotlib.pyplot as plt

plt.figure()
plt.plot(x, list(map(lambda x: x**2, x)))
plt.title('This is an example figure')
```

But often it may be useful to include more than one script:

# file1.py
x = list(range(5))

# file2.py
y = list(map(lambda a: a**2, x))

# Include multiple files in markdown
```{.pyplot include=file1.py include=file2.py}
import matplotlib.pyplot as plt

plt.figure()
plt.plot(x, y)
plt.title('This is an example figure')
```

If pandoc does not allow repeating attributes (include=file1.py include=file2.py) then it can be achieved with include="file1.py,file2.py"

Add Additional Parameters for saveFig

Sometimes you'd like to be able to pass in additional parameters to saveFig. The usual one here, for me at least, is bbox_inches='tight' to get around issues with plots being cut off. Here's one example on SO to that effect.

It would be nice if, like dpi, there were configuration options for most (or all?) of the parameters that a user might want to pass to saveFig.

I don't think there's currently a way to do this, judging by how the saveFig call is constructed.

Unrelated, thanks for this library. 👍 Making the transition from "LaTeX + Makefiles running miscellaneous figure scripts" to Pandoc a lot easier.

Error in $: Failed reading: not a valid json value

Hey there,

I've been trying to run the example from usage, but it's throwing an error:

$ pandoc --filter pandoc-pyplot input.md --output output.pdf
Error running filter pandoc-pyplot:
Error in $: Failed reading: not a valid json value

Any idea what might cause this?

My system is x86_64 Arch Linux with pandoc version 2.7.2. I've installed pandoc-pyplot via stack.

Thanks in advance.

[FEATURE REQUEST]: store state between executions of blocks

How about storing state of python objects between executions of different blocks?

The idea is following: imagine we have two code blocks and in the second one we want to have variable available, which was defined in previous block. At the moment each code block executes in isolated (in terms of global objects available) python enviroment. So, if you are unable to do the following:

# First
```{.pyplot capture="a,b"}
import matplotlib.pyplot as plt

a = list(range(5))
b = list(range(10))

plt.figure()
plt.plot(a, list(map(lambda x: x**2, a)))
plt.title('This is an example figure')
```
# Second
```{.pyplot needs="a,b"}
import matplotlib.pyplot as plt

plt.figure()
plt.plot(a, list(map(lambda x: x**3, a)))
plt.plot(a, list(map(lambda x: x**2, b)))
plt.title('This is an example figure')
```

Sure, we can simply copy-paste a = list(range(5)) and b = list(range(10))or put it in separate file and include each time we want to have it available in our code. But this could result in overhead in performance if it something harder than list(range(10)). For example, we might need to preprocess out data, show the density histogram, then normalize the preprocessed data and show new density plot.

I can think of following approach:

Firstly, we need to launch only one instance of python process while filter is working. And feed it with our scripts. It should store our variables and function definitions and they will persist in memory during work of filter. Look how it could be achieved with bash:

# generate script files
for ((i = 0 ; i < 10 ; i++)); do echo "print($i)" > "file${i}.py"; done

# feed them to python
for ((i = 0 ; i < 10 ; i++)); do cat "file${i}.py"; done | python3

Secondly, I don't know yet how, but it would be cool if we can externally provide python process to this filter to maintain state between running of pandoc with this filter. Consider this example:

# Example
Here is only one plot
```{.pyplot capture="a"}
import matplotlib.pyplot as plt

a = list(range(5)) # there will be much harder task in real-life example
plt.figure()
plt.plot(a, list(map(lambda x: x**3, a)))
plt.title('This is an example figure')
```
Here a two plots:
```{.pyplot needs="a"}
import matplotlib.pyplot as plt

b = list(range(10))

plt.figure()
plt.plot(a, list(map(lambda x: x**3, a)))
plt.plot(a, list(map(lambda x: x**2, b)))
plt.title('This is an example figure')
```

Then I convert it with pandoc:

$ pandoc --filter pandoc-pyplot -t html5 example.md

And imagine if I slightly change the code:

  Here a two plots:
  ```{.pyplot needs="a"}
  import matplotlib.pyplot as plt
  
  b = list(range(10))

  plt.figure()
- plt.plot(a, list(map(lambda x: x**3, a)))
+ plt.plot(a, list(map(lambda x: x**4, a)))
  plt.plot(a, list(map(lambda x: x**2, b)))
  plt.title('This is an example figure')
  ```

If I try to convert it again, first code block will have to be executed as well:

$ pandoc --filter pandoc-pyplot -t html5 example.md

Maybe we can borrow some ideas from here:

dir=`mktemp -d /tmp/temp.XXX`
keep_pipe_open=$dir/keep_pipe_open
pipe=$dir/pipe

mkfifo $pipe
touch $keep_pipe_open

# Read from pipe:
python3 < $pipe &

# Keep the pipe open:
while [ -f $keep_pipe_open ]; do sleep 1; done > $pipe &

# Write to pipe:
for ((i = 0 ; i < 10 ; i++)); do cat "file${i}.py"; done > $pipe

# close the pipe:
rm $keep_pipe_open
wait

rm -rf $dir

[FEATURE REQUEST]: configuration and include scripts in YAML metadata block

Hello again :)

It is possible to read metadata of document within filter. It is much more convenient to define configurations for this filter in YAML metadata block within markdown document that is going to be processed. Moreover, pandoc has support for providing this information in separate file. This is inconvenient to have another file especially for this filter.

It will be convenient to add include scripts to this YAML metadata block as well.

So, the final example will look as following (with #6 implemented):

---
title: How to model Hypergeometric Distribution?
pyplot_interpreter: python3
pyplot_format: svg
pyplot_links: false
pyplot_includes:
  plotly: |
    ```
    import plotly as py
    import plotly.graph_objs as go
    ```
  hg: |
    ```
    import scipy as sp

    # HG implements hypergeometric distribution
    class HG(object):
        def __init__(self, N: int, m: int, n: int):
            self.N = N
            self.m = m
            self.n = n

        def p(self, k: int) -> float:
            return sp.special.comb(self.m, k) \
                   * sp.special.comb(self.N - self.m, self.n - k) \
                   / sp.special.comb(self.N, self.n)

        def __str__(self) -> str:
            return f'HG({self.N}, {self.m}, {self.n})'
    
    xi = HG(30, 15, 20)
    hist_data_x = np.arange(xi.n+1)
    ```
---

# Try plotly

<!-- In this code block I need only to include `mpl` 
as I do not use any things from `data`:-->
```{.pyplot includes="plotly"}
fig = go.Figure(
    data=(go.Scatter(
        x=list(range(10)),
        y=list(map(lambda x: x**3, range(10))),
        mode='markers',
    ),)
)
```

# Histogram

Here is a histogram of $HG(30,15,20)$:
```{.pyplot includes="plotly,hg"}
hg_hist_fig = go.Figure(
    data=(go.Scatter(
        x=list(hist_data_x),
        y=list(map(xi.p, hist_data_x)),
        mode='markers',
    ),),
    layout=go.Layout(
        title=go.layout.Title(
            text=r'$\xi \sim ' + str(xi) + '$',
            x=.5,
        ),
        yaxis=go.layout.YAxis(title=go.layout.yaxis.Title(
            text=r'$\mathbb{P}(\xi=k)$',
        )),
        xaxis=go.layout.XAxis(title=go.layout.xaxis.Title(
            text=r'$k$',
        )),
    ),
)
```

Also, it will be very nice if I can include scripts that were already in document and reference them by their name without executing them separately. Look at this example:

# Modelling

Let's firstly define class `HG` which would contain information about
parameters $N$, $m$ and $n$ and method `p(k)` for calculating
the probability of event when $\xi = k$:
```{.python .pyplot .noExec name="class"}
import scipy as sp

class HG(object):
    def __init__(self, N: int, m: int, n: int):
        self.N = N
        self.m = m
        self.n = n

    def p(self, k: int) -> float:
        return sp.special.comb(self.m, k) \
               * sp.special.comb(self.N - self.m, self.n - k) \
               / sp.special.comb(self.N, self.n)

    def __str__(self) -> str:
        return f'HG({self.N}, {self.m}, {self.n})'
```
Then create object of random variable $\xi \sim HG(30, 15, 20)$:
```{.python .pyplot .noExec name="xi"}
xi = HG(30, 15, 20)
```
Next step is to define interval $\overline{0, n}$ for $k$ where we will draw our histogram:
```{.python .pyplot .noExec name="data"}
import numpy as np

hist_data_x = np.arange(xi.n+1)
```
Finally, draw the plot:
```{.python .pyplot .noExec name="plot"}
import plotly
import plotly.graph_objs as go
hg_hist_fig = go.Figure(
    data=(go.Scatter(
        x=list(hist_data_x),
        y=list(map(xi.p, hist_data_x)),
        mode='markers',
    ),),
    layout=go.Layout(
        title=go.layout.Title(
            text=r'$\xi \sim ' + str(xi) + '$',
            x=.5,
        ),
        yaxis=go.layout.YAxis(title=go.layout.yaxis.Title(
            text=r'$\mathbb{P}(\xi=k)$',
        )),
        xaxis=go.layout.XAxis(title=go.layout.xaxis.Title(
            text=r'$k$',
        )),
    ),
)
```
The result would be following:
```{.pyplot include="class,xi,data,plot"}
# I do not have to copy-paste anything here.
```

It would be awesome if you can implement this!

An error breaks the installation of pandoc-pyplot

when i install pandoc-pyplot with cabal install, i catch an error:
`
[1 of 3] Compiling Text.Pandoc.Filter.Scripting ( src/Text/Pandoc/Filter/Scripting.hs, dist/build/Text/Pandoc/Filter/Scripting.o )

src/Text/Pandoc/Filter/Scripting.hs:45:50: error:
* Variable not in scope:
(<>) :: m0 ExitCode -> String -> IO ExitCode
* Perhaps you meant one of these:
</>' (imported from System.FilePath), <$>' (imported from Prelude), *>' (imported from Prelude) Perhaps you want to add <>' to the import list in the import of
`Data.Monoid' (src/Text/Pandoc/Filter/Scripting.hs:27:1-37).

src/Text/Pandoc/Filter/Scripting.hs:57:32: error:
* Variable not in scope: (<>) :: [Char] -> String -> t0
* Perhaps you meant one of these:
</>' (imported from System.FilePath), <$>' (imported from Prelude), *>' (imported from Prelude) Perhaps you want to add <>' to the import list in the import of
`Data.Monoid' (src/Text/Pandoc/Filter/Scripting.hs:27:1-37).

src/Text/Pandoc/Filter/Scripting.hs:57:46: error:
* Variable not in scope: (<>) :: t0 -> [Char] -> PythonScript
* Perhaps you meant one of these:
</>' (imported from System.FilePath), <$>' (imported from Prelude), *>' (imported from Prelude) Perhaps you want to add <>' to the import list in the import of
Data.Monoid' (src/Text/Pandoc/Filter/Scripting.hs:27:1-37). cabal: Leaving directory '.' cabal: Error: some packages failed to install: pandoc-pyplot-1.0.2.0 failed during the building phase. The exception was: ExitFailure 1

on Ubuntu 18.04 gcc-7 ghc-8.0.2 cabal-install version 1.24.0.2

Problem with pandoc 2.8

I'm getting the following error, when using pandoc-pyplot as a filter:

pandoc-pyplot: Error in $: Incompatible API versions: encoded with [1,20] but attempted to decode with [1,17,5,4].
CallStack (from HasCallStack):
  error, called at .\Text\Pandoc\JSON.hs:111:48 in pandoc-types-1.17.5.4-1YIMZftZkkF55zFQfWBdlu:Text.Pandoc.JSON
Error running filter pandoc-pyplot:
Filter returned error status 1
make: *** [Makefile:89: result.pdf] Error 83

pandoc-pyplot: 2.2.0.0
Pandoc version: 2.8
System: Windows 10 Pro x64, Polish language

Limited to matplotlib

Hello!

Let me first thank you for such an amaizing filter!!! I was looking especially for this kind of functionality and run into this repo.

But the only disadvantage of the approach you are using is that with this filter I am limited to matplotlib and can't use any other, like Plotly. Matplotlib is great, but it lacks some features and it would be awesome to add support for other libraries, too.

Instead of adding support for each library individually, I suggest to implement more general functionality (I can't make a PR as I do not know Haskell yet) by providing exporter argument that should be a function that takes filename and does whatever stuff that should be done to save figure. For matplotlib it will look something like following:

```{.pyplot type="svg" exporter="lambda filename: plt.savefig(filename)')"}
plt.figure()
plt.plot([0,1,2,3,4], [1,2,3,4,5])
plt.title('This is an example figure')
```

This approach allows you to use any library and any format that you want:

```{.pyplot type="html" exporter="lambda filename: py.offline.plot(hg_hist_fig, filename=filename"}
import plotly as py
import scipy as sp

hg = sp.stats.hypergeom(xi.N,xi.m,xi.n)
hg_hist_x = range(*map(int, sp.stats.hypergeom.interval(1, 80, 15, 20)))
hg_hist_fig = go.Figure(,
    data=(go.Scatter(,
        name='histogram',
        x=list(hg_hist_x),
        y=list(map(hg.pmf, hg_hist_x)),
        mode='markers',
    ),),
    layout=go.Layout(,
        title=go.layout.Title(,
            text=r'$$\\xi \\sim HG(80, 15, 20)$$',
            x=.5,
        ),
        yaxis=go.layout.YAxis(,
            title=go.layout.yaxis.Title(,
                text=r'$$\\mathbb{P}(\\xi=k)$$',
            ),
        ),
        xaxis=go.layout.XAxis(,
            title=go.layout.xaxis.Title(,
                text=r'$$k$$',
            ),
        ),
    ),
)
```

In this example we embed html in final document (it may me useful for converting your markdown to html and then to pdf).

P.S.: Take a look at this repo it may contain some great ideas as well.

Support for interactive Plotly figures when possible

Based on a comment in #4 :

Pandoc filters only work on the intermediate document representation.

As I see here, getting the output format it possible (I am not sure, because I don't know Haskell)

The idea is to support interactive Plotly plots (via HTML), but fall back to static images when the output format is not HTML, (e.g. PDF)

From Pandoc toJSONFilter documentation:

An alternative is to use the type Maybe Format -> a -> a. This is appropriate when the first argument of the script (if present) will be the target format, and allows scripts to behave differently depending on the target format. The pandoc executable automatically provides the target format as argument when scripts are called using the --filter option.

laurentrdc / pandoc-pyplot Goto Github PK

pandoc-pyplot's Issues

[FEATURE REQUEST]: Include multiple scripts

Add Additional Parameters for saveFig

Error in $: Failed reading: not a valid json value

[FEATURE REQUEST]: store state between executions of blocks

[FEATURE REQUEST]: configuration and include scripts in YAML metadata block

An error breaks the installation of pandoc-pyplot

Problem with pandoc 2.8

Limited to matplotlib

Support for interactive Plotly figures when possible

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent