m-rossi / jupyter-docx-bundler Goto Github PK

View Code? Open in Web Editor NEW

39.0 5.0 5.0 255 KB

Jupyter bundler extension to export notebook as a docx file

License: MIT License

Python 100.00%

jupyter-notebook jupyter jupyter-notebook-extension docx jupyterlab

jupyter-docx-bundler's Introduction

Jupyter docx bundler extension

Jupyter bundler extension to export notebook as a docx file

Installation

Using conda

conda install -c conda-forge jupyter-docx-bundler

Using pip

Make sure you have Pandoc installed, see installing-pandoc for instructions.

pip install jupyter-docx-bundler

Usage

Adding Metadata

The bundle extension uses metadata of the notebook, if you you provide it.

"title": "Notebook title"
"authors": [{"name": "author1"}, {"name": "author2"}]
"subtitle": "Notebook subtitle"
"date": "Notebook date"

The notebook metadata can be edited under Edit -> Edit Notebook Metadata.

Hiding inputs or complete code cells

You can hide individual code cells or just their inputs by defining cell tags:

nbconvert-remove-cell: Remove the entire cell
nbconvert-remove-input: Remove the input code of the cell

(Currently there are no default values configured for these tags, the ones listed above are defined in my code and not in nbconvert. This may will change in the future.)

Cell tags can be shown by activating the cell toolbar under View -> Cell Toolbar -> Tags.

Hiding all inputs

It is also possible to hide all inputs. To achive this you need to add the following lines to your notebook metadata:

{
    "jupyter-docx-bundler": {
        "exclude_input": "True"
    }
}

The notebook metadata can be edited under Edit -> Edit Notebook Metadata.

Direct call from console (nbconvert)

To use the bundler direct from console the nbconvert utility can be used with target format docx:

jupyter nbconvert --execute --to=docx <source notebook>.ipynb --output <target document>.docx

The --execute option should be used to ensure that the notebook is run before generation.

Development

See CONTRIBUTING

jupyter-docx-bundler's People

Contributors

Stargazers

Watchers

Forkers

kirschma zymitsky stevenms pgierz tanguyfi

jupyter-docx-bundler's Issues

Add pandoc 2.11 to CI-testing

Pandoc version 2.11 just got released: https://github.com/jgm/pandoc/releases/tag/2.11. We should add this to CI when its available on conda-forge.

Use pytest-azurepipelines to simplify CI-commands

Use the great plugin https://github.com/tonybaloney/pytest-azurepipelines to perform testing and uploading of test results.

Tests fail unreliable

The test remove_all_inputs_notebook fails unreliable sometimes. I made a table to list the used versions:

OS	Python	Pandoc	pipeline fails
Windows	3.8.5	2.8	2
Windows	3.8.5	2.10	4
Windows	3.8.5	2.11	1
Windows	3.9.0	2.7	1
Windows	3.9.0	2.8	1
Windows	3.9.0	2.9	1
Windows	3.9.0	2.11	1

Remove the default 'Notebook' title if there is no title in the metadata

Hi, I was wondering why the output of this gives me a 'Notebook' title when the notebook metadata has no title. Is this intentional default behaviour?

Jupyterlab bundler extension

I'm unable to find an information how to install a bundler extension into jupyter lab. Maybe someone can help?

Extension of *.docx is file is missing on windows

When exporting a notebook through Jupyter Notebook (not Lab!) the extension of the file is missing.

Modernize packaging

See https://packaging.python.org/en/latest/tutorials/packaging-projects

Crash in setup of all fixtures

Describe the bug
Crash in setup of all fixtures:

_______ ERROR at setup of test_pandas_html_table[multirow-multicolumn] ________

request = <FixtureRequest for <Function test_pandas_html_table[multirow-multicolumn]>>

    def fill(request):
        item = request._pyfuncitem
        fixturenames = getattr(item, "fixturenames", None)
        if fixturenames is None:
            fixturenames = request.fixturenames

        if hasattr(item, 'callspec'):
            for param, val in sorted_by_dependency(item.callspec.params, fixturenames):
                if val is not None and is_lazy_fixture(val):
                    item.callspec.params[param] = request.getfixturevalue(val.name)
                elif param not in item.funcargs:
>                   item.funcargs[param] = request.getfixturevalue(param)

C:\Miniconda\envs\test\lib\site-packages\pytest_lazyfixture.py:37: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
jupyter_docx_bundler\tests\conftest.py:480: in pandas_html_table_notebook
    ep.preprocess(nb, {'metadata': {'path': tmpdir}})
C:\Miniconda\envs\test\lib\site-packages\nbconvert\preprocessors\execute.py:79: in preprocess
    self.execute()
C:\Miniconda\envs\test\lib\site-packages\nbclient\util.py:74: in wrapped
    return just_run(coro(*args, **kwargs))
C:\Miniconda\envs\test\lib\site-packages\nbclient\util.py:53: in just_run
    return loop.run_until_complete(coro)
C:\Miniconda\envs\test\lib\asyncio\base_events.py:616: in run_until_complete
    return future.result()
C:\Miniconda\envs\test\lib\site-packages\nbclient\client.py:537: in async_execute
    async with self.async_setup_kernel(**kwargs):
C:\Miniconda\envs\test\lib\site-packages\async_generator\_util.py:34: in __aenter__
    return await self._agen.asend(None)
C:\Miniconda\envs\test\lib\site-packages\nbclient\client.py:495: in async_setup_kernel
    await self.async_start_new_kernel(**kwargs)
C:\Miniconda\envs\test\lib\site-packages\nbclient\client.py:407: in async_start_new_kernel
    await ensure_async(self.km.start_kernel(extra_arguments=self.extra_arguments, **kwargs))
C:\Miniconda\envs\test\lib\site-packages\nbclient\util.py:85: in ensure_async
    result = await obj
C:\Miniconda\envs\test\lib\site-packages\jupyter_client\manager.py:575: in start_kernel
    self.kernel = await self._launch_kernel(kernel_cmd, **kw)
C:\Miniconda\envs\test\lib\site-packages\jupyter_client\manager.py:556: in _launch_kernel
    res = launch_kernel(kernel_cmd, **kw)
C:\Miniconda\envs\test\lib\site-packages\jupyter_client\launcher.py:134: in launch_kernel
    proc = Popen(cmd, **kwargs)
C:\Miniconda\envs\test\lib\subprocess.py:854: in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <subprocess.Popen object at 0x0000020BC508F820>
args = 'C:/Miniconda/envs/test/bin/python -m ipykernel_launcher -f C:\\Users\\runneradmin\\AppData\\Local\\Temp\\tmpvvrm9jyc.json --HistoryManager.hist_file=:memory:'
executable = None, preexec_fn = None, close_fds = False, pass_fds = ()
cwd = 'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\pytest-of-runneradmin\\pytest-0\\test_pandas_html_table_multiro1'
env = {'ACLOCAL_PATH': 'C:\\Program Files\\Git\\mingw64\\share\\aclocal;C:\\Program Files\\Git\\usr\\share\\aclocal', 'ALLUS...gramData', 'ANDROID_HOME': 'C:\\Android\\android-sdk', 'ANDROID_NDK_HOME': 'C:\\Android\\android-sdk\\ndk-bundle', ...}
startupinfo = <subprocess.STARTUPINFO object at 0x0000020BC508FFD0>
creationflags = 0, shell = False, p2cread = Handle(2912), p2cwrite = 13
c2pread = -1, c2pwrite = Handle(2896), errread = -1, errwrite = Handle(2916)
unused_restore_signals = True, unused_start_new_session = False

    def _execute_child(self, args, executable, preexec_fn, close_fds,
                       pass_fds, cwd, env,
                       startupinfo, creationflags, shell,
                       p2cread, p2cwrite,
                       c2pread, c2pwrite,
                       errread, errwrite,
                       unused_restore_signals, unused_start_new_session):
        """Execute program (MS Windows version)"""

        assert not pass_fds, "pass_fds not supported on Windows."

        if isinstance(args, str):
            pass
        elif isinstance(args, bytes):
            if shell:
                raise TypeError('bytes args is not allowed on Windows')
            args = list2cmdline([args])
        elif isinstance(args, os.PathLike):
            if shell:
                raise TypeError('path-like args is not allowed when '
                                'shell is true')
            args = list2cmdline([args])
        else:
            args = list2cmdline(args)

        if executable is not None:
            executable = os.fsdecode(executable)

        # Process startup details
        if startupinfo is None:
            startupinfo = STARTUPINFO()
        else:
            # bpo-34044: Copy STARTUPINFO since it is modified above,
            # so the caller can reuse it multiple times.
            startupinfo = startupinfo.copy()

        use_std_handles = -1 not in (p2cread, c2pwrite, errwrite)
        if use_std_handles:
            startupinfo.dwFlags |= _winapi.STARTF_USESTDHANDLES
            startupinfo.hStdInput = p2cread
            startupinfo.hStdOutput = c2pwrite
            startupinfo.hStdError = errwrite

        attribute_list = startupinfo.lpAttributeList
        have_handle_list = bool(attribute_list and
                                "handle_list" in attribute_list and
                                attribute_list["handle_list"])

        # If we were given an handle_list or need to create one
        if have_handle_list or (use_std_handles and close_fds):
            if attribute_list is None:
                attribute_list = startupinfo.lpAttributeList = {}
            handle_list = attribute_list["handle_list"] = \
                list(attribute_list.get("handle_list", []))

            if use_std_handles:
                handle_list += [int(p2cread), int(c2pwrite), int(errwrite)]

            handle_list[:] = self._filter_handle_list(handle_list)

            if handle_list:
                if not close_fds:
                    warnings.warn("startupinfo.lpAttributeList['handle_list'] "
                                  "overriding close_fds", RuntimeWarning)

                # When using the handle_list we always request to inherit
                # handles but the only handles that will be inherited are
                # the ones in the handle_list
                close_fds = False

        if shell:
            startupinfo.dwFlags |= _winapi.STARTF_USESHOWWINDOW
            startupinfo.wShowWindow = _winapi.SW_HIDE
            comspec = os.environ.get("COMSPEC", "cmd.exe")
            args = '{} /c "{}"'.format (comspec, args)

        if cwd is not None:
            cwd = os.fsdecode(cwd)

        sys.audit("subprocess.Popen", executable, args, cwd, env)

        # Start the process
        try:
>           hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                                     # no special security
                                     None, None,
                                     int(not close_fds),
                                     creationflags,
                                     env,
                                     cwd,
                                     startupinfo)
E                                    FileNotFoundError: [WinError 2] The system cannot find the file specified

C:\Miniconda\envs\test\lib\subprocess.py:1307: FileNotFoundError

Desktop (please complete the following information):

OS: Windows
Python version: 3.7, 3.9
Pandoc version: 2.11

CI-runs:

Re-enable testing of matplotlib on macOS when matplotlib 3.1 is released

Notebook model['name'] is different on Unix and Windows

It seems like the field model['name'] is differently filled on Unix and Windows:
https://github.com/m-rossi/jupyter_docx_bundler/blob/85b312f4ac1bc152be814e652423571af95fab9c/jupyter_docx_bundler/__init__.py#L34

On Windows it contains also the path to the file.
On Unix it contains only the filename.

This can cause an issue, if the file which will be bundled is in another folder when using Windows.

HTML-tables in output cells are broken in export

If you create a notebook with a HTML-table in the outputs. The new pandoc-only method ignores that HTML-tags and just uses the text-representation.

add support to reference.docx with panic

when using pandoc to-docx we can use a template reference.docx that contains the formatting -for example the fonts to be used

It would be nice if this would work with Jupyter-docx-bundler.

Cell input removal broken with nbconvert 7

Describe the bug
When using nbconvert 7 the cell input removal method is broken.

Expected behavior
Removal of input with nbconvert 7

Add conda-forge receipe

When I release version 0.2.0 of the package to pypi I will also make a conda-forge recipe to make the package more broadly available.

No export for multiple (matplotlib)-images in output cell

The following code creates 5 images in cell, however only the first one gets exported.

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

for ii in np.arange(5):
    plt.figure()
    plt.plot(np.arange(100)**2)
    plt.show()

Development environment

I can't setup development environment, pandoc requirements cannot be satisfied :

ERROR: Could not find a version that satisfies the requirement pandoc>=2.7 (from -r requirements.txt (line 3)) (from versions: 1.0.0a0, 1.0.0a3, 1.0.0a6, 1.0.0a7, 1.0.0a8, 1.0.0a12, 1.0.0a13, 1.0.0a14, 1.0.0a16, 1.0.0a17, 1.0.0a18, 1.0.0a19, 1.0.0b1, 1.0.0b2, 1.0.0b3, 1.0.0, 1.0.1, 1.0.2, 2.0a1, 2.0a2, 2.0a3, 2.0a4)
ERROR: No matching distribution found for pandoc>=2.7 (from -r requirements.txt (line 3))

Export fails with complex table heading

Describe the bug
If I try to export a table with a rather complex table heading, for example created by pint_pandas I get an error.

To reproduce
I have a notebook with the following content

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(6, 4), columns=[list("1122"), list("ABCD")])
df.index.name = "custom-index"
df.columns.names = (None, "unit")

An export fails with the following error

[Application] ERROR | nbconvert failed: Must pass non-zero number of levels/codes
Traceback (most recent call last):
  File "C:\minforge\lib\site-packages\jupyter_server\nbconvert\handlers.py", line 131, in get
    output, resources = await run_sync(
  File "C:\minforge\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\minforge\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "C:\minforge\lib\asyncio\futures.py", line 284, in __await__
    yield self  # This tells Task to wait for completion.
  File "C:\minforge\lib\asyncio\tasks.py", line 304, in __wakeup
    future.result()
  File "C:\minforge\lib\asyncio\futures.py", line 201, in result
    raise self._exception
  File "C:\minforge\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "C:\minforge\lib\site-packages\jupyter_server\nbconvert\handlers.py", line 132, in <lambda>
    lambda: exporter.from_notebook_node(nb, resources=resource_dict)
  File "C:\minforge\lib\site-packages\jupyter_docx_bundler\__init__.py", line 79, in from_notebook_node
    converters.notebookcontent_to_docxbytes(
  File "C:\minforge\lib\site-packages\jupyter_docx_bundler\converters.py", line 262, in notebookcontent_to_docxbytes
    content = preprocess(content, path, handler=handler)
  File "C:\minforge\lib\site-packages\jupyter_docx_bundler\converters.py", line 193, in preprocess
    raise e
  File "C:\minforge\lib\site-packages\jupyter_docx_bundler\converters.py", line 185, in preprocess
    html_to_pandas_table(output['data']['text/html']).to_markdown(),
  File "C:\minforge\lib\site-packages\jupyter_docx_bundler\converters.py", line 94, in html_to_pandas_table
    df.set_index(index_column_names, inplace=True)
  File "C:\minforge\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "C:\minforge\lib\site-packages\pandas\core\frame.py", line 5555, in set_index
    index = ensure_index_from_sequences(arrays, names)
  File "C:\minforge\lib\site-packages\pandas\core\indexes\base.py", line 6984, in ensure_index_from_sequences
    return MultiIndex.from_arrays(sequences, names=names)
  File "C:\minforge\lib\site-packages\pandas\core\indexes\multi.py", line 493, in from_arrays
    return cls(
  File "C:\minforge\lib\site-packages\pandas\core\indexes\multi.py", line 321, in __new__
    raise ValueError("Must pass non-zero number of levels/codes")
ValueError: Must pass non-zero number of levels/codes

Desktop (please complete the following information):

OS: Windows 10
Browser: Firefox
Pandoc version: 2.13

Update Codecov integration

https://about.codecov.io/blog/codecov-is-updating-its-github-integration

Use GitHub Actions instead of pipelines.

Build source-distribution and wheel with GitHub actions

Use "title" instead of "name" for notebook title to match notebook format

The notebook format has a metadata field for the "title" of the notebook: https://github.com/jupyter/nbformat/blob/master/nbformat/v4/nbformat.v4.schema.json#L63
I think it should be used instead of the field "name".

expor shows the input cell code

Describe the bug
I have just updated to nbconvert 7.2.6. I have run my export as usual :
python3 -m jupyter nbconvert --execute --to=docx alpha.ipynb --output alpha.docx

The problem is that now all the code cells are showing in the word document.
The metadata is correctly telling it not to do that.

Desktop (please complete the following information):

OS: MACOS 12.5.1
Browser [e.g. chrome, safari] chrome and safari
Versions of Python packages, find out with pip list: all updated today dec 13, 2022
Pandoc version pandoc --version pandoc 2.17.1.1

pip list:
Package Version

anyio 3.6.2
appdirs 1.4.4
appnope 0.1.3
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
arrow 1.2.3
asttokens 2.2.1
attrs 22.1.0
autopep8 2.0.0
Babel 2.11.0
backcall 0.2.0
beautifulsoup4 4.11.1
black 22.12.0
bleach 5.0.1
bokeh 3.0.3
certifi 2022.12.7
cffi 1.15.1
charset-normalizer 2.1.1
click 8.1.3
colorlover 0.3.0
comm 0.1.2
contourpy 1.0.6
cufflinks 0.17.3
cycler 0.11.0
debugpy 1.6.4
decorator 5.1.1
defusedxml 0.7.1
entrypoints 0.4
executing 1.2.0
fastjsonschema 2.16.2
fonttools 4.38.0
fqdn 1.5.1
idna 3.4
import-ipynb 0.1.4
importlib-metadata 5.1.0
ipykernel 6.19.2
ipython 8.7.0
ipython-genutils 0.2.0
ipywidgets 8.0.3
isoduration 20.11.0
jedi 0.18.2
Jinja2 3.1.2
joblib 1.2.0
json5 0.9.10
jsonpointer 2.3
jsonschema 4.17.3
jupyter 1.0.0
jupyter_client 7.4.8
jupyter-console 6.4.4
jupyter-contrib-core 0.4.2
jupyter-contrib-nbextensions 0.7.0
jupyter_core 5.1.0
jupyter-docx-bundler 0.3.4
jupyter-events 0.5.0
jupyter-highlight-selected-word 0.2.0
jupyter-latex-envs 1.4.6
jupyter-nbextensions-configurator 0.6.1
jupyter_server 2.0.1
jupyter_server_terminals 0.4.2
jupyterlab 3.5.1
jupyterlab-pygments 0.2.2
jupyterlab_server 2.16.5
jupyterlab-widgets 3.0.4
kaleido 0.2.1
keras 2.11.0
kiwisolver 1.4.4
lxml 4.9.1
MarkupSafe 2.1.1
matplotlib 3.6.2
matplotlib-inline 0.1.6
mistune 2.0.4
mlxtend 0.21.0
mypy-extensions 0.4.3
nbclassic 0.4.8
nbclient 0.7.2
nbconvert 7.2.6
nbformat 5.7.0
nest-asyncio 1.5.6
notebook 6.5.2
notebook_shim 0.2.2
numpy 1.23.5
ordered-set 4.1.0
packaging 22.0
pandas 1.5.2
pandoc 2.3
pandocfilters 1.5.0
parso 0.8.3
pathspec 0.10.3
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.3.0
pip 22.3.1
platformdirs 2.6.0
plotly 5.11.0
plumbum 1.8.0
ply 3.11
prometheus-client 0.15.0
prompt-toolkit 3.0.36
psutil 5.9.4
ptyprocess 0.7.0
pure-eval 0.2.2
pycodestyle 2.10.0
pycparser 2.21
pyee 9.0.4
Pygments 2.13.0
PyLaTeX 1.4.1
pypandoc 1.10
pyparsing 3.0.9
pyppeteer 1.0.2
pyrsistent 0.19.2
python-dateutil 2.8.2
python-json-logger 2.0.4
pytz 2022.6
PyYAML 6.0
pyzmq 24.0.1
qtconsole 5.4.0
QtPy 2.3.0
requests 2.28.1
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
scikit-learn 1.2.0
scikit-multilearn 0.2.0
scipy 1.9.3
seaborn 0.12.1
Send2Trash 1.8.0
setuptools 65.6.3
six 1.16.0
sniffio 1.3.0
soupsieve 2.3.2.post1
stack-data 0.6.2
tabulate 0.9.0
tenacity 8.1.0
terminado 0.17.1
testpath 0.6.0
threadpoolctl 3.1.0
tinycss2 1.2.1
toml 0.10.2
tomli 2.0.1
tornado 6.2
tqdm 4.64.1
traitlets 5.7.1
typing_extensions 4.4.0
uri-template 1.2.0
urllib3 1.26.13
voila 0.4.0
wcwidth 0.2.5
webcolors 1.12
webencodings 0.5.1
websocket-client 1.4.2
websockets 10.4
wheel 0.38.4
widgetsnbextension 4.0.4
xyzservices 2022.9.0
yellowbrick 1.5
zipp 3.11.0

Add recent Pandoc versions to testing in CI

Linked images not export document

When you link an image with ![image](path/to/image.png) this image does not get exported to the final document.

Make bundler non-blocking for notebook server

As mentioned in Writing the bundle function the bundling process can block the notebook server for bigger notebooks.

We should modify the bundle function based on this example

from tornado import gen

@gen.coroutine
def bundle(handler, model):
  # simulate a long running IO op (e.g., deploying to a remote host)
  yield gen.sleep(10)

  # now respond
  handler.finish('I spent 10 seconds bundling {}!'.format(model['path']))

Enabling bundlerextension fails on Windows

It seems like "jupyter-bundlerextension.exe" is not present in the scripts directory for the recent Anaconda distribution. Seems to be a bug in the notebook.

(Clean install of Anaconda 5.0.1 on Windows 10, 64bit)

Update setup-miniconda to version 2.0

setup-miniconda version 2.0 was just released: https://github.com/conda-incubator/setup-miniconda/releases/tag/v2.0.0

Mutli-column tables broken in export

A multi-column table of pandas is broken in docx-export. An example is below:

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(6, 4), columns=[list('1122'), list('ABCD')])
df

Add support for pandoc 2.15

Pandoc 2.15 just got released. Once conda-forge/pandoc-feedstock#99 is merged, we need to add 2.15 to our test matrix.

Markdown linked image with extra title - conversion fails

With a in-line style markdown cell that contains

![PI Dashboard](Screen_Shot_2020-02-18_at_1.47.26_PM.png "SCP Connected UE devices")

the converter attempts to find the file:

/home/wkv/projects/test/notebook/Screen_Shot_2020-02-18_at_1.47.26_PM.png "SCP Connected UE devices"

causing a 500 server error.

jupyter-docx-bundler-0.2.1 |           py37_0          18 KB  conda-forge
jupyter                   1.0.0                    py37_7  
notebook                  6.0.3                    py37_0

Add pandoc 2.12 to test-matrix

As pandoc version 2.12 is released we need to add it to the test matrix once it's available on conda-forge.

WARNING | Config option `template_path` not recognized by `DocxExporter`

When I run, I get the following warning.

WARNING | Config option `template_path` not recognized by `DocxExporter`

What is the best way to remove this warning?

Testing on Windows with embedded images

Currently I am unable to test all image formats on Windows because the conda pillow package seems to be broken: conda-forge/pillow-feedstock#45

Add support for JavaScript-based visualizations

JavaScript-based visualizations are more common these days. I would like to support these in the export.

Here is an incomplete list of visualization frameworks I know and like to support:

Plotly
~~Bokeh~~
~~Altair~~

Feel free to comment if I am missing anything here.

Error that notebook JSON is invalid is thrown during export, however notebook is exported properly

[E 10:25:46.372 LabApp] Notebook JSON is invalid: Additional properties are not allowed ('transient' was unexpected)

    Failed validating 'additionalProperties' in code_cell:

    On instance['cells'][0]:
    {'cell_type': 'code',
     'execution_count': 1,
     'metadata': {'tags': ['nbconvert-remove-input'], 'trusted': True},
     'outputs': ['...1 outputs...'],
     'source': 'jupyter-docx-bundler-remove-input',
     'transient': {'remove_source': True}}

Improve syntax highlighting

Syntax highlighting is working since #8/#40, however not all keywords are highlighted as you would expect it/see it in Jupyter. This is especially true für the import keyword. The reason is the default color scheme pygments. We need to provide an edited theme, for example this definition colors the import statements:

"Import": {
    "text-color": "#008000",
    "background-color": null,
    "bold": true,
    "italic": false,
    "underline": false
},

Embedded images do not work

If you embed an image inside a markdown cell with drag&drop which create the following code

![image.png](attachment:image.png)

the image does not show up in the exported file.

Feature request: Parameter to add global cell tags

On some conversions it is interesting to hide all input cells. For the Latex converter there is support to templates, which provide an easy way to remove input cells or change the output in other ways. Since you have already defined specific cell tags to remove entire cells or just cell inputs, all that is missing is a way to define this tag only once for all cells.

I have long notebooks which I export to Latex/PDF without cell inputs, I want to export them to docx using this extension but without having to manually annotate every cell.

Rerun flaky tests with pytest-rerunfailrues

The next time a test fails unreliable (so-called flaky test) use pytest-rerunfailrues to rerun that specific test.

dead kernel after installation with conda

Hi,
I just installed the docs-bundler on a py 2.7 environment, now the Kernel dies immediately when I open a jupyter notebook with the py27 kernel. Same notebooks are running if I change the kernel.
Since people are already using it, is this an issue with python 2.7?

More metadata

What's about adding more metadata to the notebook like

date
subtitle
...

Currently this type of metadata is not defined in the notebook format specification: http://nbformat.readthedocs.io/en/latest/format_description.html#metadata

Bundler throwing errors and knocking classic notebook server over

I think this is a jupyter-docx-bundler issue: installing into classic notebook, it seems to throw a lot of errors the notebook log and cause a 500 error serving notebooks:

 HTTPServerRequest(protocol='http', host='localhost:8897', method='GET', uri='/notebooks/work/Untitled.ipynb?kernel_name=python3', version='HTTP/1.1', remote_ip='172.17.0.1')
    Traceback (most recent call last):
      File "/usr/local/lib/python3.8/dist-packages/jupyter_docx_bundler/converters.py", line 3, in <module>
        from importlib.resources import files as resources_files
    ImportError: cannot import name 'files' from 'importlib.resources' (/usr/lib/python3.8/importlib/resources.py)
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/local/lib/python3.8/dist-packages/tornado/web.py", line 1704, in _execute
        result = await result
      File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 775, in run
        yielded = self.gen.send(value)
      File "/usr/local/lib/python3.8/dist-packages/notebook/notebook/handlers.py", line 95, in get
        self.write(self.render_template('notebook.html',
      File "/usr/local/lib/python3.8/dist-packages/notebook/base/handlers.py", line 516, in render_template
        return template.render(**ns)
      File "/usr/local/lib/python3.8/dist-packages/jinja2/environment.py", line 1304, in render
        self.environment.handle_exception()
      File "/usr/local/lib/python3.8/dist-packages/jinja2/environment.py", line 925, in handle_exception
        raise rewrite_traceback_stack(source=source)
      File "/usr/local/lib/python3.8/dist-packages/notebook/templates/notebook.html", line 1, in top-level template code
        {% extends "page.html" %}
      File "/usr/local/lib/python3.8/dist-packages/notebook/templates/page.html", line 154, in top-level template code
        {% block header %}
      File "/usr/local/lib/python3.8/dist-packages/notebook/templates/notebook.html", line 115, in block 'header'
        {% for exporter in get_frontend_exporters() %}
      File "/usr/local/lib/python3.8/dist-packages/notebook/notebook/handlers.py", line 41, in get_frontend_exporters
        exporter_class = get_exporter(name)
      File "/usr/local/lib/python3.8/dist-packages/nbconvert/exporters/base.py", line 98, in get_exporter
        return entrypoints.get_single('nbconvert.exporters', name).load()
      File "/usr/local/lib/python3.8/dist-packages/entrypoints.py", line 82, in load
        mod = import_module(self.module_name)
      File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
      File "<frozen importlib._bootstrap>", line 991, in _find_and_load
      File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 783, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "/usr/local/lib/python3.8/dist-packages/jupyter_docx_bundler/__init__.py", line 7, in <module>
        from . import converters
      File "/usr/local/lib/python3.8/dist-packages/jupyter_docx_bundler/converters.py", line 5, in <module>
        from importlib_resources import files as resources_files
    ModuleNotFoundError: No module named 'importlib_resources'

Add syntax highlighting

Currently there is no syntax highlighting in the final docx although the intermediate HTML export of nbconvert correctly highlights the code

Fails with virtualenv

I have installed jupyter-docx-bundler in a virtual environment and activated it.

However, it seems not to work:

When I install nbconvert, it does not install the nbconvert binary. There is a jupyter-nbconvert though, so I tried that.

Even though I followed the installation instructions and did:

 $ jupyter bundlerextension enable --py jupyter_docx_bundler --sys-prefix
 Enabling docx_bundler bundler jupyter_docx_bundler...

I see this:

  $ jupyter-nbconvert --execute --to=docx Simple\ Time\ Series.ipynb --output Simple\ Time\ Series.docx
Traceback (most recent call last):
  File "/Users/rpg/.virtualenvs/time-series-stats/bin/jupyter-nbconvert", line 10, in <module>
    sys.exit(main())
  File "/Users/rpg/.virtualenvs/time-series-stats/lib/python3.7/site-packages/jupyter_core/application.py", line 266, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/Users/rpg/.virtualenvs/time-series-stats/lib/python3.7/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/Users/rpg/.virtualenvs/time-series-stats/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 337, in start  self.convert_notebooks()
  File "/Users/rpg/.virtualenvs/time-series-stats/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 496, in convert_notebooks
    cls = get_exporter(self.export_format)
  File "/Users/rpg/.virtualenvs/time-series-stats/lib/python3.7/site-packages/nbconvert/exporters/base.py", line 113, in get_exporter
    % (name, ', '.join(get_export_names())))
ValueError: Unknown exporter "docx", did you mean one of: asciidoc, custom, html, latex, markdown, notebook, pdf, python, rst, script, slides?

However, the package is installed:

$ find ~/.virtualenvs/time-series-stats/ -name 'jupyter_docx_bundler*'
/Users/rpg/.virtualenvs/time-series-stats/lib/python3.7/site-packages/jupyter_docx_bundler
/Users/rpg/.virtualenvs/time-series-stats/lib/python3.7/site-packages/jupyter_docx_bundler-0.1.3.dist-info
(time-series-stats) rpg@RPG-MacBook-Pro: ~/projects/solar-guard/learning $

So it looks like the installation didn't work right. Is there any chance that the installation process doesn't "understand" virtual environments?

Sympy equations ?

Describe the bug
LaTeX equations display well in Mardown mode (for example $a^3=5$ in a markdown section is well exported to Word) Whereas sympy results are not. Is this a bug or there is a specific way to render equations for docx-bundler ?

To reproduce
What works : in a Markdown cell : $a^3=5$
What does not work : in a python cell :
import sympy as sp
sp.var("a b c")
from IPython.display import display
display((a/b)**3)

Expected behavior
What would be beautiful would be to export this to word too !

Screenshots
in Jupyter :

in LibreOffice :

Desktop (please complete the following information):

OS: Linux Mint XFCE 20
Browser Chrome and Firefox
Versions of Python packages, find out with `pip list````

pip list
Package Version

actionlib 1.13.2
alabaster 0.7.12
anaconda-client 1.7.2
anaconda-navigator 2.0.1
anaconda-project 0.9.1
angles 1.9.13
anyio 2.2.0
appdirs 1.4.4
argh 0.26.2
argon2-cffi 20.1.0
asn1crypto 1.4.0
astroid 2.5
astropy 4.0.2
async-generator 1.10
atomicwrites 1.4.0
attrs 20.3.0
autopep8 1.5.6
Babel 2.9.0
backcall 0.2.0
backports.functools-lru-cache 1.6.4
backports.shutil-get-terminal-size 1.0.0
backports.tempfile 1.0
backports.weakref 1.0.post1
beautifulsoup4 4.9.3
bitarray 1.9.2
bkcharts 0.2
bleach 3.3.0
bokeh 2.3.1
bondpy 1.8.6
boto 2.49.0
Bottleneck 1.3.2
brotlipy 0.7.0
camera-calibration 1.15.3
camera-calibration-parsers 1.12.0
catkin 0.8.9
certifi 2020.12.5
cffi 1.14.5
chardet 4.0.0
click 7.1.2
cloudpickle 1.6.0
clyent 1.2.2
colorama 0.4.4
conda 4.10.1
conda-build 3.21.4
conda-package-handling 1.7.3
conda-repo-cli 1.0.3
conda-token 0.2.0
conda-verify 3.4.2
contextlib2 0.6.0.post1
control 0.9.0
controller-manager 0.19.4
controller-manager-msgs 0.19.4
cryptography 3.4.7
cv-bridge 1.15.0
cycler 0.10.0
Cython 0.29.23
cytoolz 0.11.0
dask 2021.4.0
decorator 4.4.2
defusedxml 0.7.1
diagnostic-analysis 1.10.3
diagnostic-common-diagnostics 1.10.3
diagnostic-updater 1.10.3
diff-match-patch 20200713
distributed 2021.4.0
docutils 0.16
dynamic-reconfigure 1.7.1
entrypoints 0.3
et-xmlfile 1.0.1
fastcache 1.1.0
filelock 3.0.12
flake8 3.9.0
Flask 1.1.2
fsspec 0.9.0
future 0.18.2
gazebo-plugins 2.9.1
gazebo-ros 2.9.1
gencpp 0.6.5
geneus 3.0.0
genlisp 0.4.18
genmsg 0.5.16
gennodejs 2.0.2
genpy 0.6.14
gevent 21.1.2
glob2 0.7
gmpy2 2.0.8
greenlet 1.0.0
h5py 2.10.0
HeapDict 1.0.1
html5lib 1.1
idna 2.10
image-geometry 1.15.0
imageio 2.9.0
imagesize 1.2.0
importlib-metadata 3.10.0
iniconfig 1.1.1
interactive-markers 1.12.0
intervaltree 3.1.0
ipykernel 5.3.4
ipython 7.22.0
ipython-genutils 0.2.0
ipywidgets 7.6.3
isort 5.8.0
itsdangerous 1.1.0
jdcal 1.4.1
jedi 0.17.1
jeepney 0.6.0
Jinja2 2.11.3
joblib 1.0.1
joint-state-publisher 1.15.0
joint-state-publisher-gui 1.15.0
json5 0.9.5
jsonschema 3.2.0
jupyter 1.0.0
jupyter-client 6.1.12
jupyter-console 6.4.0
jupyter-core 4.7.1
jupyter-docx-bundler 0.3.2
jupyter-packaging 0.7.12
jupyter-server 1.4.1
jupyterlab 3.0.14
jupyterlab-pygments 0.1.2
jupyterlab-server 2.4.0
jupyterlab-widgets 1.0.0
keyring 22.3.0
kiwisolver 1.3.1
laser-geometry 1.6.7
lazy-object-proxy 1.6.0
libarchive-c 2.9
llvmlite 0.36.0
locket 0.2.1
lxml 4.6.3
MarkupSafe 1.1.1
matplotlib 3.3.4
mccabe 0.6.1
message-filters 1.15.9
mistune 0.8.4
mkl-fft 1.3.0
mkl-random 1.1.1
mkl-service 2.3.0
mock 4.0.3
more-itertools 8.7.0
mpmath 1.2.1
msgpack 1.0.2
multipledispatch 0.6.0
navigator-updater 0.2.1
nbclassic 0.2.6
nbclient 0.5.1
nbconvert 6.0.7
nbformat 5.1.3
nest-asyncio 1.4.2
networkx 2.5.1
nltk 3.6.1
nose 1.3.7
notebook 6.3.0
numba 0.53.1
numexpr 2.7.3
numpy 1.19.2
numpydoc 1.1.0
olefile 0.46
openpyxl 3.0.7
packaging 20.9
pandas 1.2.4
pandocfilters 1.4.3
parso 0.7.0
partd 1.2.0
path 15.1.2
pathlib2 2.3.5
pathtools 0.1.2
patsy 0.5.1
pep8 1.7.1
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.2.0
pip 21.0.1
pkginfo 1.7.0
pluggy 0.13.1
ply 3.11
prometheus-client 0.10.1
prompt-toolkit 3.0.17
psutil 5.8.0
ptyprocess 0.7.0
py 1.10.0
pycodestyle 2.6.0
pycosat 0.6.3
pycparser 2.20
pycurl 7.43.0.6
pydocstyle 6.0.0
pyee 7.0.4
pyflakes 2.2.0
Pygments 2.8.1
pylint 2.7.4
PyMeasure 0.9.0
pyodbc 4.0.0-unsupported
pyOpenSSL 20.0.1
pypandoc 1.5
pyparsing 2.4.7
pyppeteer 0.2.2
pyqtgraph 0.11.0
pyrsistent 0.17.3
pyserial 3.5
PySocks 1.7.1
pytest 6.2.3
python-dateutil 2.8.1
python-jsonrpc-server 0.4.0
python-language-server 0.35.1
python-qt-binding 0.4.3
pytz 2021.1
pyusb 1.1.1
PyVISA 1.11.3
PyVISA-py 0.5.2
PyWavelets 1.1.1
pyxdg 0.27
PyYAML 5.4.1
pyzmq 20.0.0
QDarkStyle 3.0.2
qt-dotgraph 0.4.2
qt-gui 0.4.2
qt-gui-cpp 0.4.2
qt-gui-py-common 0.4.2
QtAwesome 1.0.2
qtconsole 5.0.3
QtPy 1.9.0
regex 2021.4.4
requests 2.25.1
resource-retriever 1.12.6
rope 0.18.0
rosbag 1.15.9
rosboost-cfg 1.15.7
rosclean 1.15.7
roscreate 1.15.7
rosgraph 1.15.9
roslaunch 1.15.9
roslib 1.15.7
roslint 0.12.0
roslz4 1.15.9
rosmake 1.15.7
rosmaster 1.15.9
rosmsg 1.15.9
rosnode 1.15.9
rosparam 1.15.9
rospy 1.15.9
rosservice 1.15.9
rostest 1.15.9
rostopic 1.15.9
rosunit 1.15.7
roswtf 1.15.9
rqt-action 0.4.9
rqt-bag 0.5.1
rqt-bag-plugins 0.5.1
rqt-console 0.4.11
rqt-dep 0.4.10
rqt-graph 0.4.14
rqt-gui 0.5.2
rqt-gui-py 0.5.2
rqt-image-view 0.4.16
rqt-launch 0.4.9
rqt-logger-level 0.4.11
rqt-moveit 0.5.9
rqt-msg 0.4.9
rqt-nav-view 0.5.7
rqt-plot 0.4.13
rqt-pose-view 0.5.10
rqt-publisher 0.4.9
rqt-py-common 0.5.2
rqt-py-console 0.4.9
rqt-reconfigure 0.5.3
rqt-robot-dashboard 0.5.8
rqt-robot-monitor 0.5.13
rqt-robot-steering 0.5.12
rqt-runtime-monitor 0.5.8
rqt-rviz 0.6.1
rqt-service-caller 0.4.9
rqt-shell 0.4.10
rqt-srv 0.4.8
rqt-tf-tree 0.6.2
rqt-top 0.4.9
rqt-topic 0.4.12
rqt-web 0.4.9
Rtree 0.9.7
ruamel-yaml-conda 0.15.100
rviz 1.14.5
scikit-image 0.18.1
scikit-learn 0.24.1
scipy 1.6.2
seaborn 0.11.1
SecretStorage 3.3.1
Send2Trash 1.5.0
sensor-msgs 1.13.1
setuptools 52.0.0.post20210125
simplegeneric 0.8.1
singledispatch 0.0.0
sip 4.19.13
six 1.15.0
slycot 0.4.0.0
smach 2.5.0
smach-ros 2.5.0
smclib 1.8.6
sniffio 1.2.0
snowballstemmer 2.1.0
sortedcollections 2.1.0
sortedcontainers 2.3.0
soupsieve 2.2.1
Sphinx 3.5.4
sphinxcontrib-applehelp 1.0.2
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 1.0.3
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.4
sphinxcontrib-websupport 1.2.4
spyder 4.1.5
spyder-kernels 1.9.4
SQLAlchemy 1.4.7
statsmodels 0.12.2
sympy 1.8
tables 3.6.1
tabulate 0.8.9
tblib 1.7.0
terminado 0.9.4
testpath 0.4.4
tf 1.13.2
tf-conversions 1.13.2
tf2-geometry-msgs 0.7.5
tf2-kdl 0.7.5
tf2-py 0.7.5
tf2-ros 0.7.5
threadpoolctl 2.1.0
tifffile 2020.10.1
toml 0.10.2
toolz 0.11.1
topic-tools 1.15.9
tornado 6.1
tqdm 4.59.0
traitlets 5.0.5
typing-extensions 3.7.4.3
ujson 4.0.2
unicodecsv 0.14.1
urllib3 1.26.4
watchdog 1.0.2
wcwidth 0.2.5
webencodings 0.5.1
websockets 8.1
Werkzeug 1.0.1
wheel 0.36.2
widgetsnbextension 3.5.1
wrapt 1.12.1
wurlitzer 2.1.0
xacro 1.14.6
xlrd 2.0.1
XlsxWriter 1.3.8
xlwt 1.3.0
xmltodict 0.12.0
yapf 0.31.0
zict 2.0.0
zipp 3.4.1
zope.event 4.5.0
zope.interface 5.3.0

- Pandoc version `pandoc --version`

pandoc --version
pandoc 2.12
Compiled with pandoc-types 1.22, texmath 0.12.1.1, skylighting 0.10.4,
citeproc 0.3.0.8, ipynb 0.1.0.1
User data directory: /home/eea/.local/share/pandoc
Copyright (C) 2006-2021 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.

**Additional context**

Error running filter pandoc_filter.py

Describe the bug
I just run jupyter nbconvert --execute --to=docx TOTO.ipynb --output toto.docx

and I obtain the following Traceback

[NbConvertApp] Converting notebook TOTO.ipynb to docx
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/bin/jupyter-nbconvert", line 8, in
sys.exit(main())
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/jupyter_core/application.py", line 264, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/traitlets/config/application.py", line 846, in launch_instance
app.start()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/nbconvert/nbconvertapp.py", line 361, in start
self.convert_notebooks()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/nbconvert/nbconvertapp.py", line 533, in convert_notebooks
self.convert_single_notebook(notebook_filename)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/nbconvert/nbconvertapp.py", line 498, in convert_single_notebook
output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/nbconvert/nbconvertapp.py", line 427, in export_single_notebook
output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/nbconvert/exporters/exporter.py", line 190, in from_filename
return self.from_file(f, resources=resources, **kw)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/nbconvert/exporters/exporter.py", line 208, in from_file
return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/jupyter_docx_bundler/init.py", line 79, in from_notebook_node
converters.notebookcontent_to_docxbytes(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/jupyter_docx_bundler/converters.py", line 298, in notebookcontent_to_docxbytes
pypandoc.convert_file(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pypandoc/init.py", line 150, in convert_file
return _convert_input(source_file, format, 'path', to, extra_args=extra_args,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pypandoc/init.py", line 351, in _convert_input
raise RuntimeError(
RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/jupyter_docx_bundler/pandoc_filter.py:
Could not find executable python

PYTHON is python3 - does jupyter-docx-bundler need a specific alias for python?

Desktop (please complete the following information):

OS: MACOSX 12.3.1
Browser safari
Versions of Python Python 3.10.2
all packages installed in march/april
Pandoc version pandoc --version
Pandoc 2.17.1.1
Compiled with pandoc-types 1.22.1, texmath 0.12.4, skylighting 0.12.2,
citeproc 0.6.0.1, ipynb 0.2

Additional context
pandoc TOTO.ipynb -s -o new_word_file.docx works perfectly

m-rossi / jupyter-docx-bundler Goto Github PK

jupyter-docx-bundler's Introduction

Jupyter docx bundler extension

Installation

Using conda

Using pip

Usage

Adding Metadata

Hiding inputs or complete code cells

Hiding all inputs

Direct call from console (nbconvert)

Development

jupyter-docx-bundler's People

Contributors

Stargazers

Watchers

Forkers

jupyter-docx-bundler's Issues

Recommend Projects

Recommend Topics

Recommend Org