Giter VIP home page Giter VIP logo

Comments (8)

anandology avatar anandology commented on July 30, 2024 1

I'm not able to get this post_save_hook to work. I think it might be related to how jupyterhub reads configuration for spawned instances, but I even tried putting the jupyter_notebook_config.py in the spawned user's ~/.jupyter/... and it's still unchanged.

I think it is important to figure out how to make that work. Let's dig a bit more.

from pipalhub.

anandology avatar anandology commented on July 30, 2024

I have used this snipeet to autosave html on save.

# ~/.jupyter/jupyter_notebook_config.py

_script_exporter = None
import io
import os
from notebook.utils import to_api_path

def script_post_save(model, os_path, contents_manager, **kwargs):
    """convert notebooks to Python script after save with nbconvert

    replaces `jupyter notebook --script`
    """
    from nbconvert.exporters.html import HTMLExporter

    if model['type'] != 'notebook':
        return

    global _script_exporter

    if _script_exporter is None:
        _script_exporter = HTMLExporter(parent=contents_manager)

    log = contents_manager.log

    base, ext = os.path.splitext(os_path)
    script, resources = _script_exporter.from_filename(os_path)
    script_fname = base + resources.get('output_extension', '.txt')
    log.info("Saving script /%s", to_api_path(script_fname, contents_manager.root_dir))

    with io.open(script_fname, 'w', encoding='utf-8') as f:
        f.write(script)

c.FileContentsManager.post_save_hook = script_post_save

from pipalhub.

nikochiko avatar nikochiko commented on July 30, 2024

We need the export to have these things:

  • A summary of each student's notebook
  • An index from which all students' notebooks can be visited
  • A single page for each student that displays the complete notebook

For this approach:

  • Components:
    • Summary: We will have to handle this specially. A new save should not re-compile the whole summary file. At the same time, saves to multiple files should not create race conditions. I don't know if that is a case we should be worried about. The summary file would have to have separate unambiguous divs for each user that can be selected and specifically swapped out. There is some complexity - we would have to figure out how to build the first copy without any content.
    • Index of students: This can be handled by simple nginx file serving
    • Student page: Can be handled by nginx file serving
  • Pros:
    • Easy to export as files
    • Mostly already implemented
  • Cons:
    • Some unknowns around race conditions to write the summary file
    • Would depend on nginx and its configuration

Two other approaches that were discussed besides using a post_save_hook to direclty generate HTML:

  1. Something like notebook-html without an express build step, but rather creating the HTML/JS files with knowledge of each student's name.
    • Components:
      • Summary: We will need to extend the code to be able to take a max number of code cells that we convert to HTML. This change is simple (add an extra param to settings and override the simpleBuildHTML method to use it.
      • Index of students: We will need a list of the students' user names to build this. We could either setup some HTTP endpoint that does this or a single script that looks at some specific directory and returns the sub-directory names. I think the former should be the better way to do it when we have the admin interface ready, until then it would be better to use a simple script.
      • Student page: We can return this content dynamically using a template. The notebook-html library can be used and we'll have a unique URL for each student.
    • Pros:
      • No need to run build scripts in advance
      • Content is always fresh when reloaded
      • No need for a separate build component (endpoints can be part of the same webapp as notebook docs)
    • Cons:
      • Summary page would load all students' complete notebooks into memory each time (from the URL fetch) before taking the first 10 cells out from it. This might have an impact on the instructors' experience.
      • Exporting as HTML files (for backups and sharing with students) would get harder. Or we'll have to use the Python exporter to do that.
  2. Like post_save_hook but with each cell as a SQL row and rendering dynamically
    • Components:
      • Summary: Getting the summary for each student would be an SQL query and then the HTML exporter can convert that JSON to HTML. We would only load a limited amount of data.
      • Index: Would be served by an extra endpoint on notebook-html
      • Student page: Would be served by an endpoint on notebook-html
    • Pros:
      • Everything is in database. Easy to play with
      • Rendered server-side, so easy to export
      • Easy to scale to a very large number of users
    • Cons:
      • Implementation would be complex. We'd have to deal with edge cases and need to test it.

from pipalhub.

nikochiko avatar nikochiko commented on July 30, 2024

I think the most reliable way to move ahead would be to use notebook-html with javascript. There are least unknowns and the instructor is guaranteed to have a fresh copy.
For file exports, we can use Python's HTMLExporter or write some logic on frontend to create a download button for each file after loading it.

from pipalhub.

nikochiko avatar nikochiko commented on July 30, 2024

I wasn't able to get this exact approach to work, with extensions. As an alternate approach, we could have a service instead: https://jupyterhub.readthedocs.io/en/stable/reference/services.html
and make a simple Flask app that will serve the desired pages. The flask process would be managed by jupyterhub (we won't need to have custom start/stop or use systemctl) and proxied to with a /services/{service_name} url. I have tested that it works.
The notebook content cannot be fetched directly as an IPYNB and would need an authenticated HTTP request. Because we don't want to share the auth token with frontend, we can delegate that part to a separate unauthenticated endpoint on our service that would in turn fetch the content with its auth token, or using the file system directly.

from pipalhub.

anandology avatar anandology commented on July 30, 2024

We need the export to have these things:

* A summary of each student's notebook

* An index from which all students' notebooks can be visited

* A single page for each student that displays the complete notebook

We already do these things as part of the build process. The issue is it is done repeatedly every couple of seconds.

I think it would be easier to do the same process on every save and try to optimize from there rather than taking up a completely new approach.

from pipalhub.

nikochiko avatar nikochiko commented on July 30, 2024

I'm not able to get this post_save_hook to work. I think it might be related to how jupyterhub reads configuration for spawned instances, but I even tried putting the jupyter_notebook_config.py in the spawned user's ~/.jupyter/... and it's still unchanged.

from pipalhub.

nikochiko avatar nikochiko commented on July 30, 2024

This is done.
https://github.com/pipalacademy/pipalhub/compare/3871d55..c2ff9580c38dbb98c816344798614d0507b17706

from pipalhub.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.