Comments (8)
I'm not able to get this post_save_hook to work. I think it might be related to how jupyterhub reads configuration for spawned instances, but I even tried putting the
jupyter_notebook_config.py
in the spawned user's~/.jupyter/...
and it's still unchanged.
I think it is important to figure out how to make that work. Let's dig a bit more.
from pipalhub.
I have used this snipeet to autosave html on save.
# ~/.jupyter/jupyter_notebook_config.py
_script_exporter = None
import io
import os
from notebook.utils import to_api_path
def script_post_save(model, os_path, contents_manager, **kwargs):
"""convert notebooks to Python script after save with nbconvert
replaces `jupyter notebook --script`
"""
from nbconvert.exporters.html import HTMLExporter
if model['type'] != 'notebook':
return
global _script_exporter
if _script_exporter is None:
_script_exporter = HTMLExporter(parent=contents_manager)
log = contents_manager.log
base, ext = os.path.splitext(os_path)
script, resources = _script_exporter.from_filename(os_path)
script_fname = base + resources.get('output_extension', '.txt')
log.info("Saving script /%s", to_api_path(script_fname, contents_manager.root_dir))
with io.open(script_fname, 'w', encoding='utf-8') as f:
f.write(script)
c.FileContentsManager.post_save_hook = script_post_save
from pipalhub.
We need the export to have these things:
- A summary of each student's notebook
- An index from which all students' notebooks can be visited
- A single page for each student that displays the complete notebook
For this approach:
- Components:
- Summary: We will have to handle this specially. A new save should not re-compile the whole summary file. At the same time, saves to multiple files should not create race conditions. I don't know if that is a case we should be worried about. The summary file would have to have separate unambiguous divs for each user that can be selected and specifically swapped out. There is some complexity - we would have to figure out how to build the first copy without any content.
- Index of students: This can be handled by simple nginx file serving
- Student page: Can be handled by nginx file serving
- Pros:
- Easy to export as files
- Mostly already implemented
- Cons:
- Some unknowns around race conditions to write the summary file
- Would depend on nginx and its configuration
Two other approaches that were discussed besides using a post_save_hook
to direclty generate HTML:
- Something like notebook-html without an express build step, but rather creating the HTML/JS files with knowledge of each student's name.
- Components:
- Summary: We will need to extend the code to be able to take a max number of code cells that we convert to HTML. This change is simple (add an extra param to settings and override the
simpleBuildHTML
method to use it. - Index of students: We will need a list of the students' user names to build this. We could either setup some HTTP endpoint that does this or a single script that looks at some specific directory and returns the sub-directory names. I think the former should be the better way to do it when we have the admin interface ready, until then it would be better to use a simple script.
- Student page: We can return this content dynamically using a template. The notebook-html library can be used and we'll have a unique URL for each student.
- Summary: We will need to extend the code to be able to take a max number of code cells that we convert to HTML. This change is simple (add an extra param to settings and override the
- Pros:
- No need to run build scripts in advance
- Content is always fresh when reloaded
- No need for a separate build component (endpoints can be part of the same webapp as notebook docs)
- Cons:
- Summary page would load all students' complete notebooks into memory each time (from the URL fetch) before taking the first 10 cells out from it. This might have an impact on the instructors' experience.
- Exporting as HTML files (for backups and sharing with students) would get harder. Or we'll have to use the Python exporter to do that.
- Components:
- Like
post_save_hook
but with each cell as a SQL row and rendering dynamically- Components:
- Summary: Getting the summary for each student would be an SQL query and then the HTML exporter can convert that JSON to HTML. We would only load a limited amount of data.
- Index: Would be served by an extra endpoint on notebook-html
- Student page: Would be served by an endpoint on notebook-html
- Pros:
- Everything is in database. Easy to play with
- Rendered server-side, so easy to export
- Easy to scale to a very large number of users
- Cons:
- Implementation would be complex. We'd have to deal with edge cases and need to test it.
- Components:
from pipalhub.
I think the most reliable way to move ahead would be to use notebook-html
with javascript. There are least unknowns and the instructor is guaranteed to have a fresh copy.
For file exports, we can use Python's HTMLExporter
or write some logic on frontend to create a download button for each file after loading it.
from pipalhub.
I wasn't able to get this exact approach to work, with extensions. As an alternate approach, we could have a service instead: https://jupyterhub.readthedocs.io/en/stable/reference/services.html
and make a simple Flask app that will serve the desired pages. The flask process would be managed by jupyterhub (we won't need to have custom start/stop or use systemctl) and proxied to with a /services/{service_name}
url. I have tested that it works.
The notebook content cannot be fetched directly as an IPYNB and would need an authenticated HTTP request. Because we don't want to share the auth token with frontend, we can delegate that part to a separate unauthenticated endpoint on our service that would in turn fetch the content with its auth token, or using the file system directly.
from pipalhub.
We need the export to have these things:
* A summary of each student's notebook * An index from which all students' notebooks can be visited * A single page for each student that displays the complete notebook
We already do these things as part of the build process. The issue is it is done repeatedly every couple of seconds.
I think it would be easier to do the same process on every save and try to optimize from there rather than taking up a completely new approach.
from pipalhub.
I'm not able to get this post_save_hook to work. I think it might be related to how jupyterhub reads configuration for spawned instances, but I even tried putting the jupyter_notebook_config.py
in the spawned user's ~/.jupyter/...
and it's still unchanged.
from pipalhub.
This is done.
https://github.com/pipalacademy/pipalhub/compare/3871d55..c2ff9580c38dbb98c816344798614d0507b17706
from pipalhub.
Related Issues (12)
- Auto login link HOT 1
- API endpoint to create new users HOT 1
- Streamline setting up a new instance HOT 1
- Admin interface
- Dashboard service: with mechanism for notification of new saves HOT 6
- Generate summary page when a notebook is saved
- Dashboard notifications: browser javascript to poll and check for updates on the page HOT 1
- Add documentation for newly added components
- Set shell to bash when adding users HOT 2
- nbconvert fails to make html for live notes sharing
- Need some notification for comments
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pipalhub.