Comments (16)
For (1) is it enough if users have a home directory as part of their jupyterhub login?
For (2) is this a dataset that is listed in https://github.com/earthlab/earthpy/blob/master/earthpy/io.py? If it isn't part of earthpy we could add it to the docker image directly.
I started using (3) on https://hub.earthdatascience.org/earthhub yesterday, needs some testing to be sure but should be ready.
from hub-ops.
@betatim thank you.
- I'd like to see how this works but if the student have a home directory that has space to write data to. (write access and space allocation) that should be sufficient!
- The data are for friday listed in :
'spatial-vector-lidar': ('https://ndownloader.figshare.com/files/12396203', '.', 'zip')
<- this is the line in the dictionary. What i don't understand is i want that data to be in their home directory for them to use. i don't understand how we get it there! it may just be me not understanding how this is setup :) will i need to show them how to download the data once they login? AND if they do download it - what happens if they logout and then log back into OR get disconnected? does their workspace persist?
THANK YOU!!
from hub-ops.
The workspace persists.
I'll create a workshophub
just for the workshop tomorrow morning and add this file to the image so the data is there ready to use for the students.
What system should we use for authentication? Do they all have a UC Boulder email address? If yes we can use that for logins.
from hub-ops.
ok. ideally for future setup it would be easy for me to add a few datasets to a "hub" space! maybe we can chat more tomorrow?
We will have a mix of students potentially. There may be some without CU authentication as we had two drop in yesterday that are not in the system yet. Could a CSV approach work potentially?
from hub-ops.
If there is a mixture I'd go for https://github.com/yuvipanda/jupyterhub-firstuseauthenticator with a white list.
Should we also have the material of the course directly available in people's home directory?
from hub-ops.
- put data into the image #45
- copy data from the image to the user's home directory on pod start #45
- authentication HashAuth #47
- switch node type to a bigger node or even use two different node pools
- nbgitpuller with https://github.com/earthlab-education/2018-07-20-spatial-python-workshop #45
- ~15 students be prepared for up to 25
- add a notebook to admin user that generates passwords
- more grafana monitoring for nice plots
Workshop starts at 9am Boulder time.
from hub-ops.
Using https://github.com/thedataincubator/jupyterhub-hashauthenticator at the moment which does not seem to support whitelists. Maybe worth fixing at some point.
from hub-ops.
Added a second nodepool with two machines that each can handle around 14 students. The node pool is allowed to scale up to 4 machines so we should have enough resources even for last minute arrivals.
- turn off woskhop-pool again after the workshop to save money
from hub-ops.
Collecting to do items and learnings from the workshop:
- kernel died a few times, probably because using too much RAM, current limit was set by guessing a not-crazy number, what is a better way to determine this number?
- data copying was broken (see #58), next time run through process with completely fresh users to catch problems that only show up on first-use
- right at the start of the workshop (19min in) there was a k8s master update which caused 3minutes of outage. Why did this upgrade happen then?
from hub-ops.
NOTE: the kernel is consistently dying on the show_hist function using rasterio! so this is definitely a package issue but the question is what causes it to die? maybe memory hog?
from hub-ops.
From mid workshop. This shows actual amount of memory and CPU used. Not how much memory/CPU we promised to each pod. This would suggest we can increase the memory limit beyond 2G a bit without having to pay more money for bigger machines. It seems most people, most of the time aren't using all the memory we assign them. Brief peaks for show_hist
?
We can definitely increase the CPU limit as people mostly idle.
Before giving away more RAM we need to make sure that all the core services specify how much RAM they need so that they get protected by kubernetes.
from hub-ops.
@lwasser was their any promise to participants how long their home directories will be available? Otherwise I'd clean (aka delete) them as part of shutting down the workshop cluster.
from hub-ops.
No! i told them to download their files as it would all disappear. we can clean it. i'd like to know how to do that. I also have questions post wowrkshop
- data -- do the data update now ? so if i add a new folder to a dataset, it will update even if the folder already exists?
- if the folder doesn't exist will it create it ? (ie @joemcglinchy didn't have data i think because his data didn't update
from hub-ops.
Hub has been turned off via #67 and following documentation in https://earthlab-hub-ops.readthedocs.io/en/latest/day-to-day.html#removing-a-hub
from hub-ops.
The answer to @lwasser questions is: yes and yes.
If the data in the docker image change that will be reflected in the user's home directory (once they stop&start their server to pickup the new image).
Previously there was a bug that it would not update if some of the directory existed. This should be fixed now.
from hub-ops.
Closing this as the event is over and most of the action items from lessons learnt have been done.
from hub-ops.
Related Issues (20)
- git puller quirk / fail HOT 2
- Every commit is taking hours to build HOT 1
- Travis and Hub Deployment - Currently I can't deploy hub updates HOT 5
- Update location of grafana and prometheus charts HOT 5
- Migrate deployment from Travis to Actions HOT 4
- Update docs on how to setup gcloud for cu
- Resource tracking on the hub
- Vector notebooks failing grading on the hub HOT 2
- nbgrader-hub for the spring HOT 1
- revoke user tokens at end of class HOT 5
- migrate new hub to main branch HOT 2
- Move Applications to the Earth lab ORGANIZATION and update docs HOT 3
- PR only build? HOT 7
- remove travis secrets from repo HOT 2
- issues launching the hub HOT 3
- uploading files to the new hub is hanging HOT 1
- hub launch throwing memory errors and is very slow HOT 5
- Remove students from ea-hub
- shut down grading hub for the summer
- issues shutting down the hub HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hub-ops.