Giter VIP home page Giter VIP logo

jupyter-course's Introduction

Binder

Reproducible and Interactive Data Science

Syllabus

The aim of this course is to introduce students to the Jupyter Notebook which is an open-source software that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text. Uses include: data cleansing and manipulation, numerical simulations, statistical modeling, machine learning, and much more. Through the notebooks, research results and the underlying analyses can be transparently reproduced as well as shared. As an example, see this Notebook on gravitational waves published in Physical Review Letters.

During three days with alternating video lectures (Intro & Widgets, Libraries, [ATLAS Dijet](see below) and hands-on exercises, the participants will learn to construct well-documented, electronic notebooks that perform advanced data analyses and produce publication ready plots. While the course is based on Python, this is not a prerequisite since the Jupyter Notebook supports many programming languages. The name Jupyter itself stands for Julia, Python, and R, the main languages of data science.

Credits

4 ECTS.

Workload equivalent to one working week (5 full-time days) for going through the course and seminars (1.5 hp), one working week to complete the individual project and implementation of corrections (1.5 hp), and 2.5 working days for the peer-review of other students project (0.5 hp).

Logistics

The course is held in "flipped classroom" mode: after the first introductory and get-to-know-each-other session, the students are supposed go through the videos themselves, and have Q&A sessions with teachers and helpers. All students and some of the teachers will be in LINXS, near Ideon, and other teachers will be on Zoom, with the room coordinates to be given to the participants. There is also a Discord server.

Introductory sessions with teachers on December 6, 2022 from 10:15 to 17:00 (with breaks for fika and lunch)

Q&A Sessions with teachers on December 8 and 9, 2022 from 10:15 to 17:00.

Program

The course consists of a taught component with alternating video lectures (Intro & Widgets, Libraries, ATLAS Dijet (see below) and hands-on exercises. All notebooks shown in the video lectures are available on this site in the lectures folder.

Prerequisites

  • No prior knowledge in Python is required, but familiarity with programming concepts is helpful.
  • A laptop connected to the internet (eduroam, for example) and running Linux, MacOS, or Windows and with Anaconda installed, see below.
  • Earphones for watching lectures in your own time / re-watching them during the sessions (we will also provide breakout rooms).

If you have little experience with Python or shell programming, the following two tutorials may be helpful:

Preparation Before the First Session

  1. Watch the video lectures (Intro & Widgets, Libraries, ATLAS dijets (see above, Day 3))

  2. Install miniconda3 alternatively the full anaconda3 enviroment on your laptop (the latter is much larger).

  3. Download the course material (this github repository) and unzip.

  4. Uncomment the line with "# - gcc # [osx]" in the file environment.yml.

  5. Install and activate the LUcompute environment described by the file environment.yml by running the following in a terminal:

    conda env create -f environment.yml
    conda activate LUcompute

Instructions for Windows:

  1. Watch the video lectures (Intro & Widgets, Libraries, ATLAS dijets (see above, Day 3))

  2. Install miniconda3.

  3. Download the course material (this github repository) and unzip.

  4. Open the anaconda prompt from the start menu.

  5. Navigate to the folder where the course material has been unzipped (e.g. using cd to change directory and dir to list files in a folder).

  6. Install and activate the LUcompute environment described by the file environment.yml by running the following in the anaconda prompt:

    conda env create -f environment.yml
    activate LUcompute

Documentation on conda environments

Project Work

The project work consists of three steps:

  1. Each student will make a Notebook project covering topics from day 1โ€“4 with either:
  • research, presenting data analysis and theory behind a manuscript or published paper. The Notebook should ideally be written such that it can act as supporting information (SI) for a journal. Here's some inspiration.
  • or a Notebook presenting a text-book topic of choice and aimed at students. Here's some inspiration.
  • Deadline for project: January 30th
  1. Each student will upload their project on a public GitHub repository created through GitHub Classroom For a brief introduction to git repositories, see here. Details and repositories will be made available at the end of the course.

  2. A peer-review process where each student reviews and writes comments on two other notebooks by creating issues on the respective GitHub repositories. The review should be based on the criteria listed below. For each point, include specific suggestions for improvements. The teachers can also add feedback on how to improve the notebook. Deadline for review: February 15th.

  3. The deadline for implementing the reviewer comments on your notebook and answering the GitHub issues is March 15th. At this point you should also have a Zenodo DOI for your project - add this as a badge to your repository, or as a link to your README. You will have to also add (to your Github repository) a text file that explains what changes you've made, and why. This process simulates a peer-review for scientific papers, so you're ready

  4. Save your project to your own GitHub repository when the course has finished as we may delete it before the next course event.

Notebook Requirements

This check list summarizes the minimum requirements for the Notebook project to be approved. It should be used as a reference for both the development of the Notebook and the peer-review process.

  • Documentation:
    • title and abstract of the project (max 300 words)
    • includes instructions on how to run the notebook
    • includes the required packages in an environment.yml file
    • includes a brief explanation of the reason each package/library was used
    • includes rich documentation using Markdown (equations, tables, links, images or videos)
    • is reproducible, i.e., someone else should be able to redo the steps
  • Input/Output:
    • uses pandas to read large data sets or numpy to load data from text files
    • uses pandas to save to disk the processed or generated data
  • Scientific computing/data processing:
    • performs numerical operations (numpy, scipy, pandas) or manipulates, groups, and aggregates a data set (pandas)
  • Data visualization:
    • includes at least one composite plot (inset or multiple panels)
    • produces publication ready quality figures (see here for an editorial guide on Graphical Excellence):
      • the figures are 89 mm wide (single column) or 183 mm wide (double column)
      • the axes are labeled
      • the font sizes are sufficiently large
      • the figures are saved as rasterized images (300 dpi) or vector art
  • Version control, sharing, and archiving:
    • is archived in a repository with a digital object identifier (DOI)

Getting a DOI via Zenodo

Part of your project work will consist of adding a Digital Object Identifier DOI to your work, through Zenodo. In order to do that, you should watch the videos mentioned in "day 3": - Version control, sharing, and archiving (Github and Zenodo) The easiest and preferred way to do it is by connecting your Github account to Zenodo first, enabling the repository to be seen by Zenodo, then making a tag in GitHub, following the instructions here.

Create and Export Conda Environments

The command to create a new environemnt with Python x.y is

conda create --name myenv python=x.y

where myenv is a name of your choice for the new environment and x.y is a specific Python version (e.g. 2.7 or 3.6). After activating the environemnt (conda activate myenv), you can install all the other packages within the environment. conda list shows the list of packages installed in the environment. The command to export the active environment myenv to an environment yml file (e.g. myenv.yml) is

conda env export > myenv.yml

Troubleshooting

If your notebook seems to have an issue on connection, similar to the lines below:

[E 12:18:57.001 NotebookApp] Uncaught exception in /api/kernels/5e16fa4b-3e35-4265-89b0-ab36bb0573f5/channels
 Traceback (most recent call last):
   File "/Library/Python/2.7/site-packages/tornado-5.0a1-py2.7-macosx-10.13-intel.egg/tornado/websocket.py", line 494, in _run_callback
     result = callback(*args, **kwargs)
   File "/Library/Python/2.7/site-packages/notebook-5.2.2-py2.7.egg/notebook/services/kernels/handlers.py", line 258, in open
     super(ZMQChannelsHandler, self).open()
   File "/Library/Python/2.7/site-packages/notebook-5.2.2-py2.7.egg/notebook/base/zmqhandlers.py", line 168, in open
     self.send_ping, self.ping_interval, io_loop=loop,
 TypeError: __init__() got an unexpected keyword argument 'io_loop'
[I 12:18:58.021 NotebookApp] Adapting to protocol v5.1 for kernel 5e16fa4b-3e35-4265

You should either a) downgrade the package "tornado" b) change L178 of the file

[your conda installation location]/miniconda3/envs/LUcompute/lib/python3.6/site-packages/notebook/base/zmqhandlers.py 

from

             self.send_ping, self.ping_interval, io_loop=loop,

into

             self.send_ping, self.ping_interval,

https://stackoverflow.com/questions/48090119/jupyter-notebook-typeerror-init-got-an-unexpected-keyword-argument-io-l

External Resources

  • Cross-language interaction is a striking feature of Jupyter notebooks: The possibility to integrate multiple languages in the same notebook makes it feasible to exploit the best tools of the various languages in the different steps of data analysis. You can read more about it in this post.
  • The Jupyter notebook is a very popular tool for working with data in academia as well as in the private sector.
    • These tutorials show how the LIGO/VIRGO collaboration extensively uses Jupyter notebooks to communicate its research.
    • The streaming service Netflix currently uses Jupyter notebooks as the main tool for data analysis. For example, recommendation algorithms which suggest which movies or TV series to watch next are currently run on Jupyter notebooks. You can read more about it in this post.
    • In 2017 Jupyter received the ACM Software System Award, a prestigious award that it shares with projects such as Unix and the Web.
  • There are many freely available online resources to learn data science.
    • The best resource to find help with programming and scripting is Stack Overflow, which is a question and answer website curated by software developer communities.
    • An excellent book is "Python Data Science Handbook" by Jake VanderPlas which is freely available as Jupyter notebooks at this GitHub page. On the author's webpage, you can also find a list of excellent talks, lectures, and tutorials and a blog.
    • Yet another useful resource is the podcast Data Skeptic which features a collection of entertaining and educational mini-lectures on data science as well as interviews with experts.

jupyter-course's People

Contributors

gitesei avatar mlund avatar urania277 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

jupyter-course's Issues

Bokeh lecture notebook

First try to make a "issue report" on GitHub. :)
The issues are in the lecture part about bokeh.

  • The "label" handles in the are deprecated keywords should be changed in "legend_label". Bokeh version in the lecture is 1.0.1 and the newest is 2.2.3
  • Cell number 8 (the one showing the 2D array) is not actually showing anything.

Ubuntu 16.04 - conda env create -f environment.yml

I uninstalled all my python, ipython and jupyter packages but still obtain the following when doing conda env create -f environment.yml:

...
jupyter_contri 100% |##################################################################################################################################| Time: 0:00:15   1.31 MB/s
+ /home/anton/.miniconda3/envs/LUcompute/bin/python -c 'import logging; from jupyter_contrib_core.notebook_compat.nbextensions import install_nbextension_python; install_nbextension_python('\''jupyter_highlight_selected_word'\'', sys_prefix=True, logger=logging.getLogger())'

+ /home/anton/.miniconda3/envs/LUcompute/bin/python -c 'import logging; from jupyter_contrib_core.notebook_compat.nbextensions import install_nbextension_python; install_nbextension_python('\''latex_envs'\'', sys_prefix=True, logger=logging.getLogger())'

+ /home/anton/.miniconda3/envs/LUcompute/bin/jupyter-nbextensions_configurator enable --sys-prefix
Enabling: jupyter_nbextensions_configurator
- Writing config: /home/anton/.miniconda3/envs/LUcompute/etc/jupyter
    - Validating...
      jupyter_nbextensions_configurator  OK
Enabling notebook nbextension nbextensions_configurator/config_menu/main...
Enabling tree nbextension nbextensions_configurator/tree_tab/main...

+ /home/anton/.miniconda3/envs/LUcompute/bin/jupyter-contrib-nbextension install --sys-prefix
Traceback (most recent call last):
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 664, in _build_master
    ws.require(__requires__)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 981, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 872, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (parso 0.1.1 (/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages), Requirement.parse('parso==0.1.0'), {'jedi'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/anton/.miniconda3/envs/LUcompute/bin/jupyter-contrib-nbextension", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3138, in <module>
    @_call_aside
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3122, in _call_aside
    f(*args, **kwargs)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3151, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 666, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 679, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 867, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'parso==0.1.0' distribution was not found and is required by jedi

ERROR conda.core.link:_execute_actions(337): An error occurred while installing package 'conda-forge::jupyter_contrib_nbextensions-0.3.3-py36_0'.
LinkError: post-link script failed for package conda-forge::jupyter_contrib_nbextensions-0.3.3-py36_0
running your command again with `-v` will provide additional information
location of failed script: /home/anton/.miniconda3/envs/LUcompute/bin/.jupyter_contrib_nbextensions-post-link.sh
==> script messages <==
+ /home/anton/.miniconda3/envs/LUcompute/bin/jupyter-contrib-nbextension install --sys-prefix
Traceback (most recent call last):
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 664, in _build_master
    ws.require(__requires__)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 981, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 872, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (parso 0.1.1 (/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages), Requirement.parse('parso==0.1.0'), {'jedi'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/anton/.miniconda3/envs/LUcompute/bin/jupyter-contrib-nbextension", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3138, in <module>
    @_call_aside
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3122, in _call_aside
    f(*args, **kwargs)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3151, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 666, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 679, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 867, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'parso==0.1.0' distribution was not found and is required by jedi


Attempting to roll back.


LinkError: post-link script failed for package conda-forge::jupyter_contrib_nbextensions-0.3.3-py36_0
running your command again with `-v` will provide additional information
location of failed script: /home/anton/.miniconda3/envs/LUcompute/bin/.jupyter_contrib_nbextensions-post-link.sh
==> script messages <==
+ /home/anton/.miniconda3/envs/LUcompute/bin/jupyter-contrib-nbextension install --sys-prefix
Traceback (most recent call last):
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 664, in _build_master
    ws.require(__requires__)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 981, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 872, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (parso 0.1.1 (/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages), Requirement.parse('parso==0.1.0'), {'jedi'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/anton/.miniconda3/envs/LUcompute/bin/jupyter-contrib-nbextension", line 6, in <module>
    from pkg_resources import load_entry_point
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3138, in <module>
    @_call_aside
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3122, in _call_aside
    f(*args, **kwargs)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3151, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 666, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 679, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/home/anton/.miniconda3/envs/LUcompute/lib/python3.6/site-packages/pkg_resources/__init__.py", line 867, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'parso==0.1.0' distribution was not found and is required by jedi

Later, I can't enable the rubberband and exercise2 extensions in the LUcompute environment:

(LUcompute) anton@ebbe:jupyter-course$ jupyter nbextension enable rubberband/main
Enabling notebook extension rubberband/main...
      - Validating: problems found:
        - require?  X rubberband/main
(LUcompute) anton@ebbe:jupyter-course$ jupyter nbextension enable exercise2/main
Enabling notebook extension exercise2/main...
      - Validating: problems found:
        - require?  X exercise2/main
(LUcompute) anton@ebbe:jupyter-course$ jupyter nbextension enable --py widgetsnbextension
Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: OK

Any help appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.