Giter VIP home page Giter VIP logo

jupyter-course's Introduction

Binder

Reproducible and Interactive Data Science

Syllabus

The aim of this course is to introduce students to the Jupyter Notebook which is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more. Through the notebooks, research results and the underlying analysis can be transparently reproduced as well as shared. As an example of a Notebook on gravitational waves published in Phys. Rev. Lett., see here.

During three days with alternating video lectures and hands-on exercises, the participants will learn to construct well-documented, electronic notebooks that perform advanced data analysis and produce publication ready plots. While the course is based on Python, this is not a prerequisite, and many other programming languages can be used.

Credits

4 ECTS.

Program

Sessions on December 3-5, 2018 and project presentations: January 14-15, 2019. All dates from 10:15 to 15:00.

Location: Sölvegatan 27 (The observatory), Department of Astronomy and Theoretical Physics, Rooms Cassiopeia (Monday morning; Thursday afternoon; and all Friday) or Andromeda.

The course consists of four full days: three with alternating video lectures and hands-on exercises, and two days with project presentations. Lectures are available as Notebooks on this site in the lectures folder.

  • Day 1. Introduction

    • Introduction and overview of Jupyter notebooks
    • Installation and package management (Anaconda, managing environments)
    • Navigating cells and iPython "Magic" commands
    • Online resources and getting help
    • Documenting using Markdown: rich text, equations, images, tables, video.
    • Other languages (bash, cython, R, etc.)
    • Online viewing, conversion, sharing, version control (Github, Zenodo, Binder, NBviewer)
  • Day 2. Numerical Methods, Plotting, and visualization

    • Storage and manipulation of numerical arrays (numpy)
    • Plotting in Notebooks (matplotlib, seaborn)
    • Arranging plots, customizing
    • Making plots publication ready
    • Exporting to vectorized file formats
    • Interactive visialization (ipywidgets)
  • Day 3. Numerical methods and data science

    • Scientific python (scipy)
    • Symbolic math (sympy)
    • Data parsing and import (csv, excel, json, pickle, pdf, custom files, etc.)
    • Working with large datasets (pandas)
  • Day 4 and 5. Project presentations

Prerequisites

  • No prior knowledge in Python is required, but familiarity with programming concepts is helpful.
  • A laptop connected to the internet (eduroam, for example) and running Unix, MacOS, or Windows and with Anaconda installed, see below.
  • Ear phones for silently watching lectures during the sessions.

If you have little experience with Python or shell programming, the following two tutorials may be helpful:

Preparations before the first session

  1. Watch the video lectures which will be provided here one week before course start.

  2. Install miniconda3 alternatively the full anaconda3 enviroment on your laptop (the latter is much larger).

  3. Download the course material (this github repository) and unzip.

  4. Install and activate the LUcompute environment described by the file environment.yml by running the following in a terminal (inside the jupyter-course directory where you should see environment.yml):

    conda env create -f environment.yml
    source activate LUcompute
    jupyter nbextension enable rubberband/main
    jupyter nbextension enable exercise2/main
    jupyter nbextension enable --py widgetsnbextension

Instructions for Windows:

  1. Install miniconda3.

  2. Download the course material (this github repository) and unzip.

  3. Open the anaconda prompt from the start menu.

  4. Navigate to the folder where the course material has been unzipped (e.g. using cd to change directory and dir to list files in a folder).

  5. Install and activate the LUcompute environment described by the file environment.yml by running the following in the anaconda prompt:

    conda env create -f environment.yml
    activate LUcompute
    jupyter nbextension enable rubberband/main
    jupyter nbextension enable exercise2/main
    jupyter nbextension enable --py widgetsnbextension

Further Information

Troubleshooting

If your notebook seems to have an issue on connection, similar to the lines below:

[E 12:18:57.001 NotebookApp] Uncaught exception in /api/kernels/5e16fa4b-3e35-4265-89b0-ab36bb0573f5/channels
 Traceback (most recent call last):
   File "/Library/Python/2.7/site-packages/tornado-5.0a1-py2.7-macosx-10.13-intel.egg/tornado/websocket.py", line 494, in _run_callback
     result = callback(*args, **kwargs)
   File "/Library/Python/2.7/site-packages/notebook-5.2.2-py2.7.egg/notebook/services/kernels/handlers.py", line 258, in open
     super(ZMQChannelsHandler, self).open()
   File "/Library/Python/2.7/site-packages/notebook-5.2.2-py2.7.egg/notebook/base/zmqhandlers.py", line 168, in open
     self.send_ping, self.ping_interval, io_loop=loop,
 TypeError: __init__() got an unexpected keyword argument 'io_loop'
[I 12:18:58.021 NotebookApp] Adapting to protocol v5.1 for kernel 5e16fa4b-3e35-4265

You should either a) downgrade the package "tornado" b) change L178 of the file

[your conda installation location]/miniconda3/envs/LUcompute/lib/python3.6/site-packages/notebook/base/zmqhandlers.py 

from

             self.send_ping, self.ping_interval, io_loop=loop,

into

             self.send_ping, self.ping_interval,

https://stackoverflow.com/questions/48090119/jupyter-notebook-typeerror-init-got-an-unexpected-keyword-argument-io-l

Project Work

The project work consists of three steps:

  1. Each student will make a Notebook project covering topics from day 1-3 with either
  • research, presenting data analysis and theory behind a manuscript or published paper. The Notebook should ideally be written such that it can act as supporting information (SI) for a journal. Here's some inspiration.
  • or a Notebook presenting a text-book topic of choice and aimed at students. Here's some inspiration.
  • Deadline for project: January 2, 2019
  1. A peer-review process where each student reviews and, in writing, comments on two other notebooks. The review should be based on the criteria listed below and for each point, include specific suggestions for improvements. Deadline for review: January 8, 2019

  2. Notebook presentation to the class (day 4). Maximum 10 minutes per participant and do include your answer to the referee reports.

Notebook Requirements

The notebook must

  • include rich documentation using Markdown, equations, tables, links, etc.
  • import or generate data. If generating, data should be exported to disk.
  • perform data operations using numpy, scipy, pandas or equivalent.
  • create plots of publication ready quality. For an editorial guide on Graphical Excellence, see here.
  • include instructions on how to run the notebook, include the required packages. This could be an environment.yml file.
  • be reproducible, i.e. someone elso should be able to redo the steps

Further, the notebook could

  • act as supporting information for an article
  • have an digital object identifier (DOI)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.