scitools-classroom / courses Goto Github PK

Python courses for the scientific researcher

License: BSD 3-Clause "New" or "Revised" License

Shell 0.27% Jupyter Notebook 94.78% Roff 4.96%

courses's Issues

Reintroduce Travis notebook tests

Since #150 we haven't any proper Travis testing, since we broke it.

Travis used to make the notebooks into a non-interactive document build, but we have stopped doing that.
We could re-introduce this.

Mostly, during the 'feature_self_learn' development, we tried to deliver notebooks that would all run through without errors.
The many new "sample solution" code examples are designed also to run : It should seek out + un-comment "# %load" lines first.
Ideally, everything will run through, and that can be our "notebook tests" !

( NOTE: at the moment though, we are getting stuck in a "run all cells" operation. This could be a Jupyter problem or local only ? )

There may be 1 or 2 remaining examples of purposely failing examples for "what went wrong there?" demonstrations. We would need to fix that, maybe with try/except (as in several code solution examples)

Adapting course material - citation/attribution

Hi there,

I just wanted to check if it is OK to use these course materials as the basis for an Iris tutorial I am writing for https://github.com/ourcodingclub/ourcodingclub.github.io (I have to develop the materials in a set layout/format, otherwise I would just deliver the course 'as is' from the notebook.)

I see it's GPL licensed but just wanted to check if you have any specific citation/attribution, or copyright messages you would like included. (Other than the GPL)

Best wishes,
Declan

University of Edinburgh GeoSciences

Iris course - conda environment

In Section 6 of the Iris course (https://github.com/scitools-classroom/courses/blob/master/course_content/iris_course/6.Data_Processing.ipynb), the solution # %load solutions/iris_exercise_6.3e raises an error that nc-time-axis is not found when running in a suitable conda environment as documented in the README ($ conda create -n testenv iris iris-sample-data jupyter).

At the AVD Surgery, I was advised to set up a conda environment with nc-time-axis specified ($ conda create -n testenv iris iris-sample-data jupyter nc-time-axis). This successfully avoids the error. Perhaps the README could be updated?

Thanks

numpy broadcasting rules could do with simplification

The numpy course Broadcasting section can be difficult to understand due to confusing technical language. The concept itself is quite straightforward, but there are several layers of complexity in the sentences which describe the rules, and they do not correlate to the graphics although they seem like they should, so the images actually just become confusing.

I think that this section would be easier to understand if the images matched the rules and allowed the user to understand all the loaded phrases (like 'the shape of the array with fewer dimensions ... padded ... leading (left) side'; this would be easier to unpack if you maybe showed the shape of the dimensions in a code cell, and then what it means to 'pad' it to match the shape of the other array).

Incomplete sentence in Iris course section 7 (Advanced_Concepts)

One of the cells in the Advanced_Concepts part of the Iris course reads only

As you can see, by loading a

numpy course feedback May17

indexing exercise

remove commas from print output, numpy prints an array as

[1 4 5]

not

[1, 4, 5]

alter example to return

[4]]

not

[[1 4]]

Missing html files

The make.sh would suggest that html files should be made, but this doesn't seem to be the case.

course feedback 1mar

https://github.com/SciTools/courses/blob/a5e4a5f88d4f8c8590dc3dd6fc2ca40199031270/course_content/notebooks/numpy_intro.ipynb

result of arr_2d[0, ::2] is [[1, 3], [4, 6]]

wrong! arr_2d[0:, ::2]

the first column, retaining the outside dimension: resulting in [[1, 4]]

tricksy, intentional??

conditional indexing is useful and interesting, include?

print(np.where(arr_2d == 4))
print(arr_2d[arr_2d % 2 == 0])

The Array Object: Summary of key points

properties : shape, dtype
arrays are homogeneous, all elements have the same type: dtype
creation : array([list]), ones, zeros, arange, linspace
- indexing arrays to produce further arrays, subsets of the original
- multi-dimensional indexing and conditional indexing
~~indexing like Python objects : integers and slices~~
~~indexing produces further array objects~~
~~multi-dimensional indexing with multiple indices~~
~~indexing differences from list-of-lists~~

desired_result = np.array([[ 1, 2, 3],
[104, 105, 406],
[407, 407 8, 409]])

You can assume that the ordering of the values is the same as in the earlier example. That is, the order is [day1-station1, day1-station2, day1-station3, day2-station1, ...] and so on.

this is not clear enough, needs another look

lessons for calcs:

defensive programming
clear syntax

masked array:

how to unmask a masked value
space for the exercise (new cell)

efficiency

when to optimise, as well as how

Feedback from Iris course (02/16)

A few niggles and oversights within the Iris course that are hanging over from the recent updates to the course:

In §Constraints, in the part on constraining on time, under the code cell iris.FUTURE.cell_datetime_objects = True, the markdown cell reads "it is now possible to do the same constraint" when the "same constraint" being referenced has been removed.
Still in §Constraints, perhaps we should add a subsection 'Time Constraints'.
In §Plotting and the comparison of iplt with qplt, we need to adjust wspace not hspace.
In §Cube maths, when the scenario difference is calculated the markdown cell below incorrectly states that "the coordinates “time” and “forecast_period” have been removed".

Out of date coordinate creation loop in: Iris Tutorial, Chapter 4

In code block 19, new coordinates are being added to a cube to allow for a merge. However, units are not hard coded, which causes them to load as unknown, whereas existing coordinates have units of 1. Potential solution below, although suitable explanation will also be needed to be added.

[dask] task graphs one to many

Dask task graphs seem to be optimised for many to one style processing: you load multiple files and perform some sort of aggregation / reduction to arrive at a single result. This does not seem to be the paradigm anything like as often in Iris, where you might need to...

take multiple statistics of a single cube and return all of them (e.g. retrieve mean and standard deviation concurrently)
extract multiple sub-cubes out of one or more input cubes
and so on.

Put together an example of making a graph that looks like this and then computing the graph to easily retrieve the requested data.

Use Binder Jupyterlab

I like the revamp of the courses a lot, great work!

Good to see Binder being used for the notebooks. I have a couple of suggestions of where you could go next for a better user interaction:

Use Jupyter Lab instead of Jupyter Notebooks on Binder
It is easy to implement by changing the end of the Binder url from
...?filepath=path%to%notebook.ipynb to ...?urlpath=lab/tree/path/to/notebook.ipynb
See this example from the Informatics Lab
https://binder.pangeo.io/v2/gh/informatics-lab/itk-3dvis/master?urlpath=lab/tree/itk-3dvis.ipynb
and guidance on some of the nuances in this repo
https://github.com/binder-examples/jupyterlab
Use Pangeo Binder
https://binder.pangeo.io is a Binder service from the Pangeo organisation on Google Cloud Services. It has more resources than the regular Binder deployment so might have a bit more grunt for Iris's data crunching requirements.

Feedback from NumPy course (02/16)

The course needs more exercises - there are long sections of teaching with no breaks / changes of style.

For example, a short exercise (on creating/indexing arrays?) before §Multidimensional Array Creation.
Change the values / number of points in the x and y arrays in the final exercise.

Sections to possibly remove from Iris course (04/16)

Sections of the Iris course that may be excess to requirements and could be removed without leaving a big unfilled gap in what the Iris course teaches:

cell - cell comparisons: the section on cell - point comparisons show that cell comparisons are possible. The section on cell - cell comparisons just goes back over this but makes the situation far more complex as it just isn't clear why cell - cell comparisons behave as they do.
Partial datetimes: in many ways they don't add anything that you can't do with a cell datetime object.

Produce physical resources

One way to boost learning is to improve the physical environment within which these courses are taught. We could do this by producing posters that relate to the teaching material that are added to the learning environment.

The resources should be primarily graphical and ideally colourful too. Some possible examples:

The NumPy broadcasting example image
The different types of matplotlib plots shown in the course
The pictographic representation of a cube
Cube data plotted on a cartopy map

Broken Link to iris.save documentation

In the Iris course Chapter 2.2. Saving Cubes, there is a link to iris.save documentation that does not work: https://scitools.org.uk/iris/docs/latest/iris/iris.html#iris.save

It should be updated to this link: https://scitools-iris.readthedocs.io/en/stable/userguide/saving_iris_cubes.html?highlight=iris.save#saving-iris-cubes

Merge resource behaviour in Python3 / Iris v2

When using Python3 & Iris v2, the resources for the merge exercise load differently to using Python2 and Iris v1. Specifically, it appears like each dataset is loaded twice. This needs to be fixed, preferably for consistent behaviour between the two Python/Iris combinations.

Investigate:

is this being caused by Python 3 (i.e. has wildcard filename match behaviour changed?)
is this being caused by Iris v2 (i.e. has dataset load behaviour changed in a manner that we have not allowed for?)
is this being caused by a combination of the above?
is this being caused by something else?

[Dask] merge conflicts

Add a subsection detailing what to do if you use dask + Iris for parallel file load and encounter merge conflicts when you try and merge the cubes.

For example:

files = glob.glob(iris.sample_data_path('GloSea4', '*.pp'))
cb = db.from_sequence(files).map(iris.load_cube)
dmc = delayed(lambda cubes: iris.cube.CubeList(cubes).merge_cube())(cb)
dmc.compute()

MergeError                                Traceback (most recent call last)
...
MergeError: failed to merge into a single cube.
  Coordinates in cube.aux_coords (scalar) differ: realization.
  Coordinates in cube.aux_coords (non-scalar) differ: forecast_period.

Iris course - corrupt notebook (7.Advanced concepts)

The Advanced Concepts notebook in the Iris course seems to be corrupt.

A solution appears to be adding ", to the end of line 497.

Variables are referred to with Exercises in between

For example:
In the Iris course 3 (Subcube_Extraction), in section 3.2, the cubes variable is defined at the top of the section and is referred to at the bottom. Between this, there are two exercises where the user my redefine or change the cubes variable, the second of which involves loading from another file. When loading this other file, I made use of the same name cubes which meant that, later on, the notebook was referring to a different CubeList than it was expecting.
It may be wise to redefine such variables when there is an exercise in between for better consistency.

Iris: area-averaging not explained

There is an exercise asking for area-averaging, but we don't explain it anywhere. Bit too far of a leap to skip that, so needs adding to the course.

[Dask] Explore re-ordering §load content

As noted in #107 (review), the content in the loading section may not be as clear as it could be. We should explore re-ordering the content in this section to bring all the pure Iris code together into a single subsection as the demonstration, and then provide the dask + Iris code in a following single subsection as a comparison.

Note: this may not make the content any clearer. If it doesn't, we can make a note here of the fact we explored the option and then leave it as it is.

Feedback from Iris course on 08/16

Use working examples that may appeal/taylored to scientists from:

applied science
climate science
foundation science
weather science

Introduction to python course

There is a set of things that we teach in every course that we run. These things don't directly relate to any of the specific SciTools courses; they are general Python things that are nevertheless useful to know.

They are:

*args and **kwargs,
list comprehensions,
variable unpacking,
__str__ and __repr__,
naming conventions for classes, functions and methods, and
probably a few other bits as well.

These could be added to a mini-course to be taught alongaside the other existing courses.

scitools-classroom / courses Goto Github PK

courses's Issues

Recommend Projects

Recommend Topics

Recommend Org