scitools-classroom / courses Goto Github PK
View Code? Open in Web Editor NEWPython courses for the scientific researcher
License: BSD 3-Clause "New" or "Revised" License
Python courses for the scientific researcher
License: BSD 3-Clause "New" or "Revised" License
Since #150 we haven't any proper Travis testing, since we broke it.
Travis used to make the notebooks into a non-interactive document build, but we have stopped doing that.
We could re-introduce this.
Mostly, during the 'feature_self_learn' development, we tried to deliver notebooks that would all run through without errors.
The many new "sample solution" code examples are designed also to run : It should seek out + un-comment "# %load" lines first.
Ideally, everything will run through, and that can be our "notebook tests" !
( NOTE: at the moment though, we are getting stuck in a "run all cells" operation. This could be a Jupyter problem or local only ? )
There may be 1 or 2 remaining examples of purposely failing examples for "what went wrong there?" demonstrations. We would need to fix that, maybe with try/except (as in several code solution examples)
Hi there,
I just wanted to check if it is OK to use these course materials as the basis for an Iris tutorial I am writing for https://github.com/ourcodingclub/ourcodingclub.github.io (I have to develop the materials in a set layout/format, otherwise I would just deliver the course 'as is' from the notebook.)
I see it's GPL licensed but just wanted to check if you have any specific citation/attribution, or copyright messages you would like included. (Other than the GPL)
Best wishes,
Declan
University of Edinburgh GeoSciences
In Section 6 of the Iris course (https://github.com/scitools-classroom/courses/blob/master/course_content/iris_course/6.Data_Processing.ipynb), the solution # %load solutions/iris_exercise_6.3e raises an error that nc-time-axis is not found when running in a suitable conda environment as documented in the README ($ conda create -n testenv iris iris-sample-data jupyter).
At the AVD Surgery, I was advised to set up a conda environment with nc-time-axis specified ($ conda create -n testenv iris iris-sample-data jupyter nc-time-axis). This successfully avoids the error. Perhaps the README could be updated?
Thanks
The numpy course Broadcasting section can be difficult to understand due to confusing technical language. The concept itself is quite straightforward, but there are several layers of complexity in the sentences which describe the rules, and they do not correlate to the graphics although they seem like they should, so the images actually just become confusing.
I think that this section would be easier to understand if the images matched the rules and allowed the user to understand all the loaded phrases (like 'the shape of the array with fewer dimensions ... padded ... leading (left) side'; this would be easier to unpack if you maybe showed the shape of the dimensions in a code cell, and then what it means to 'pad' it to match the shape of the other array).
One of the cells in the Advanced_Concepts part of the Iris course reads only
As you can see, by loading a
indexing exercise
remove commas from print output, numpy prints an array as
[1 4 5]
not
[1, 4, 5]
alter example to return
[4]]
not
[[1 4]]
The make.sh would suggest that html files should be made, but this doesn't seem to be the case.
result of arr_2d[0, ::2] is [[1, 3], [4, 6]]
wrong! arr_2d[0:, ::2]
the first column, retaining the outside dimension: resulting in [[1, 4]]
tricksy, intentional??
conditional indexing is useful and interesting, include?
print(np.where(arr_2d == 4))
print(arr_2d[arr_2d % 2 == 0])
The Array Object: Summary of key points
desired_result = np.array([[ 1, 2, 3],
[104, 105, 406],
[407, 407 8, 409]])
You can assume that the ordering of the values is the same as in the earlier example. That is, the order is [day1-station1, day1-station2, day1-station3, day2-station1, ...] and so on.
this is not clear enough, needs another look
lessons for calcs:
masked array:
efficiency
A few niggles and oversights within the Iris course that are hanging over from the recent updates to the course:
iris.FUTURE.cell_datetime_objects = True
, the markdown cell reads "it is now possible to do the same constraint" when the "same constraint" being referenced has been removed.iplt
with qplt
, we need to adjust wspace
not hspace
.Dask task graphs seem to be optimised for many to one style processing: you load multiple files and perform some sort of aggregation / reduction to arrive at a single result. This does not seem to be the paradigm anything like as often in Iris, where you might need to...
Put together an example of making a graph that looks like this and then computing the graph to easily retrieve the requested data.
I like the revamp of the courses a lot, great work!
Good to see Binder being used for the notebooks. I have a couple of suggestions of where you could go next for a better user interaction:
Use Jupyter Lab instead of Jupyter Notebooks on Binder
It is easy to implement by changing the end of the Binder url from
...?filepath=path%to%notebook.ipynb
to ...?urlpath=lab/tree/path/to/notebook.ipynb
See this example from the Informatics Lab
https://binder.pangeo.io/v2/gh/informatics-lab/itk-3dvis/master?urlpath=lab/tree/itk-3dvis.ipynb
and guidance on some of the nuances in this repo
https://github.com/binder-examples/jupyterlab
Use Pangeo Binder
https://binder.pangeo.io is a Binder service from the Pangeo organisation on Google Cloud Services. It has more resources than the regular Binder deployment so might have a bit more grunt for Iris's data crunching requirements.
The course needs more exercises - there are long sections of teaching with no breaks / changes of style.
x
and y
arrays in the final exercise.Sections of the Iris course that may be excess to requirements and could be removed without leaving a big unfilled gap in what the Iris course teaches:
One way to boost learning is to improve the physical environment within which these courses are taught. We could do this by producing posters that relate to the teaching material that are added to the learning environment.
The resources should be primarily graphical and ideally colourful too. Some possible examples:
In the Iris course Chapter 2.2. Saving Cubes, there is a link to iris.save documentation that does not work: https://scitools.org.uk/iris/docs/latest/iris/iris.html#iris.save
It should be updated to this link: https://scitools-iris.readthedocs.io/en/stable/userguide/saving_iris_cubes.html?highlight=iris.save#saving-iris-cubes
When using Python3 & Iris v2, the resources for the merge exercise load differently to using Python2 and Iris v1. Specifically, it appears like each dataset is loaded twice. This needs to be fixed, preferably for consistent behaviour between the two Python/Iris combinations.
Investigate:
Add a subsection detailing what to do if you use dask + Iris for parallel file load and encounter merge conflicts when you try and merge the cubes.
For example:
files = glob.glob(iris.sample_data_path('GloSea4', '*.pp'))
cb = db.from_sequence(files).map(iris.load_cube)
dmc = delayed(lambda cubes: iris.cube.CubeList(cubes).merge_cube())(cb)
dmc.compute()
MergeError Traceback (most recent call last)
...
MergeError: failed to merge into a single cube.
Coordinates in cube.aux_coords (scalar) differ: realization.
Coordinates in cube.aux_coords (non-scalar) differ: forecast_period.
The Advanced Concepts notebook in the Iris course seems to be corrupt.
A solution appears to be adding ", to the end of line 497.
For example:
In the Iris course 3 (Subcube_Extraction), in section 3.2, the cubes
variable is defined at the top of the section and is referred to at the bottom. Between this, there are two exercises where the user my redefine or change the cubes
variable, the second of which involves loading from another file. When loading this other file, I made use of the same name cubes
which meant that, later on, the notebook was referring to a different CubeList than it was expecting.
It may be wise to redefine such variables when there is an exercise in between for better consistency.
There is an exercise asking for area-averaging, but we don't explain it anywhere. Bit too far of a leap to skip that, so needs adding to the course.
As noted in #107 (review), the content in the loading section may not be as clear as it could be. We should explore re-ordering the content in this section to bring all the pure Iris code together into a single subsection as the demonstration, and then provide the dask + Iris code in a following single subsection as a comparison.
Note: this may not make the content any clearer. If it doesn't, we can make a note here of the fact we explored the option and then leave it as it is.
Use working examples that may appeal/taylored to scientists from:
There is a set of things that we teach in every course that we run. These things don't directly relate to any of the specific SciTools courses; they are general Python things that are nevertheless useful to know.
They are:
*args
and **kwargs
,__str__
and __repr__
,These could be added to a mini-course to be taught alongaside the other existing courses.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.