Giter VIP home page Giter VIP logo

molssi-education / python-package-best-practices Goto Github PK

View Code? Open in Web Editor NEW
35.0 5.0 35.0 13.62 MB

Hands-on workshop showing good software development practices to create a Python package.

Home Page: http://education.molssi.org/python-package-best-practices/

License: Other

Makefile 0.77% HTML 4.89% CSS 8.89% R 5.14% Shell 0.28% Python 66.64% Ruby 0.30% Jupyter Notebook 13.10%
lessons molssi-best-practices cookiecutter python best-practices

python-package-best-practices's Introduction

Python Package Best Practices

This repository contains lessons for the MolSSI Workshop "Best Practices in Python Package Development". To complete the lessons, navigate to the website. Make sure to visit setup first to obtain lesson materials and install required software.

This workshop is designed to take place in one and a half to two days. Students begin the workshop with python code stored in a Jupyter notebook. During the workshop, we refactor and format the code into a Python package. At the end, all participants should have a repository on GitHub which implements MolSSI's Best Practices for a Python package. Depending on the level of students taking this course, all lessons may not be accomplished in two days. Workshops should cover at least lessons 1-6 (outlined below), with more potentially being added for more advanced groups.

Workshop Topics

This workshop walks students through setting up, writing code, and setting up multiple services (continuous integration, codecov, etc) for a Python package. When finished, this repo will include the following topics. Items which have a checkbox are (mostly) finished, and can be followed online.

Contributing to these lessons

To contribute to this lesson, fork this repository and make changes on your local clone. Once you have changes you would like incorporated, create a pull request, and we will review your contributions.

All edits to lessons should be done under _episodes in the appropriate markdown file. We use Sphinx with a variation of the PyData Sphinx Theme and MyST parser for lessons. There are custom admonitions for overviews, key points, and exercises. You can see examples of custom admonitions in the lesson files.

python-package-best-practices's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

python-package-best-practices's Issues

CITATION

Hi,

I'm giving a workshop as a part of WiDS2021 and I am wondering if I can use some episodes (e.g. GitHub, ..) from this lesson to incorporate it in the tutorials. I understand that this lesson is made available under the Creative Commons Attribution license but what is the appropriate way of citing it. I looked into CITATION file but it's empty. Can you advise me of the best way of using the materials, giving credits to the authors and citing them, please?

add license info in footer

consider adding a license notice in the web pages' footer (currently it has a copyright notice, which may be interpreted as "all rights reserved")

The code inside "Episode 4 - Deciding Package Structure" should follow the flow.

The code inside Episode 4 doesn't reflect the starting_notebook.ipynb, as @khoivan88 pointed out at #11. I think it is important for the student/reader to follow along the code in the consistent manner since the differences will distract and derail the flow of the learning path. Imagine when the student/reader got confused which one should they follow, or asking why they are different? Or maybe when they use the code that have descriptive variable name then they reach Episode 5 and can not practice the Variable Names lesson.

Also, it is important to highlight that Episode 4 is the continuation of Using Branches - exercise from Episode 2. Therefore there should be a note near the Package Structure header that warn the student/reader that they should have completed the Using Branches - exercise from Episode 2.

PS. I think using prereq blockquote would be appropriate at the beginning of Episode 4 :)

symbols typed as Numpy array out of nowhere

Now I understand why in Episode 4 section Editing function to our package, the symbols variable output was written as array(['O', 'H', 'H'], dtype='<U1') which then I updated to ['O', 'H', 'H']. When I reached Whitespace subsection, open_pdb becomes:

def open_pdb(file_location):
    
    with open(file_location) as f:
        data = f.readlines()

    coordinates = []
    symbols = []
    for line in data:
        if 'ATOM' in line[0:6] or 'HETATM' in line[0:6]:
            symbols.append(line[76:79].strip())
            atom_coords = [float(x) for x in line[30:55].split()]
            coordinates.append(coords)

    coords = np.array(coordinates)
    symbols = np.array(symbols)

    return symbols, coords

Notice that "suddenly" symbols = np.array(symbols) came out of nowhere. Thus, we need to add some explanation for adding this line, or we can change the starting_notebook.ipynb. And I prefer the former over the latter. Because I always wondering why Episode 4 begin with Editing function to our package when there is no editing to the function. I believe that it was originally intended for an introduction to function editing. And that is why there is a module check to see what is the content of symbols and coords variables, then the output of symbols variable is... Numpy array!

Therefore I suggest that we add a little bit of introduction to function editing in Editing function to our package to teach the student how to change the function and how the change reflected when we're using the code. And the example that we use is adding the symbols = np.array(symbols) line to open_pdb.

WSL as the command prompt & environment of choice in Windows

In PR #45 WSL was mentioned, which reminds me that there are many benefit from using it. I think the greatest benefit is that we can make the working environment similar between Windows, Linux, and MacOS. WSL is also supported by VS Code and Pycharm.

So I propose using WSL for Windows users in this workshop. Hopefully, we can also use the command prompt directly inside IDE without any trouble. Which IMO very helpful for the student, rather than switching back and forth between VS Code and command prompt (or Powershell).

package-structure lesson has functions not exactly the same as jupyter notebook

Hi,

I was going through the Deciding Package Structure lesson and I noticed that the functions content are not the same as those in the Jupiter notebook in the downloaded material. I don't think it is a big issue, just want to let you know.

Besides, I have a few suggestions for those functions and I am just wondering if I should make a pull request for 04-package-structure.md. The suggestions would include:

  • Add space between operators and operands to match PEP8
  • Use list comprehension in place of some simple for loops (I saw that the list comprehension has been used already in the jupyter notebook)
  • Simplify some for loop that loop over items() while only keys or values are used.

If you thing that these suggestions could be helpful(i.e. fits the goals and audience Python comfort level), please let me know and I will make a PR.

Again, thank you for a nice workshop material!

Best regards,

Khoi Van

Adding Travis-CI/CodeCov to repo

In the past, and in the current episode, we add Travis-CI and CodeCov by going to the websites and authorizing the apps. An alternative would be to add these through GitHub Marketplace.

This would also connect to GitHub Actions (when we start using it for the workshop) which you can find through Marketplace.

Thoughts? (@JoaoRodrigues )

calculate_distance running into exception when build_bond_list tested

I was trying to practice Episode 07 - Python Testing And when I reached the build_bond_list test, invoking the command pytest -v and it throws an Exception:

(molssi_best_practices) radifar->mymolpy$ pytest -v
====================================================== test session starts ======================================================
platform linux -- Python 3.7.7, pytest-5.4.3, py-1.8.2, pluggy-0.13.1 -- /home/radifar/anaconda3/envs/molssi_best_practices/bin/python
cachedir: .pytest_cache
rootdir: /media/radifar/radifar-dsl/Python/molssi-best-practice/mymolpy
plugins: cov-2.10.0
collected 4 items                                                                                                               

mymolpy/tests/test_measure.py::test_calculate_distance PASSED                                                             [ 25%]
mymolpy/tests/test_measure.py::test_calculate_angle PASSED                                                                [ 50%]
mymolpy/tests/test_molecule.py::test_build_bond_list FAILED                                                               [ 75%]
mymolpy/tests/test_mymolpy.py::test_mymolpy_imported PASSED                                                               [100%]

=========================================================== FAILURES ============================================================
_____________________________________________________ test_build_bond_list ______________________________________________________

    def test_build_bond_list():
    
        coordinates = np.array([
            [1, 1, 1],
            [2.4, 1, 1],
            [-0.4, 1, 1],
            [1, 1, 2.4],
            [1, 1, -0.4]
        ])
    
>       bonds = mymolpy.build_bond_list(coordinates)

mymolpy/tests/test_molecule.py:18: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
mymolpy/molecule.py:15: in build_bond_list
    distance = calculate_distance(coordinates[atom1], coordinates[atom2])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

rA = array([1., 1., 1.]), rB = array([1., 1., 1.])

    def calculate_distance(rA, rB):
        """Calculate the distance between two points.
    
        Parameters
        ----------
        rA, rB : np.ndarray
            The coordinates of each point.
    
        Returns
        -------
        distance : float
            The distance between the two points.
    
        Examples
        --------
        >>> r1 = np.array([0, 0, 0])
        >>> r2 = np.array([0, 0.1, 0])
        >>> calculate_distance(r1, r2)
        0.1
        """
    
        if isinstance(rA, np.ndarray) is False or isinstance(rB, np.ndarray) is False:
            raise TypeError("rA and rB must be numpy arrays")
    
        dist_vec = (rA - rB)
        distance = np.linalg.norm(dist_vec)
    
        if distance == 0.0:
>           raise Exception("Two atoms are located in the same point in space")
E           Exception: Two atoms are located in the same point in space

mymolpy/measure.py:35: Exception
==================================================== short test summary info ====================================================
FAILED mymolpy/tests/test_molecule.py::test_build_bond_list - Exception: Two atoms are located in the same point in space
================================================== 1 failed, 3 passed in 0.46s ==================================================

Btw, my package name is mymolpy instead of molecool, and here is my mymolpy/tests/test_molecule.py

"""
Unit and regression test for the molecule module
"""

import mymolpy
import numpy as np

def test_build_bond_list():

    coordinates = np.array([
        [1, 1, 1],
        [2.4, 1, 1],
        [-0.4, 1, 1],
        [1, 1, 2.4],
        [1, 1, -0.4]
    ])

    bonds = mymolpy.build_bond_list(coordinates)

    assert len(bonds) == 4

    for bond_length in bonds.values():
        assert bond_length == 1.4

Here is my mymolpy/molecule.py

"""
Miscelaneous function related to molecule
"""

from .measure import calculate_distance

def build_bond_list(coordinates, max_bond=1.5, min_bond=0):

    # Find the bonds in a molecule (set of coordinates) based on distance criteria.
    bonds = {}
    num_atoms = len(coordinates)

    for atom1 in range(num_atoms):
        for atom2 in range(atom1, num_atoms):
            distance = calculate_distance(coordinates[atom1], coordinates[atom2])
            if distance > min_bond and distance < max_bond:
                bonds[(atom1, atom2)] = distance

    return bonds

And then I tried to debug mymolpy/molecule.py as follows (Notice that I'm adding +1 after atom1 in the inner for loop):

"""
Miscelaneous function related to molecule
"""

from .measure import calculate_distance

def build_bond_list(coordinates, max_bond=1.5, min_bond=0):

    # Find the bonds in a molecule (set of coordinates) based on distance criteria.
    bonds = {}
    num_atoms = len(coordinates)

    for atom1 in range(num_atoms):
        for atom2 in range(atom1+1, num_atoms):
            distance = calculate_distance(coordinates[atom1], coordinates[atom2])
            if distance > min_bond and distance < max_bond:
                bonds[(atom1, atom2)] = distance

    return bonds

And the test run perfectly!

(molssi_best_practices) radifar->mymolpy$ pytest -v
====================================================== test session starts ======================================================
platform linux -- Python 3.7.7, pytest-5.4.3, py-1.8.2, pluggy-0.13.1 -- /home/radifar/anaconda3/envs/molssi_best_practices/bin/python
cachedir: .pytest_cache
rootdir: /media/radifar/radifar-dsl/Python/molssi-best-practice/mymolpy
plugins: cov-2.10.0
collected 4 items                                                                                                               

mymolpy/tests/test_measure.py::test_calculate_distance PASSED                                                             [ 25%]
mymolpy/tests/test_measure.py::test_calculate_angle PASSED                                                                [ 50%]
mymolpy/tests/test_molecule.py::test_build_bond_list PASSED                                                               [ 75%]
mymolpy/tests/test_mymolpy.py::test_mymolpy_imported PASSED                                                               [100%]

======================================================= 4 passed in 0.36s =======================================================

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.