carpentries-lab / good-enough-practices Goto Github PK
View Code? Open in Web Editor NEWGood Enough Practices in Scientific Computing
Home Page: https://carpentries-lab.github.io/good-enough-practices/
License: Other
Good Enough Practices in Scientific Computing
Home Page: https://carpentries-lab.github.io/good-enough-practices/
License: Other
The FAIR in bio practice lesson already has excellent material from @aromanowski and @tzielins on files and project organization.
We should use that to update the project organization episode here.
During the hackathon, together with @egountouna and @ggrimes it was discussed that we should consider adding a proper exercise with different examples of good documentation and generate a discussion around that. Although the timing is already pretty tight.
During the hackathon, together with @egountouna and @ggrimes we found the heading "Anticipate the need to use multiple tables, and use a unique identifier for every record" a little bit confusing. We should consider rephrasing this bit. The paragraph talks about using unique ids to link multiple tables (in a database kind of style).
There are some discussions not formatted correctly in the software episode, so that the suggestions are displayed alongside the questions. That needs fixing.
Could you expand the set of Topics you have added to the description of this repository please? You can find some more guidance in the appendix to the Curriculum Development Handbook. A complete set of topics will ensure this lesson appears correctly on the Community Developed Lesson page. Feel free to ping me again if I can help.
Originally posted by @tobyhodges in #2 (comment)
README files are fundamental to this lesson, so it would help to have an exercise about them.
We could either
There are also some great links about how to write a good (project) README file.
Examples at awesome-readme
Requesting @tobyhodges to list this repository on the Help Wanted page so that any issues we label help-wanted
will show up there for the wider community.
I can already add some points to instructor notes:
I'll get started on that document as I seem to be writing it already.
Currently information on what should be included in a change log entry is currently spread throughout the lesson, and quite short.
For ease of instructing (and learning!) it would be good to pull these out into one callout box, or a headed section, which pulls together a bullet pointed list of the things which we DO want to include in our change log entries. What we don't want to cover is a section further below, but it would be good to highlight what we DO want.
The conclusion section is not very coherent in its current form, and would I think be better if shorter.
Probably:
We need to add the basic information, as recommended in the lesson template README:
lesson_title
_config.yml
FIXME
in:
_config.yml
main
branch and set this as the default branch in your repository settings][change-default-branch]I am creating the new branch basic-info-1
for these.
During the hackathon, together with @egountouna and @ggrimes we discussed about the possibility of linking to further content on file naming and identifiers. A good place to do this would be at some place under the "Create the data you wish to see in the world" heading.
The Good Enough Practices paper suggests a directory structure focused on programming projects:
Project organization
a. Put each project in its own directory, which is named after the project.
b. Put text documents associated with the project in thedoc
directory.
c. Put raw data and metadata in a data directory and files generated during cleanup and analysis in aresults
directory.
d. Put project source code in thesrc
directory.
e. Put external scripts or compiled programs in the bin directory.
f. Name all files to reflect their content or function.
This suits some projects but not others; for example, some data collection projects might be best served with a date-based directory structure 01-Jan, 02-Feb, ...
Jennifer Bryan argues for date-based naming in how to name files.
After discussing this on 29th March, I think we should:
Remove placeholders such as "I am a section" from Data Management and other lessons.
Add a discussion session on what can go wrong or write in collaborative writing.
Figure 1. Four Stages of Data Loss
needs a citation and credit to PhDcomics.
Explain that the track changes episode provides motivation and good practices for track changes. But it does not aim to teach git.
Then the swc git-novice teaches hotw to use git as one (good) solution to implement all these.
@tzielins proposed a change to the instructor notes. Most of the discussion sections can be done several ways:
Different approaches can work with different groups based on the time available.
We recommend as an efficient approach: ask learners to fill in their responses in the collaborative document, and the instructor can then read out highlights to lead and steer a group discussion.
As a second exercise, not as a first exercise, we could direct users to some real data repositories to ask what is good or not so good.
Any suggestions of repositories to try?
From my own work, some repos that follow at least some good enough practices are:
This follows the successful earlier hackathon #32. Hackathon runs 09:30-12:00, email or tag me to get the zoom link if you're coming and don't have it.
My suggested timing
09:30 - All meet, triage major issues and decide what to focus on.
10:00 - work on separate issues in breakout rooms
10:45 - coffee break
11:00 - resume in breakout rooms?
11:45 - meet in main room to evaluate progress and congratulate ourselves.
12:00 - end.
I'm not sure what to focus on, beyond #55 which definitely needs progress?
Maybe timings #33?
Let us put Edinburgh-specific resources (e.g datavault) on a new branch edinburgh-specific
, so that we keep them, but they don't end up on the main worldwide carpentries-incubator branch.
Eventually it might make more sense to put them in a fork elsewhere that has its own gh-pages branch and website.
This depends on the timings (#33).
We discussed there being 5 chapters that want exercises. Either 1 exercise, or 1 main and 1 optional
Formatting fixes needed for Keeping Track of Changes lesson:
The aim of this lesson is to be a 3-hour carpentries-style half-day lesson.
We need to add timings for all the episodes that reflect this.
A collaborator to the riboviz project suggested that creating an example of how a 'quick collaborative git manuscript workflow' might look would be really useful, and would fit into the 07-manuscripts.md
lesson.
This would feature collaborators e.g. Agnes and Bob and describe the branch, edit, add/commit/push, pull request, review, merge process.
Today, I wondered if we should remove DOIs as a "key point" in the software episode. Yes, it is a key point of the data management episode. But I don't think it needs to be emphasized so much in the software, and distracts from the episode's focus:
that readable, reusable, and testable are all side effects of writing modular code.
So I favour removing/de-emphasizing.
What do you think @tzielins @aromanowski @ameynert @egountouna ?
Current material talks about 2 different kinds of README.md
These both have in common that they are an aid to navigation that tells you what you need to know. Read me first still applies, for either a whole project or a component of it. We should probably draw a conceptual distinction between these.
The "some git resources" section in 06-track_changes
isn't lining up properly when rendered. Probably needs bullet points.
Also would benefit from nice titles/links in markdown style.
I think this lesson material could be brought up to alpha status by a small team of people working over a day or two. Let's schedule that?
The material is similar so it might give a better flow to move collaboration to the penultimate episode, just before manuscripts. To discuss.
Give the introduction episode a clearer narrative.
Currently it is still bullet points from slides so doesn't quite hang together.
I am wondering if it makes sense to have 09-resources
as a separate episode, or instead to just include a list of other resources in the 08-conclusions
episode.
Also, we could be less generous in linking to the resources, is it too much?
Several still need tidying up...
Make figures display in-line in website.
This did not work when I previewed locally with `bundle exec jekyll serve, instead giving me links to the figures. It needs to be checked on github pages after merging.
In 09-resources
, the links should be made nice in the markdown link manner.
@aromanowski pointed out that the discussions are the instructor's best opportunity to figure out where the learners are at and how to adjust the material accordingly.
Needs to be updated in instructor notes
Discussions should be with "the person next to you" or "with a small group", and have suggested timings.
Some of them may also be recast as challenges
Add funding information to README, as already in fair-bio-practice.
We are developing this lesson as part of the ED-DaSH consortium, supported by the "Data driven life science skills development - equipping society for the future" UKRI-MRC grant MR/V039075/1.
Request to add @tzielins, @aromanowski and @FlicAnderson to team @good-enough-practices-maintainers (I added Tomasz and Andrew manually, but better to have them in the team)
Links to papers , e.g. #wickham2014, were not working when I made the website locally using bundle exec jekyll serve
. We need to go through and fix those.
Keeping Track of Changes lesson needs a completed Overview section (timings, questions, objectives).
Currently this hasn't been added, and should match the lesson content.
I think "project organization" could use an example project directory displaying some problems. Could be used for a discussion about what doesn't work. Then contrasted with one that is well organized?
In 05-project_organization
EW delivered to CDT-BAI 1st year students with helper Sandy Nelson, on 2021-09-17.
Started at 10:05, coffee break 11:15, resumed 11:30, ended 12:30. In that time we covered the introduction, Data management, and Software sections, with a little on "What next". The students mostly seemed to have previous software experience so was on the more skilled end of the crowd.
Good:
Bad:
One thing you would keep about today's workshop:
One thing you would change about today's workshop:
Create learner profiles focused on a half-day "good enough practices in research computing" workshop. These are complementary to the learner profiles that @tzielins and @aromanowski have created for the longer "FAIR science in practice" workshop.
@ppxasjsm and @JenDaub have agreed to contribute to learner profiles.
I started a new branch learner-profiles-5
, and added a profile to _extras/learner-profiles.md
Add drafts of
To all the draft episodes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.