Giter VIP home page Giter VIP logo

carpentries-incubator / data-science-ai-senior-researchers Goto Github PK

View Code? Open in Web Editor NEW
8.0 8.0 7.0 17.16 MB

Introduction to Data Science and AI for senior researchers

Home Page: https://carpentries-incubator.github.io/data-science-ai-senior-researchers/

License: Other

Ruby 1.08% Makefile 8.52% R 13.52% Shell 0.79% Python 76.09%
ai artificial-intelligence carpentries-incubator data-science english lesson pre-alpha

data-science-ai-senior-researchers's People

Contributors

aldenc avatar allcontributors[bot] avatar arronlacey avatar claudioangione avatar jcolomb avatar johav avatar kasra-hosseini avatar lydiafrance avatar malvikasharan avatar tobyhodges avatar zkamvar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-science-ai-senior-researchers's Issues

Suggested edits for Episode 02

  • Add a more detailed case study for simulation at the Turing (and the Crick?)
  • Use the callout formatting for the case study
  • Adding images to improve aesthetics (not critical)

Please include these references where applicable

Summary

References:

Turing resources

Other resources

Carpentries Resources

Going through the list of current incubators, here a a few projects that deal with ML already that we should connect with and reuse

Packaging and publishing

different levels of cloud computing and HPC training materials

AI and Machine Learning

Suggested edits for the Introduction Episode

  • Adding context to the overview of the training materials (Line 54)
  • Add Learning Outcomes Section
  • Some repetition with Modular and Flexible Learning Section with Mode of Delivery -- combine?
  • A section on what the course will not cover?

Identify if any existing images/figures would benefit from custom illustrations

I have put placeholder images i.e. a ROC curve that has been sourced from a blog / site and maybe it would be better if a custom illustration might work better? We could produce these via an example analysis we conduct ourselves (and link a notebook) if that doesn't sound the way to go? Just thinking about consistency.

For example here is a placeholder image I am using in a PR here

Revisions starting March 2023

Hi my name is Jo, some of you know me already from other co-creation projects and activities around Open Science.
It's a great honor to join the team and impressive to see what has been built already.

Moving forward with this and the "data-science-ai-senior-researchers repository, I'll go through the modules to make suggestions (pull requests and issues), as agreed with @malvikasharan and @jcolomb in order to finalize the content and to get both courses ready by mid-April, provided that we all agree.

Thanks for all your excellent contributions thus far. Both courses will surely be of great value to many researchers around the world.
Ping me with any comments or concerns you might have, I am looking forward to working with you.
Best wishes
Jo.

Explain how the term AI is used in different audiences

In the introduction I think we should contextualize the use of AI and it's relation to ML. AI is a very high level term that gets used less as you read papers on methodologies etc where ML and sub-types of ML are more common. We should explain why this is.

Suggested edits for Episode 03

  • Use correct formatting for the case studies (test the images and gifs)
  • Expand on the benefits to open tool communities and impact
  • Possibly refactor the episode into learning-outcomes.
  • Are there automation tools used by the Crick?

Content proposed for this material

This lesson should cover foundational concepts introducing

  • data science, machine learning, Deep Learning, AI
  • Address โ€˜why do these apply to me?โ€™
  • Include examples to inform future directions in biomedical and related fields: breakthrough technology and applications
  • Intro to different algorithm types and how to choose what to use
  • Splitting datasets appropriately and avoiding leakage.
  • Problem of imbalanced data sets and of overfitting
  • Statistics to evaluate output (precision-recall etc)
  • Pointers to where to start with different approaches
  • Guide to common tools - perhaps this will be covered in the above sections, but a guide to available commonly used tools would be helpful.

Examples should be pulled from the Turing where we can and also more broadly from international communities highlighting specific/specialist topics

  • Genomics: Single cell genomic analyses (scRNAseq, scATACseq, spatial transcriptomics);
  • Genomics: Cancer genomics
  • Imaging: Biological image analysis (ML for segmentation, tracking, registration, classification, de-noising).
  • AlphaFold, protein folding- structural biology/protein-protein interaction
  • See some more examples here: carpentries-incubator/managing-computational-projects#19 (comment)

Another lesson is being developed alongside for Managing open and reproducible computational projects: challenges and benefits (reproducibility, collaboration, open tools, version control and importance in grants, peer review, academic incentives)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.