Giter VIP home page Giter VIP logo

bioml_georgetown's Introduction

bioml_georgetown

BioML: Machine Learning for Biomedical Data (Georgetown School of Medicine Graduate Course)

Course Cover

Description:

This course covers practical and conceptual aspects of machine learning in application to high-throughput biomedical data using Python. Throughout the course, students will get an understanding of opportunities and limitations of machine learning in the context of pre-clinical and clinical research. The course is designed as a combination of online resources, practical assignments and live sessions that will be conducted online. Throughout the course, we will review several project examples that demonstrate successes and limitations of conventional machine learning (ML) methods and deep learning (DL) using data from public repositories. As a result of completing this course, each participant should be able to differentiate between various methods, apply the correct method to a data or problem statement and develop a completed project using ML or DL.

Readings:

The readings will be in the form of relevant publications that will be provided. All additional content, including coding practice for learning will be available through the Omics Logic portal: learn.omicslogic.com

  1. OmicsLogic Learn portal: https://learn.omicslogic.com
  2. T-BioInfo server for Big Data Analysis: https://server.t-bio.info
  3. Recommended reading: Deep Learning in Omics Data Analysis and Precision Medicine (book - https://www.ncbi.nlm.nih.gov/books/NBK550335/)
  4. Overview of Machine Learning Part 1: Fundamentals and Classic Approaches https://www.sciencedirect.com/science/article/pii/S1052514920300629?via%3Dihub
  1. Introduction to the course: objectives and outcomes (refer to https://learn.omicslogic.com)
  2. Data Processing and Exploratory Analysis (https://learn.omicslogic.com/courses/course/course-7-bioml-machine-learning-for-biomedical-data)
  3. Machine Learning Methods: unsupervised and supervised types of the analysis
  4. Dimensionality Reduction: Ordination and Embedding
  5. Unsupervised Learning: Clustering
  6. Supervised Learning: Discriminant Analysis and Classification
  7. Explainable AI: Feature selection
  8. Classification vs. Regression
  9. Generalized Linear Models: an introduction to Deep Learning
  10. Network analysis: neighborhoods, manifold, and regression
  11. Deep Learning: Multi-layer Perceptron (MLP), Network Topography, Activation Function
  12. Model Accuracy and Validation: Cross Validation, Randomized and Grid Search for Hyperparameter Optimization
  13. Project Examples and Case Studies (https://learn.omicslogic.com/courses)
  14. How to design your data science project (https://learn.omicslogic.com/courses/course/course-9-designing-a-bioinformatics-research-project)
  15. Project Submissions and Final Exam (https://learn.omicslogic.com/projects)

Course Topics & Outcomes:

In-depth review of statistical concepts, ML and Deep Learning:

  1. T-test, F-test, chi-square, ANOVA and Regression
  2. PCA, tSNE, LDA, Clustering (hierarchical, k-means, DBscan, Fuzzy, PAM)
  3. Classification: Decision Trees, Random Forest, Support Vector Machine, Naive Bayes
  4. Feature Selection Strategies (Feature Significance & Greedy Methods)
  5. Deep Learning: Deep Feedforward Neural Network (DFNN), Convolutional Neural Network (CNN) and other implementations for time-series data.

What Will You Learn To Do In Python?

Coding Challenges:

Loading data from csv, txt, or xlsx sources and converting it to various data structures (dataframe, matrix, lists and vectors) Summarizing categorical and continuous datasets Data preparation using log-normal transformation and quantile normalization Statistical tests and outputs (p-value, t-value, standard error, FDR, logFC) Popular packages like pandas, numpy, and sklearn Visualization using matplotlib, seaborn and plotly Reading, understanding and loading code examples General Coding & Data Sharing Practices: Organizing your scripts with comments and functions (syntax) Setting up a development environment (IDE) Dealing with errors and troubleshooting code (debugging) Preparing data summaries and submitting curated data and meta-data tables to sharing repositories (FAIR principles) Sharing your analysis in jupyter notebooks, on github or google colab Creating interactive visualization in plotly

Course Requirements:

The course is available for those who are just getting started and does not require in-depth knowledge of programming or machine learning. Some background in the basics of molecular biology preferred introduction to bioinformatics. Please complete the following free tutorials to help you get a head start: Bytes and Molecules (https://learn.omicslogic.com/courses/course/course-1-bytes-and-molecules) Getting Started with Bioinformatics in Python (https://learn.omicslogic.com/courses/course/getting-started-with-bioinformatics-in-python)

Intended Outcomes & Learning Objectives:

Understanding of analytical methods for processing, visualization, and analysis of complex biomedical data Learning terminology for machine learning and artificial intelligence in biomedical discovery Becoming familiar with project examples where ML was used effectively to achieve meaningful results Hands-on practice in application of standard unsupervised and supervised learning methods to various types of data, such as genomic, transcriptomic, metagenomics, imaging, and clinical Understand the ML taxonomy and the commonly used machine learning algorithms for analyzing “omics” data Understand differences between ML algorithms categories and to which kind of problem they can be applied to Understand different applications of ML in application to different -omics studies and project design objectives Use popular Python packages for data visualization, analysis and ML Interpret and visualize the results obtained from ML analyses on omics datasets Apply the ML techniques to analyze public domain or their own datasets

bioml_georgetown's People

Contributors

eliabrodsky avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.