Giter VIP home page Giter VIP logo

microsoft / ml-for-beginners Goto Github PK

View Code? Open in Web Editor NEW
67.0K 998.0 13.6K 222.4 MB

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

Home Page: https://microsoft.github.io/ML-For-Beginners/

License: MIT License

Jupyter Notebook 7.22% Python 0.07% CSS 0.01% HTML 92.70% JavaScript 0.01% Vue 0.01% Dockerfile 0.01%
ml data-science machine-learning machine-learning-algorithms machinelearning python machinelearning-python scikit-learn scikit-learn-python r

ml-for-beginners's Issues

[LESSON]

  • quiz 1
  • written content
  • quiz 2
  • challenge
  • extra reading
  • assignment

Tribute to Contributors in diffrent way

Is your feature request related to a problem? Please describe.
As of now, we are Thanking Contributors like this ๐Ÿ‘‡๐Ÿป


image


I think we can do it in a more better and personalized way !

Describe the solution you'd like

We can shift this section at bottom of reading me the file and we can add everyone's card so If anyone wants to reach out to them they can visit their GitHub profile card will contain their GitHub avatar and name where the link will be added.

Describe alternatives you've considered
Or we can just add links to their profiles to the existing places this will also help but the card will look a more attractive and a small tribute to great people!!

Will like to add this feature

[TRANSLATIONS] - Russian

  • Base README.md
  • Quizzes
  • Introduction base README
    • Intro to ML README
    • Intro to ML assignment
    • History of ML README
    • History of ML assignment
    • Fairness README
    • Fairness assignment
    • Techniques of ML README
    • Techniques of ML assignment
  • Regression base README
    • Tools README
    • Tools assignment
    • Data README
    • Data assignment
    • Linear README
    • Linear assignment
    • Logistic README
    • Logistic assignment
  • Web app base README
    • Web app README
    • Web app assignment
  • Classification base README
    • Intro README
    • Intro assignment
    • Classifiers 1 README
    • Classifiers 1 assignment
    • Classifiers 2 README
    • Classifiers 2 assignment
    • Applied README
    • Applied assignment
  • Clustering base README
    • Visualize README
    • Visualize assignment
    • K-means README
    • K-means assignment
  • NLP base README
    • Intro README
    • Intro assignment
    • Tasks README
    • Tasks assignment
    • Translation README
    • Translation assignment
    • Reviews 1 README
    • Reviews 1 assignment
    • Reviews 2 README
    • Reviews 2 assignment
  • Time Series base README
    • Intro README
    • Intro assignment
    • ARIMA README
    • ARIMA assignment
  • Reinforcement base README
    • QLearning README
    • QLearning assignment
    • gym README
    • gym assignment
  • Real World base README
    • Real World README
    • Real World assignment

[LESSON]

  • quiz 1
  • written content
  • quiz 2
  • challenge
  • extra reading
  • assignment

Google cloud Auto ML vision

Have taken most of the lessons.... Why not adding introduction to GPC Auto ML vision lesson?

It wil be very interesting as practical project!

[TRANSLATIONS] - Italian

  • Base README.md
  • Quizzes
  • Introduction base README
    • Intro to ML README
    • Intro to ML assignment
    • History of ML README
    • History of ML assignment
    • Fairness README
    • Fairness assignment
    • Techniques of ML README
    • Techniques of ML assignment
  • Regression base README
    • Tools README
    • Tools assignment
    • Data README
    • Data assignment
    • Linear README
    • Linear assignment
    • Logistic README
    • Logistic assignment
  • Web app base README
    • Web app README
    • Web app assignment
  • Classification base README
    • Intro README
    • Intro assignment
    • Classifiers 1 README
    • Classifiers 1 assignment
    • Classifiers 2 README
    • Classifiers 2 assignment
    • Applied README
    • Applied assignment
  • Clustering base README
    • Visualize README
    • Visualize assignment
    • K-means README
    • K-means assignment
  • NLP base README
    • Intro README
    • Intro assignment
    • Tasks README
    • Tasks assignment
    • Translation README
    • Translation assignment
    • Reviews 1 README
    • Reviews 1 assignment
    • Reviews 2 README
    • Reviews 2 assignment
  • Time Series base README
    • Intro README
    • Intro assignment
    • ARIMA README
    • ARIMA assignment
  • Reinforcement base README
    • QLearning README
    • QLearning assignment
    • gym README
    • gym assignment
  • Real World base README
    • Real World README
    • Real World assignment

[TRANSLATIONS] - Korean

  • Base README.md
  • Quizzes
  • Introduction base README
    • Intro to ML README
    • Intro to ML assignment
    • History of ML README
    • History of ML assignment
    • Fairness README
    • Fairness assignment
    • Techniques of ML README
    • Techniques of ML assignment
  • Regression base README
    • Tools README
    • Tools assignment
    • Data README
    • Data assignment
    • Linear README
    • Linear assignment
    • Logistic README
    • Logistic assignment
  • Web app base README
    • Web app README
    • Web app assignment
  • Classification base README
    • Intro README
    • Intro assignment
    • Classifiers 1 README
    • Classifiers 1 assignment
    • Classifiers 2 README
    • Classifiers 2 assignment
    • Applied README
    • Applied assignment
  • Clustering base README
    • Visualize README
    • Visualize assignment
    • K-means README
    • K-means assignment
  • NLP base README
    • Intro README
    • Intro assignment
    • Tasks README
    • Tasks assignment
    • Translation README
    • Translation assignment
    • Reviews 1 README
    • Reviews 1 assignment
    • Reviews 2 README
    • Reviews 2 assignment
  • Time Series base README
    • Intro README
    • Intro assignment
    • ARIMA README
    • ARIMA assignment
  • Reinforcement base README
    • QLearning README
    • QLearning assignment
    • gym README
    • gym assignment
  • Real World base README
    • Real World README
    • Real World assignment

Time requirements?

The course is described as being "12 weeks, 24 lessons", but it doesn't give an idea as to how much time is required to complete each lesson.

I'd like to do the course, but I need to figure out if I've got time in my current schedule.

PAT Rubrics

Build all PAT Rubrics in discussion board for each lesson grouping

[Translations] - Simplified Chinese

  • Base README.md
  • Quizzes
  • Introduction base README
    • Intro to ML README
    • Intro to ML assignment
    • History of ML README
    • History of ML assignment
    • Fairness README
    • Fairness assignment
    • Techniques of ML README
    • Techniques of ML assignment
  • Regression base README
    • Tools README
    • Tools assignment
    • Data README
    • Data assignment
    • Linear README
    • Linear assignment
    • Logistic README
    • Logistic assignment
  • Web app base README
    • Web app README
    • Web app assignment
  • Classification base README
    • Intro README
    • Intro assignment
    • Classifiers 1 README
    • Classifiers 1 assignment
    • Classifiers 2 README
    • Classifiers 2 assignment
    • Applied README
    • Applied assignment
  • Clustering base README
    • Visualize README
    • Visualize assignment
    • K-means README
    • K-means assignment
  • NLP base README
    • Intro README
    • Intro assignment
    • Tasks README
    • Tasks assignment
    • Translation README
    • Translation assignment
    • Reviews 1 README
    • Reviews 1 assignment
    • Reviews 2 README
    • Reviews 2 assignment
  • Time Series base README
    • Intro README
    • Intro assignment
    • ARIMA README
    • ARIMA assignment
  • Reinforcement base README
    • QLearning README
    • QLearning assignment
    • gym README
    • gym assignment
  • Real World base README
    • Real World README
    • Real World assignment

[REQUEST] - Sketchnotes for major modules

@dasani-madipalli and @girliemac

I'd love to include one sketchnote (WebDev style if possible!) for each of the big topics in this course, to be added to each introductory lesson:

  • Introduction
  • Fairness
  • History
  • Techniques
  • Regression
  • Classification
  • Clustering
  • NLP
  • Time Series
  • Reinforcement
  • Real-World

Do you think it's possible? ๐Ÿ™โค๏ธ

NLP 5 - last lesson for NLP

the last lesson, finishing NLTK for sentiment analysis

  • quiz 1
  • written content
  • quiz 2
  • challenge
  • extra reading
  • assignment

[LESSON]

  • quiz 1
  • written content
  • quiz 2
  • challenge
  • extra reading
  • assignment

Reviews for NLP Module

Hi @jlooper,

I really love the way in which these learning materials and curriculum are designed. However, I had some reviews which might help make this better. I don't have any reviews for Sections 1 and 3 and feel they are really great in the current form. Do let me know if I should make a PR addressing any of these reviews.

1-Introduction-to-NLP

This section looks pretty good to me and I don't have any reviews which might help make this better.

2-Tasks

  • For this sentence

๐ŸŽ“ Tokenization Probably the first thing most NLP algorithms have to do is split the text into tokens, or words. While this sounds simple, having to account for punctuation and different language's word and sentence delimiters can make it tricky.

maybe we could give some intuition on why tokenization might be to not only split a sentence when there is whitespace by adding:

Thought it might seem very straightforward to simply split your sentence into words, you might have to use some other methods or add on top of this too.

  • For the Tasks common to NLP section I feel we should add a mention about "word embeddings" owing to its importance. Maybe we could add something like this:

๐ŸŽ“ Embeddings Embeddings are a way to meaningfully convert your text data numerically. This is done in a way so that words with a similar meaning or words used together cluster together in a high dimensional space.

Optionally we could also add:

Try playing around with word embeddings from a quite popular model (Word2Vec) here. Can you see how clicking on one word shows the words with similar meaning clustering around! Eg. if you inspect the word 'toy' you see it clusters with words: 'disney', 'lego', 'playstation', 'console' etc.

However, I do understand this might make it a bit more deep at this stage, what do you think?

3-Translation-Sentiment

This section looks pretty good to me and I don't have any reviews which might help make this better.

[LESSON]

  • quiz 1
  • written content
  • quiz 2
  • challenge
  • extra reading
  • assignment

[Content]: Real World Applications for Classic Machine Learning - Select an area

To complete the lessons, we want to add an overview of ML as it is used in the real world. This will focus o
classic ML only! So don't worry about Neural Networks for this curriculum. Pick a domain and write a paragraph about it as a reply to this issue, and I'll add it to the lesson and credit you!

  • Finance
    Credit card fraud detection
    Wealth management

  • Education
    Predicting student behavior
    Preventing plagiarism
    Course recommendations

  • Retail
    Personalizing the customer journey
    Inventory management

  • Health Care
    Optimizing drug delivery
    Hospital re-entry management
    Disease management

  • Ecology and Green Tech
    Forest management
    Motion sensing of animals
    Energy Management

  • Insurance
    Actuarial tasks

  • Arts, Culture, and Literature
    Fake news detection
    Classifying artifacts

  • Marketing
    'Ad words'

Reviews for Clustering 2 Notebook

Hi @jlooper ,

I reviewed the solution notebook 5-Clustering/2-K-Means. Here are my experiments on top of which I make the below comments: https://colab.research.google.com/drive/1oIXPkQZzvJClRaCoEcpznKw3RhXOTPay?usp=sharing

It turns out we are trying to cluster the other features to get 3 categories (artist genres) but if you see there is almost no correlation between our features and our expected cluster bases (see this cell: https://colab.research.google.com/drive/1oIXPkQZzvJClRaCoEcpznKw3RhXOTPay#scrollTo=cpfSV8Yem1H9&line=5&uniqifier=1). The only good enough corellation we get is for loudness and energy which again wouldn't be a problem to be solvedย a clustering algorithm.

I think it would not be much useful to use clustering for this problem in its entirety, what do you think?

ุชุชุตุฎ

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Reviews for Clustering Module Part 1

Hi @jlooper,

Here are my reviews for the Clustering Module, do note I have only reviewed Part 1 as of the moment as you asked.

  • In this sentence

Derived from mathematical terminology, non-flat vs. flat geometry refers to the measure of distances between points by either 'flat' (non-Euclidean) or 'non-flat' (Euclidean) geometrical methods.

I think there might have made a typo which could get a bit confusing for readers, by flat geometry (not flat object) we mean measuring distances, areas, and volumes using Euclidean distance i.e. following Euclidean geometry, thus the sentence should be rewritten as:

Derived from mathematical terminology, non-flat vs. flat geometry refers to the measure of distances between points by either 'flat' (Euclidean) or 'non-flat' (non-Euclidean) geometrical methods.

Quote from sklearn docs to support my suggestion:

"Non-flat geometry clustering is useful when the clusters have a specific shape, i.e. a non-flat manifold, and the standard euclidean distance is not the right metric."

  • In this sentence

'Flat' in this context refers to Euclidean geometry (parts of which are taught as 'plane' geometry), and non-flat refers to non-Euclidean geometry

I think adding a mention of the difference in how distances are measured would be quite helpful since most readers might already know about calculating Euclidean distance and allow them to build their intuition on top of this, what do you think?

  • This one might seem like me bringing into attention very small aspects of the material ๐Ÿ˜… but what I think might be important for beginners so I will leave this completely up to you.

The only strong correlation is between energy and loudness, which is not too surprising, given that loud music is usually pretty energetic. Otherwise, the correlations are relatively weak. It will be interesting to see what a clustering algorithm can make of this data.

This sentence is quite simple but I think it might lead to beginner readers making wrong assumptions, how do I know, well I had this wrong intuition up my mind when starting with ML too!

Anyways, we here say that there is a good correl. between the energy of the song and loudness, I personally think it is very important to here specify

Correlation does not imply causation and should not be confused, we have proof of correlation but no proof of causation.

Optionally we could link readers to Tyler Vigen's super famous spurious correlations blog/book. I think this is quite important and highly misunderstood aspect which would be best suited here.

  • In this sentence

You can discover concentric circles around a general point of convergence, showing the distribution of points. In general, the three genres align loosely in terms of their popularity and danceability. Determining clusters in this loosely-aligned data will be interesting:

Do you think we should introduce the readers to some more about these graphs maybe something like:

We here use a KDE (Kernel Density Estimate) graph that represents the data using a continuous probability density curve. This allows us to easily interpret data especially when working with multiple distributions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.