The ml-for-beginners's discuss from microsoft

Is your feature request related to a problem? Please describe.
As of now, we are Thanking Contributors like this 👇🏻

I think we can do it in a more better and personalized way !

Describe the solution you'd like

We can shift this section at bottom of reading me the file and we can add everyone's card so If anyone wants to reach out to them they can visit their GitHub profile card will contain their GitHub avatar and name where the link will be added.

Describe alternatives you've considered
Or we can just add links to their profiles to the existing places this will also help but the card will look a more attractive and a small tribute to great people!!

Will like to add this feature

[TRANSLATIONS] - Russian

Translating lessons, quizzes, and assignments

Let's translate this content! Here are instructions: https://github.com/microsoft/ML-For-Beginners/blob/main/TRANSLATIONS.md

We can focus on the following for now, but please propose yours and start a draft PR:

[LESSON] 7 - Reinforcement 1

[LESSON] 2 - Regression 1

[LESSON]

[LESSON] 5 - NLP 1

[LESSON] 4 - Clustering 2

[LESSON] 3 - Classification 3

[LESSON] 6 - Time Series 2

Google cloud Auto ML vision

Have taken most of the lessons.... Why not adding introduction to GPC Auto ML vision lesson?

It wil be very interesting as practical project!

[TRANSLATIONS] - Italian

[LESSON] 1 - Ethics of ML

[LESSON] microsoft ml for beginners

[LESSON] 2 - Regression 3

[LESSON] 6 - Time Series 1

[TRANSLATIONS] - Korean

Time requirements?

The course is described as being "12 weeks, 24 lessons", but it doesn't give an idea as to how much time is required to complete each lesson.

I'd like to do the course, but I need to figure out if I've got time in my current schedule.

PAT Rubrics

Build all PAT Rubrics in discussion board for each lesson grouping

[LESSON] 1 - Introduction to Machine Learning

[LESSON] 5 - NLP 4

[LESSON] 2 - Regression 2

[Translations] - Simplified Chinese

[LESSON] 2 - Regression 4

[REQUEST] - Sketchnotes for major modules

@dasani-madipalli and @girliemac

I'd love to include one sketchnote (WebDev style if possible!) for each of the big topics in this course, to be added to each introductory lesson:

Do you think it's possible? 🙏❤️

NLP 5 - last lesson for NLP

the last lesson, finishing NLTK for sentiment analysis

[LESSON] 8 - Real World Applications

[LESSON]Microsoft machine learning basic

[LESSON] 3 - Classification 2

[LESSON] Have to learn

[LESSON]

[LESSON] 5 - NLP 3

Reviews for NLP Module

Hi @jlooper,

I really love the way in which these learning materials and curriculum are designed. However, I had some reviews which might help make this better. I don't have any reviews for Sections 1 and 3 and feel they are really great in the current form. Do let me know if I should make a PR addressing any of these reviews.

1-Introduction-to-NLP

This section looks pretty good to me and I don't have any reviews which might help make this better.

2-Tasks

For this sentence

🎓 Tokenization Probably the first thing most NLP algorithms have to do is split the text into tokens, or words. While this sounds simple, having to account for punctuation and different language's word and sentence delimiters can make it tricky.

maybe we could give some intuition on why tokenization might be to not only split a sentence when there is whitespace by adding:

Thought it might seem very straightforward to simply split your sentence into words, you might have to use some other methods or add on top of this too.

For the Tasks common to NLP section I feel we should add a mention about "word embeddings" owing to its importance. Maybe we could add something like this:

🎓 Embeddings Embeddings are a way to meaningfully convert your text data numerically. This is done in a way so that words with a similar meaning or words used together cluster together in a high dimensional space.

Optionally we could also add:

Try playing around with word embeddings from a quite popular model (Word2Vec) here. Can you see how clicking on one word shows the words with similar meaning clustering around! Eg. if you inspect the word 'toy' you see it clusters with words: 'disney', 'lego', 'playstation', 'console' etc.

However, I do understand this might make it a bit more deep at this stage, what do you think?

3-Translation-Sentiment

This section looks pretty good to me and I don't have any reviews which might help make this better.

[LESSON]

[Content]: Real World Applications for Classic Machine Learning - Select an area

To complete the lessons, we want to add an overview of ML as it is used in the real world. This will focus o
classic ML only! So don't worry about Neural Networks for this curriculum. Pick a domain and write a paragraph about it as a reply to this issue, and I'll add it to the lesson and credit you!

Finance
Credit card fraud detection
Wealth management
Education
Predicting student behavior
Preventing plagiarism
Course recommendations
Retail
Personalizing the customer journey
Inventory management
Health Care
Optimizing drug delivery
Hospital re-entry management
Disease management
Ecology and Green Tech
Forest management
Motion sensing of animals
Energy Management
Insurance
Actuarial tasks
Arts, Culture, and Literature
Fake news detection
Classifying artifacts
Marketing
'Ad words'

Reviews for Clustering 2 Notebook

Hi @jlooper ,

I reviewed the solution notebook 5-Clustering/2-K-Means. Here are my experiments on top of which I make the below comments: https://colab.research.google.com/drive/1oIXPkQZzvJClRaCoEcpznKw3RhXOTPay?usp=sharing

It turns out we are trying to cluster the other features to get 3 categories (artist genres) but if you see there is almost no correlation between our features and our expected cluster bases (see this cell: https://colab.research.google.com/drive/1oIXPkQZzvJClRaCoEcpznKw3RhXOTPay#scrollTo=cpfSV8Yem1H9&line=5&uniqifier=1). The only good enough corellation we get is for loudness and energy which again wouldn't be a problem to be solved a clustering algorithm.

I think it would not be much useful to use clustering for this problem in its entirety, what do you think?

Issue in README

Issue in the markdown, says no repo found on android etc...
File attached: screenshot.png

تتصخ

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: [e.g. iOS]
Browser [e.g. chrome, safari]
Version [e.g. 22]

Smartphone (please complete the following information):

Device: [e.g. iPhone6]
OS: [e.g. iOS8.1]
Browser [e.g. stock browser, safari]
Version [e.g. 22]

Additional context
Add any other context about the problem here.

Translations files are in wrong folder for lesson groupings

Task- move all translations to their proper folder

[Content] - Introduction to ML

Please create an outline of how you'd like to write this lesson

Reviews for Clustering Module Part 1

Hi @jlooper,

Here are my reviews for the Clustering Module, do note I have only reviewed Part 1 as of the moment as you asked.

In this sentence

Derived from mathematical terminology, non-flat vs. flat geometry refers to the measure of distances between points by either 'flat' (non-Euclidean) or 'non-flat' (Euclidean) geometrical methods.

I think there might have made a typo which could get a bit confusing for readers, by flat geometry (not flat object) we mean measuring distances, areas, and volumes using Euclidean distance i.e. following Euclidean geometry, thus the sentence should be rewritten as:

Derived from mathematical terminology, non-flat vs. flat geometry refers to the measure of distances between points by either 'flat' (Euclidean) or 'non-flat' (non-Euclidean) geometrical methods.

Quote from sklearn docs to support my suggestion:

"Non-flat geometry clustering is useful when the clusters have a specific shape, i.e. a non-flat manifold, and the standard euclidean distance is not the right metric."

In this sentence

'Flat' in this context refers to Euclidean geometry (parts of which are taught as 'plane' geometry), and non-flat refers to non-Euclidean geometry

I think adding a mention of the difference in how distances are measured would be quite helpful since most readers might already know about calculating Euclidean distance and allow them to build their intuition on top of this, what do you think?

This one might seem like me bringing into attention very small aspects of the material 😅 but what I think might be important for beginners so I will leave this completely up to you.

The only strong correlation is between energy and loudness, which is not too surprising, given that loud music is usually pretty energetic. Otherwise, the correlations are relatively weak. It will be interesting to see what a clustering algorithm can make of this data.

This sentence is quite simple but I think it might lead to beginner readers making wrong assumptions, how do I know, well I had this wrong intuition up my mind when starting with ML too!

Anyways, we here say that there is a good correl. between the energy of the song and loudness, I personally think it is very important to here specify

Correlation does not imply causation and should not be confused, we have proof of correlation but no proof of causation.

Optionally we could link readers to Tyler Vigen's super famous spurious correlations blog/book. I think this is quite important and highly misunderstood aspect which would be best suited here.

In this sentence

You can discover concentric circles around a general point of convergence, showing the distribution of points. In general, the three genres align loosely in terms of their popularity and danceability. Determining clusters in this loosely-aligned data will be interesting:

Do you think we should introduce the readers to some more about these graphs maybe something like:

We here use a KDE (Kernel Density Estimate) graph that represents the data using a continuous probability density curve. This allows us to easily interpret data especially when working with multiple distributions.

microsoft / ml-for-beginners Goto Github PK

ml-for-beginners's Issues

Recommend Projects

Recommend Topics

Recommend Org