- For data science and machine learning, R and Python are two mostly used programming language. For scientific computing, another two popular choices are MATLAB and Julia. I teach programming related courses with MATLAB, R, and Python.
- MATLAB is expensive to use for commercial purpose, but many universities have subscribed to some of their academic licenses. In addition, the GNU community offers the excellent open source Octave, which shares the same language syntax with MATLAB. The advantage of using MATLAB is that, it is extremely easy to learn -- you can probably start using it to write your own research code after a few hours of reading online tutorials.
- R, Python, and the relative new Julia are all open source, and each of them has its own strength. R is famous for statistics and data visualization, while Python is a general-purpose programming language wildly used in many areas. On the other hand, Julia offers excellent speed.
- A popular development environment is Anaconda, which can bundle Python, R, and Julia together. Here is a very simple tutorial (in Chinese) I provided for my students regarding how to set up Anaconda.
- Here are some tests regarding introductory example comparing the following algorithms. (The tests are mainly for educational purpose.)
- Machine Learning can be used to solve Dynamic Programming (DP) problems approximately. DP is a powerful and widely used tool in operations research. For many problems, DP is a viable tool. For example the shortest path problem (see here for the code and example I provide). But more often, its computation complexity is forbidding, mostly due to the famous curse-of-dimensionality. This site has a lot of useful information regarding Approximate Dynamic Programming (ADP) and learning.
- Regarding using ADP to overcome DP's curse of dimensionality, here is a recent paper I co-authored on a problem of eCommerce oriented automated warehouse optimization. We use 'rollout', a technique from reinforcement learning, to overcome the difficulty inheritated from the dynamic nature of the problem. The paper was published in Transportation Resarch Part E.
- The simple form of the famous Bayes rule is the following: P(A and B)=P(A|B)*P(B)=P(B|A)*P(A).
- It is very useful in understanding a lot of important concepts and techniques, and here we briefly explain one of them--binary classification.
- Here is an example of conditional probability illustrating some related concepts in binary classification. It is about the COVID-19 fast testing using some real data together with some assumptions.