Welcome!
Here is a non-exaustive, work in progress list of resources for data science, machine learning, artificial intelligence, data and text analytics, and data visualization.
I've also included links for web and API development, programming languages, DevOps tools, cloud computing, and more.
Note that resources are listed in no particular order of preference or relevance. Well... maybe except for my blog :)
- InnoArchiTech
- Flowing Data
- KDnuggets
- R-bloggers
- Analytics Vidhya
- Statistical Modeling, Causal Inference, and Social Science
- Simply Statistics
- Walking Randomly
- FastML
- No Free Hunch
- Machine Learning Mastery
- Data Science Weekly
- Edwin Chen
- Harvard Data Science
- Awesome Data Science
- Data Science Resources
- Data science blogs
- Data Science Specialization resources
- Data Science Specialization notes
- Python Data Science Tutorials
- R Data Science Tutorials
- Machine Learning & Deep Learning Tutorials
- Learn Data Science open resources
- List of Data Science/Big Data Resources
- General Assembly's Data Science course materials
- Scikit-learn Tutorial
- theano-tutorial
- IPython Theano Tutorials
- ISLR-python
- Awesome R
- Data science IPython notebooks
- Data-Analysis-and-Machine-Learning-Projects
- machine_learning
- ipython-notebooks
- Spark Notebook
- Python Machine Learning book resources
- Python Machine Learning book FAQ
- Learning-Predictive-Analytics-with-R
- Data Science from Scratch book resources
- IPython Cookbook materials
- Python Data Science Handbook Supplemental Materials
- GitHub markdown cheatsheet
- GitHub markdown guide
- Machine learning algorithm cheat sheet
- 11 Steps for Data Exploration in R
- AI Cheat Sheet
- Data Science Cheat Sheet
- Data Science Weekly resources
- Data School resources
- Open Source Data Science Masters
- Open Source Data Science Masters - GitHub
- Choosing the right estimator
- Awesome Public Datasets
- AWS Public Datasets
- 100+ Interesting Data Sets for Statistics
- Kaggle Datasets
- FiveThirtyEight data
- Google BigQuery Public Datasets
- UCI Machine Learning Repository
- Stanford Large Network Dataset Collection
- AWS
- MongoDB
- Redis
- Memcache
- MySQL
- PostgreSQL
- BigTable
- S3
- Neo4j
- CouchBase
- Cassandra
- Riak
- HBase
- CouchDB
- ElasticSearch - Service that makes it easy to deploy, operate, and scale Elasticsearch in the AWS Cloud
- Hadoop
- Spark
- Keras: Deep Learning library for Theano and TensorFlow
- Weka
- Theano
- TensorFlow
- Hive
- Pig
- Anaconda
- Python
- R
- ggplot2
- ISLR
- Rcpp
- dplyr
- plyr
- stringr
- shiny
- knitr
- readr
- R Markdown
- tidyr
- lubridate
- lme4
- nlme
- mime
- mda
- lasso2
- lars
- digest
- reshape2
- colorspace
- RColorBrewer
- manipulate
- scales
- labeling
- proto
- randomForest
- glmnet
- caret
- ggvis
- rgl
- htmlwidgets
- leaflet
- dygraphs
- googleVis
- zoo
- RCurl
- jsonlite
- bitops
- devtools
- magrittr
- packrat
- Haven
- DT
- MICE
- rpart
- party
- nnet
- e1071
- kernlab
- gbm
- wordcloud
- c50
- class
- neuralnet
- tm
- gmodels
- rodbc
- princurve
- General CRAN List - By task
- General CRAN List - NLP/Text analytics
- General CRAN List
- AWS
- Kinesis - Real-time streaming data in the AWS cloud
- Firehouse - Easily load real-time streaming data into AWS
- Analytics - Get actionable insights from streaming data in real-time
- Streams - Build custom applications that process or analyze streaming data for specialized needs
- Amazon EMR - Easily Run and Scale Apache Hadoop, Spark, HBase, Presto, Hive, and other Big Data Frameworks
- QuickSight - Fast, easy to use business analytics
- Machine Learning
- IoT - Easily and securely connect devices to the cloud
- Lambda - Serverless compute. AWS Lambda lets you run code without provisioning or managing servers
- EC2 - Web service that provides resizable compute capacity in the cloud
- Elastic Beanstalk - Deploy and scale web applications and services
- ElastiCache - Web service that makes it easy to deploy, operate, and scale an in-memory data store or cache in the cloud
- AWS Data Pipeline - Easily automate the movement and transformation of data
- AWS SES - Reliable, cost-effective email platform
- Kinesis - Real-time streaming data in the AWS cloud
- Google Cloud Platform
- Digital Ocean