DataScience

Why You Might Be Interested in This Repository

This repository holds the lessons I've learned as I've explored various sub-disciplines of data science (Python, R, statistics, neural network programming, etc.). Most of the insights and code come from my MS Data Science studies at Indiana U, which I plan to complete in May 2020.

I hope you learn something useful as you read the code and discussions in the topics below.

Topics

My Story

I have been a data geek all my life. For science fair projects, I didn't build baking soda volcanos; instead, I analyzed weather forecast accuracy and factors in student fitness. As an undergrad at Princeton, I aced linear algebra, multivariable calculus, and several math-track economics courses. My interests in social justice and statistical analysis aligned as I pored over US Census Bureau data to identify gentrification trends for a course in urban economics. Continuing in this vein, I enrolled in a graduate level course in Comparative Urban Development, where I happily immersed myself in analyzing migration and growth patterns in India.

Shortly after my graduation in 1983, I joined a Christian humanitarian organization as the administrator of a nutrition and health education project in West Africa. To increase our effectiveness, my wife and I were required to abandon our American lifestyle in favor of local dress, custom, and language. Fortunately, I was not called upon to abandon my passion for data.

The previous project management had collected children's weight and age data as points on a scatter plot. While this permitted a rough evaluation of the health of enrolled children, I and my leadership wanted to explore questions such as:

What percentage of the children were improving or regressing? Our intervention would be different for perpetually malnourished children than for those who fluctuated between malnutrition and proper development.
Was there a critical age span where children were at greatest risk? If so, we wanted to focus interventions on that cohort.
Were children in some of the five regions struggling more than in others? If so, was it the result of random variance or a trend that warranted further investigation? We might, for example, want to give extra training to volunteers at centers serving the most at-risk children.

To answer these questions, I designed a dBase III-based system for tracking each child's age and weight. A colleague regularly exported the data to a spreadsheet which I used to generate analyses and visualizations for our stakeholders.

In the early 90s, I became a Merrill Lynch financial consultant. I am probably the only retail broker in the history of the firm who taught himself C in order to write an asset allocation recommendation system based on modern portfolio theory! This was the beginning of a course of self-study in programming that led to a lasting career shift.

Since 1994 I have worked as a software engineer and architect. I have coded at every application layer in at least a dozen languages; debugged assembler code in Windows 98; worked with clients to craft system requirements and architecture; led development teams; designed system collaborations via both proprietary APIs and industry-standard SOA/message exchanges; and blogged shamelessly about it all. But my favorite assignments have always involved intensive data analysis. As a Microsoft consultant, for example, I led customers through data-intensive performance labs. More recently, I have analyzed method call time-series data to identify the root causes of system outages in an extremely large federal system, and have optimized Spark jobs in a big data curation/analysis system for the Defense Intelligence Agency. I am currently working on a microservices architecture project for the VA.

katerdowdy / datascience Goto Github PK

datascience's Introduction

DataScience

Why You Might Be Interested in This Repository

Topics

My Story

datascience's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent