Giter VIP home page Giter VIP logo

datascience's Introduction

DataScience

Why You Might Be Interested in This Repository

This repository holds the lessons I've learned as I've explored various sub-disciplines of data science (Python, R, statistics, neural network programming, etc.). Most of the insights and code come from my MS Data Science studies at Indiana U, which I plan to complete in May 2020.

I hope you learn something useful as you read the code and discussions in the topics below.

Topics

My Story

I have been a data geek all my life. For science fair projects, I didn't build baking soda volcanos; instead, I analyzed weather forecast accuracy and factors in student fitness. As an undergrad at Princeton, I aced linear algebra, multivariable calculus, and several math-track economics courses. My interests in social justice and statistical analysis aligned as I pored over US Census Bureau data to identify gentrification trends for a course in urban economics. Continuing in this vein, I enrolled in a graduate level course in Comparative Urban Development, where I happily immersed myself in analyzing migration and growth patterns in India.

Shortly after my graduation in 1983, I joined a Christian humanitarian organization as the administrator of a nutrition and health education project in West Africa. To increase our effectiveness, my wife and I were required to abandon our American lifestyle in favor of local dress, custom, and language. Fortunately, I was not called upon to abandon my passion for data.

The previous project management had collected children's weight and age data as points on a scatter plot. While this permitted a rough evaluation of the health of enrolled children, I and my leadership wanted to explore questions such as:

  • What percentage of the children were improving or regressing? Our intervention would be different for perpetually malnourished children than for those who fluctuated between malnutrition and proper development.
  • Was there a critical age span where children were at greatest risk? If so, we wanted to focus interventions on that cohort.
  • Were children in some of the five regions struggling more than in others? If so, was it the result of random variance or a trend that warranted further investigation? We might, for example, want to give extra training to volunteers at centers serving the most at-risk children.

To answer these questions, I designed a dBase III-based system for tracking each child's age and weight. A colleague regularly exported the data to a spreadsheet which I used to generate analyses and visualizations for our stakeholders.

In the early 90s, I became a Merrill Lynch financial consultant. I am probably the only retail broker in the history of the firm who taught himself C in order to write an asset allocation recommendation system based on modern portfolio theory! This was the beginning of a course of self-study in programming that led to a lasting career shift.

Since 1994 I have worked as a software engineer and architect. I have coded at every application layer in at least a dozen languages; debugged assembler code in Windows 98; worked with clients to craft system requirements and architecture; led development teams; designed system collaborations via both proprietary APIs and industry-standard SOA/message exchanges; and blogged shamelessly about it all. But my favorite assignments have always involved intensive data analysis. As a Microsoft consultant, for example, I led customers through data-intensive performance labs. More recently, I have analyzed method call time-series data to identify the root causes of system outages in an extremely large federal system, and have optimized Spark jobs in a big data curation/analysis system for the Defense Intelligence Agency. I am currently working on a microservices architecture project for the VA.

datascience's People

Contributors

chrisfalter avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.