Giter VIP home page Giter VIP logo

book-of-models's Introduction



Michael's GitHub Space




I'm Michael, and I do all manner of things within the realm of data science. Here you'll find source code for modeling, packages, general programming, and various other things.


R Python Stan


personal website Twitter LinkedIn


My github stats


book-of-models's People

Contributors

m-clark avatar saberry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

book-of-models's Issues

chapter goals

Maybe there is enough with the intro/key-ideas, but should we add a 'goals' section at the beginning? For example, for lm chap

Chapter goals:

  • Understand what a model is conceptually
  • Understand what a linear model is and how features are mapped to the target
  • Be able to get predictions from our model
  • Be able to understand the results of a model at a basic level
  • Get a sense of complexity and other issues

Cleanup/consistency list

Bigger changes

  • All chapters should have key ideas, why this matters, good to know, where to go, wrapping up sections (and exercises if appropriate); reference per chapter to be determined (#17)
  • All chapters/sections (at least level 2) need to be linkable and linked throughout the text
  • Add content to preface/acknowledgments/misc models
  • #19
  • Spelling/typo checks

Data

  • Data needs to be linked. Possibly early on like even in preface.
  • There should be only one version of a single data source being used (e.g. not one with and without some features)

Potential Content Trimming

  • Merge overlapping content
    • LM and model exploration chapter
    • Data, and ML other models chap, and misc models chap
  • ML concepts gets pretty deep, could trim a lot there
  • Could remove what amounts to a demo chapter in the intro
  • Could remove some appendix content (matrix, nix simulation, bayes)
  • Drop some deeper code examples (e.g. estimation beyond the estimation chapter may need to be nixed for length. Just comment out.
  • LM Part needs consistency update

Code

  • code needs to be cleaned/made consistent between SB and MC
    • R:
    • No <- assignment for R code
    • 4 space indent?
    • Py:
    • uglify to Pep standards
    • Other?
    • General:
    • No more than 80 char for pdf
    • !! Make sure the printed code runs as is !!

Tables & Figures

  • Make sure all figures and tables are linkable/cross-referenceable
  • Make sure no labels have underscores (latex can't handle it)
  • Make sure no gt tables use color (latex can't handle it)

Other that needs to be done

  • Add simulation demo/discussion if space permits
  • Chapter title/section casing (use css if possible)
  • Dataset descriptions
  • bold should be key words, italics just for emphasis
  • Save out figures as separate files if not already. Preferably svg.
  • Move programming discussion in ML to a general one in Part 3 or appendix. Trim discussion in intro?
  • quarto-dev/quarto-cli#7856
  • Fix missing crossref

Lesser issues

  • Want to remove chapter specific folders. keep everything in just data, img.
  • All the graphical models need separate dot files and looks the same
  • All chunks should be named

Where can we best advise on things not to do?

It would be useful in particular chapters or as part of the 'misc/more models chapter' to have some kindly worded "don't do this" or or otherwise problematic stuff. For example, stepwise regression, just going with p-value result/ignoring prediction, mean imputation, ignoring uncertainty, attributing causal effects where not warranted, ignoring baselines, using old models that are no longer necessary, etc.

Reorg

The following only covers reorganization, not outright trimming

PART I

  • Move all the shap and related discussion to model exploration
  • Move model list in LM chapter to last chapter, but leave a bit of a preview to 'other models' (this acutally wasn't that much content and was exclusively 'linear models', so left). Reminder to add others added to misc_models.
  • Should assumptions stay in LM or be moved to knowing? (leaving for now)
  • Move interactions to 'extensions' chapter, which might be renamed something to reflect nonlinear nature of the models covered there
  • move metrics table and other discussion from ML to model explore
  • in the knowing chap, talk about feature importance for other models with brief demo in model exploration (e.g. can demo RF/boost), but leave bulk of discussion to ML coverage
  • Update the knowing chapter to handle all the new content and rename from model_criticism.qmd

PART II

  • Trim content that appears to get too deep, duplicates content elsewhere, or otherwise just isn't necessary
    • Preface
    • Intro
    • Linear Models
    • Knowing
    • Estimation
    • GLM
    • Nonlin
    • ML concepts
    • ML models
    • ML more
    • Data
    • Causal
  • Finish last chapter (currently 'misc models')
  • Put exercises for appropriate chaps
  • Update Preface
  • Update Appendix

PART III

  • Recheck text where gaps are left and text is added so that it makes sense.
    • lm chapter
    • ml intro
    • model exploration
    • final chap

Add an exercise where applicable

Any main part with a model demonstration (or possibly something related) should have a single exercise very simply described. For example:

Use x model with the census data. Include one visualization that helps in interpretation. Compare its performance to a different model of your choosing.

I don't want to do more than guide practice, at least for now.

Chapter 0 complete before turn in

Work has been done and here is what's left to do before turn in.

  • Review CRC guidelines to see if anything big left out
  • General proof read
  • Organizational changes needed?
  • Cleanup
    • Name chunks
    • Spelling
    • Rounding digits (no more than 2 for this chapter)
    • Plot consistency, possibly minimize

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.