Giter VIP home page Giter VIP logo

Comments (13)

drewgriffith15 avatar drewgriffith15 commented on July 22, 2024 1

Oh, a few days ago, I came across this: https://github.com/BillPetti/baseballr

If you haven't seen this before, it's worth a follow.

-Drew


From: Andrew Martin [email protected]
Sent: Monday, March 14, 2016 2:03 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)

OK, I wrote the core fangraphs scrape function - it's on a new branchhttps://github.com/almartin82/projprep/blob/72e7c722d9e3dcb8715607f0776d0153acf24040/R/fangraphs.R

usage is pretty basichttps://github.com/almartin82/projprep/blob/steamer/tests/testthat/test_fangraphs.R right now, and per #29#29, there are some variable-type issues that I need to figure out right away. but getting there!

Reply to this email directly or view it on GitHubhttps://github.com//issues/19#issuecomment-196159640.

from projprep.

almartin82 avatar almartin82 commented on July 22, 2024

looking more closely at this, should try to generalize the fangraphs scrape to cover all 5 projection systems they display

from projprep.

drewgriffith15 avatar drewgriffith15 commented on July 22, 2024

I'll look tomorrow and get back to you.

On Sat, Mar 12, 2016 at 8:18 AM -0800, "Andrew Martin" <[email protected]mailto:[email protected]> wrote:

looking more closely at this, should try to generalize the fangraphs scrape to cover all 5 projection systems they display

Reply to this email directly or view it on GitHubhttps://github.com//issues/19#issuecomment-195768850.

from projprep.

almartin82 avatar almartin82 commented on July 22, 2024

hey, @drewgriffith15 - didn't mean to spam you with notifications! just stumbled into your scripts and wanted to leave a breadcrumb for myself / cite my sources.

I love how compact that code is for reading in all of the steamer data by team. my current plan of attack was to turn that into a function, and then see what parameters needed to be changed to get all the other available projections - ZIPS, fangraph fans, etc.

if you'd like to join in, would love to have you on this project. in your 2015 script, were you implementing standings gain points? haven't ever tried that, and would definitely be interested in seeing how valuations differ for SGP vs z-score approaches.

from projprep.

almartin82 avatar almartin82 commented on July 22, 2024

OK, I wrote the core fangraphs scrape function - it's on a new branch, steamer.

usage is pretty basic right now, and per #29, there are some variable-type issues that I need to figure out right away. but getting there!

from projprep.

drewgriffith15 avatar drewgriffith15 commented on July 22, 2024

Hey, feel free to use whatever code that you see on my side to build functions for Fangraphs (zips, steamer, etc.). If you want to work together on some code, I'd be up for it. Just let me know.

-Drew


From: Andrew Martin [email protected]
Sent: Monday, March 14, 2016 2:03 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)

OK, I wrote the core fangraphs scrape function - it's on a new branchhttps://github.com/almartin82/projprep/blob/72e7c722d9e3dcb8715607f0776d0153acf24040/R/fangraphs.R

usage is pretty basichttps://github.com/almartin82/projprep/blob/steamer/tests/testthat/test_fangraphs.R right now, and per #29#29, there are some variable-type issues that I need to figure out right away. but getting there!

Reply to this email directly or view it on GitHubhttps://github.com//issues/19#issuecomment-196159640.

from projprep.

almartin82 avatar almartin82 commented on July 22, 2024

@drewgriffith15 got all the fangraphs projections in! here's a minimal snippet:

  library(devtools)
  devtools::install_github('almartin82/projprep')
  library(projprep)

  ex <- projprep::get_steamer(2016, TRUE)
  pp <- projprep::proj_prep(ex)
  pp$h_final %>% dplyr::arrange(desc(value)) %>% peek()

will produce

   mlbid         fullname firstname    lastname position priority_pos projection_name
1 545361       Mike Trout      Mike       Trout       OF           OF         steamer
2 514888      Jose Altuve      Jose      Altuve       2B           2B         steamer
3 502671 Paul Goldschmidt      Paul Goldschmidt       1B           1B         steamer
4 519203    Anthony Rizzo   Anthony       Rizzo       1B           1B         steamer
5 457763     Buster Posey    Buster       Posey        C            C         steamer
6 621043    Carlos Correa    Carlos      Correa       SS           SS         steamer
   ab   r rbi sb  tb  obp r_zscore rbi_zscore sb_zscore tb_zscore obp_zscore
1 541 103 104 15 316 0.41     2.85     2.6077    0.5467      2.75       2.27
2 629  92  64 37 271 0.35     2.01    -0.0095    2.9794      1.53       2.20
3 538  92  92 14 286 0.40     2.01     1.8226    0.4362      1.94       2.02
4 552  92  99 10 287 0.37     2.01     2.2806   -0.0061      1.96       1.64
5 497  68  73  2 234 0.37     0.16     0.5794   -0.8908      0.53       0.77
6 572  80  83 20 262 0.34     1.08     1.2337    1.0996      1.29       1.12
  unadjusted_zsum replacement_pos adjustment_zscore final_zsum value hit_pitch
1            11.0              OF              -1.0       10.0    60         h
2             8.7              2B               1.2        9.9    59         h
3             8.2              1B               1.3        9.5    57         h
4             7.9              1B               1.3        9.2    55         h
5             1.2               C               7.0        8.1    49         h
6             5.8              SS               2.2        8.0    48         h

if you take a look at the main fangraphs scrape, you'll notice that instead of scraping pos=all x 30 teams, I'm doing 6 hitting positions x 30 teams. that was the only way I could figure out how to get the position eligibility field - it isn't in the standard projections, so I had to extract it from the url.

I'm pretty happy with this - the only thing I wish that the scrape was preserving was the hyperlink for each player, which contains their fangraphs id. Having all those names and fangraphs ids would be nice to write back to the player metadata, and would help solve some of the problems around players with duplicate names (issue #31). @drewgriffith15 if you have any thoughts about how ways to extract those links as part of the scrape - the current readHTMLTable call loses those hyperlinks. maybe we could combine it with something else from rvest that makes getting those links easier?

from projprep.

drewgriffith15 avatar drewgriffith15 commented on July 22, 2024

I got the package to load from github. I was running a couple versions back and when I installed the new version, it worked!

Drew Griffith
Business Data Analyst III
Analytics and Decision Support

(850) 259-6039 (cell)
[http://www.liberty.edu/media/1616/40themail/wordmark-for-email.jpg]

Liberty University | Training Champions for Christ since 1971


From: Andrew Martin [email protected]
Sent: Tuesday, March 15, 2016 2:11 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)

@drewgriffith15https://github.com/drewgriffith15 got all the fangraphs projections in! here's a minimal snippet:

library(devtools)
devtools::install_github('almartin82/projprep')
library(projprep)

ex <- projprep::get_steamer(2016, TRUE)
pp <- projprep::proj_prep(ex)
pp$h_final %>% dplyr::arrange(desc(value)) %>% peek()

will produce

mlbid fullname firstname lastname position priority_pos projection_name
1 545361 Mike Trout Mike Trout OF OF steamer
2 514888 Jose Altuve Jose Altuve 2B 2B steamer
3 502671 Paul Goldschmidt Paul Goldschmidt 1B 1B steamer
4 519203 Anthony Rizzo Anthony Rizzo 1B 1B steamer
5 457763 Buster Posey Buster Posey C C steamer
6 621043 Carlos Correa Carlos Correa SS SS steamer
ab r rbi sb tb obp r_zscore rbi_zscore sb_zscore tb_zscore obp_zscore
1 541 103 104 15 316 0.41 2.85 2.6077 0.5467 2.75 2.27
2 629 92 64 37 271 0.35 2.01 -0.0095 2.9794 1.53 2.20
3 538 92 92 14 286 0.40 2.01 1.8226 0.4362 1.94 2.02
4 552 92 99 10 287 0.37 2.01 2.2806 -0.0061 1.96 1.64
5 497 68 73 2 234 0.37 0.16 0.5794 -0.8908 0.53 0.77
6 572 80 83 20 262 0.34 1.08 1.2337 1.0996 1.29 1.12
unadjusted_zsum replacement_pos adjustment_zscore final_zsum value hit_pitch
1 11.0 OF -1.0 10.0 60 h
2 8.7 2B 1.2 9.9 59 h
3 8.2 1B 1.3 9.5 57 h
4 7.9 1B 1.3 9.2 55 h
5 1.2 C 7.0 8.1 49 h
6 5.8 SS 2.2 8.0 48 h

if you take a look at the mainhttps://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b78a23db0c8fb60718eea/R/fangraphs.R#L23 fangraphs scrape, you'll notice that instead of scraping pos=all x 30 teams, I'm doing 6 hitting positions x 30 teams. that was the only way I could figure out how to get the position eligibility field - it isn't in the standard projections, so I had to extract it from the url.

I'm pretty happy with this - the only thing I wish that the scrape was preserving was the hyperlink for each player, which contains their fangraphs id. Having all those names and fangraphs ids would be nice to write back to the player metadatahttps://github.com/almartin82/projprep/blob/master/vignettes/universal_metadata.Rmd, and would help solve some of the problems around players with duplicate names (issue #31#31). @drewgriffith15https://github.com/drewgriffith15 if you have any thoughts about how ways to extract those links as part of the scrape - the current readHTMLTable call<https://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b
%2078a23db0
%20c8fb60718eea/R/fangraphs.R#L30> loses those hyperlinks. maybe we could combine it with something else from rvest that makes getting those links easier?

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#19 (comment)

from projprep.

drewgriffith15 avatar drewgriffith15 commented on July 22, 2024

There is a dependency on the ensurer package when you run that snippet of code you sent me.

-Drew


From: Andrew Martin [email protected]
Sent: Tuesday, March 15, 2016 2:11 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)

@drewgriffith15https://github.com/drewgriffith15 got all the fangraphs projections in! here's a minimal snippet:

library(devtools)
devtools::install_github('almartin82/projprep')
library(projprep)

ex <- projprep::get_steamer(2016, TRUE)
pp <- projprep::proj_prep(ex)
pp$h_final %>% dplyr::arrange(desc(value)) %>% peek()

will produce

mlbid fullname firstname lastname position priority_pos projection_name
1 545361 Mike Trout Mike Trout OF OF steamer
2 514888 Jose Altuve Jose Altuve 2B 2B steamer
3 502671 Paul Goldschmidt Paul Goldschmidt 1B 1B steamer
4 519203 Anthony Rizzo Anthony Rizzo 1B 1B steamer
5 457763 Buster Posey Buster Posey C C steamer
6 621043 Carlos Correa Carlos Correa SS SS steamer
ab r rbi sb tb obp r_zscore rbi_zscore sb_zscore tb_zscore obp_zscore
1 541 103 104 15 316 0.41 2.85 2.6077 0.5467 2.75 2.27
2 629 92 64 37 271 0.35 2.01 -0.0095 2.9794 1.53 2.20
3 538 92 92 14 286 0.40 2.01 1.8226 0.4362 1.94 2.02
4 552 92 99 10 287 0.37 2.01 2.2806 -0.0061 1.96 1.64
5 497 68 73 2 234 0.37 0.16 0.5794 -0.8908 0.53 0.77
6 572 80 83 20 262 0.34 1.08 1.2337 1.0996 1.29 1.12
unadjusted_zsum replacement_pos adjustment_zscore final_zsum value hit_pitch
1 11.0 OF -1.0 10.0 60 h
2 8.7 2B 1.2 9.9 59 h
3 8.2 1B 1.3 9.5 57 h
4 7.9 1B 1.3 9.2 55 h
5 1.2 C 7.0 8.1 49 h
6 5.8 SS 2.2 8.0 48 h

if you take a look at the mainhttps://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b78a23db0c8fb60718eea/R/fangraphs.R#L23 fangraphs scrape, you'll notice that instead of scraping pos=all x 30 teams, I'm doing 6 hitting positions x 30 teams. that was the only way I could figure out how to get the position eligibility field - it isn't in the standard projections, so I had to extract it from the url.

I'm pretty happy with this - the only thing I wish that the scrape was preserving was the hyperlink for each player, which contains their fangraphs id. Having all those names and fangraphs ids would be nice to write back to the player metadatahttps://github.com/almartin82/projprep/blob/master/vignettes/universal_metadata.Rmd, and would help solve some of the problems around players with duplicate names (issue #31#31). @drewgriffith15https://github.com/drewgriffith15 if you have any thoughts about how ways to extract those links as part of the scrape - the current readHTMLTable call<https://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b
%2078a23db0
%20c8fb60718eea/R/fangraphs.R#L30> loses those hyperlinks. maybe we could combine it with something else from rvest that makes getting those links easier?

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#19 (comment)

from projprep.

almartin82 avatar almartin82 commented on July 22, 2024

ah, ok -- I have ensurer in suggests but maybe I will move all of those
to imports so that they install cleanly.
http://r-pkgs.had.co.nz/description.html

On Tue, Mar 15, 2016 at 11:00 AM, Drew Griffith [email protected]
wrote:

There is a dependency on the ensurer package when you run that snippet of
code you sent me.

-Drew


From: Andrew Martin [email protected]
Sent: Tuesday, March 15, 2016 2:11 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)

@drewgriffith15https://github.com/drewgriffith15 got all the fangraphs
projections in! here's a minimal snippet:

library(devtools)
devtools::install_github('almartin82/projprep')
library(projprep)

ex <- projprep::get_steamer(2016, TRUE)
pp <- projprep::proj_prep(ex)
pp$h_final %>% dplyr::arrange(desc(value)) %>% peek()

will produce

mlbid fullname firstname lastname position priority_pos projection_name
1 545361 Mike Trout Mike Trout OF OF steamer
2 514888 Jose Altuve Jose Altuve 2B 2B steamer
3 502671 Paul Goldschmidt Paul Goldschmidt 1B 1B steamer
4 519203 Anthony Rizzo Anthony Rizzo 1B 1B steamer
5 457763 Buster Posey Buster Posey C C steamer
6 621043 Carlos Correa Carlos Correa SS SS steamer
ab r rbi sb tb obp r_zscore rbi_zscore sb_zscore tb_zscore obp_zscore
1 541 103 104 15 316 0.41 2.85 2.6077 0.5467 2.75 2.27
2 629 92 64 37 271 0.35 2.01 -0.0095 2.9794 1.53 2.20
3 538 92 92 14 286 0.40 2.01 1.8226 0.4362 1.94 2.02
4 552 92 99 10 287 0.37 2.01 2.2806 -0.0061 1.96 1.64
5 497 68 73 2 234 0.37 0.16 0.5794 -0.8908 0.53 0.77
6 572 80 83 20 262 0.34 1.08 1.2337 1.0996 1.29 1.12
unadjusted_zsum replacement_pos adjustment_zscore final_zsum value
hit_pitch
1 11.0 OF -1.0 10.0 60 h
2 8.7 2B 1.2 9.9 59 h
3 8.2 1B 1.3 9.5 57 h
4 7.9 1B 1.3 9.2 55 h
5 1.2 C 7.0 8.1 49 h
6 5.8 SS 2.2 8.0 48 h

if you take a look at the main<
https://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b78a23db0c8fb60718eea/R/fangraphs.R#L23>
fangraphs scrape, you'll notice that instead of scraping pos=all x 30
teams, I'm doing 6 hitting positions x 30 teams. that was the only way I
could figure out how to get the position eligibility field - it isn't in
the standard projections, so I had to extract it from the url.

I'm pretty happy with this - the only thing I wish that the scrape was
preserving was the hyperlink for each player, which contains their
fangraphs id. Having all those names and fangraphs ids would be nice to
write back to the player metadata<
https://github.com/almartin82/projprep/blob/master/vignettes/universal_metadata.Rmd>,
and would help solve some of the problems around players with duplicate
names (issue #31#31).
@drewgriffith15https://github.com/drewgriffith15 if you have any
thoughts about how ways to extract those links as part of the scrape - the
current readHTMLTable call<
https://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b
%2078a23db0
%20c8fb60718eea/R/fangraphs.R#L30> loses those hyperlinks. maybe we could
combine it with something else from rvest that makes getting those links
easier?

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#19 (comment)


You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub:
#19 (comment)

from projprep.

almartin82 avatar almartin82 commented on July 22, 2024

@drewgriffith15 just pushed that fix to master so that you can re-install. if you want to contribute (please do) I would also suggest cloning the repo locally on your machine, which makes it easier to pull down these changes and use them locally.

I try to use the branch workflow to manage these kinds of distributed projects.

from projprep.

drewgriffith15 avatar drewgriffith15 commented on July 22, 2024

Cool. I've got notifications setup, so I saw it. I will do my best to contribute and to use the branches. Never worked on a collaborative project on Github, so hopefully I can get it right the first time. I already noticed something else. Found an error with this:

ex <- projprep::get_fangraphs(2016, TRUE)
Error in names(df)[2] <- "fg_note" :
'names' attribute [2] must be the same length as the vector [1]


From: Andrew Martin [email protected]
Sent: Tuesday, March 15, 2016 11:16 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)

@drewgriffith15https://github.com/drewgriffith15 just pushed that fix to master so that you can re-install. if you want to contribute (please do) I would also suggest cloning the repo locally on your machine, which makes it easier to pull down these changes and use them locally.

I try to use the branch workflowhttps://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows to manage these kinds of distributed projects.

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#19 (comment)

from projprep.

almartin82 avatar almartin82 commented on July 22, 2024

Awesome -- happy to help w/ git collaboration. This set of Atlassian writeups is awesome - super intuitive intro to different models of git collaboration. 'Feature branch workflow' is good bang for the buck - not nearly as complicated as the gitflow workflow, but you get like 80% of the benefits. The good thing about this is if you fork projprep and clone it locally, you'll be committing to your own branch of the project - there's literally no way that it could break anything on this branch. Once you have something you want to contribute back, you just create a pull request, and I can bring those changes back into the code base.

This is how I work even when I am the only contributor to a project -- if you look at the network of commits, you'll see a feature branch leave master, a bunch of commits happen, and then a pull request merges the changes back into master.

That has the advantage of keeping master stable - the bleeding edge changes live on a branch until they are ready for production. On projects where I have another lead collaborator (like mapvizieR with @chrishaid), we have established the convention of always submitting our branches to each other for code review and approval. I really like this workflow.

from projprep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.