Comments (13)
Oh, a few days ago, I came across this: https://github.com/BillPetti/baseballr
If you haven't seen this before, it's worth a follow.
-Drew
From: Andrew Martin [email protected]
Sent: Monday, March 14, 2016 2:03 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)
OK, I wrote the core fangraphs scrape function - it's on a new branchhttps://github.com/almartin82/projprep/blob/72e7c722d9e3dcb8715607f0776d0153acf24040/R/fangraphs.R
usage is pretty basichttps://github.com/almartin82/projprep/blob/steamer/tests/testthat/test_fangraphs.R right now, and per #29#29, there are some variable-type issues that I need to figure out right away. but getting there!
Reply to this email directly or view it on GitHubhttps://github.com//issues/19#issuecomment-196159640.
from projprep.
looking more closely at this, should try to generalize the fangraphs scrape to cover all 5 projection systems they display
from projprep.
I'll look tomorrow and get back to you.
On Sat, Mar 12, 2016 at 8:18 AM -0800, "Andrew Martin" <[email protected]mailto:[email protected]> wrote:
looking more closely at this, should try to generalize the fangraphs scrape to cover all 5 projection systems they display
Reply to this email directly or view it on GitHubhttps://github.com//issues/19#issuecomment-195768850.
from projprep.
hey, @drewgriffith15 - didn't mean to spam you with notifications! just stumbled into your scripts and wanted to leave a breadcrumb for myself / cite my sources.
I love how compact that code is for reading in all of the steamer data by team. my current plan of attack was to turn that into a function, and then see what parameters needed to be changed to get all the other available projections - ZIPS, fangraph fans, etc.
if you'd like to join in, would love to have you on this project. in your 2015 script, were you implementing standings gain points? haven't ever tried that, and would definitely be interested in seeing how valuations differ for SGP vs z-score approaches.
from projprep.
OK, I wrote the core fangraphs scrape function - it's on a new branch, steamer
.
usage is pretty basic right now, and per #29, there are some variable-type issues that I need to figure out right away. but getting there!
from projprep.
Hey, feel free to use whatever code that you see on my side to build functions for Fangraphs (zips, steamer, etc.). If you want to work together on some code, I'd be up for it. Just let me know.
-Drew
From: Andrew Martin [email protected]
Sent: Monday, March 14, 2016 2:03 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)
OK, I wrote the core fangraphs scrape function - it's on a new branchhttps://github.com/almartin82/projprep/blob/72e7c722d9e3dcb8715607f0776d0153acf24040/R/fangraphs.R
usage is pretty basichttps://github.com/almartin82/projprep/blob/steamer/tests/testthat/test_fangraphs.R right now, and per #29#29, there are some variable-type issues that I need to figure out right away. but getting there!
Reply to this email directly or view it on GitHubhttps://github.com//issues/19#issuecomment-196159640.
from projprep.
@drewgriffith15 got all the fangraphs projections in! here's a minimal snippet:
library(devtools)
devtools::install_github('almartin82/projprep')
library(projprep)
ex <- projprep::get_steamer(2016, TRUE)
pp <- projprep::proj_prep(ex)
pp$h_final %>% dplyr::arrange(desc(value)) %>% peek()
will produce
mlbid fullname firstname lastname position priority_pos projection_name
1 545361 Mike Trout Mike Trout OF OF steamer
2 514888 Jose Altuve Jose Altuve 2B 2B steamer
3 502671 Paul Goldschmidt Paul Goldschmidt 1B 1B steamer
4 519203 Anthony Rizzo Anthony Rizzo 1B 1B steamer
5 457763 Buster Posey Buster Posey C C steamer
6 621043 Carlos Correa Carlos Correa SS SS steamer
ab r rbi sb tb obp r_zscore rbi_zscore sb_zscore tb_zscore obp_zscore
1 541 103 104 15 316 0.41 2.85 2.6077 0.5467 2.75 2.27
2 629 92 64 37 271 0.35 2.01 -0.0095 2.9794 1.53 2.20
3 538 92 92 14 286 0.40 2.01 1.8226 0.4362 1.94 2.02
4 552 92 99 10 287 0.37 2.01 2.2806 -0.0061 1.96 1.64
5 497 68 73 2 234 0.37 0.16 0.5794 -0.8908 0.53 0.77
6 572 80 83 20 262 0.34 1.08 1.2337 1.0996 1.29 1.12
unadjusted_zsum replacement_pos adjustment_zscore final_zsum value hit_pitch
1 11.0 OF -1.0 10.0 60 h
2 8.7 2B 1.2 9.9 59 h
3 8.2 1B 1.3 9.5 57 h
4 7.9 1B 1.3 9.2 55 h
5 1.2 C 7.0 8.1 49 h
6 5.8 SS 2.2 8.0 48 h
if you take a look at the main fangraphs scrape, you'll notice that instead of scraping pos=all
x 30 teams, I'm doing 6 hitting positions x 30 teams. that was the only way I could figure out how to get the position eligibility field - it isn't in the standard projections, so I had to extract it from the url.
I'm pretty happy with this - the only thing I wish that the scrape was preserving was the hyperlink for each player, which contains their fangraphs id. Having all those names and fangraphs ids would be nice to write back to the player metadata, and would help solve some of the problems around players with duplicate names (issue #31). @drewgriffith15 if you have any thoughts about how ways to extract those links as part of the scrape - the current readHTMLTable
call loses those hyperlinks. maybe we could combine it with something else from rvest
that makes getting those links easier?
from projprep.
I got the package to load from github. I was running a couple versions back and when I installed the new version, it worked!
Drew Griffith
Business Data Analyst III
Analytics and Decision Support
(850) 259-6039 (cell)
[http://www.liberty.edu/media/1616/40themail/wordmark-for-email.jpg]
Liberty University | Training Champions for Christ since 1971
From: Andrew Martin [email protected]
Sent: Tuesday, March 15, 2016 2:11 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)
@drewgriffith15https://github.com/drewgriffith15 got all the fangraphs projections in! here's a minimal snippet:
library(devtools)
devtools::install_github('almartin82/projprep')
library(projprep)
ex <- projprep::get_steamer(2016, TRUE)
pp <- projprep::proj_prep(ex)
pp$h_final %>% dplyr::arrange(desc(value)) %>% peek()
will produce
mlbid fullname firstname lastname position priority_pos projection_name
1 545361 Mike Trout Mike Trout OF OF steamer
2 514888 Jose Altuve Jose Altuve 2B 2B steamer
3 502671 Paul Goldschmidt Paul Goldschmidt 1B 1B steamer
4 519203 Anthony Rizzo Anthony Rizzo 1B 1B steamer
5 457763 Buster Posey Buster Posey C C steamer
6 621043 Carlos Correa Carlos Correa SS SS steamer
ab r rbi sb tb obp r_zscore rbi_zscore sb_zscore tb_zscore obp_zscore
1 541 103 104 15 316 0.41 2.85 2.6077 0.5467 2.75 2.27
2 629 92 64 37 271 0.35 2.01 -0.0095 2.9794 1.53 2.20
3 538 92 92 14 286 0.40 2.01 1.8226 0.4362 1.94 2.02
4 552 92 99 10 287 0.37 2.01 2.2806 -0.0061 1.96 1.64
5 497 68 73 2 234 0.37 0.16 0.5794 -0.8908 0.53 0.77
6 572 80 83 20 262 0.34 1.08 1.2337 1.0996 1.29 1.12
unadjusted_zsum replacement_pos adjustment_zscore final_zsum value hit_pitch
1 11.0 OF -1.0 10.0 60 h
2 8.7 2B 1.2 9.9 59 h
3 8.2 1B 1.3 9.5 57 h
4 7.9 1B 1.3 9.2 55 h
5 1.2 C 7.0 8.1 49 h
6 5.8 SS 2.2 8.0 48 h
if you take a look at the mainhttps://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b78a23db0c8fb60718eea/R/fangraphs.R#L23 fangraphs scrape, you'll notice that instead of scraping pos=all x 30 teams, I'm doing 6 hitting positions x 30 teams. that was the only way I could figure out how to get the position eligibility field - it isn't in the standard projections, so I had to extract it from the url.
I'm pretty happy with this - the only thing I wish that the scrape was preserving was the hyperlink for each player, which contains their fangraphs id. Having all those names and fangraphs ids would be nice to write back to the player metadatahttps://github.com/almartin82/projprep/blob/master/vignettes/universal_metadata.Rmd, and would help solve some of the problems around players with duplicate names (issue #31#31). @drewgriffith15https://github.com/drewgriffith15 if you have any thoughts about how ways to extract those links as part of the scrape - the current readHTMLTable call<https://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b
%2078a23db0
%20c8fb60718eea/R/fangraphs.R#L30> loses those hyperlinks. maybe we could combine it with something else from rvest that makes getting those links easier?
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#19 (comment)
from projprep.
There is a dependency on the ensurer package when you run that snippet of code you sent me.
-Drew
From: Andrew Martin [email protected]
Sent: Tuesday, March 15, 2016 2:11 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)
@drewgriffith15https://github.com/drewgriffith15 got all the fangraphs projections in! here's a minimal snippet:
library(devtools)
devtools::install_github('almartin82/projprep')
library(projprep)
ex <- projprep::get_steamer(2016, TRUE)
pp <- projprep::proj_prep(ex)
pp$h_final %>% dplyr::arrange(desc(value)) %>% peek()
will produce
mlbid fullname firstname lastname position priority_pos projection_name
1 545361 Mike Trout Mike Trout OF OF steamer
2 514888 Jose Altuve Jose Altuve 2B 2B steamer
3 502671 Paul Goldschmidt Paul Goldschmidt 1B 1B steamer
4 519203 Anthony Rizzo Anthony Rizzo 1B 1B steamer
5 457763 Buster Posey Buster Posey C C steamer
6 621043 Carlos Correa Carlos Correa SS SS steamer
ab r rbi sb tb obp r_zscore rbi_zscore sb_zscore tb_zscore obp_zscore
1 541 103 104 15 316 0.41 2.85 2.6077 0.5467 2.75 2.27
2 629 92 64 37 271 0.35 2.01 -0.0095 2.9794 1.53 2.20
3 538 92 92 14 286 0.40 2.01 1.8226 0.4362 1.94 2.02
4 552 92 99 10 287 0.37 2.01 2.2806 -0.0061 1.96 1.64
5 497 68 73 2 234 0.37 0.16 0.5794 -0.8908 0.53 0.77
6 572 80 83 20 262 0.34 1.08 1.2337 1.0996 1.29 1.12
unadjusted_zsum replacement_pos adjustment_zscore final_zsum value hit_pitch
1 11.0 OF -1.0 10.0 60 h
2 8.7 2B 1.2 9.9 59 h
3 8.2 1B 1.3 9.5 57 h
4 7.9 1B 1.3 9.2 55 h
5 1.2 C 7.0 8.1 49 h
6 5.8 SS 2.2 8.0 48 h
if you take a look at the mainhttps://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b78a23db0c8fb60718eea/R/fangraphs.R#L23 fangraphs scrape, you'll notice that instead of scraping pos=all x 30 teams, I'm doing 6 hitting positions x 30 teams. that was the only way I could figure out how to get the position eligibility field - it isn't in the standard projections, so I had to extract it from the url.
I'm pretty happy with this - the only thing I wish that the scrape was preserving was the hyperlink for each player, which contains their fangraphs id. Having all those names and fangraphs ids would be nice to write back to the player metadatahttps://github.com/almartin82/projprep/blob/master/vignettes/universal_metadata.Rmd, and would help solve some of the problems around players with duplicate names (issue #31#31). @drewgriffith15https://github.com/drewgriffith15 if you have any thoughts about how ways to extract those links as part of the scrape - the current readHTMLTable call<https://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b
%2078a23db0
%20c8fb60718eea/R/fangraphs.R#L30> loses those hyperlinks. maybe we could combine it with something else from rvest that makes getting those links easier?
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#19 (comment)
from projprep.
ah, ok -- I have ensurer in suggests
but maybe I will move all of those
to imports
so that they install cleanly.
http://r-pkgs.had.co.nz/description.html
On Tue, Mar 15, 2016 at 11:00 AM, Drew Griffith [email protected]
wrote:
There is a dependency on the ensurer package when you run that snippet of
code you sent me.-Drew
From: Andrew Martin [email protected]
Sent: Tuesday, March 15, 2016 2:11 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)@drewgriffith15https://github.com/drewgriffith15 got all the fangraphs
projections in! here's a minimal snippet:library(devtools)
devtools::install_github('almartin82/projprep')
library(projprep)ex <- projprep::get_steamer(2016, TRUE)
pp <- projprep::proj_prep(ex)
pp$h_final %>% dplyr::arrange(desc(value)) %>% peek()will produce
mlbid fullname firstname lastname position priority_pos projection_name
1 545361 Mike Trout Mike Trout OF OF steamer
2 514888 Jose Altuve Jose Altuve 2B 2B steamer
3 502671 Paul Goldschmidt Paul Goldschmidt 1B 1B steamer
4 519203 Anthony Rizzo Anthony Rizzo 1B 1B steamer
5 457763 Buster Posey Buster Posey C C steamer
6 621043 Carlos Correa Carlos Correa SS SS steamer
ab r rbi sb tb obp r_zscore rbi_zscore sb_zscore tb_zscore obp_zscore
1 541 103 104 15 316 0.41 2.85 2.6077 0.5467 2.75 2.27
2 629 92 64 37 271 0.35 2.01 -0.0095 2.9794 1.53 2.20
3 538 92 92 14 286 0.40 2.01 1.8226 0.4362 1.94 2.02
4 552 92 99 10 287 0.37 2.01 2.2806 -0.0061 1.96 1.64
5 497 68 73 2 234 0.37 0.16 0.5794 -0.8908 0.53 0.77
6 572 80 83 20 262 0.34 1.08 1.2337 1.0996 1.29 1.12
unadjusted_zsum replacement_pos adjustment_zscore final_zsum value
hit_pitch
1 11.0 OF -1.0 10.0 60 h
2 8.7 2B 1.2 9.9 59 h
3 8.2 1B 1.3 9.5 57 h
4 7.9 1B 1.3 9.2 55 h
5 1.2 C 7.0 8.1 49 h
6 5.8 SS 2.2 8.0 48 hif you take a look at the main<
https://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b78a23db0c8fb60718eea/R/fangraphs.R#L23>
fangraphs scrape, you'll notice that instead of scraping pos=all x 30
teams, I'm doing 6 hitting positions x 30 teams. that was the only way I
could figure out how to get the position eligibility field - it isn't in
the standard projections, so I had to extract it from the url.I'm pretty happy with this - the only thing I wish that the scrape was
preserving was the hyperlink for each player, which contains their
fangraphs id. Having all those names and fangraphs ids would be nice to
write back to the player metadata<
https://github.com/almartin82/projprep/blob/master/vignettes/universal_metadata.Rmd>,
and would help solve some of the problems around players with duplicate
names (issue #31#31).
@drewgriffith15https://github.com/drewgriffith15 if you have any
thoughts about how ways to extract those links as part of the scrape - the
current readHTMLTable call<
https://github.com/almartin82/projprep/blob/f2b6a366c0bdfa4e6e0b
%2078a23db0
%20c8fb60718eea/R/fangraphs.R#L30> loses those hyperlinks. maybe we could
combine it with something else from rvest that makes getting those links
easier?You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#19 (comment)—
You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub:
#19 (comment)
from projprep.
@drewgriffith15 just pushed that fix to master so that you can re-install. if you want to contribute (please do) I would also suggest cloning the repo locally on your machine, which makes it easier to pull down these changes and use them locally.
I try to use the branch workflow to manage these kinds of distributed projects.
from projprep.
Cool. I've got notifications setup, so I saw it. I will do my best to contribute and to use the branches. Never worked on a collaborative project on Github, so hopefully I can get it right the first time. I already noticed something else. Found an error with this:
ex <- projprep::get_fangraphs(2016, TRUE)
Error in names(df)[2] <- "fg_note" :
'names' attribute [2] must be the same length as the vector [1]
From: Andrew Martin [email protected]
Sent: Tuesday, March 15, 2016 11:16 AM
To: almartin82/projprep
Cc: Griffith, Warren Andrew (Analytics & Decision Support Admin)
Subject: Re: [projprep] include regular steamer (#19)
@drewgriffith15https://github.com/drewgriffith15 just pushed that fix to master so that you can re-install. if you want to contribute (please do) I would also suggest cloning the repo locally on your machine, which makes it easier to pull down these changes and use them locally.
I try to use the branch workflowhttps://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows to manage these kinds of distributed projects.
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#19 (comment)
from projprep.
Awesome -- happy to help w/ git collaboration. This set of Atlassian writeups is awesome - super intuitive intro to different models of git collaboration. 'Feature branch workflow' is good bang for the buck - not nearly as complicated as the gitflow workflow, but you get like 80% of the benefits. The good thing about this is if you fork projprep and clone it locally, you'll be committing to your own branch of the project - there's literally no way that it could break anything on this branch. Once you have something you want to contribute back, you just create a pull request, and I can bring those changes back into the code base.
This is how I work even when I am the only contributor to a project -- if you look at the network of commits, you'll see a feature branch leave master
, a bunch of commits happen, and then a pull request merges the changes back into master.
That has the advantage of keeping master stable - the bleeding edge changes live on a branch until they are ready for production. On projects where I have another lead collaborator (like mapvizieR with @chrishaid), we have established the convention of always submitting our branches to each other for code review and approval. I really like this workflow.
from projprep.
Related Issues (20)
- write set_defaults() function
- correctly handle rate stats HOT 3
- users should be able to apply playing-time assumptions across projections
- use TRUE/FALSE for user_settings flags
- some projection data lists players twice HOT 1
- add encoding to id_map
- fangraphs data currently all returns as 'character' type
- matt duffy problem
- normalize team names when scraping/reading projection data
- expand id_map to include all players in the PECOTA spreadsheet
- add PECOTA projections
- include cheatsheet projections HOT 1
- zips is broken HOT 2
- account for keeper prices
- get position eligibility from the yahoo api HOT 1
- support AL and NL only leagues
- include espn projections
- era zscore is screwy for starters HOT 2
- fantasy pros is broken for 2017 data
- Unable to pull Steamer HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from projprep.