Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to scrape historical stats #4

Open
scottfrechette opened this issue Aug 29, 2022 · 3 comments
Open

Add function to scrape historical stats #4

scottfrechette opened this issue Aug 29, 2022 · 3 comments

Comments

@scottfrechette
Copy link

Have you considered adding a function to scrape historical stats for given sport and position? While getting historical fantasy points by player is helpful it also provides relevant stats driving those points, which could be useful for deeper insights such as % of points from TDs.

Here's a very crude example to show URL and output:

library(dplyr)
library(rvest)

df_stats <- read_html('https://www.fantasypros.com/nfl/stats/qb.php?year=2021&week=1&scoring=Standard&roster=consensus&range=week') %>%
  html_table(header = F) %>%
  .[[1]]

cols <- paste(as.character(df_stats[1,]),
              as.character(df_stats[2,]),
              sep = "_") %>%
  gsub("^_|MISC_", "", .)

df_stats %>%
  slice(-1, -2) %>%
  rename_with(~tolower(cols)) %>%
  type.convert(as.is = T) %>% 
  glimpse()
#> Rows: 120
#> Columns: 18
#> $ rank          <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1~
#> $ player        <chr> "Kyler Murray (ARI)", "Patrick Mahomes II (KC)", "Jared ~
#> $ passing_cmp   <int> 21, 27, 38, 14, 32, 27, 42, 18, 34, 20, 21, 28, 36, 22, ~
#> $ passing_att   <int> 32, 36, 57, 20, 50, 35, 58, 23, 56, 26, 33, 51, 49, 37, ~
#> $ passing_pct   <dbl> 65.6, 75.0, 66.7, 70.0, 64.0, 77.1, 72.4, 78.3, 60.7, 76~
#> $ passing_yds   <int> 289, 337, 338, 148, 379, 264, 403, 254, 435, 321, 291, 3~
#> $ `passing_y/a` <dbl> 9.0, 9.4, 5.9, 7.4, 7.6, 7.5, 6.9, 11.0, 7.8, 12.3, 8.8,~
#> $ passing_td    <int> 4, 3, 3, 5, 4, 3, 3, 4, 2, 3, 2, 3, 2, 1, 2, 2, 1, 2, 2,~
#> $ passing_int   <int> 1, 0, 1, 0, 2, 0, 1, 0, 1, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0,~
#> $ passing_sacks <int> 2, 2, 3, 0, 0, 1, 1, 3, 3, 1, 1, 1, 3, 2, 2, 6, 1, 5, 3,~
#> $ rushing_att   <int> 5, 5, 3, 6, 0, 7, 4, 5, 4, 5, 4, 1, 0, 6, 3, 0, 5, 1, 4,~
#> $ rushing_yds   <int> 20, 18, 14, 37, 0, 62, 13, 9, 6, -5, 40, -2, 0, 27, 19, ~
#> $ rushing_td    <int> 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,~
#> $ fl            <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,~
#> $ g             <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
#> $ fpts          <dbl> 34.6, 33.3, 29.9, 29.6, 29.2, 28.8, 28.4, 27.1, 25.0, 24~
#> $ `fpts/g`      <dbl> 34.6, 33.3, 29.9, 29.6, 29.2, 28.8, 28.4, 27.1, 25.0, 24~
#> $ rost          <chr> "97.8%", "99.9%", "13.9%", "30.7%", "96.8%", "97.0%", "9~

Created on 2022-08-28 with reprex v2.0.2

@tanho63
Copy link
Member

tanho63 commented Aug 29, 2022

I haven’t, mostly because I like ffscrapr’s ff_scoringhistory methodology better in most cases. I can put this on the backlog! (No word on when that’ll happen)

@scottfrechette
Copy link
Author

The one drawback with ffscrapr is not having Yahoo league data because they don’t like sharing. I manually scrape their data and join to general data like this based on need.

@tanho63
Copy link
Member

tanho63 commented Aug 29, 2022

Ah! Yahoo. Okay, yeah, that would explain it. I owe a PR review for ffscrapr first but can probably knock this out decently quickly. In meantime, can use nflreadr::load_player_stats and adjust the fantasy points column as necessary?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants