Push nflfastR 2.0.3

nflverse · Jun 15, 2020 · 5e3b7ae · 5e3b7ae
1 parent 2adf0c6
commit 5e3b7ae
Show file tree

Hide file tree

Showing 7 changed files with 48 additions and 27 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: nflfastR
 Title: Functions to Efficiently Scrape NFL Play by Play Data
-Version: 2.0.2
+Version: 2.0.3
 Authors@R: 
     c(person(given = "Sebastian",
              family = "Carl",

diff --git a/NEWS.md b/NEWS.md
@@ -1,3 +1,8 @@
+# nflfastR 2.0.3
+
+* Fix for NFL providing plays out of order
+* Fix for series not incrementing following defensive TD
+
 # nflfastR 2.0.2
 
 * Fixed a bug in the series and series success calculations caused by timeouts

diff --git a/R/helper_add_nflscrapr_mutations.R b/R/helper_add_nflscrapr_mutations.R
@@ -11,13 +11,23 @@ add_nflscrapr_mutations <- function(pbp) {
 
   out <-
     pbp %>%
-    dplyr::mutate(index = 1 : dplyr::n()) %>% # to re-sort after removing duplicates
+    dplyr::mutate(index = 1 : dplyr::n()) %>%
     # remove duplicate plays. can't do this with play_id because duplicate plays
     # sometimes have different play_ids
     dplyr::group_by(game_id, quarter, time, play_description) %>%
     dplyr::slice(1) %>%
     dplyr::ungroup() %>%
-    dplyr::arrange(index) %>%
+    dplyr::mutate(
+      # Modify the time column for the quarter end:
+      time = dplyr::if_else(quarter_end == 1, "00:00", time),
+      time = dplyr::if_else(play_description == 'GAME', "15:00", time),
+      # Create a column with the time in seconds remaining for the quarter:
+      quarter_seconds_remaining = lubridate::period_to_seconds(lubridate::ms(time))
+    ) %>%
+    #put plays in the right order
+    dplyr::group_by(game_id) %>%
+    dplyr::arrange(quarter, -quarter_seconds_remaining, index) %>%
+    dplyr::ungroup() %>%
     dplyr::mutate(
       # Fill in the rows with missing posteam with the lag:
       posteam = dplyr::if_else(
@@ -74,10 +84,6 @@ add_nflscrapr_mutations <- function(pbp) {
         yardline_side == posteam | yardline == "MID 50",
         100 - yardline_number, yardline_number
       ),
-      # Modify the time column for the quarter end:
-      time = dplyr::if_else(quarter_end == 1, "00:00", time),
-      # Create a column with the time in seconds remaining for the quarter:
-      quarter_seconds_remaining = lubridate::period_to_seconds(lubridate::ms(time)),
       # Create a column with the time in seconds remaining for each half:
       half_seconds_remaining = dplyr::if_else(
         quarter %in% c(1, 3),

diff --git a/R/helper_add_series_data.R b/R/helper_add_series_data.R
@@ -23,8 +23,11 @@ add_series_data <- function(pbp) {
       # AND first down after change of possesion (-> drivenumber increases)
       # we don't want a first down being indicated for XP, 2P, KO
       first_down = dplyr::if_else(
-        (first_down_rush == 1 | first_down_pass == 1 |
-           first_down_penalty == 1 |
+        #earn first down
+        (first_down_rush == 1 | first_down_pass == 1 | first_down_penalty == 1 |
+        #defensive TD
+          (touchdown == 1 & td_team != posteam) |
+        #drive changes
            (drive < dplyr::lead(drive) | (drive < dplyr::lead(drive, 2) & is.na(dplyr::lead(drive))))
          ) &
           (extra_point_attempt == 0 & two_point_attempt == 0 & kickoff_attempt == 0),
@@ -49,7 +52,7 @@ add_series_data <- function(pbp) {
       ),
       series_success = dplyr::case_when(
         is.na(series) | qb_kneel == 1 | qb_spike == 1 ~ NA_real_,
-        touchdown == 1 | first_down_rush == 1 | first_down_pass == 1 |
+        (touchdown == 1 & td_team == posteam) | first_down_rush == 1 | first_down_pass == 1 |
           first_down_penalty == 1 ~ 1,
         punt_attempt == 1 | interception == 1 | fumble_lost == 1 |
           fourth_down_failed == 1 | field_goal_attempt == 1 ~ 0,

diff --git a/README.Rmd b/README.Rmd
@@ -17,9 +17,9 @@ knitr::opts_chunk$set(
 )
 ```
 
-
-
 <!-- badges: start -->
+![GitHub release (latest by date)](https://img.shields.io/github/v/release/mrcaseb/nflfastR?label=latest%20release)
+[![Twitter Follow](https://img.shields.io/twitter/follow/nflfastR.svg?style=social)](https://twitter.com/nflfastR)
 <!-- badges: end -->
 
 `nflfastR` is a set of functions to efficiently scrape NFL play-by-play data. `nflfastR` expands upon the features of nflscrapR:
@@ -56,7 +56,7 @@ library(tidyverse)
 
 The functionality of `nflscrapR` can be duplicated by using `fast_scraper` This obtains the same information contained in `nflscrapR` (plus some extra) but much more quickly. To compare to `nflscrapR`, we use their data repository as the program no longer functions now that the NFL has taken down the old Gamecenter feed. Note that EP differs from nflscrapR as we use a newer era-adjusted model (more on this below).
 
-This example also uses the built-in function `clean_pbp` to create a "name' column for the primary player involved (the QB on pass play or ball-carrier on run play).
+This example also uses the built-in function `clean_pbp` to create a 'name' column for the primary player involved (the QB on pass play or ball-carrier on run play).
 
 ``` {r ex1-nflscrapR, warning = FALSE, message = FALSE}
 read_csv(url('https://github.com/ryurko/nflscrapR-data/blob/master/play_by_play_data/regular_season/reg_pbp_2019.csv?raw=true')) %>%
@@ -104,7 +104,7 @@ games_2009 %>% filter(!is.na(cpoe)) %>% group_by(passer_player_name) %>%
 When scraping from the default RS feed, drive results are automatically included. Let's look at how much more likely teams were to score starting from 1st & 10 at their own 20 yard line in 2015 (the last year before touchbacks on kickoffs changed to the 25) than in 2000.
 ``` {r ex4, warning = FALSE, message = FALSE}
 games_2000 <- readRDS(url('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_2000.rds'))
-games_2015 <-readRDS(url('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_2015.rds'))
+games_2015 <- readRDS(url('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_2015.rds'))
 
 pbp <- bind_rows(games_2000, games_2015)
 
@@ -159,7 +159,7 @@ The `clean_pbp()` function does a lot of work cleaning up player names and IDs f
 
 ## `nflfastR` models
 
-`nflfastR` uses its own models for Expected Points, Win Probability, and Completion Percentage. To read about the models, [please see here](https://github.com/mrcaseb/nflfastR/blob/master/data-raw/MODEL-README.md). For a more detailed description of Expected Points models, we highly recommend this paper [from the nflscrapR team located here](https://arxiv.org/pdf/1802.00998.pdf). 
+`nflfastR` uses its own models for Expected Points, Win Probability, and Completion Probability. To read about the models, [please see here](https://github.com/mrcaseb/nflfastR/blob/master/data-raw/MODEL-README.md). For a more detailed description of Expected Points models, we highly recommend this paper [from the nflscrapR team located here](https://arxiv.org/pdf/1802.00998.pdf). 
 
 `nflfastR` includes two win probability models: one with and one without incorporating the pre-game spread.
 
@@ -184,10 +184,10 @@ Even though `nflfastR` is very fast, **for historical games we recommend downloa
 ## Special thanks
 
 * To [Nick Shoemaker](https://twitter.com/WeightRoomShoe) for [finding and making available JSON-formatted NFL play-by-play back to 1999](https://github.com/CroppedClamp/nfl_pbps) (`nflfastR` uses this source for 1999-2010)
+* To [Lau Sze Yui](https://twitter.com/903124S) for developing a scraping function to access JSON-formatted NFL play-by-play beginning in 2011.
 * To [Lee Sharpe](https://twitter.com/LeeSharpeNFL) for curating a resource for game information
 * To [Timo Riske](https://twitter.com/PFF_Moo), [Lau Sze Yui](https://twitter.com/903124S), [Sean Clement](https://twitter.com/SeanfromSeabeck), and [Daniel Houston](https://twitter.com/CowboysStats) for many helpful discussions regarding the development of the new `nflfastR` models
-* To [Zach Feldman](https://twitter.com/ZachFeldman3) and [Josh Hermsmeyer](https://twitter.com/friscojosh) for many helpful discussions about CPOE models
+* To [Zach Feldman](https://twitter.com/ZachFeldman3) and [Josh Hermsmeyer](https://twitter.com/friscojosh) for many helpful discussions about CPOE models as well as [Peter Owen](https://twitter.com/JSmoovesBrekkie) for [many helpful suggestions for the CP model](https://twitter.com/JSmoovesBrekkie/status/1268885950626623490)
 * To [Florian Schmitt](https://twitter.com/Flosch1006) for the logo design
-* To [Peter Owen](https://twitter.com/JSmoovesBrekkie) for [many helpful suggestions for the CP model](https://twitter.com/JSmoovesBrekkie/status/1268885950626623490)
 * The many users who found and reported bugs in `nflfastR` 1.0
 * And of course, the original [`nflscrapR`](https://github.com/maksimhorowitz/nflscrapR) team, Maksim Horowitz, Ronald Yurko, and Samuel Ventura, whose work represented a dramatic step forward for the state of public NFL research
diff --git a/README.md b/README.md
@@ -1,6 +1,13 @@
 nflfastR <img src='man/figures/logo.png' align="right" width="25%" />
 ================
 
+<!-- badges: start --> 
+
+![GitHub release (latest by
+date)](https://img.shields.io/github/v/release/mrcaseb/nflfastR?label=latest%20release)
+[![Twitter Follow](https://img.shields.io/twitter/follow/nflfastR.svg?style=social)](https://twitter.com/nflfastR)
+<!-- badges: end -->
+
   - [Installation](#installation)
   - [Usage](#usage)
       - [Example 1: replicate `nflscrapR` with
@@ -24,9 +31,6 @@ nflfastR <img src='man/figures/logo.png' align="right" width="25%" />
 
 <!-- README.md is generated from README.Rmd. Please edit that file -->
 
-<!-- badges: start -->
-
-<!-- badges: end -->
 
 `nflfastR` is a set of functions to efficiently scrape NFL play-by-play
 data. `nflfastR` expands upon the features of nflscrapR:
@@ -78,7 +82,7 @@ that EP differs from nflscrapR as we use a newer era-adjusted model
 (more on this below).
 
 This example also uses the built-in function `clean_pbp` to create a
-"name’ column for the primary player involved (the QB on pass play or
+‘name’ column for the primary player involved (the QB on pass play or
 ball-carrier on run play).
 
 ``` r
@@ -171,7 +175,7 @@ before touchbacks on kickoffs changed to the 25) than in 2000.
 
 ``` r
 games_2000 <- readRDS(url('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_2000.rds'))
-games_2015 <-readRDS(url('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_2015.rds'))
+games_2015 <- readRDS(url('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_2015.rds'))
 
 pbp <- bind_rows(games_2000, games_2015)
 
@@ -252,7 +256,7 @@ as the NFL changed their system for IDs in the underlying data.
 ## `nflfastR` models
 
 `nflfastR` uses its own models for Expected Points, Win Probability, and
-Completion Percentage. To read about the models, [please see
+Completion Probability. To read about the models, [please see
 here](https://github.com/mrcaseb/nflfastR/blob/master/data-raw/MODEL-README.md).
 For a more detailed description of Expected Points models, we highly
 recommend this paper [from the nflscrapR team located
@@ -306,6 +310,9 @@ Baldwin](https://twitter.com/benbbaldwin).
     and making available JSON-formatted NFL play-by-play back
     to 1999](https://github.com/CroppedClamp/nfl_pbps) (`nflfastR` uses
     this source for 1999-2010)
+  - To [Lau Sze Yui](https://twitter.com/903124S) for developing a
+    scraping function to access JSON-formatted NFL play-by-play
+    beginning in 2011.
   - To [Lee Sharpe](https://twitter.com/LeeSharpeNFL) for curating a
     resource for game information
   - To [Timo Riske](https://twitter.com/PFF_Moo), [Lau Sze
@@ -315,12 +322,12 @@ Baldwin](https://twitter.com/benbbaldwin).
     discussions regarding the development of the new `nflfastR` models
   - To [Zach Feldman](https://twitter.com/ZachFeldman3) and [Josh
     Hermsmeyer](https://twitter.com/friscojosh) for many helpful
-    discussions about CPOE models
+    discussions about CPOE models as well as [Peter
+    Owen](https://twitter.com/JSmoovesBrekkie) for [many helpful
+    suggestions for the CP
+    model](https://twitter.com/JSmoovesBrekkie/status/1268885950626623490)
   - To [Florian Schmitt](https://twitter.com/Flosch1006) for the logo
     design
-  - To [Peter Owen](https://twitter.com/JSmoovesBrekkie) for [many
-    helpful suggestions for the CP
-    model](https://twitter.com/JSmoovesBrekkie/status/1268885950626623490)
   - The many users who found and reported bugs in `nflfastR` 1.0
   - And of course, the original
     [`nflscrapR`](https://github.com/maksimhorowitz/nflscrapR) team,

diff --git a/man/figures/README-ex5-1.png b/man/figures/README-ex5-1.png