diff --git a/submissions/Why Advance Rate is Broken (But Don't Worry, We Fixed it).ipynb b/submissions/Why Advance Rate is Broken (But Don't Worry, We Fixed it).ipynb new file mode 100644 index 0000000..7b0e827 --- /dev/null +++ b/submissions/Why Advance Rate is Broken (But Don't Worry, We Fixed it).ipynb @@ -0,0 +1,1824 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "daeb3b9d", + "metadata": {}, + "source": [ + "# Why Advance Rate is Broken (But Don't Worry, We Fixed it)\n", + "\n", + "## Participants:\n", + "## Sackreligious\n", + "## Hackr6849\n", + "\n", + "https://spikeweek.com/\n", + "\n", + "Advance rate is a very commonly cited metric used to measure how impactful a player was in helping (or hurting) teams advance to the playoffs. The current method of calculating advance rate is simply the number of advancing teams that contain a specific player divided by the total number of times that player was drafted. There is a substantial amount of noise that muddies the signal contained in advance rate the way it is currently calculated. Some of the biases that reduce the value of advance rate in its current form include but are not limited to:\n", + "\n", + "*Combinations of players can occur at different rates due to things like stacking and ADPs that line up with certain draft slots\n", + "\n", + "*Roster constructions can have uneven distributions of certain players; 0 RB teams may be more likely to have certain players than robust RB teams etc.\n", + "\n", + "*Drafter skill matters. Above average drafters may be more likely to select certain players they have evaluated as \"good picks\" and less likely to select players they have evaluated as \"bad picks\". While they may be incorrect about their individual evaluation of a player with respect to the results of the specific season, they may have an overall advantage over the field leading to inflated advance rates of players they select more frequently. The inverse can be said about low skill drafters.\n", + "\n", + "The goals of this project are to provide less noisy, more useful metrics for measuring past player performance, and perhaps better identifying player profiles we should be targetting in future best ball drafts. The four metrics we have developed are:\n", + "\n", + "*Roster Agnostic Advance Rate (RAAR), a new more accurate method of determining the impact an individual player had on the ability of a team to advance to the playoffs. By swapping a specific player on to each team and recalculating the advance rate 1 team at a time by simulating the pod with 2022 results, we are able to remove many of the biases that currently plague advance rate. We choose the player we want to calculate RAAR for, and iterate through each draft_id, adding the player to one team at a time in each 12 team draft, and removing the player on the target roster that was selected with the pick closest to the ADP of the player we are swapping on to the team, provided those two players are the same position. We then recalculate the weekly score for this team, and test to see if the team would have advanced with the player that we are calculating RAAR for. We repeat this process for each team in the 12 team draft (no swap is required for the team that already has the target player on their roster). We then repeat this process for each draft_id in BBM3.\n", + "\n", + "*Average Player Points Added (APPA), a new metric that follows a similar player swapping methodology to compare players that were being selected in a similar ADP range to help measure which picks were actually best at a given point in drafts.\n", + "\n", + "*Player Points Contributed to Advancing Teams (PPCAT), a new metric that follows a similar player swapping methodology to compare the percentage of total roster points that a specific player contributed to advancing teams.\n", + "\n", + "*Player Points Contributed to Teams (PPCT), the same methodology as PPCAT, but for all teams instead of specifically advancing teams.\n", + "\n", + "First we create a table of weekly fantasy scores for each player for 2022." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d6022f9b", + "metadata": {}, + "outputs": [], + "source": [ + "import sqlite3\n", + "import pandas as pd\n", + "import nfl_data_py as nfl\n", + "\n", + "# Scoring rules\n", + "def calculate_fantasy_points(row):\n", + " points = 0.0\n", + " points += row['receptions'] * 0.5\n", + " points += row['receiving_tds'] * 6.0\n", + " points += row['receiving_yards'] * 0.1\n", + " points += row['rushing_tds'] * 6.0\n", + " points += row['rushing_yards'] * 0.1\n", + " points += row['passing_yards'] * 0.04\n", + " points += row['passing_tds'] * 4.0\n", + " points += row['interceptions'] * -1.0\n", + " points += row['passing_2pt_conversions'] * 2.0\n", + " points += row['rushing_2pt_conversions'] * 2.0\n", + " points += row['receiving_2pt_conversions'] * 2.0\n", + " points += row['sack_fumbles_lost'] * -2.0\n", + " points += row['rushing_fumbles_lost'] * -2.0\n", + " points += row['receiving_fumbles_lost'] * -2.0\n", + " return points\n", + "\n", + "# Specify the years and columns you are interested in\n", + "years = [2022]\n", + "columns = ['player_id', 'player_name', 'player_display_name', 'position', 'season', 'week', 'passing_yards', 'passing_tds', 'interceptions', 'sack_fumbles_lost', 'passing_2pt_conversions', 'rushing_yards', 'rushing_tds', 'rushing_fumbles_lost', 'rushing_2pt_conversions', 'receptions', 'receiving_yards', 'receiving_tds', 'receiving_fumbles_lost', 'receiving_2pt_conversions', 'special_teams_tds']\n", + "\n", + "# Fetch the weekly data\n", + "weekly_data = nfl.import_weekly_data(years, columns)\n", + "\n", + "# Calculate the fantasy points for each week and store in a new column\n", + "weekly_data['fantasy_points'] = weekly_data.apply(calculate_fantasy_points, axis=1)\n", + "\n", + "# Transform the data to the required format\n", + "weekly_scores = weekly_data.pivot_table(index=['player_display_name', 'position'], columns='week', values='fantasy_points', fill_value=0)\n", + "\n", + "# Convert the pivot table to a DataFrame and reset the index\n", + "weekly_scores_df = pd.DataFrame(weekly_scores.to_records())\n", + "\n", + "# Connect to the SQLite database\n", + "conn = sqlite3.connect('bestball.db')\n", + "\n", + "# Write the DataFrame to the SQLite database\n", + "weekly_scores_df.to_sql('FPTS_UD_2022', conn, if_exists='replace', index=False)\n", + "\n", + "# Close the database connection\n", + "conn.close()" + ] + }, + { + "cell_type": "markdown", + "id": "4f91a25d", + "metadata": {}, + "source": [ + "We dump ADP data and player ID to JSON files" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2d6a4f38", + "metadata": {}, + "outputs": [], + "source": [ + "import requests\n", + "from datetime import datetime, timedelta\n", + "import json\n", + "\n", + "\n", + "f = open('rawAdpData.datajson', 'r')\n", + "adpData = json.loads(f.read())\n", + "f.close()\n", + "\n", + "playerAdps = {}\n", + "playerIds = {}\n", + "vals = {}\n", + "\n", + "for date in adpData:\n", + "\n", + "\tdata = adpData[date]\n", + "\tfor val in data:\n", + "\t\tadp = val['adp']\n", + "\t\tdateVal = val['date']\n", + "\t\tplayerName = val['playerpositiondraftgroup']['player']['playerName']\n", + "\t\tplayerId = val['playerpositiondraftgroup']['playerDraftGroupId']\n", + "\t\tif playerName not in playerIds:\n", + "\t\t\tplayerIds[playerName] = []\n", + "\t\tif playerId not in playerIds[playerName]:\n", + "\t\t\tplayerIds[playerName].append(playerId)\n", + "\t\tif dateVal not in playerAdps:\n", + "\t\t\tplayerAdps[dateVal] = {}\n", + "\t\tplayerAdps[dateVal][playerId] = adp\n", + "\n", + "\n", + "f = open('playerAdpData.datajson', 'w')\n", + "f.write(json.dumps(playerAdps))\n", + "f.close()\n", + "\n", + "f = open('playerIds.datajson', 'w')\n", + "f.write(json.dumps(playerIds))\n", + "f.close()" + ] + }, + { + "cell_type": "markdown", + "id": "2c9d85ef", + "metadata": {}, + "source": [ + "Do precalculations and save to JSON files" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "df040853", + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "import logging\n", + "import sqlite3\n", + "import json\n", + "from tqdm import tqdm\n", + "\n", + "\n", + "def calculate_weekly_score(team, week_number):\n", + " team_scores = df_scores[df_scores['normalized'].isin(team['normalized'])]\n", + " unmatched = set(team['normalized']) - set(team_scores['normalized'])\n", + " for player in unmatched:\n", + " logging.error(\n", + " f\"Week {week_number}, Team {team['draft_entry_id'].values[0]}: Player {player} not found in 'UD_FPTS_2022' table\")\n", + "\n", + " positions = ['QB', 'RB', 'WR', 'TE']\n", + " score = 0.0\n", + " starting_lineup = {}\n", + " playersUsed = []\n", + "\n", + " for pos in positions:\n", + " num_required = 2 if pos == 'RB' else 3 if pos == 'WR' else 1\n", + " players_pos = team_scores[team_scores['position'] == pos]\n", + "\n", + " # Sort by scores for the given week, and avoid repeating players\n", + " sorted_players = players_pos.sort_values(by=f'{week_number}', ascending=False)\n", + " top_players = sorted_players.drop_duplicates(subset=['normalized']).head(num_required)\n", + "\n", + " score += top_players[f'{week_number}'].sum()\n", + " #starting_lineup += [{'n' : top_players['player_display_name'], 's' : top_players[f'{week_number}']}]\n", + " playerNames = list(top_players['player_display_name'])\n", + " playerScores = list(top_players[f'{week_number}'])\n", + " starting_lineup[pos] = {}\n", + " for i in range(len(playerNames)):\n", + " starting_lineup[pos][playerNames[i]] = playerScores[i]\n", + " playersUsed.append(playerNames[i])\n", + "\n", + " # Flex position\n", + " remaining_players = team_scores[(~team_scores['normalized'].isin(\n", + " [df_scores[df_scores['player_display_name'] == player]['normalized'].values[0] for player in\n", + " playersUsed])) & (team_scores['position'] != 'QB')]\n", + "\n", + " # Provide a default value for max_score_player\n", + " max_score_player = None\n", + "\n", + " if not remaining_players.empty:\n", + " max_score_player = remaining_players.loc[remaining_players[f'{week_number}'].idxmax()]\n", + "\n", + " if max_score_player is not None:\n", + " score += max_score_player[f'{week_number}']\n", + " playerName = max_score_player['player_display_name']\n", + " playerScore = max_score_player[f'{week_number}']\n", + " playersUsed.append(playerName)\n", + " position = 'FLEX'\n", + " if position not in starting_lineup:\n", + " starting_lineup[position] = {}\n", + " starting_lineup[position][playerName] = playerScore\n", + "\n", + " #top bench players\n", + " benchPlayers = {}\n", + " remaining_players = team_scores[(~team_scores['normalized'].isin(\n", + " [df_scores[df_scores['player_display_name'] == player]['normalized'].values[0] for player in\n", + " playersUsed]))]\n", + " for pos in positions:\n", + " num_required = 1\n", + " players_pos = remaining_players[remaining_players['position'] == pos]\n", + "\n", + " # Sort by scores for the given week, and avoid repeating players\n", + " sorted_players = players_pos.sort_values(by=f'{week_number}', ascending=False)\n", + " top_players = sorted_players.drop_duplicates(subset=['normalized']).head(num_required)\n", + "\n", + " playerNames = list(top_players['player_display_name'])\n", + " playerScores = list(top_players[f'{week_number}'])\n", + " positions = list(top_players['position'])\n", + " for i in range(len(playerNames)):\n", + " if positions[i] not in benchPlayers:\n", + " benchPlayers[positions[i]] = {}\n", + " benchPlayers[positions[i]][playerNames[i]] = playerScores[i]\n", + "\n", + " return score, starting_lineup, benchPlayers\n", + "\n", + "\n", + "def calculate_team_score(team_id):\n", + " global playerAdps\n", + " team = df_teams[df_teams['draft_entry_id'] == team_id]\n", + " print(f'Team ID: {team_id}')\n", + " print(f\"Player Names: {team['player_name'].values}\\n\")\n", + " players = list(team['player_name'])\n", + " pickRounds = list(team['team_pick_number'])\n", + " positions = list(team['position_name'])\n", + " adps = list(team['projection_adp'])\n", + " pickNums = list(team['overall_pick_number'])\n", + " draftDate = list(team['draft_time'])[0].split(' ')[0]\n", + " lineup = {}\n", + " for i in range(len(players)):\n", + " position = positions[i]\n", + " if position not in lineup:\n", + " lineup[position] = []\n", + " lineup[position].append({ 'name' : players[i], 'pos' : positions[i], 'pick' : pickNums[i] })\n", + " playerAdps[players[i]] = { 'p' : positions[i], 'a' : adps[i] }\n", + "\n", + " # For each week, calculate the weekly score\n", + " total_score = 0.0\n", + " team_output = {}\n", + "\n", + " for week in range(1, 15):\n", + " weekly_score, starting_lineup, benchLineup = calculate_weekly_score(team, week)\n", + " total_score += weekly_score\n", + " team_output[week] = { \"l\": starting_lineup, \"s\": weekly_score, \"b\" : benchLineup}\n", + "\n", + " return team_id, total_score, team_output, lineup, draftDate\n", + "\n", + "\n", + "def batch_draft_ids(df_teams, batch_size):\n", + " unique_draft_ids = df_teams['draft_id'].unique()\n", + " for i in range(0, len(unique_draft_ids), batch_size):\n", + " yield unique_draft_ids[i:i + batch_size]\n", + "\n", + "# create logger\n", + "logging.basicConfig(filename='error_logs.txt', level=logging.ERROR)\n", + "\n", + "# load your data\n", + "df_scores = pd.read_sql('SELECT * FROM UD_FPTS_2022', con=sqlite3.connect('bestball.db'))\n", + "con = sqlite3.connect('bestball.db')\n", + "cur = con.cursor()\n", + "res = cur.execute('SELECT distinct draft_id FROM BBMIII WHERE tournament_round_number = 1')\n", + "draftIds = res.fetchall()\n", + "\n", + "completedDrafts = []\n", + "try:\n", + " f = open('completedDrafts.txt', 'r')\n", + " lines = f.read().split('\\n')\n", + " for line in lines:\n", + " completedDrafts.append(line)\n", + " f.close()\n", + "except:\n", + " pass\n", + "\n", + "playerAdps = {}\n", + "try:\n", + " f = open('playerAdps.json', 'r')\n", + " playerAdps = json.loads(f.read())\n", + " f.close()\n", + "except:\n", + " pass\n", + "\n", + "draftNum = 0\n", + "totalDrafts = len(draftIds)\n", + "dataFiles = ['precalc_data_new_1.json','precalc_data_new_2.json','precalc_data_new_3.json','precalc_data_new_4.json','precalc_data_new_5.json','precalc_data_new_6.json','precalc_data_new_7.json','precalc_data_new_8.json','precalc_data_new_9.json','precalc_data_new_10.json']\n", + "for draftId in draftIds:\n", + " draftId = draftId[0]\n", + " if draftId in completedDrafts:\n", + " draftNum += 1\n", + " continue\n", + " draftNum += 1\n", + " print('---------------------- PROCESSING DRAFT NUMBER %d OF %d------------------------' % (draftNum,totalDrafts))\n", + " df_teams = pd.read_sql('SELECT * FROM BBMIII WHERE tournament_round_number = 1 and draft_id = \"%s\"' % (draftId,), con=sqlite3.connect('bestball.db'))\n", + " # Normalize names by removing dots and spaces, and converting to lower case\n", + " df_scores['normalized'] = df_scores['player_display_name'].str.replace('[. ]', '').str.lower()\n", + " df_teams['normalized'] = df_teams['player_name'].str.replace('[. ]', '').str.lower()\n", + "\n", + " batch_size = 100\n", + " final_results = {}\n", + "\n", + " # Add tqdm() around the iterable\n", + " for batch in tqdm(batch_draft_ids(df_teams, batch_size), desc=\"Processing batches\"):\n", + " batch_teams = df_teams[df_teams['draft_id'].isin(batch)]['draft_entry_id'].unique()\n", + " team_scores = {}\n", + "\n", + " # Add tqdm() around the iterable\n", + " for team_id in tqdm(batch_teams, desc=\"Processing teams\", leave=False):\n", + " team_id, team_score, team_output, lineup, draftDate = calculate_team_score(team_id)\n", + " draft_id = df_teams[df_teams['draft_entry_id'] == team_id]['draft_id'].values[0]\n", + " if draft_id not in team_scores:\n", + " team_scores[draft_id] = [(team_id, team_score, team_output, lineup, draftDate)]\n", + " else:\n", + " team_scores[draft_id].append((team_id, team_score, team_output, lineup, draftDate))\n", + "\n", + " for draft_id, scores in team_scores.items():\n", + " sorted_scores = sorted(scores, key=lambda x: x[1], reverse=True)\n", + " final_results[draft_id] = {}\n", + " for i, (team_id, score, team_output, lineup, draftDate) in enumerate(sorted_scores, 1):\n", + " final_results[draft_id][team_id] = {\"rank\": i, \"total_score\": score, \"team_output\": team_output, \"lineup\" : lineup, \"date\" : draftDate}\n", + " if i == 1:\n", + " final_results['first'] = score\n", + " elif i == 2:\n", + " final_results['second'] = score\n", + " elif i == 3:\n", + " final_results['third'] = score\n", + " f = open('precalc_data_new_%d.json' % ((draftNum % 10) + 1), 'a')\n", + " f.write('%s\\n' % (json.dumps(final_results),))\n", + " f.close()\n", + "\n", + " f = open('completedDrafts.txt', 'a')\n", + " f.write('%s\\n' % (draftId,))\n", + " f.close()\n", + "\n", + " f = open('playerAdps.json', 'w')\n", + " f.write(json.dumps(playerAdps))\n", + " f.close()\n", + "\n", + " #break\n", + "\n", + "\n", + "\n", + "'''\n", + "# Save precalculation data and thresholds to JSON\n", + "with open(r'precalc_data.json', 'w') as outfile:\n", + " json.dump(final_results, outfile\n", + "'''" + ] + }, + { + "cell_type": "raw", + "id": "e90e7bb7", + "metadata": {}, + "source": [ + "Sample of precalc data:\n", + "{'0008be15-d36c-407e-8053-ba5e19cc8599': {'b8052297-36fd-45ef-8e2d-6adf18de4ed7': {'rank': 1, 'total_score': 1733.3000000000002, 'team_output': {'1': {'l': {'QB': {'Joe Burrow': 22.22}, 'RB': {'Kareem Hunt': 21.0, 'Clyde Edwards-Helaire': 20.9}, 'WR': {'Davante Adams': 25.1, 'Jaylen Waddle': 15.700000000000001, 'Tyler Lockett': 4.300000000000001}, 'TE': {'Darren Waller': 9.9}, 'FLEX': {'Aaron Jones': 9.100000000000001}}, 's': 128.22000000000003, 'b': {'QB': {'Ryan Tannehill': 19.34}, 'RB': {'Josh Jacobs': 7.800000000000001}, 'WR': {'Nico Collins': 3.6}, 'TE': {'Logan Thomas': 6.0}}}, '2': {'l': {'QB': {'Joe Burrow': 16.560000000000002}, 'RB': {'Aaron Jones': 30.5, 'Clyde Edwards-Helaire': 13.8}, 'WR': {'Jaylen Waddle': 34.6, 'Tyler Lockett': 15.200000000000001, 'Allen Robinson': 13.3}, 'TE': {'Darren Waller': 14.0}, 'FLEX': {'Logan Thomas': 11.2}}, 's': 149.16, 'b': {'QB': {'Ryan Tannehill': 2.88}, 'RB': {'Raheem Mostert': 9.400000000000002}, 'WR': {'Sammy Watkins': 10.8}, 'TE': {'Austin Hooper': 2.4000000000000004}}}, '3': {'l': {'QB': {'Joe Burrow': 23.0}, 'RB': {'Clyde Edwards-Helaire': 12.4, 'Josh Jacobs': 12.2}, 'WR': {'Jaylen Waddle': 13.100000000000001, 'Tyler Lockett': 12.100000000000001, 'Davante Adams': 11.7}, 'TE': {'Darren Waller': 3.7}, 'FLEX': {'Kareem Hunt': 7.6000000000000005}}, 's': 95.8, 'b': {'QB': {'Ryan Tannehill': 19.76}, 'RB': {'Aaron Jones': 4.2}, 'WR': {'Nico Collins': 5.1000000000000005}, 'TE': {'Austin Hooper': 2.9000000000000004}}}, '4': {'l': {'QB': {'Joe Burrow': 20.08}, 'RB': {'Josh Jacobs': 32.0, 'Clyde Edwards-Helaire': 21.9}, 'WR': {'Davante Adams': 15.000000000000002, 'Tyler Lockett': 12.1, 'Nico Collins': 9.700000000000001}, 'TE': {'Logan Thomas': 4.4}, 'FLEX': {'Aaron Jones': 13.0}}, 's': 128.18, 'b': {'QB': {'Ryan Tannehill': 14.38}, 'RB': {'Raheem Mostert': 9.100000000000001}, 'WR': {'Michael Gallup': 9.4}, 'TE': {'Darren Waller': 3.9000000000000004}}}, '5': {'l': {'QB': {'Joe Burrow': 18.28}, 'RB': {'Josh Jacobs': 27.8, 'Raheem Mostert': 18.700000000000003}, 'WR': {'Davante Adams': 25.9, 'Tyler Lockett': 24.9, 'Nico Collins': 8.5}, 'TE': {'Austin Hooper': 1.2000000000000002}, 'FLEX': {'Kareem Hunt': 13.2}}, 's': 138.48, 'b': {'QB': {'Ryan Tannehill': 11.54}, 'RB': {'Aaron Jones': 9.0}, 'WR': {'Michael Gallup': 6.4}, 'TE': {'Darren Waller': 0.0}}}\n", + "..." + ] + }, + { + "cell_type": "markdown", + "id": "2369108c", + "metadata": {}, + "source": [ + "Swap every player to every roster, one roster at a time, with a player of the same position taken near their ADP, then simulate the pod to calculate roster agnostic advance rate. Also calculate the average points a player added over replacement player's drafted within a range of their ADP, as well as PPCAT and PPCT." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d2c4ce7a", + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "import pandas as pd\n", + "import sqlite3\n", + "import copy\n", + "\n", + "\n", + "def get_scores_for_player(player):\n", + " nameExceptions = {\n", + " 'DJ Moore': 'D.J. Moore',\n", + " 'AJ Dillon': 'A.J. Dillon',\n", + " 'DJ Chark': 'D.J. Chark',\n", + " 'KJ Hamler': 'K.J. Hamler'\n", + " }\n", + " # Fetch scores from the 'UD_FPTS_2022' dataframe for a given player\n", + " if player not in player_scores:\n", + " playerName = player\n", + " if player in nameExceptions:\n", + " playerName = nameExceptions[player]\n", + " scores_df = UD_FPTS_2022_df[UD_FPTS_2022_df['player_display_name'] == playerName]\n", + " player_scores[player] = {}\n", + " if len(scores_df.index) == 0:\n", + " return\n", + " for i in range(1, 15):\n", + " player_scores[player][i] = list(scores_df[str(i)])[0]\n", + " return player_scores[player]\n", + "\n", + "\n", + "def swap_player(team_data, player_to_swap, test_player, test_player_scores_dict, position):\n", + " # Make a deep copy of the team data so we don't modify the original data\n", + " new_team_data = copy.deepcopy(team_data)\n", + "\n", + " # For each week in the team output\n", + " new_team_data['total_score'] = 0\n", + " for week_num in new_team_data['team_output']:\n", + " week_data = new_team_data['team_output'][week_num]\n", + " # First remove the player from the starting lineup if they were not in the starting lineup\n", + "\n", + " # if player_to_swap in week_data['l'][position]:\n", + " # week_data['l'][player_to_swap] = test_player_scores_dict.get(week_data['week_number'], 0)\n", + " remove_player_from_lineup(week_data, player_to_swap, position)\n", + "\n", + " if not test_player_scores_dict or int(week_num) not in test_player_scores_dict:\n", + " new_team_data['total_score'] += week_data['s']\n", + " continue\n", + " add_player_to_lineup(week_data, test_player, test_player_scores_dict[int(week_num)], position)\n", + "\n", + " # If the player to swap is on the bench, replace his score with the test player's score\n", + " # elif player_to_swap in week_data['bench']:\n", + " # week_data['bench'][player_to_swap] = test_player_scores_dict.get(week_data['week_number'], 0)\n", + " new_team_data['total_score'] += week_data['s']\n", + " return new_team_data\n", + "\n", + "\n", + "'''\n", + "Remove a player from the lineup. If they are not in the lineup, nothing needs to happen\n", + "'''\n", + "\n", + "\n", + "def remove_player_from_lineup(lineup_data, player_to_remove, position):\n", + " if player_to_remove in lineup_data['l'][position]:\n", + " lineup_data['s'] = lineup_data['s'] - lineup_data['l'][position][player_to_remove]\n", + " lineup_data['l'][position].pop(player_to_remove, None)\n", + " if position in lineup_data['b']:\n", + " for player_to_add in lineup_data['b'][position]:\n", + " lineup_data['s'] += lineup_data['b'][position][player_to_add]\n", + " lineup_data['l'][position][player_to_add] = lineup_data['b'][position][player_to_add]\n", + " break\n", + " else:\n", + " player_to_add = None\n", + " high_score = 0\n", + " for pos in lineup_data['b']:\n", + " if pos == 'QB':\n", + " continue\n", + " for player_name in lineup_data['b'][pos]:\n", + " score = lineup_data['b'][pos][player_name]\n", + " if score > high_score:\n", + " player_to_add = player_name\n", + " high_score = score\n", + " break\n", + " if player_to_add:\n", + " lineup_data['s'] += high_score\n", + " lineup_data['l'][position][player_to_add] = high_score\n", + " elif player_to_remove in lineup_data['l']['FLEX']:\n", + " lineup_data['s'] = lineup_data['s'] - lineup_data['l']['FLEX'][player_to_remove]\n", + " lineup_data['l']['FLEX'].pop(player_to_remove, None)\n", + " player_to_add = None\n", + " high_score = 0\n", + " for position in lineup_data['b']:\n", + " if position == 'QB':\n", + " continue\n", + " for player_name in lineup_data['b'][position]:\n", + " score = lineup_data['b'][position][player_name]\n", + " if score > high_score:\n", + " player_to_add = player_name\n", + " high_score = score\n", + " break\n", + " if player_to_add:\n", + " lineup_data['s'] += high_score\n", + " lineup_data['l']['FLEX'][player_to_add] = high_score\n", + "\n", + "\n", + "def add_player_to_lineup(lineup_data, player_to_add, score, position):\n", + " if position == 'QB':\n", + " qbScore = 0\n", + " for playerName in lineup_data['l']['QB']:\n", + " qbScore = lineup_data['l']['QB'][playerName]\n", + " break\n", + " if qbScore < score:\n", + " lineup_data['s'] += score - qbScore\n", + " else:\n", + " # find the lowest score between the flex and that player's position to see who is replaced\n", + " lowPositionScore = 0\n", + " for playerName in lineup_data['l']['FLEX']:\n", + " lowPositionScore = lineup_data['l']['FLEX'][playerName]\n", + " break\n", + "\n", + " for playerName in lineup_data['l'][position]:\n", + " if lineup_data['l'][position][playerName] < lowPositionScore:\n", + " lowPositionScore = lineup_data['l'][position][playerName]\n", + "\n", + " if lowPositionScore < score:\n", + " lineup_data['s'] += score - lowPositionScore\n", + "\n", + "\n", + "def get_player_to_swap(lineup, test_player, test_player_adp, test_player_position):\n", + " same_team_same_position_players = []\n", + " for player in lineup[test_player_position]:\n", + " same_team_same_position_players.append(\n", + " {'name': player['name'], 'adp_difference': abs(player['pick'] - test_player_adp)})\n", + "\n", + " # If no such players exist, return None\n", + " if len(same_team_same_position_players) == 0:\n", + " return None\n", + " same_team_same_position_players.sort(key=lambda x: x['adp_difference'])\n", + " player_to_swap = same_team_same_position_players[0]['name']\n", + "\n", + " return player_to_swap\n", + "\n", + "def getPlayersToSwap(playerAdpVals, position, playerId, pickNum, playerNamesById, playerAdps):\n", + "\n", + " numPlayers = getNumPlayersToSwap(pickNum)\n", + "\n", + " offset = 0\n", + " for id in playerAdpVals:\n", + " if playerAdpVals[id] > pickNum:\n", + " break\n", + " offset += 1\n", + "\n", + " playersToSwap = []\n", + "\n", + " ids = list(playerAdpVals.keys())\n", + " startVal = max(0, offset-numPlayers)\n", + " endVal = min(len(ids)-1, offset + numPlayers)\n", + "\n", + " for i in range(startVal, endVal+1):\n", + " if startVal < 0 or endVal >= len(ids):\n", + " continue\n", + " id = ids[i]\n", + " if playerId == id:\n", + " continue\n", + " playerName = playerNamesById[id]\n", + " if playerName not in playerAdps:\n", + " continue\n", + " playerVals = playerAdps[playerName]\n", + " if playerVals['p'] != position:\n", + " continue\n", + " playersToSwap.append(playerName)\n", + "\n", + " return playersToSwap\n", + "\n", + "\n", + "def getNumPlayersToSwap(pickNum):\n", + "\n", + " roundNum = round((pickNum-1) / 12) + 1\n", + " if roundNum <= 2:\n", + " return 3\n", + " else:\n", + " return 3 + (roundNum - 2)\n", + "\n", + "\n", + "def calculate_rank(new_team_data, precalc_data):\n", + " # The new team's draft ID\n", + " new_team_draft_id = new_team_data['draft_id']\n", + "\n", + " # Extract all teams in the same draft from precalc_data\n", + " same_draft_teams = [team for team in precalc_data if team['draft_id'] == new_team_draft_id]\n", + "\n", + " # Calculate total score for new team\n", + " new_team_total_score = sum(week_data['score'] for week_data in new_team_data['team_output'])\n", + "\n", + " # Add new team to the list of same draft teams\n", + " same_draft_teams.append({'team_id': new_team_data['team_id'], 'total_score': new_team_total_score})\n", + "\n", + " # Sort teams by total score in descending order\n", + " sorted_teams = sorted(same_draft_teams, key=lambda x: x['total_score'], reverse=True)\n", + "\n", + " # Find rank of new team\n", + " new_team_rank = next(\n", + " i + 1 for i, team in enumerate(sorted_teams) if team['team_id'] == new_team_data['team_id'])\n", + "\n", + " return new_team_rank\n", + "\n", + "\n", + "if __name__ == '__main__':\n", + "\n", + " con = sqlite3.connect('bestball.db')\n", + "\n", + " playerAdps = {}\n", + " f = open('playerAdps.json', 'r')\n", + " playerAdps = json.loads(f.read())\n", + " f.close()\n", + "\n", + " playerAdps = dict(sorted(playerAdps.items(), key=lambda x: x[1]['a']))\n", + "\n", + " playerIds = {}\n", + " f = open('playerIds.datajson', 'r')\n", + " playerIds = json.loads(f.read())\n", + " f.close()\n", + "\n", + " playerNamesById = {}\n", + " for playerName in playerIds:\n", + " for playerId in playerIds[playerName]:\n", + " playerNamesById[playerId] = playerName\n", + "\n", + " playerAdpsAllDates = {}\n", + " f = open('playerAdpData.datajson', 'r')\n", + " playerAdpsAllDates = json.loads(f.read())\n", + " f.close()\n", + "\n", + " for date in playerAdpsAllDates:\n", + " playerAdpsAllDates[date] = dict(sorted(playerAdpsAllDates[date].items(), key=lambda x: x[1]))\n", + "\n", + " # Dictionaries to store player scores and info\n", + " player_scores = {}\n", + " player_info = {}\n", + " # Fetch data from 'UD_FPTS_2022' table\n", + "\n", + " UD_FPTS_2022_df = pd.read_sql_query(\"SELECT * FROM UD_FPTS_2022\", con)\n", + " \n", + " BATCH_SIZE = 1\n", + "\n", + " # Set up a dictionary to store advance counts for each player\n", + " advance_count = {player: {'a': 0, 't': 0} for player in playerAdps.keys()}\n", + "\n", + " # so that we don't have to recalculate lineups\n", + " player_results = {}\n", + " try:\n", + " f = open('playerResults.json', 'r')\n", + " player_results = json.loads(f.read())\n", + " f.close()\n", + " except:\n", + " pass\n", + "\n", + " try:\n", + " f = open('advanceRate.json', 'r')\n", + " advance_count = json.loads(f.read())\n", + " f.close()\n", + " except:\n", + " pass\n", + "\n", + " # Load the data\n", + " '''\n", + " with open(r'precalc_data.json', 'r') as file:\n", + " precalc_data = json.load(file)\n", + " '''\n", + "\n", + " for i in range(1, 11):\n", + " f = open('precalc_data_new_%d.json' % (i,), 'r')\n", + " precalc_data = []\n", + " while True:\n", + " line = f.readline()\n", + " if not line:\n", + " break\n", + " precalc_data.append(json.loads(line))\n", + " f.close()\n", + "\n", + " # Main loop\n", + " for test_player in playerAdps:\n", + " if playerAdps[test_player]['a'] >= 200:\n", + " continue\n", + " if test_player not in player_results:\n", + " player_results[test_player] = []\n", + " test_player_position = playerAdps[test_player]['p']\n", + " if test_player not in advance_count:\n", + " continue\n", + " print(f\"Player to test: {test_player}\")\n", + " playerCount = 1\n", + "\n", + " # Get the weekly scores of the test player from the 'UD_FPTS_2022' table\n", + " test_player_scores_dict = get_scores_for_player(test_player)\n", + " if not test_player_scores_dict:\n", + " continue\n", + " for val in precalc_data:\n", + " for draftId in val:\n", + " break\n", + " draftData = val[draftId]\n", + " for lineupId in draftData:\n", + " if lineupId in player_results[test_player]:\n", + " continue\n", + " lineupData = draftData[lineupId]\n", + " original_rank = lineupData[\"rank\"]\n", + "\n", + " #print(f\"Original rank: {original_rank}\")\n", + "\n", + " # If the test player is already on the team, continue to the next player\n", + " playerFound = False\n", + " for player in lineupData['lineup'][test_player_position]:\n", + " if player['name'] == test_player:\n", + " playerFound = True\n", + " break\n", + " if playerFound:\n", + "\n", + " if lineupData['rank'] <= 2:\n", + " advance_count[test_player]['a'] += 1\n", + " advance_count[test_player]['t'] += 1\n", + "\n", + " if playerCount % 5000 == 0:\n", + " print(playerCount)\n", + " player_results[test_player].append(lineupId)\n", + " f = open('playerResults.json', 'w')\n", + " f.write(json.dumps(player_results))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " f = open('advanceRate.json', 'w')\n", + " f.write(json.dumps(advance_count))\n", + " f.flush()\n", + " f.close()\n", + " playerCount += 1\n", + " continue\n", + "\n", + " # Get the player to swap with the test player\n", + " playerAdpsByDate = playerAdpsAllDates[lineupData['date']]\n", + " try:\n", + " playerId = playerIds[test_player][0]\n", + " playerAdp = playerAdpsByDate[playerId]\n", + " except:\n", + " try:\n", + " playerAdp = playerAdps[test_player]['a']\n", + " except:\n", + " playerAdp = 216\n", + " player_to_swap = get_player_to_swap(lineupData['lineup'], test_player, playerAdp,\n", + " playerAdps[test_player]['p'])\n", + " if not player_to_swap:\n", + " continue\n", + "\n", + " #(f\"Player to swap: {player_to_swap}\")\n", + "\n", + " # Swap the player\n", + " new_team_data = swap_player(lineupData, player_to_swap, test_player, test_player_scores_dict,\n", + " test_player_position)\n", + " #print(f\"New team data: {new_team_data}\")\n", + "\n", + " if new_team_data['total_score'] > val['second']:\n", + " advance_count[test_player]['a'] += 1\n", + " advance_count[test_player]['t'] += 1\n", + "\n", + " player_results[test_player].append(lineupId)\n", + "\n", + " if playerCount % 5000 == 0:\n", + " print(playerCount)\n", + " f = open('playerResults.json', 'w')\n", + " f.write(json.dumps(player_results))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " f = open('advanceRate.json', 'w')\n", + " f.write(json.dumps(advance_count))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " playerCount += 1\n", + "\n", + " '''\n", + " # Calculate the new rank\n", + " new_rank = calculate_rank(new_team_data, precalc_data)\n", + " print(f\"New rank: {new_rank}\")\n", + "\n", + " # If the new rank is better than the original rank, increment the advance count for the test player\n", + " if new_rank < original_rank:\n", + " advance_count[test_player] += 1\n", + " original_rank = new_rank\n", + " '''\n", + " # break\n", + " f = open('playerResults.json', 'w')\n", + " f.write(json.dumps(player_results))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " f = open('advanceRate.json', 'w')\n", + " f.write(json.dumps(advance_count))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " playerPointsPercentages = {}\n", + " try:\n", + " f = open('playerPointsPercentages.json', 'r')\n", + " playerPointsPercentages = json.loads(f.read())\n", + " f.close()\n", + " except:\n", + " pass\n", + "\n", + " playersAdvancingPointsPercentages = {}\n", + " try:\n", + " f = open('playerAdvancingPointsPercentages.json', 'r')\n", + " playersAdvancingPointsPercentages = json.loads(f.read())\n", + " f.close()\n", + " except:\n", + " pass\n", + "\n", + " completedContests = []\n", + " try:\n", + " f = open('playerAdvancingPointsPercentagesCompletedContests.json', 'r')\n", + " completedContests = json.loads(f.read())\n", + " f.close()\n", + " except:\n", + " pass\n", + "\n", + " nameExceptions = {\n", + " 'D.J. Moore': 'DJ Moore',\n", + " 'A.J. Dillon': 'AJ Dillon',\n", + " 'D.J. Chark': 'DJ Chark',\n", + " 'K.J. Hamler': 'KJ Hamler',\n", + " 'Gabe Davis': 'Gabriel Davis'\n", + " }\n", + "\n", + " for i in range(1, 11):\n", + " f = open('precalc_data_new_%d.json' % (i,), 'r')\n", + " precalc_data = []\n", + " while True:\n", + " line = f.readline()\n", + " if not line:\n", + " break\n", + " precalc_data.append(json.loads(line))\n", + " f.close()\n", + "\n", + " for val in precalc_data:\n", + " for draftId in val:\n", + " break\n", + " if draftId in completedContests:\n", + " continue\n", + " print(f\"Calculting draft: {draftId}\")\n", + " valCount = 1\n", + " draftData = val[draftId]\n", + " for lineupId in draftData:\n", + " playerScores = {}\n", + " lineup = draftData[lineupId]['lineup']\n", + " for position in lineup:\n", + " for val in lineup[position]:\n", + " playerName = val['name']\n", + " playerScores[playerName] = 0\n", + " totalPoints = draftData[lineupId]['total_score']\n", + " weeklyScores = draftData[lineupId]['team_output']\n", + " for weekNum in weeklyScores:\n", + " weeklyLineup = weeklyScores[weekNum]['l']\n", + " for position in weeklyLineup:\n", + " for playerName in weeklyLineup[position]:\n", + " score = weeklyLineup[position][playerName]\n", + " if playerName in nameExceptions:\n", + " playerName = nameExceptions[playerName]\n", + " playerScores[playerName] += score\n", + " rank = draftData[lineupId]['rank']\n", + " for playerName in playerScores:\n", + " if playerName not in playerPointsPercentages:\n", + " playerPointsPercentages[playerName] = {'p': 0, 't': 0}\n", + " playerPointsPercentages[playerName]['p'] += playerScores[playerName]\n", + " playerPointsPercentages[playerName]['t'] += totalPoints\n", + "\n", + " if rank <= 2:\n", + " if playerName not in playersAdvancingPointsPercentages:\n", + " playersAdvancingPointsPercentages[playerName] = {'p': 0, 't': 0}\n", + " playersAdvancingPointsPercentages[playerName]['p'] += playerScores[playerName]\n", + " playersAdvancingPointsPercentages[playerName]['t'] += totalPoints\n", + "\n", + " completedContests.append(draftId)\n", + "\n", + " if valCount % 5000 == 0:\n", + " print(valCount)\n", + " f = open('playerPointsPercentages.json', 'w')\n", + " f.write(json.dumps(playerPointsPercentages))\n", + " f.flush()\n", + " f.close()\n", + " f = open('playerAdvancingPointsPercentages.json', 'w')\n", + " f.write(json.dumps(playersAdvancingPointsPercentages))\n", + " f.flush()\n", + " f.close()\n", + " f = open('playerAdvancingPointsPercentagesCompletedContests.json', 'w')\n", + " f.write(json.dumps(completedContests))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " valCount += 1\n", + " f = open('playerPointsPercentages.json', 'w')\n", + " f.write(json.dumps(playerPointsPercentages))\n", + " f.flush()\n", + " f.close()\n", + " f = open('playerAdvancingPointsPercentages.json', 'w')\n", + " f.write(json.dumps(playersAdvancingPointsPercentages))\n", + " f.flush()\n", + " f.close()\n", + " f = open('playerAdvancingPointsPercentagesCompletedContests.json', 'w')\n", + " f.write(json.dumps(completedContests))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " # so that we don't have to recalculate lineups\n", + " lineupsCalculated = []\n", + " try:\n", + " f = open('lineupsCalculatedPerLineupAdvanceRate.json', 'r')\n", + " lineupsCalculated = json.loads(f.read())\n", + " f.close()\n", + " except:\n", + " pass\n", + "\n", + " playerPointsAddedPerLineup = {}\n", + " try:\n", + " f = open('playerPointsAddedPerLineup.json', 'r')\n", + " playerPerLineupAdvanceRate = json.loads(f.read())\n", + " f.close()\n", + " except:\n", + " pass\n", + "\n", + " playerScoresVals = {}\n", + "\n", + " for i in range(1, 11):\n", + " f = open('precalc_data_new_%d.json' % (i,), 'r')\n", + " precalc_data = []\n", + " while True:\n", + " line = f.readline()\n", + " if not line:\n", + " break\n", + " precalc_data.append(json.loads(line))\n", + " f.close()\n", + "\n", + " # Main loop\n", + " for contestData in precalc_data:\n", + " for contestId in contestData:\n", + " break\n", + " if contestId in lineupsCalculated:\n", + " continue\n", + " print(f\"Calculting draft: {contestId}\")\n", + " valCount = 1\n", + " for lineupId in contestData[contestId]:\n", + " lineupInfo = contestData[contestId][lineupId]\n", + " lineup = lineupInfo['lineup']\n", + " weeklyScores = lineupInfo['team_output']\n", + " draftDate = lineupInfo['date']\n", + " playerAdpVals = playerAdpsAllDates[draftDate]\n", + " for position in lineup:\n", + " for playerVal in lineup[position]:\n", + " playerName = playerVal['name']\n", + " playerIdName = playerName\n", + " if playerIdName in nameExceptions:\n", + " playerIdName = nameExceptions[playerName]\n", + " if playerIdName not in playerIds:\n", + " continue\n", + " playerId = playerIds[playerIdName]\n", + " pickNum = playerVal['pick']\n", + " position = playerVal['pos']\n", + " playersToSwap = getPlayersToSwap(playerAdpVals, position, playerId[0], pickNum, playerNamesById, playerAdps)\n", + " if len(playersToSwap) == 0:\n", + " continue\n", + " for playerToSwap in playersToSwap:\n", + " if playerToSwap not in playerScoresVals:\n", + " playerScoresVals[playerToSwap] = get_scores_for_player(playerToSwap)\n", + " new_team_data = swap_player(lineupInfo, playerName, playerToSwap,\n", + " playerScoresVals[playerToSwap],\n", + " position)\n", + " if playerName not in playerPointsAddedPerLineup:\n", + " playerPointsAddedPerLineup[playerName] = {'pointsAdded' : 0, 'numSwaps' : 0, 'position' : position, 'totalAdp' : 0, 'totalLineups' : 0}\n", + " playerPointsAddedPerLineup[playerName]['numSwaps'] += 1\n", + " playerPointsAddedPerLineup[playerName]['pointsAdded'] += lineupInfo['total_score'] - new_team_data['total_score']\n", + " playerPointsAddedPerLineup[playerName]['totalAdp'] += pickNum\n", + " playerPointsAddedPerLineup[playerName]['totalLineups'] += 1\n", + "\n", + "\n", + " lineupsCalculated.append(contestId)\n", + " if valCount % 5000 == 0:\n", + " print(valCount)\n", + " f = open('lineupsCalculatedPerLineupAdvanceRate.json', 'w')\n", + " f.write(json.dumps(lineupsCalculated))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " f = open('playerPointsAddedPerLineup.json', 'w')\n", + " f.write(json.dumps(playerPointsAddedPerLineup))\n", + " f.flush()\n", + " f.close()\n", + " valCount += 1\n", + " f = open('lineupsCalculatedPerLineupAdvanceRate.json', 'w')\n", + " f.write(json.dumps(lineupsCalculated))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " f = open('playerPointsAddedPerLineup.json', 'w')\n", + " f.write(json.dumps(playerPointsAddedPerLineup))\n", + " f.flush()\n", + " f.close()\n", + "\n", + " con.close()" + ] + }, + { + "cell_type": "raw", + "id": "34e6d7ed", + "metadata": {}, + "source": [ + "Sample of advanceRate.json\n", + "{\n", + " Jonathan Taylor: {\n", + " a: 38714\n", + " t: 454940\n", + " }\n", + " Christian McCaffrey: {\n", + " a: 118223\n", + " t: 454954\n", + " }\n", + " Justin Jefferson: {\n", + " a: 143125\n", + " t: 454953\n", + " }\n", + " Cooper Kupp: {\n", + " a: 67096\n", + " t: 454947\n", + " }\n", + " Ja'Marr Chase: {\n", + " a: 57041\n", + " t: 454951\n", + " }\n", + " Austin Ekeler: {\n", + " a: 135040\n", + " t: 454952\n", + " }\n", + " Stefon Diggs: {\n", + " a: 113736\n", + " t: 454948\n", + " }\n", + " Derrick Henry: {\n", + " a: 114805\n", + " t: 454947\n", + " }\n", + " Dalvin Cook: {\n", + " a: 59397\n", + " t: 454946\n", + " }\n", + " Davante Adams: {\n", + " a: 128579\n", + " t: 454954\n", + " }\n", + " Najee Harris: {\n", + " a: 35049\n", + " t: 454956\n", + " }\n", + " Travis Kelce: {\n", + " a: 179332\n", + " t: 454955\n", + " }\n", + " Joe Mixon: {\n", + " a: 75182\n", + " t: 454954\n", + " }\n", + " CeeDee Lamb: {\n", + " a: 63369\n", + " t: 454954\n", + " }\n", + " Saquon Barkley: {\n", + " a: 96503\n", + " t: 454954\n", + " }\n", + " D'Andre Swift: {\n", + " a: 34286\n", + " t: 454948\n", + " }\n", + " ..." + ] + }, + { + "cell_type": "markdown", + "id": "e4a88a57", + "metadata": {}, + "source": [ + "Now we compare the reported advance rate provided by Underdog for BBMIII to our roster agnostic advance rate." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d063748d", + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "import pandas as pd\n", + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "from PIL import Image\n", + "\n", + "# Load the JSON data\n", + "with open(r'advanceRate.json') as f:\n", + " data = json.load(f)\n", + "\n", + "# Load the CSV data\n", + "csv_data = pd.read_csv(r'bbm3_adv_rate.csv')\n", + "\n", + "# Normalize the player names in the csv data\n", + "csv_data['Player'] = csv_data['Player'].apply(lambda x: x.strip())\n", + "\n", + "# Initialize a dictionary to store the a/t ratios\n", + "ratios = {}\n", + "\n", + "# Iterate over items in the dictionary\n", + "for name, values in data.items():\n", + " a = values.get('a', 0)\n", + " t = values.get('t', 0)\n", + " if t != 0: # To avoid division by zero\n", + " ratios[name.strip()] = a / t\n", + "\n", + "# Lists to store players, differences, percentage differences, ratios, and ADPs\n", + "players = []\n", + "diffs = []\n", + "perc_diffs = []\n", + "ratios_list = []\n", + "adps = []\n", + "adv_from_round_1 = []\n", + "\n", + "# Iterate over each row in the CSV data\n", + "for _, row in csv_data.iterrows():\n", + " player = row['Player']\n", + " if player in ratios:\n", + " # Calculate the difference and the percentage difference\n", + " diff = ratios[player] - row['Adv From Round 1']\n", + " perc_diff = (diff / row['Adv From Round 1']) * 100 if row['Adv From Round 1'] != 0 else None\n", + " print(f\"{player}: difference = {diff}, percentage difference = {perc_diff}%\")\n", + "\n", + " # Append the player, difference, percentage difference, ratio, ADP, and Adv From Round 1 to their respective lists\n", + " players.append(player)\n", + " diffs.append(diff)\n", + " perc_diffs.append(perc_diff)\n", + " ratios_list.append(ratios[player])\n", + " adps.append(row['ADP'])\n", + " adv_from_round_1.append(row['Adv From Round 1'])\n", + " else:\n", + " print(f\"{player} does not exist in the JSON data\")\n", + "\n", + "# Convert lists to a DataFrame\n", + "df = pd.DataFrame({\n", + " 'Player': players,\n", + " 'Difference': diffs,\n", + " 'Percentage Difference': perc_diffs,\n", + " 'RAAR': ratios_list,\n", + " 'ADP': adps,\n", + " 'Advance Rate': adv_from_round_1\n", + "})\n", + "\n", + "# Sort DataFrame by ADP for plotting\n", + "df_plot = df.sort_values('ADP')\n", + "\n", + "# Plot each group of 24 players\n", + "n = 24\n", + "bar_width = 0.35\n", + "opacity = 0.8\n", + "for i in range(0, len(df_plot), n):\n", + " df_subset = df_plot.iloc[i:i + n]\n", + " fig, ax = plt.subplots(figsize=(10, 8))\n", + "\n", + " index = np.arange(len(df_subset))\n", + "\n", + " rects1 = plt.bar(index, df_subset['Advance Rate'], bar_width, alpha=opacity, color='r', label='Advance Rate')\n", + " rects2 = plt.bar(index + bar_width, df_subset['RAAR'], bar_width, alpha=opacity, color='#36bafb', label='RAAR')\n", + "\n", + " plt.xlabel('Player', color='white')\n", + " plt.ylabel('Advance Rate', color='white')\n", + " plt.title('Advance Rate vs RAAR', color='white')\n", + "\n", + " # Adjust x ticks to avoid cutting off the first bar\n", + " plt.xticks(index + bar_width / 2, df_subset['Player'], rotation=90, color='white', ha='right')\n", + "\n", + " plt.yticks(color='white')\n", + "\n", + " legend = plt.legend()\n", + " plt.setp(legend.get_texts(), color='black')\n", + "\n", + " ax.set_facecolor('#313338')\n", + " fig.patch.set_facecolor('#313338')\n", + "\n", + " img = Image.open(r'SW_watermark-1.png')\n", + " plt.imshow(img, aspect='auto', extent=(min(index) - 0.5, max(index) + bar_width + 0.5, 0, max(max(df_subset['Advance Rate']), max(df_subset['RAAR'])) + 0.1), alpha=0.5)\n", + "\n", + " plt.tight_layout()\n", + " plt.show()\n", + "\n", + "# Sort DataFrame by Difference in descending order for display\n", + "df_sorted = df.sort_values('Difference', ascending=False)\n", + "\n", + "# Save the sorted DataFrame to a new CSV file\n", + "df_sorted.to_csv(r'sorted_difference.csv', index=False)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "32164779", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "from IPython.display import display, Image\n", + "\n", + "base_url = \"https://raw.githubusercontent.com/sackreligious/bestballdatabowl/cdb0d53e05d15b3c7935315e1b9a12cadce82113/RAAR/R\"\n", + "\n", + "\n", + "for i in range(1, 19, 2):\n", + " if i == 17:\n", + " img_url = f\"{base_url}{i}.png\"\n", + " else:\n", + " img_url = f\"{base_url}{i}-{i+1}.png\"\n", + " display(Image(url=img_url))\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "675850c6", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "from IPython.display import display, Image\n", + "\n", + "base_url = \"https://raw.githubusercontent.com/sackreligious/bestballdatabowl/cdb0d53e05d15b3c7935315e1b9a12cadce82113/APPA/APPA_r\"\n", + "\n", + "for i in range(1, 19, 2):\n", + " img_url = f\"{base_url}{i}-{i+1}.png\"\n", + " display(Image(url=img_url))\n" + ] + }, + { + "cell_type": "markdown", + "id": "e3017bbf", + "metadata": {}, + "source": [ + "RAAR does have some limitations due to the methodology. For example, Travis Kelce sports an impressive RAAR differential of over 9% more than traditional advance rate. However, this is due to the player swapping protocol selecting players within a certain ADP range and of the same position as the target player. In practice, RAAR effectively pits Travis Kelce in a 1v1 with Mark Andrews, and we all know how that played out last season. Another feature of RAAR is the average RAAR is only 13.94% compared to tradtional advance rate average being 16.78%. This is because in order for a team to go from non-advancing to advancing, they must beat out one of the 2 teams that actually advanced. When the target player being added to rosters is a player with a very high advance rate, this can effectively turn the advance rate check into a 17 vs. 17 of the remaining players on each roster, assuming they both now contain the high advance rate target player. In summary, RAAR will have a bias towards underreporting advance rate because our measuring stick is the 2nd place team in each pod, a high threshold especially for teams in the bottom half of each pod.\n", + "\n", + "Now that we've looked at RAAR and APPA, let's take a look at PPCAT and PPCT:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bbd67354", + "metadata": {}, + "outputs": [], + "source": [ + "import json\n", + "import pandas as pd\n", + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "from PIL import Image\n", + "\n", + "# Load the JSON data\n", + "with open(r'playerPointsAddedPerLineup.json') as f:\n", + " data = json.load(f)\n", + "\n", + "# Load the CSV data\n", + "csv_data = pd.read_csv(r'bbm3_adv_rate.csv')\n", + "\n", + "# Normalize the player names in the csv data\n", + "csv_data['Player'] = csv_data['Player'].apply(lambda x: x.strip())\n", + "\n", + "# Initialize a list to store dictionaries with player information\n", + "player_info = []\n", + "\n", + "# Iterate over items in the dictionary\n", + "for name, values in data.items():\n", + " num_swaps = values.get('numSwaps', 0)\n", + "\n", + " points_added = values.get('pointsAdded', 0)\n", + " avg_points = points_added / num_swaps\n", + "\n", + " # Get the player's ADP from the CSV data, if available\n", + " adp = csv_data.loc[csv_data['Player'] == name, 'ADP'].values[0] if name in csv_data['Player'].values else None\n", + "\n", + " # Create a dictionary with player information and append it to the list\n", + " player_info.append({'player': name, 'avg_points_added': avg_points, 'adp': adp})\n", + "\n", + "# Convert list of dictionaries to a DataFrame\n", + "df = pd.DataFrame(player_info)\n", + "\n", + "# Sort DataFrame by ADP\n", + "df_sorted = df.sort_values('adp')\n", + "\n", + "# Convert DataFrame columns back to lists\n", + "players = df_sorted['player'].tolist()\n", + "avg_points_added = df_sorted['avg_points_added'].tolist()\n", + "\n", + "# Plot each group of 24 players\n", + "n = 24\n", + "bar_width = 0.35\n", + "opacity = 0.8\n", + "\n", + "for i in range(0, len(players), n):\n", + " fig, ax = plt.subplots(figsize=(10, 8))\n", + "\n", + " player_subset = players[i:i+n]\n", + " avg_points_subset = avg_points_added[i:i+n]\n", + "\n", + " index = np.arange(len(player_subset))\n", + "\n", + " plt.bar(index, avg_points_subset, bar_width, alpha=opacity, color='#36bafb')\n", + "\n", + " plt.xlabel('Player', color='white')\n", + " plt.ylabel('Average Player Points Added', color='white')\n", + " plt.title('Average Player Points Added Per Swap', color='white')\n", + "\n", + " # Adjust x ticks to avoid cutting off the first bar\n", + " plt.xticks(index, player_subset, rotation=45, color='white', ha='right')\n", + "\n", + " plt.yticks(color='white')\n", + "\n", + " ax.set_facecolor('#313338')\n", + " fig.patch.set_facecolor('#313338')\n", + "\n", + " img = Image.open(r'SW_watermark-1.png')\n", + " plt.imshow(img, aspect='auto', extent=(min(index) - 0.5, max(index) + bar_width + 0.5, min(avg_points_subset), max(avg_points_subset) + 0.1), alpha=0.5)\n", + "\n", + " plt.tight_layout()\n", + " plt.show()\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "f8315feb", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "from IPython.display import display, Image\n", + "\n", + "base_url = \"https://raw.githubusercontent.com/sackreligious/bestballdatabowl/cdb0d53e05d15b3c7935315e1b9a12cadce82113/PPCAT/PPCAT_r\"\n", + "\n", + "for i in range(1, 19, 2):\n", + " img_url = f\"{base_url}{i}-{i+1}.png\"\n", + " display(Image(url=img_url))\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "c8617abd", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "from IPython.display import display, Image\n", + "\n", + "base_url = \"https://raw.githubusercontent.com/sackreligious/bestballdatabowl/cdb0d53e05d15b3c7935315e1b9a12cadce82113/PPCT/PPCT_r\"\n", + "\n", + "for i in range(1, 19, 2):\n", + " img_url = f\"{base_url}{i}-{i+1}.png\"\n", + " display(Image(url=img_url))\n" + ] + }, + { + "cell_type": "markdown", + "id": "949c7e87", + "metadata": {}, + "source": [ + "There are many potential applications for RAAR, especially when used in conjunction with APPA, PPCAT, and PPCT. There is a large amount of existing analysis in the best ball space based around advance rate/win rate. All of that analysis can be improved by utilizing these less noisy metrics that we have developed. We plan on applying these four metrics in many future projects, and we look forward to seeing how other analysts utilize them in their work." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.2" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}