A tool to collect and analyze what shoes runners wear in the Boston Marathon. This project helps identify trends in running shoe choices among marathon participants.
- Collects runner data from the Boston Marathon results
- Shows you photos of runners from MarathonFoto
- Lets you identify what shoes each runner is wearing
- Saves this information for later analysis
You'll need:
- A computer running Windows or Mac
- Internet connection
- About 10GB of free disk space
- Basic knowledge of using command prompt (Windows) or terminal (Mac)
- Go to Python.org
- Download the latest Python installer
- Run the installer
- ✓ Check "Add Python to PATH" during installation
- Click "Install Now"
- Open Terminal
- Install Homebrew if you don't have it:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)" - Install Python:
brew install python
-
Install Google Chrome if you don't have it: Download Chrome
-
Download ChromeDriver:
- Go to ChromeDriver Downloads
- Download the version that matches your Chrome browser
-
Setup ChromeDriver:
- Create folder:
C:\Program Files (x86)\chromedriver-win64 - Extract chromedriver.exe to this folder
- Extract chromedriver to
/usr/local/bin:sudo mv ~/Downloads/chromedriver /usr/local/bin
- Create folder:
-
Download this project:
- Click the green "Code" button above
- Choose "Download ZIP"
- Extract the ZIP file somewhere on your computer
-
Open command prompt (Windows) or terminal (Mac)
-
Navigate to the project folder:
cd path/to/extracted/folder -
Install required software:
pip install -r requirements.txt
The tool uses several CSV files and needs to know where to find them. You'll need to update these paths in the code:
-
Open
src/data/ScrapingMarathonfoto.pyin a text editor -
Find and modify these paths to match your setup:
# For shoe choices storage 'D:\\BAAFootwear\\data\\Raw\\ShoeChoices.csv' # For race results data 'D:\\BAAFootwear\\data\\Processed\\RaceTimeSeconds.csv'
-
Use double backslashes (\) on Windows or forward slashes (/) on Mac:
- Windows example:
D:\\BAAFootwear\\data\\Raw\\ShoeChoices.csv - Mac example:
/Users/yourname/BAAFootwear/data/Raw/ShoeChoices.csv
- Windows example:
-
Make sure these directories exist on your system before running the tool
-
Start the data collection:
python src/data/make_dataset.pyThis will gather runner information from the marathon results.
-
Start the shoe identification tool:
python src/data/ScrapingMarathonfoto.py -
For each runner:
- A window will open showing marathon photos
- Another window will show shoe options
- Click the shoe that matches what the runner is wearing
- The tool automatically moves to the next runner
If you're a student helping with shoe classification:
-
Create a GitHub account if you don't have one
-
Share your GitHub username with your John
-
John will:
- Add you as a collaborator with restricted permissions
- Set up branch protection rules allowing you to only modify shoeChoices.csv
-
Clone the repository:
git clone https://github.com/jkuzmeski/BAAFootwear.git -
When working:
- Commit and push your changes:
git add shoeChoices.csv git commit -m "Data labing MM-DD-YYYY" git push
- Commit and push your changes:
Note: You will only be able to modify shoeChoices.csv. Other file changes will be rejected.
Common issues:
-
"Python not found"
- Reinstall Python and make sure to check "Add Python to PATH"
-
"ChromeDriver error"
- Make sure Chrome browser is installed
- Download the matching ChromeDriver version
- Check if ChromeDriver is in the correct location
-
"Module not found"
- Run
pip install -r requirements.txtagain
- Run