Get to know a GitHub repository's star-gazers, by using some simple analytics and visualizations of star-gazers data queried from GitHub.
The image below is a geographical snapshot of FastAI's star gazers at the beginning of November, 2018. It was created by executing the following:
$ python stars_analytics.py query-github --user=<YOUR USER> --password=<YOUR PASSWORD> --repo=https://github.com/fastai/fastai --proxy=https://<YOUR PROXY>:<PORT>
And here's the same data in a different view:
$ python3 stars_analytics.py stars-geo-tbl --cache-file=fastai_star_gazers.csv --format=plot
Choose any github repository, run a query against GitHub to download the star-gazer records, and perform analytics. Currently features:
- Classify star-gazers by country of origin, and draw this using size-bubbles on a world map.
- Create monthly starring activity statistics
- Create day-of-week statistics
Note: You will need a GitHub user and password to query GitHub. There is an API to query without using GiutHub credentials, but it limits the number of queries to a very small number, that is not practical for this application.
These instructions will help get GSA up and running on your local machine.
Notes:
- This repository has only been tested on Ubuntu 16.04 LTS, and with Python 3.5.
Clone the Star Gazers Analytics code repository from github:
$ git clone https://github.com/netaz/github_stars_analytics.git
I recommend using a Python virtual environment, but that of course, is up to you.
Before creating the virtual environment, make sure you are located in directory github_stars_analytics
. After creating the environment, you should see a directory called github_stars_analytics/env
.
If you don't have virtualenv installed, you can find the installation instructions here.
To create the environment, execute:
$ python3 -m virtualenv env
This creates a subdirectory named env
where the python virtual environment is stored, and configures the current shell to use it as the default python environment.
If you prefer to use venv
, then begin by installing it:
$ sudo apt-get install python3-venv
Then create the environment:
$ python3 -m venv env
As with virtualenv, this creates a directory called github_stars_analytics/env
.
The environment activation and deactivation commands for venv
and virtualenv
are the same.
!NOTE: Make sure to activate the environment, before proceeding with the installation of the dependency packages:
$ source env/bin/activate
Finally, install GSA's dependency packages using pip3
:
$ pip3 install -r requirements.txt
- Download some metadata We need some metadata about each of the countries in the world: its capital, size of population and all of the major cities. We also download for each country its longitude and latitude so we can draw country-specific data on a map. We will use this data to match star-gazer records to countries.
We only need to download this data once:
$ ./download_metadata.sh
- Query a specific Github repository and create a file (default name = star_gazers.csv) with the cached information. This process is a bit slow so we cache the results in a file, until we decide to get new data.
python stars_analytics.py query-github --user=<YOUR-GITHUB-USERNAME> --password=<YOUR-GITHUB-PASSWORD> --repo=<YOUR-GITHUB-REPO-URL>
The format of the URL is the same as when you access it via the web. For example: https://github.com/NervanaSystems/distiller
- Use the cached github star-gazers data file, to create analytics and visualizations. For example, to create the diagram above:
python stars_analytics.py stars-geo-map
To create a chart of the daily stars for the entire period:
python3 stars_analytics.py daily --cache-file=distiller_star_gazers.csv --format=plot