Gaming Data Science and Machine Learning

This repo is a demo of a classical Data Science and Machine Learning approach but for gaming data, which I've never worked with before. The gaming data is from the games of Counter-Strike Global Offensive (CSGO) and League of Legends (LoL).

For CSGO, the approach is analytical at first producing statistical and probabilistic analysis of the games played. Later on, the coordinate data (longitude, latitude) is used to do movement prediction with an LSTM, training on the GPU.

But for LoL, the approach is a bit different because of the composition of the dataset, first an analysis is done which then leads to a binary classification challenge where the predictions are which teams wins.

CS-GO

What we'll look at first is the equipment value after buy time for each match and split them by which side won.

- Counter-Terrorists tend to buy more expensive gear
- Terrorists might be saving depending on the round

Of all the matches analyzed, we find that Counter-Terrorists have a higher propensity to be the first attacker.

Terrorists have a tendency to spread around bomb site while Counter-Terrorists focus more on bomb site B and the center.

When we test the difference of time between attacks between CT and T using a non-parametric statistical test 500 times to see if this happens by chance 1% of the time - the majority of our tests generate values that are compatible with our data. Meaning we don’t really find a statistical difference.

League of Legends

For LoL, at a first glance we find the following:

- Higher team champion level comes with high team minions killed
- More wards placed by the team also comes with a higher team champion level

If a team has an extremely high kill rate they tend be very different in their first 10 minutes of the game.

- lower deaths than the non-anomaly teams
- higher kill assists
- higher kills
- more gold per minute
- higher total experience
- higher champion level

We've got all the features for modeling

The best model is a logistic regression using our augmented dataset with the PCA embeddings. The second best model is the Random Forest which is a tree-based model.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
notebook-images		notebook-images
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cs_go_mm.ipynb		cs_go_mm.ipynb
cult_demo.pdf		cult_demo.pdf
images		images
lol_10min.ipynb		lol_10min.ipynb
predict_movement_lstm_gpu.ipynb		predict_movement_lstm_gpu.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gaming Data Science and Machine Learning

CS-GO

League of Legends

About

Releases

Sponsor this project

Packages

Languages

License

jakorostami/gaming_ds

Folders and files

Latest commit

History

Repository files navigation

Gaming Data Science and Machine Learning

CS-GO

League of Legends

About

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages