PlayerUnknown’s Battlegrounds Behavioral Analysis
This analysis takes game and death data from over 720,000 games of PUBG to learn more about player behavior and help make recommendations to increase their engagement with the game.
The Data
This project uses the PUBG Match Deaths and Statistics Dataset from Kaggle.
The dataset is comprised of player statistics from over 720,000 competitive matches, which is separated into two different collections- Death Statistics and Aggregate Match Statistics.
Those collections are each split into five CSV files that are approximately 2 GB each, so I will not be including the data in the GitHub repository. Everything can still be downloaded from the link above, and the code will show you how to use PySpark to reduce the necessary disk space from over 20 GB to just under 6 GB to more comfortably fit everything into memory to work with.

Video Presentation of the Process and Results
Author
Xander Hieken
Acknowledgements
- Keven Pei’s PUBG Match Deaths and Statistics dataset available on Kaggle



