Microsoft sees all the big companies creating original video content and they want to get in on the fun. They have decided to create a new movie studio, but they don’t know anything about creating movies. You are charged with exploring what types of films are currently doing the best at the box office. You must then translate those findings into actionable insights that the head of Microsoft's new movie studio can use to help decide what type of films to create.
Since Microsoft wants to join the film making industry it doesn't have data of its own. We are using data that was collected from various locations. The data collected includes movie ratings and movie basics that are associated with movie ids. The datasets provides a lot of information that includes domestic and foreign gross, movie budgets and the genres of movies. Each piece of data went through the data mining process:
- Data Understanding 2.Data Cleaning and Preparation 3.Data visualization 4.Recommendation For this project I used the following data to arrive into a conclusion: • IMDb DATA • The Numbers • Box Office MojO
-From this data we were able to see the popularity of movies around the globe. And we Suggested that Microsoft considers the top 10 most popular movies
Based on popularity, the movie Avengers: Infinity war was the most popular among the viewers and therefore is expected to be a hit for Microsoft studio as well.
We also looked at the data from The Numbers studio to come to a decision. We based our decision on the profit generated from movies globally. Profit = Worldwide gross – Production cost We used worldwide data because we are targeting viewers around the globe and not just locally. The result was the top 10 movies that generated the most profit which I believe Microsoft studio should look into. Below we have profit against movie title.
The movie avatar has done exceptionally well since it has generated nearly 2.5m usd for the company and I would recommend Microsoft studio to look into, not forgetting the others as well like the titanic.
Data from the BOM contained both Domestic gross and Worldwide gross showing the amounts generated. we avoided the worldwide data since it had so many missing data and would have resulted a miscalculation of resulst. So we decided to use the domestic gross in the calculation of gross generated per year since it had a correlation of 0.79 with the worldwide gross. The 0.79 shows that a positive change in the domestic gross results in significantyly propotionate change in the worldwide gross hence safe to use in representing the worldwide gross. We also based our year to 2023.
Gross generated per year = domestic gross/number of years Number of years = (2023-release year)
From this we were able to see how much every movie is generating per year. The movie that stood out was Black Panther generating nearly 1.5m per years