Overview Seeing all big companies creating original video content, Microsoft wants to also join and have decided to create a new movie studio, but do not have the knowledge about virtual video creation. I have been assigned the task by Microsoft to figure out what are the measures that they are going to take for them to venture in this field. I was provided with several data files for the task, to analyze and give the head of Microsoft’s new movie studio recommendations based on my findings to succeed in the field of movie creation. Business Problem Microsoft as a company wants to start on creating original video content but do not have enough knowledge about movie creation to move forward with their plan.
Objectives Microsoft has the following objectives:
• Finding which genres of the movies perform well in the dataset to receive the most public attention.
• Determining the best time to release a movie.
• Which director is associated with the most popular genre?
Using several data frames read from the Box Office it helped in discovering patterns and relationships in the data in order to make better business decisions. Data mining will aid in spotting movie trends depending on various attributes, develop smarter methods for movie creation and accurately predict the movie performance. METHOD: CRISP DM I will be following the CRISP DM process for this task The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model that serves as the base for a data science process. It has 4 sequential phases:
Business understanding – To venture into movie production. Data understanding - Data was obtained from top movie wesites of which it was already provided. Data preparation – cleaning data,removing unwanted columns, removing outliers changing to prefered data types. Modeling – visualization with matplotlib.