Analysing the youtube channel Andrew Huberman to extract keywords from developing high performing content.
Data APIs are a great source of data for data analytics projects. In this readme, I'm walking you step by step through the process of retrieving video data and channel data using Youtube Data API.
And also the techniques to Visualize the obtained data.
We start this project by first creating an YouTube API Key which will be our credential to access youtube data. go to google developer console and sign through your google account then create a new project, enable api & create a new api key, copy this api key and paste it into you code as a variable.
Go to the below link and look through the youtube api documentation for code snippet to obtain channel id. https://developers.google.com/youtube/v3/docs/channels/list?apix=true
For obtaining channel id of a specific channel, go to youtube and follow steps below.
Click ctrl+u to open a new tab with source code
Next click ctrl+f to find itemprop="channelId"
copy the value stored in content variable with key equal to itemprop="channelId"
then store it in a variable
install "google-api-python-client" (which is the google python package required to access youtube api data), we will also install pandas, seaborn & Matplotlib.
Youtube data api stores all data in json form as shown in below pictorial representation.
We extract channel details from youtube. I.e. we extract details such as youtube channel name and playlistId. Below is the image of code snippet of function for extracting channel details.
We will be loading this data into a pandas dataframe and then store the obtained playlistId in a variable named playlist_id.
Below is the image of code snippet of function for extracting videoIds.
We shall build a logic to extract video Ids from playlistId for a particular channel. Below is the image of code snippet of function for extracting videoIds.
We shall extract details such as video title, video description, total views each video has got, total number of likes, each video has got. Then load these details into pandas dataframe.
to use this data for visualization we need to convert likes and view count into numeric form
The main purpose of EDA is to help look at data before making any assumptions. It can help identify obvious errors, as well as better understand patterns within the data, detect outliers or anomalous events, find interesting relations among the variables.
Using scatter plot to visualize the relation between view count and like count












