Netflix Exploration

Sep 11, 2021

Netflix EDA

In these uniquely challenging times, Netfilx has proven to be a great source of entertainment. The platform has recorded an increase of 23 percent in the number of paid members during the final quarters of 2020 when compared with the same period an year earlier.

Here is an exploratory data analysis done on the ‘netflix-shows’ dataset.

Conclusion: As we can see, the content on Netflix dataset has 30.9% TV shows and 69.1% Movies.

Now let’s look into categorization based upon other factors.

Conclusion: After dividing the dataset on the basis of country of production, we see that United States holds the highest percentage, of about half of the total content. India comes second in the list with about 14 percentage share. (Note: We have filled the NaN values in the country_main column with the mean() method. That can be counted as a factor of the dominance of United States)

Conclusion: The above plot shows the distribution of Movies and TV Shows separately based upon the year of adding. There is a noticable increase in the number of shows added throughout the years. For movies 2019 records the maximum number while for TV shows the year 2020 records the maximum number of addition of shows.

Conclusion: The above plot shows content distribution by target age. The sizes of te plot is relative to the number of content added in the mentioned years. As it is clear from the plot, there was a huge increase in the number of content from the year 2012 to 2015 to 2020.

Conclusion: The above histogram represents the content added with respect to the country_main column.

Conclusion: The above plot shows the percentage of movies added based upon the target age. As one can see, the Adults section scores the highest number with 43.8%.

Conclusion: The above plot shows the percentage of TV series added based upon the target age. As one can see, the Adults section scores the highest number with 42.5%.

Conclusion: The above distplot is based upon the release_year of the movies/shows.

Conclusion: The above plot is based upon the duration of Movies. As we can see, the duration of 90 minutes records the highest number of movies.

Conclusion: The above plot is based upon the seasons of TV series. As we can see, only 1 season records the highest number of Tv series. Discontinuation of shows due to less popularity or pending release of next season might be a contributing factor.

Conclusion: Here is another pie plot depicting the percentage of TV series originating from a country. As noticed from the above, United States holds the highest percentage, i.e approx.44%.

Conclusion: Here is another pie plot depicting the percentage of Movies originating from a country. As noticed from the above, United States holds the highest percentage, i.e approx.43.3%.