• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • Crypto Currency
  • Technology
  • Contact
NEO Share

NEO Share

Sharing The Latest Tech News

  • Home
  • Artificial Intelligence
  • Machine Learning
  • Computers
  • Mobile
  • Crypto Currency

Breaking down YouTube trending videos with R — Part II

December 25, 2020 by systems

Trifunovic Uros

In part I of the analysis, we looked into the relationship videos view count has with the number of likes, dislikes, and comments and fitted two different linear models based on the findings. However, they fail to tell the entire story successfully. Therefore, we focus on the categorical variables like tags, channel, and category next. Lastly, we take a look at the time when the videos were published.

YouTube tags are words and phrases used to give YouTube context about a video. They are an important ranking factor in YouTube’s search algorithm. The word cloud below shows that the content creators repeatedly used tags like “funny” and “comedy”, suggesting the entertainment videos’ dominance among the trending ones. The observation makes sense as videos from this category represent 24.33% of the dataset.

Next, we expand the dataset with logical columns for each of the top 500 most frequently used tags. The values in columns are “TRUE” if the tag is among the video tags and “FALSE” otherwise. Then, we fit a Random Forest model with the below formula:

Fitted on 500 trees and using the select categorical variables as predictors, the Random Forest model’s R-squared value of 0.76 is slightly worse than that of the two linear models. However, the variable importance chart reveals that certain channels are likely to get more views, likely due to a large subscriber base. The chart further confirms the popularity of entertainment and music videos.

Next, we take a look at the time the trending videos were published to see if there is a preferable time of day or even an hour for publishing a video to get more views. After splitting the day into four categories, we see that most videos are published in the afternoon hours, i.e. between 12:01 pm and 6 pm UTC.

Getting more granular by looking at the number of videos published by the hour, we observe that the time period between 3 pm and 5 pm UTC is particularly popular for publishing YouTube videos that end up trending.

Looking at the most popular hour for publishing by category reveals an interesting observation that Music videos are mostly uploaded to YouTube at 5 am UTC.

We also split the channels into four categories based on the aggregate number of views on their videos to see if the channel size plays a role in determining the time to publish a video. The chart below confirms it does not.

Finally, we fit another Random Forest model based on the observations with the following formula:

The aggregate Random Forest model outperforms all the other ones with an R-squared value of 0.98. The variable importance chart confirms that user engagement is important for increasing a video view count. Additionally, entertainment, music, and sports videos are likely to get more views. Lastly, even though the majority of the videos are published in the afternoon hours, it seems that the ones published at night tend to perform better.

The model comparison table summarizes the accuracy of the four models.

In summary, the analysis shows that even getting dislikes on a video is beneficial for increasing the views count up to a certain point. As expected, more likes lead to more views on videos. Additionally, the rising comment count eventually stops contributing to increasing the number of views. Furthermore, some videos do exceptionally well. Those are often entertainment and music videos. Although the afternoon is the most popular time of the day to publish, videos uploaded overnight are more important for predicting the view count. The next step of the analysis could be to narrow the focus to the outperforming videos to explore the driving forces behind their performance.

Link to the dataset: YouTube trending videos dataset

Link to the code: YouTube trending videos analysis

Filed Under: Machine Learning

Primary Sidebar

Stay Ahead: The Latest Tech News and Innovations

Cryptocurrency Market Updates: What’s Happening Now

Emerging Trends in Artificial Intelligence: What to Watch For

Top Cloud Computing Services to Secure Your Data

The Future of Mobile Technology: Recent Advancements and Predictions

Footer

  • Privacy Policy
  • Terms and Conditions

Copyright © 2025 NEO Share

Terms and Conditions - Privacy Policy