Dialing for Videos: A Random Sample of YouTube

Authors

  • Ryan McGrady University of Massachusetts Amherst
  • Kevin Zheng University of Massachusetts Amherst
  • Rebecca Curran University of Massachusetts Amherst
  • Jason Baumgartner Pushshift.io
  • Ethan Zuckerman University of Massachusetts Amherst

DOI:

https://doi.org/10.51685/jqd.2023.022

Keywords:

YouTube, random sampling, methods, social media, digital infrastructure

Abstract

YouTube is one of the largest, most important communication platforms in the world, but while there is a great deal of research about the site, many of its fundamental characteristics remain unknown. To better understand YouTube as a whole, we created a random sample of videos using a new method. Through a description of the sample’s metadata, we provide answers to many essential questions about, for example, the distribution of views, comments, likes, subscribers, and categories. Our method also allows us to estimate the total number of publicly visible videos on YouTube and its growth over time. To learn more about video content, we hand-coded a subsample to answer questions like how many are primarily music, video games, or still images. Finally, we processed the videos’ audio using language detection software to determine the distribution of spoken languages. In providing basic information about YouTube as a whole, we not only learn more about an influential platform, but also provide baseline context against which samples in more focused studies can be compared.

Downloads

Published

2023-12-20

Issue

Section

Articles

How to Cite

McGrady, R., Zheng, K., Curran, R., Baumgartner, J., & Zuckerman, E. (2023). Dialing for Videos: A Random Sample of YouTube. Journal of Quantitative Description: Digital Media , 3. https://doi.org/10.51685/jqd.2023.022