Skip to content
This repository has been archived by the owner on Oct 29, 2020. It is now read-only.
madprogramer edited this page Sep 25, 2020 · 10 revisions

ytcc-archive: Archiving YouTube's unpublished community contributions

What this project is all about

YouTube's community captions...

Why would YouTube do this?

A year ago there was a major controversy which led to YouTube restricting the feature. Because people were complaining about spam, they made it so only uploaders could publish submissions [2:44 PM] themadprogramer:

For a more exhaustive explanation click here!

YouTube announced that they were going to retire the feature on September 28, 2020. This decision seeing as people had been complaining about the featureand that they did not want to reimplement it for the new editor

Why should we archive these unpublished contributions?

The best scenario: uploaders approve, had previously complained about the feature

How a "worker" works

The Tracker tracks workers, the workers collect information from videos: captions in review, title/description translations in review and caption credits.

An "un-automated" version which https://github.com/Data-Horde/ytcc-exporter

Stats

https://atdash.meo.ws/d/attv2/archive-team-tracker-charts-v2?orgId=1&var-project=ext-yt-communitycontribs

https://tracker.archiveteam.org/ext-yt-communitycontribs/


Tutorial

Getting Started

In order to run these tools you will need to provide "session cookies", you can think of this as a lazy way of logging onto YouTube:

  • In a new/guest/Incognito browser profile, create a test Google account. (Use a separate browser profile so the cookies don't get associated with your main Google account).
  • IMPORTANT: Set the default account language to English (United States). https://myaccount.google.com/language
  • IMPORTANT: Visit YouTube.com. Set the YouTube site language (found by clicking on the profile image on the top right corner of youtube.com) to English (US).
  • Open developer tools and go to the Application tab in Chrome, or the Storage tab in Firefox. Click on Cookies and then https://www.youtube.com. Copy the full values for the following cookies on youtube.com: HSID, SSID, and SID. Note these values for when the archiving begins.

The cookie values are needed because a Google account (any Google account) is required to access the community contributions editor, where much of the data is gathered from.

Run locally

WIP

Heroku

If you're familiar with Heroku, you can just deploy the YTCC archiving tool from this template:

Deploy

Docker

WIP

You can also make a new image using the Dockerfile provided in this repo.

Clone this wiki locally