Skip to content

WIP repository, looking for differences in cswiki editing during 2020 aka covid year

Notifications You must be signed in to change notification settings

wmcz/blogpost-cswiki-in-covid-year

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

blogpost-cswiki-in-covid-year

This repository contains data for post at WMCZ's blog, about Czech Wikipedia during the pandemy.

Data source

This repository makes use only of public data published by the Wikimedia Foundation, but the public data are processed at WMF's Hadoop cluster via Spark queries.

Page views

Data about page/project views can be downloaded from Wikimedia Dumps as pageviews dataset. In the Hadoop cluster, the data are available as those two tables:

  • wmf.pageview_hourly: per-page views, hourly granularity (docs)
  • wmf.projectview_hourly: per-project views, hourly granularity (docs)

Edits

Data about edits can be downloaded from Wikimedia Dumps as mediawiki_history dataset. In the Hadoop cluster, the data are available as wmf.mediawiki_history (docs).

About

WIP repository, looking for differences in cswiki editing during 2020 aka covid year

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published