-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
123 lines (109 loc) · 6.06 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Wikipedia Interactive Visualisation</title>
<script src="./build/jquery.min.js" type="text/javascript"></script>
<script src="https://requirejs.org/docs/release/2.3.5/minified/require.js" type="text/javascript"></script>
<link href="https://fonts.googleapis.com/css?family=Open+Sans" rel="stylesheet">
<meta name="author" content="Raphaël Madillo">
</head>
<style>
body {
margin: 0;
font-family: 'Open Sans', sans-serif;
width: 60%;
min-width: 500px;
margin-left: auto;
margin-right: auto;
margin-top: 40px;
padding: 40x;
line-height: 1.7em;
}
/* div#vis-info{
display: flex;
}*/
*/div.info-p{
width: 33%;
height: 40%;
border-style: solid;
margin: 5px 5px;
text-align: center;
padding: 10px;
}*/
</style>
<body>
<header>
<h1>Data analysis and visualisation of temporal Wikipedia </h1>
</header>
<h4>Visualisation</h4>
<p>
Welcome to a temporal visualisation of wikipedia.
In this page you will be able to visualize the user activity across multiple wikipedia subpart.</br>
These visualizations represent a Wikipedia sub-graph, the nodes being Wikipedia articles and the edges being Wikipedia hyperlinks between pages. </br>
The different visualisation are dynamic with the node size representing the number of viewer of the page during a given month. The activity has been taken during 3 years from November 2015 to October 2018.
</p>
<select id="choose_vis">
<option disabled selected value> -- select a visualisation -- </option>
<option value="diseases">Diseases</option>
<option value="science">Science</option>
<option value="XX_century">XX century</option>
</select>
<div id="vis-info">
<!-- <p>This dataset is composed by science pages that are part of <a href="https://snap.stanford.edu/data/wikispeedia.html">wikispedia</a> dataset <br />'
It includes both English and French wikipedia</p>
<ul>
<li>Wikipedia pages: 1.097</li>
<li>Hyperlinks: 13.971</li>
</ul>
<center>
<img src="./images/wikispedia_overview.png">
<br/>
<a href='./wikispedia/viz.html' rel='noopener noreferrer' target='_blank'>Visualise</a>
</center> -->
</div>
<h4>Data analysis</h4>
<p>
To further explore our datasets, we will try to analyse activity comportements on Wikipedia.
</p>
<h5>Sanfillipo event</h5>
<p>Let's first observe the activity of <a href="https://fr.wikipedia.org/wiki/Maladie_de_Sanfilippo">Maladie de Sanfilippo</a> in french wikipedia.</p>
<img src="./images/sanfilippo.png" width="100%">
<p>We can see on this temporal analysis that the activity of Sanfilippo wikipedia page is 24 times higher on September 2018 than in average. </br>
The explanation behind this very high sudden activity is that on September 17, a TV film <a href="https://fr.wikipedia.org/wiki/Tu_vivras_ma_fille">Tu vivras ma fille</a> was broadcasted by one of the main French TV broadcaster TF1,
and that the main character had this disease. </p>
<h5>1996 in music</h5>
<p>Let's now take a look at english wikipedia page <a href="https://en.wikipedia.org/wiki/1996_in_music">1996 in music</a> where we also observe a peak of activity.</p>
<center>
<img src="./images/1996_in_music.png" width="100%">
</center>
<p style="display: inline-block;">From this activity analysis, we can see the activity of 1996_in_music page on January 2017 is of 5.447.653 views (34 times the average activity),
so the phenomenon is simillar to Sanfilippo disease but here the explication is unknown.</br>
It is the most viewed page (after wikipedia main page and wikipedia search page) in mid January, however doesn't appear on wikipedia trend page for mid february
<a href="https://en.wikipedia.org/wiki/Wikipedia:Top_25_Report/January_8_to_14,_2017">here</a> or <a href="https://en.wikipedia.org/wiki/Wikipedia:Top_25_Report/January_15_to_21,_2017">here</a>.</br>
A possible reason why it doesn't appear on the trends is that the top 25 list excludes "articles that have almost no mobile views (5–6% or less) or almost all mobile views (94–95% or more) because they are very likely to be automated views based on our experience and research of the issue."</br>
And when the page had 1.131.599 desktop views on 15 January 2017, it only had 224 views from mobile-web and 44 views from mobile-app.
</p>
<h5>Winter diseases</h5>
<p>We can also observe expectable phenomenon like the augmentation of user views for the pages of diseases that often occure in winter.</br>
On French Wikipedia with <a href="https://fr.wikipedia.org/wiki/Angine">Angine</a> (Tonsillitis) and <a href="https://fr.wikipedia.org/wiki/Rhume">Rhume</a> (Common cold).</p>
<center>
<img src="./images/winter_diseases.png" width="100%">
</center>
<p>However the analysis of same page on English Wikipedia is less relevant</p>
<center>
<img src="./images/winter_diseases_en.png" width="100%">
</center>
<h5>Chipko movement</h5>
<p>Another event that occured on end of march 2018 is the augmenting number of view on page <a href="https://en.wikipedia.org/wiki/Chipko_movement">Chipko movement</a> as the bar chart representing the activity on the page shows it.</p>
<center>
<img src="./images/chipko.png" width="100%">
</center>
<p>Or this is due to the Google doodle on March 26 that was related to the 45th anniversary of Chipko movement.</p>
<h5>Futher exploration</h5>
<p>To further explore, and discover the cause of an unexpected activity, you can use this website <a href="https://wikimedia.org/api/rest_v1/#!/Pageviews_data/">https://wikimedia.org/api/rest_v1/#!/Pageviews_data/</a> to get the daily activity of a page and see on what day we observed a peak of activity,
and we can also use <a href="https://en.wikipedia.org/wiki/Wikipedia:Top_25_Report/">https://en.wikipedia.org/wiki/Wikipedia:Top_25_Report/</a>, if the page is in top 25 of page and it wasn't a DDos or automated views you will be able to find the causes here.</p>
<script src="./script.js"></script>
</body>