-
Notifications
You must be signed in to change notification settings - Fork 1
/
03-data-pre-processing.qmd
70 lines (40 loc) · 3.21 KB
/
03-data-pre-processing.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# Data pre processing
After loading the data, the creation of a flow map with Arabesque can be broken down into the following main steps.
::: {.callout-tip appearance="simple"}
## Flow mapping tips
1. Importing flow data (links/nodes)
2. Processing flow data (automatic indicators calculation, ...)
3. Geographical data computing (layering, ...)
4. Statistical data computing (filtering, ...)
5. Designing links and arrows (geometry and semiology)
6. Designing nodes (semiology)
7. Map cosmetics (title, sources, ...)
8. Export
:::
The links and nodes datasets are automatically modified by creating new columns when importing. Arabesque computes different key indicators and a default flow map is suggested.
## Indicators on links
For the moment, only the euclidean distance between the origin and destination entities is calculated.
## Indicators on nodes
A list of various simple and weighted indicators are calculated on the nodes (with reference to the Social Network Analysis (SNA) theory are proposed.
**balance** : difference between the number of in and out degrees.
**outdegree** : number of outgoing links from a places
**indegree** : number of ingoing links from a places
**weigthed Balance** : difference between the number of in and out degrees weighted by the flow value (.ie. volume)
**weigthed degree** : difference between the number of ingoing and outgoing links weigthed by the flow value (.ie. volume).
See below the additional indicators automatically calculated on the nodes of the RIcardo data
See below the **Additional variables** that have been automatically computed (ie the additional variable) on the nodes of the RIcardo data.
![](images/import_indic.png)
These indicators can be downloaded in . csv format (see Export and Save sections).
Loading links and nodes data into *Arabesque* leads to the creation of a default map, which is placed in the center of the interface.
## Suggested default flowmap
Loading data in Arabesque leads to the creation of a default map to avoid visualizing a "spaghetti effect" when entering the application ; all the defined parameters can then be modified during the exploration.
By default, the links are represented in shades of blue and the nodes in red. The map is presented in the WGS84 projection, according to the lat/lon coordinates declared during the import.
Except in the case of loading a projected geometry as input, the map is presented in WGS84.
Hereby is the Mobscol dataset default map in the central panel.
**For all default map, only a small percent ( around 10%) of the most important links (in value) are represented** and symbolized (see the automatic legend) according to their intensity (the flow variable entered at import).
The corresponding nodes are symbolized according to their degree (automatically calculated variable during the import). Their ID codes declared at the time of importation are also presented by default.
![](images/default_flowmap-mobscol.png)
The right-hand panel shows :
\- overall statistics on the proportion of flow information represented on the map ;
\- a flow value distribution diagram
All graphic and data filtering parameters can be modified using the left (geography and semiology) and right (statistical filtering) panels.