@@ -8,14 +8,7 @@ analytical pipeline. To get things going, we are going to keep it simple; our
8
8
goal here is to get an analysis done, that's it. We won't focus on
9
9
reproducibility (well, not beyond what was done in the previous chapter to set
10
10
up our development environment). We are going to download some data, and analyse
11
- it, that's it. But before all of that, I will present the Polars package. In the
12
- preface, I said that this wasn’t supposed to be a book about Python, so why am
13
- I talking about a specific package? It’s because I think that Polars, unlike Pandas,
14
- has some features and design choices that actually improve reproducibility.
15
-
16
- ## The Polars package and why you should ditch Pandas in its favour
17
-
18
-
11
+ it, that's it.
19
12
20
13
## Housing in Luxembourg
21
14
@@ -50,11 +43,11 @@ knitr::include_graphics("images/lux_rhode_island.png")
50
43
```
51
44
:::
52
45
53
- What you should also know is that the population is about 645,000 as of writing
54
- (January 2023 ), half of which are foreigners. Around 400,000 persons work in
55
- Luxembourg, of which half do not live in Luxembourg; so every morning from
56
- Monday to Friday, 200,000 people enter the country to work and then leave in the
57
- evening to go back to either Belgium, France or Germany, the neighbouring
46
+ What you should also know is that the population is about 672,050 people as of
47
+ writing (July 2024 ), half of which are foreigners. Around 400,000 persons
48
+ work in Luxembourg, of which half do not live in Luxembourg; so every morning
49
+ from Monday to Friday, 200,000 people enter the country to work and then leave
50
+ in the evening to go back to either Belgium, France or Germany, the neighbouring
58
51
countries. As you can imagine, this puts enormous pressure on the transportation
59
52
system and on the roads, but also on the housing market; everyone wants to live
60
53
in Luxembourg to avoid the horrible daily commute, and everyone wants to live
@@ -88,10 +81,11 @@ If you want to download the data, click
88
81
Let us paste the definition of the HPI in here (taken from the HPI's
89
82
[ metadata] ( https://archive.is/OrQwA ) ^[ https://archive.is/OrQwA , archived link for posterity.] page):
90
83
91
- * The House Price Index (HPI) measures inflation in the residential property market. The HPI
92
- captures price changes of all types of dwellings purchased by households (flats, detached houses,
93
- terraced houses, etc.). Only transacted dwellings are considered, self-build dwellings are
94
- excluded. The land component of the dwelling is included.*
84
+ * The House Price Index (HPI) measures inflation in the residential property
85
+ market. The HPI captures price changes of all types of dwellings purchased by
86
+ households (flats, detached houses, terraced houses, etc.). Only transacted
87
+ dwellings are considered, self-build dwellings are excluded. The land component
88
+ of the dwelling is included.*
95
89
96
90
So from the plot, we can see that the price of dwellings more than doubled
97
91
between 2010 and 2021; the value of the index is 214.81 in 2021 for Luxembourg,
@@ -202,6 +196,7 @@ import polars as pl
202
196
import polars.selectors as cs
203
197
import re
204
198
```
199
+
205
200
I will be using the ` polars ` package to manipulate data.
206
201
207
202
Next, the code below downloads the data, and puts it in a data frame:
0 commit comments