-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path03_data_analysis.qmd
387 lines (262 loc) · 15 KB
/
03_data_analysis.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
# Data Analysis
First, we load the required libraries and files.
```{r Functions_source, message=FALSE, warning=FALSE}
source("setup.qmd")
```
Now, let's retrieve the necessary objects for data exploration:
```{r load_object1, eval=TRUE, warning=FALSE, message=FALSE}
# Load company_data
company_data <- readRDS("company_data.RDS")
cik <- readRDS("cik.RDS")
# Split the lists in company_data
company_Metadata <- company_data$company_Metadata
company_Facts <- company_data$company_Facts
company_Concept <- company_data$company_Concept
```
### Rebuilding the financial reports
The use of specific labels to indicate financial report items is regulated by accounting standards set forth by authoritative bodies. In the United States, the Financial Accounting Standards Board (FASB) establishes generally accepted accounting principles (GAAP), which provide guidelines for the preparation of financial statements, including the standardization of labels and concepts.
Different companies may use different reporting styles to indicate Facts in different ways. The use of different labels however affect our ability to efficiently retrieve the appropriate financial data.
The objective here is to generate standardized financial reports of Balance sheet, Income Statement, Cash Flow so that we can use the same label to perform specific calculation across all companies.
Let's create now a dataframe including the data retrieved from `` company_Facts$facts$`us-gaap` ``.
#### Extracting financial data
The first step is un-nest the dataset, in particular the financial data, in `company_Facts`, nested in various sub-lists.
The following code can be used to un-nest the list within `company_Facts` and create a dataframe easy to visualize and useful for our purpose[^03_data_analysis-1].
[^03_data_analysis-1]: For visualization purpose we have omitted from the printed table below the variables (columns) `df_Facts$description` and `df_Facts$accn`.
```{r unnesting_company_Facts, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
df_Facts <- Fundamentals_to_Dataframe(company_data)
# Select the columns to print out and present the output with wrapped text and formatted numbers
df_Facts %>% select(-c(description,accn)) %>%
head() %>% as.data.frame() %>%
kable("html") %>%
kable_styling(full_width = FALSE)
```
Note that `df_Facts$val` is expressed in million.
The data in this dataframe is organized to facilitate analysis and comparison of financial information over different periods. Each row corresponds to a specific financial Concept (e.g. Accounts Payable) reported by the company, and the columns provide details about the reporting period, values, and other relevant attributes.
To recreate a financial statement from the data in `df_Facts` of JAKKS Pacific Inc., we need to extract the values in `df_Fact$val` of the various Concepts and corresponding date in `df_Fact$end`.
`df_Fact` has a long list of Concepts for multiple fiscal periods. The following code shows the large number of Concepts included in `df_Fact`, related to all financial reports.
```{r summarize_Facts1, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-str-output"}
# Print the size in human-readable format
cat("df_Facts includes:", format(nrow(df_Facts), units = "auto"), "records \n")
```
From an extract of the `df_Facts` above, we would see repeated Concepts. Each of them is associated with a different fiscal period. The code next provides number of unique Concepts used historically JAKKS Pacific Inc. in their financial reports.
```{r summarize_Facts3, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-str-output"}
# Print the number of unique records of df_Facts
df_Facts_distinct <- df_Facts %>% select(label,description) %>% distinct()
cat("df_Facts includes:", format(nrow(df_Facts_distinct), units = "auto"), "distinct Concepts \n")
```
By arranging these records by date indicated in `df_Fact$end` we can see the the full financial reports for each period in a data frame format.
```{r summarize_Facts4, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Group the df_Fact by date in df_Fact$end
df_Facts <- df_Facts %>% arrange(desc(end))
# Show first rows of df_Facts_BS
df_Facts %>% select(-c(description,accn)) %>%
head() %>% as.data.frame() %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1,width = "250px")
```
The next step is to filter the relevant Concepts associated with each key financial statement: Balance Sheet (BS), Income Statement (IS) and Cash Flow (CF).
::: {.Note .custom-note}
------------------------------------------------------------------------
Note
As mentioned in the previous chapter, the `df_Facts` dataframe contains financial data along with corresponding reporting information, such as `df_Facts$end`, `df_Facts$fy`, `df_Facts$fp`, and `df_Facts$form`. However, there are instances where a filing (e.g., 10-K) for a specific fiscal period (`fy` and `fp`) includes a comparison with previous fiscal periods. In these cases, the financial data may actually refer to a different period, as indicated by `df_Facts$end`, rather than the period specified by `fy` and `fp`.
To handle this scenario, we construct a dataframe of financial data based on the end date (`df_Facts$end`) of the reporting period and remove the remaining attributes such as `fy`, `fp`, etc. This ensures that we focus on the actual reporting period for the financial data.
------------------------------------------------------------------------
:::
#### Balance Sheet
The following code will generate a standardized Balance Sheet (df_std_BS) based on a matching table in Excel (standardized_BS.xlsx). The standardized Balance Sheet includes Concepts that may not be present in the filings (e.g., 10-K) of all companies. If financial values (Facts) are missing, they will be estimated.
```{r standardized_BS, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Retrieve balance sheet of JAKKS Pacific Inc. in standardized format
df_std_BS <- BS_std(df_Facts)
# Format numeric columns to 2 decimal places
df_std_BS <- df_std_BS %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized balancesheet
df_std_BS %>% head() %>% as.data.frame() %>%
head() %>% as.data.frame() %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1,width = "250px")
```
#### Income Statement
The following code will create a standardized Income Statement (df_std_IS) based on a matching table in Excel (standardized_IS.xlsx).
Here few relevant remarks regarding the standardized Income Statement:
- The financial data are presented on a quarterly basis only.
- In cases where quarterly not all Facts are available, these are estimated as the difference between the yearly value and the existing quarterly values of that year, divided equally by the number of quarters that are missing.
```{r standardized_IS, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Retrieve Income Statement of JAKKS Pacific Inc. in standardized format
df_std_IS <- IS_std(df_Facts)
# Format numeric columns to 2 decimal places
df_std_IS <- df_std_IS %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized income statement
df_std_IS %>% head() %>% as.data.frame() %>%
head() %>% as.data.frame() %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1,width = "250px")
```
#### Cash Flow
Similarly to the Income Statement, the following code will create a standardized Cash Flow Statement (df_std_CF) based on a matching table in Excel (standardized_CF.xlsx).
```{r standardized_CF, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Retrieve Cash Flow statement of JAKKS Pacific Inc. in standardized format
df_std_CF <- CF_std(df_Facts)
# Format numeric columns to 2 decimal places
df_std_CF <- df_std_CF %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized cash flow
df_std_CF %>% head() %>% as.data.frame() %>%
head() %>% as.data.frame() %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1,width = "250px")
```
## Quarterly format
Finally, we can transpose the dataframes of the financial statements to improve the readability.
**Balance Sheet**
```{r BS_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
library(kableExtra)
# Transpose the Balanche Sheet statement in standardized format
df_std_BS_t <- transpose_df_standardized(df_std_BS, "standardized_BS")
# Format numeric columns to 2 decimal places
df_std_BS_t <- df_std_BS_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Balance Sheet in standardized format
df_std_BS_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```
**Income Statement**
```{r IS_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Transpose the Income Statement in standardized format
df_std_IS_t <- transpose_df_standardized(df_std_IS, "standardized_IS")
# Format numeric columns to 2 decimal places
df_std_IS_t <- df_std_IS_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Income Statement in standardized format
df_std_IS_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```
**Cash Flow Statement**
```{r CF_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Transpose the Cash Flow Statement of JAKKS Pacific Inc. in standardized format
df_std_CF_t <- transpose_df_standardized(df_std_CF, "standardized_CF")
# Format numeric columns to 2 decimal places
df_std_CF_t <- df_std_CF_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Cash Flow Statement in standardized format
df_std_CF_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```
## Trailing Month (TM) format
Finally, we will calculate the multiple trailing month value of the Income Statement and Cash Flow. These are easily comparable with the filing on SEC website.
### Twelve Trailing Month (12 month)
**Income Statement**
```{r 12TM_IS_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Calculation Trailing month
df_std_IS_12TM <- calculate_trailing_months(df_std_IS,12)
# Define the order of columns of the transposed dataframe
order_df <- colnames(df_std_IS_12TM) %>% as.data.frame()
# Transpose the Income Statement into standardized format
df_std_IS_12TM_t <- transpose_df(df_std_IS_12TM, order_df)
# Format numeric columns to 2 decimal places
df_std_IS_12TM_t <- df_std_IS_12TM_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Income Statement in standardized format
df_std_IS_12TM_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```
**Cash Flow Statement**
```{r 12TM_CF_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Calculation Trailing month
df_std_CF_12TM <- calculate_trailing_months(df_std_CF,12)
# Define the order of columns of the transposed dataframe
order_df <- colnames(df_std_CF_12TM) %>% as.data.frame()
# Transpose the Income Statement into standardized format
df_std_CF_12TM_t <- transpose_df(df_std_CF_12TM, order_df)
# Format numeric columns to 2 decimal places
df_std_CF_12TM_t <- df_std_CF_12TM_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Income Statement in standardized format
df_std_CF_12TM_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```
### Nine Trailing Month (9 month)
**Income Statement**
```{r 9TM_IS_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Calculation Trailing month
df_std_IS_9TM <- calculate_trailing_months(df_std_IS,9)
# Define the order of columns of the transposed dataframe
order_df <- colnames(df_std_IS_9TM) %>% as.data.frame()
# Transpose the Income Statement into standardized format
df_std_IS_9TM_t <- transpose_df(df_std_IS_9TM, order_df)
# Format numeric columns to 2 decimal places
df_std_IS_9TM_t <- df_std_IS_9TM_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Income Statement in standardized format
df_std_IS_9TM_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```
**Cash Flow Statement**
```{r 9TM_CF_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Calculation Trailing month
df_std_CF_9TM <- calculate_trailing_months(df_std_CF,9)
# Define the order of columns of the transposed dataframe
order_df <- colnames(df_std_CF_9TM) %>% as.data.frame()
# Transpose the Income Statement into standardized format
df_std_CF_9TM_t <- transpose_df(df_std_CF_9TM, order_df)
# Format numeric columns to 2 decimal places
df_std_CF_9TM_t <- df_std_CF_9TM_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Income Statement in standardized format
df_std_CF_9TM_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```
### Six Trailing Month (6 month)
**Income Statement**
```{r 6TM_IS_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Calculation Trailing month
df_std_IS_6TM <- calculate_trailing_months(df_std_IS,6)
# Define the order of columns of the transposed dataframe
order_df <- colnames(df_std_IS_6TM) %>% as.data.frame()
# Transpose the Income Statement into standardized format
df_std_IS_6TM_t <- transpose_df(df_std_IS_6TM, order_df)
# Format numeric columns to 2 decimal places
df_std_IS_6TM_t <- df_std_IS_6TM_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Income Statement in standardized format
df_std_IS_6TM_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```
**Cash Flow Statement**
```{r 6TM_CF_transpose, message=FALSE, warning=FALSE, message=FALSE, class.output="custom-table"}
# Calculation Trailing month
df_std_CF_6TM <- calculate_trailing_months(df_std_CF,6)
# Define the order of columns of the transposed dataframe
order_df <- colnames(df_std_CF_6TM) %>% as.data.frame()
# Transpose the Income Statement into standardized format
df_std_CF_6TM_t <- transpose_df(df_std_CF_6TM, order_df)
# Format numeric columns to 2 decimal places
df_std_CF_6TM_t <- df_std_CF_6TM_t %>%
mutate(across(where(is.numeric), ~ round(.x,2) ))
# Print the standardized Income Statement in standardized format
df_std_CF_6TM_t %>%
kable( "html") %>%
kable_styling(full_width = FALSE) %>%
column_spec(1, width = "30em", extra_css = "white-space: nowrap;")
```