-
Notifications
You must be signed in to change notification settings - Fork 13
/
combine_GVA.Rd
87 lines (75 loc) · 3.6 KB
/
combine_GVA.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/combine_GVA.R
\name{combine_GVA}
\alias{combine_GVA}
\title{Combine GVA, ABS, SIC91, and Tourism datasets}
\usage{
combine_GVA(ABS = NULL, GVA = NULL, SIC91 = NULL,
DCMS_sectors = eesectors::DCMS_sectors, tourism = NULL,
log_level = futile.logger::INFO, log_appender = "console")
}
\arguments{
\item{ABS}{ABS data as extracted by \code{eesectors::extract_ABS_data()}.}
\item{GVA}{ABS data as extracted by \code{eesectors::extract_GVA_data()}.}
\item{SIC91}{ABS data as extracted by \code{eesectors::extract_SIC91_data()}.}
\item{DCMS_sectors}{ABS data as extracted by
\code{eesectors::extract_DCMS_sectors()} or matching the
\code{eesectors::DCMS_sectors} in-built dataset.}
\item{tourism}{ABS data as extracted by \code{eesectors::extract_tourism_data()}.}
\item{log_level}{The severity level at which log messages are written from
least to most serious: TRACE, DEBUG, INFO, WARN, ERROR, FATAL. Default is
level is INFO. See \code{?flog.threshold()} for additional details.}
\item{log_appender}{Defaults to write the log to "console", alternatively you
can provide a character string to specify a filename to also write to. See
for additional details \code{?futile.logger::appender.file()}.}
}
\value{
A \code{data.frame} as expected by the \code{year_sector_data} class.
Can also return an error log to console or write to file.
}
\description{
Combines datasets exracted from the underlying spreadsheet using
the \code{extract_XXX} functions. A notebook version of this function
(which may be easier to debug) can be downloaded using the
\code{get_GV_combine()} function. Note that this function in its current
form will only work to reproduce the 2016 SFR, and requires adjustment to
generalise it over new years.
NOTE: THIS FUNCTION RELIES ON DATA WHICH ARE CLASSIFIED AS
OFFICIAL-SENSITIVE. THE OUTPUT OF THIS FUNCTION IS AGGREGATED, AND
PUBLICALLY AVAILABLE IN THE FINAL STATISTICAL RELEASE, HOWEVER CARE MUST BE
EXERCISED WHEN CREATING A PIPELINE INCLUDING THIS FUNCTION. IT IS HIGHLY
ADVISEABLE TO ENSURE THAT THE DATA WHICH ARE CREATED BY THE \code{extract_}
FUNCTIONS ARE NOT STORED IN A FOLDER WHICH IS A GITHUB REPOSITORY TO
MITIGATE AGAINST ACCIDENTAL COMMITTING OF OFFICIAL DATA TO GITHUB. TOOLS TO
FURTHER HELP MITIGATE THIS RISK ARE AVAILABLE AT
https://github.com/ukgovdatascience/dotfiles.
}
\details{
The best way to understand what happens when you run this function
is to look at the \code{inst/combine_GVA.Rmd} notebook, which can be
downloaded automatically using the \code{get_GV_combine()} function, or by
visiting
\url{https://github.com/ukgovdatascience/eesectors/blob/master/inst/combine_GVA.Rmd}.
A brief explanation of what the function does here:
1. Remove SIC 91 data from \code{ABS} and swap in values from \code{SIC91})
2. Duplicate the 2014 \code{ABS} values to use for 2015 (2015 values not
being available - this may change in future years.). 2. Merge the
\code{eesectors::DCMS_sectors} into \code{ABS} to get the 2 digit SIC code.
3. Calculate sums across sectors and years. 4. Add in total UK GVA from
\code{GVA}. 5. Match in \code{tourism} data. 6. Add \code{tourism} overlap.
7. Build the dataframe into a format that is expected by the
\code{year_sector_data} class.
}
\examples{
\dontrun{
library(eesectors)
input <- 'OFFICIAL_working_file_dcms_V13.xlsm'
combine_GVA(
ABS = eesectors::extract_ABS_data(input),
GVA = eesectors::extract_ABS_data(input),
SIC91 = eesectors::extract_ABS_data(input),
DCMS_sectors = eesectors::DCMS_sectors,
tourism = eesectors::extract_ABS_data(input)
)
}
}