Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local_path argument development #23

Merged
merged 87 commits into from
Oct 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
846433d
read/write local paths for `get_hvw_data()`
feinleib May 10, 2024
0a52533
`cli::cli_abort()` errors in HVW/LES paths
feinleib May 10, 2024
23f5f93
Fix CHECK note for "undefined global" in `get_voteview_members()`
feinleib May 10, 2024
75cbc03
Update NEWS.md
feinleib May 10, 2024
33840ec
Use argument names
feinleib May 13, 2024
176c967
`extract_file_ending()`
feinleib May 13, 2024
e567aab
Test `get_hvw_data()` local path args
feinleib May 13, 2024
c89ac0b
Add to HVW docs
feinleib May 13, 2024
cc52a73
Improve `extract_file_ending()`
feinleib May 13, 2024
4477845
Fix factor levels in `fix_les_coltypes`
feinleib May 13, 2024
a03718e
Fix column types in HVW data
feinleib May 14, 2024
3d2796a
`fix_hvw_coltypes()`: fix `chamber_code`s
feinleib May 16, 2024
574b908
`cli_warn()` in `match_chamber()`
feinleib May 16, 2024
a92f551
read/write local paths for `get_les()`
feinleib May 16, 2024
cd5c7d7
Remove {tidyselect} dependency
feinleib May 29, 2024
65cfa90
Mention HVW coltype fixes in NEWS
feinleib Jun 13, 2024
a3eecef
`extract_file_ending()`: pass back NULLs
feinleib Jun 13, 2024
49e1cdf
HVW: remove 'write' arg, fix DTA issues
feinleib Jun 13, 2024
799e72b
HVW: convert maj/min leader to lgl
feinleib Jun 13, 2024
d0a4892
LES: remove `write_to_local_path`
feinleib Jun 13, 2024
49506da
LES: expectation cols as factors
feinleib Jun 14, 2024
407daf9
Remove arg `fix_les_coltypes(les_2)`
feinleib Jun 14, 2024
41c5ca3
Test LES column types
feinleib Jun 14, 2024
bb94e63
LES: 0/1-char bionames, .DTA factors
feinleib Jun 14, 2024
698f4bb
LES: Test local read/write
feinleib Jun 14, 2024
23cf834
NEWS: col types, bioname NAs
feinleib Jun 14, 2024
44b9161
if/else optimizations
feinleib Jun 14, 2024
b2e7ecc
LES: short-circuit cond, cli_abort web errors
feinleib Jun 15, 2024
a6f8482
`get_online_data()`: `return_format` arg
feinleib Jun 15, 2024
1c80d99
LES: use `get_online_data()`
feinleib Jun 15, 2024
029e889
Remove {crul} dependency
feinleib Jun 15, 2024
304dd92
`create_factor_columns()` for LES/HVW
feinleib Jun 17, 2024
fcb2ef4
Docs: clarify `read_from_local_path` is optional
feinleib Jun 17, 2024
d113052
`read_local_file()`: use `...`
feinleib Jun 17, 2024
ca1d3db
`get_online_data()`: 3 tries
feinleib Jun 17, 2024
cce511c
Voteview: `read_from_local_path` arg
feinleib Jun 17, 2024
5bc8500
parties: more tests
feinleib Jun 17, 2024
f7c46f7
Remove {R.utils} dependency
feinleib Jun 17, 2024
df5f8d2
Remove `doc_arg_local()`
feinleib Jun 17, 2024
ec3682a
Remove writing from NEWS
feinleib Jun 17, 2024
75742f3
Update NEWS
feinleib Jun 17, 2024
1693511
`build_file_path()`: use `cli::cli_abort()`
feinleib Jun 17, 2024
0755344
Rename `build_file_path()` to `build_url()`
feinleib Jun 17, 2024
1987cf0
Improve error formatting
feinleib Jun 17, 2024
ba7c583
Rename files to build_url
feinleib Jun 18, 2024
b943054
DESCRIPTION: shorten lines
feinleib Jun 18, 2024
9b6082b
`get_voteview_members()` coltypes
feinleib Jun 18, 2024
5193f5d
Update NEWS
feinleib Jun 18, 2024
49810a6
NEWS: minor reformatting
feinleib Jun 21, 2024
fd97813
Remove `write_local_file()`
feinleib Jun 23, 2024
41bc705
Remove `extract_file_ending()`
feinleib Jun 23, 2024
cb94893
Expand `get_online_data()` docs
feinleib Jun 23, 2024
397342e
NEWS: update dependency changes
feinleib Jun 23, 2024
a178fa2
`get_voteview_members()`: .dta handling
feinleib Jun 23, 2024
242c6f7
Test `get_voteview_members()`
feinleib Jun 23, 2024
768e635
`filter_chamber_congress()`
feinleib Jun 24, 2024
d2f2456
Split `filter_chamber()`, `filter_congress()`
feinleib Jun 24, 2024
d349861
Rename `read_from_local_path` to `local_path`
feinleib Jun 24, 2024
36cff3e
Allow "hs" chamber code
feinleib Jun 24, 2024
5d53700
Test `filter_chamber()`
feinleib Jun 24, 2024
933cbf5
Use roxygen 7.3.2
feinleib Jul 6, 2024
6b7eba1
Export/doc/test `read_html_table()`
feinleib Jul 6, 2024
e6e7581
Add `read_html_table()` to NEWS
feinleib Jul 6, 2024
a532e2a
More NEWS
feinleib Jul 6, 2024
ce99340
test-utils: skip tests if offline
feinleib Jul 6, 2024
69750bd
`read_html_table()` GH issue #
feinleib Jul 6, 2024
3e25a15
.val in `match_chamber()` errors
feinleib Jul 11, 2024
152705b
`match_congress()`: error for invalid Congress
feinleib Jul 11, 2024
394eef1
`filter_congress()`: use checks from `match_congress()`
feinleib Jul 11, 2024
04c2338
Only filter chamber/cong for local reads
feinleib Jul 11, 2024
feb4e27
`filter_congress()`: don't filter NULL congress
feinleib Jul 11, 2024
beb1b4c
Test single-congress `filter_congress()`
feinleib Jul 11, 2024
1b283aa
Test filtering for all congresses
feinleib Jul 11, 2024
2f88a73
test `filter_congress()` exp. usage
feinleib Jul 20, 2024
232c6db
test `filter_congress()` errors
feinleib Jul 20, 2024
e912a33
skip Voteview members test if offline
feinleib Jul 20, 2024
2038660
`filter_congress()`: error if congs not found
feinleib Jul 20, 2024
e33e9c8
Test bad `congress` for local members
feinleib Jul 20, 2024
388d248
`call` arg in filter_congress, match_congress
feinleib Jul 28, 2024
1b0a7dd
Just use TSV file in read filtering test
feinleib Jul 28, 2024
04e3dfc
skip member_votes tests if offline
feinleib Jul 28, 2024
febbf9d
member_votes: match fixes in members
feinleib Jul 28, 2024
2f40446
Test member_votes coltypes
feinleib Jul 28, 2024
3408684
Drop levels of filtered-out chambers
feinleib Oct 19, 2024
b7c460b
member_votes: fix coltypes
feinleib Oct 19, 2024
56a28af
member_votes: test local read/write
feinleib Oct 19, 2024
56cbc96
Update NEWS
feinleib Oct 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,27 @@ Version: 0.2.1.9000
Authors@R:
person("Max", "Feinleib", email = "[email protected]", role = c("aut", "cre", "cph"),
comment = c(ORCID = "0009-0002-9604-3533"))
Description: Provides easy-to-understand and consistent interfaces for accessing data on the U.S. Congress.
The functions in 'filibustr' streamline the process for importing data on Congress into R,
removing the need to download and work from CSV files and the like.
Data sources include 'Voteview' (<https://voteview.com/>), the U.S. Senate website (<https://www.senate.gov/>), and more.
Description: Provides easy-to-understand and consistent interfaces for accessing data
on the U.S. Congress. The functions in 'filibustr' streamline the process for
importing data on Congress into R, removing the need to download and work from
CSV files and the like. Data sources include 'Voteview' (<https://voteview.com/>),
the U.S. Senate website (<https://www.senate.gov/>), and more.
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
Imports:
crul,
cli,
dplyr,
haven,
httr2,
lifecycle,
R.utils,
readr,
rlang,
rvest,
stringr,
tidyr,
tidyselect
tools
URL: https://github.com/feinleib/filibustr
BugReports: https://github.com/feinleib/filibustr/issues
Suggests:
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ export(get_voteview_member_votes)
export(get_voteview_members)
export(get_voteview_parties)
export(get_voteview_rollcall_votes)
export(read_html_table)
export(year_of_congress)
importFrom(lifecycle,deprecated)
importFrom(rlang,.data)
29 changes: 26 additions & 3 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,32 @@
# filibustr (development version)

* `get_voteview_cast_codes()` provides the cast codes used in
Voteview's member votes data (#13).
* `get_les()` now uses more specific column types (#10).
* BREAKING CHANGE: Redesigned the interface for reading from local files.
Now, to read from a local file, specify the file path using `local_path`
(#17).
* A given function call will now consistently read data from *either* online
or a local file, not try both. There is no longer an "online fallback" if
a local file is not found.
* In the Voteview functions, an invalid `congress` is now an error, instead of
silently returning data for all Congresses.
* Improved error messages with `cli::cli_abort()` (#9).
* When reading data from online, now try up to 3 times in case of HTTP errors.
* New `get_voteview_cast_codes()` provides the cast codes used in Voteview's
member votes data (#13).
* New `read_html_table()` for reading HTML tables from online. It's a nice
shortcut for a common {rvest} workflow that otherwise takes 3 functions.
`read_html_table()` was previously an internal function, but it's useful
enough that I think it should be exported, even if it's not a core
functionality of {filibustr} (#20).
* `get_les()`, `get_hvw_data()`, `get_voteview_members()`, and
`get_voteview_member_votes()` now use more specific column types, such as
integer for count data and logical for binary data (#10).
* NOTE: state abbreviations (columns `state`, `st_name`) and LES scores
relative to expectation (columns `expectation`, `expectation1`,
`expectation2`) are now factor variables.
* `get_voteview_members()`: fix factor levels in the `state_abbrev` column.
* In `get_les()`, 0- or 1-character strings for `bioname` are converted to `NA`.
* Removed dependencies: {crul}, {R.utils}, {tidyselect}.
* New dependencies: {cli}, {tools}.

# filibustr 0.2.1 (2024-05-02)

Expand Down
142 changes: 0 additions & 142 deletions R/build_file_path.R

This file was deleted.

139 changes: 139 additions & 0 deletions R/build_url.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
build_url <- function(data_source, chamber = "all", congress = NULL, sheet_type = NULL) {
chamber_code <- match_chamber(chamber)

congress_code <- match_congress(congress, call = rlang::caller_env())

url <- switch(
tolower(data_source),
voteview = build_voteview_url(sheet_type = sheet_type,
chamber_code = chamber_code,
congress_code = congress_code),
hvw = build_hvw_url(chamber_code = chamber_code),
lhy = build_hvw_url(chamber_code = chamber_code),
les = build_les_url(les_2 = sheet_type, chamber_code = chamber_code),
"source not implemented"
)

if (url == "source not implemented") {
cli::cli_abort(c(
"Invalid data source name: {.arg {data_source}}",
"i" = paste("Expected data sources (case-insensitive):",
"{.arg Voteview},", "{.arg HVW},", "{.arg LHY},", "{.arg LES}")
))
}

url
}

build_voteview_url <- function(sheet_type, chamber_code = "HS", congress_code = "all") {
source <- paste0("https://voteview.com/static/data/out/", sheet_type)

paste0(source, "/", chamber_code, congress_code, "_", sheet_type, ".csv")
}

build_hvw_url <- function(chamber_code) {
# no "all" option for HVW
if (!(chamber_code %in% c("H", "S"))) {
cli::cli_abort(c(
paste("Invalid {.arg chamber} argument ({.arg {chamber_code}})",
"provided for {.code get_hvw_data()}."),
"i" = "{.arg chamber} must be either House or Senate, not both."
),
call = rlang::caller_env(2))
}

source <- "https://dataverse.harvard.edu/api/access/datafile"
file <- if (chamber_code == "H") "6299608" else "6299605"

paste0(source, "/", file)
}

build_les_url <- function(chamber_code, les_2 = FALSE) {
# no "all" option for LES
if (!(chamber_code %in% c("H", "S"))) {
cli::cli_abort(c(
"Invalid {.arg chamber} argument ({.arg {chamber_code}}) provided for {.code get_les()}.",
"i" = "{.arg chamber} must be either House or Senate, not both."
),
call = rlang::caller_env(2))
}

source <- "https://thelawmakers.org/wp-content/uploads/2023/04"
chamber_name <- if (chamber_code == "H") "House" else "Senate"
sheet_type <- if (les_2) "117ReducedLES2" else "93to117ReducedClassic"

paste0(source, "/CEL", chamber_name, sheet_type, ".dta")
}

match_chamber <- function(chamber) {
chamber_code <- dplyr::case_match(tolower(chamber),
c("all", "congress", "hs") ~ "HS",
c("house", "h", "hr") ~ "H",
c("senate", "s", "sen") ~ "S",
.default = "HS_default")

# Warn for invalid chamber argument
if (chamber_code == "HS_default") {
cli::cli_warn(paste("Invalid {.arg chamber} argument ({.val {chamber}}) provided.",
"Using {.arg chamber = {.val all}}."),
call = rlang::caller_env())
chamber_code <- "HS"
}

chamber_code
}

#' Get Voteview string for a specified Congress
#'
#' Get a Congress number as a three-digit string.
#' This is the format of Congress numbers in Voteview data file names.
#'
#' If no Congress number is given, this will return `"all"`.
#' Any argument that is not a valid Congress number (i.e., the integers 1 to
#' `r current_congress()`) is an error.
#'
#' @param congress A Congress number.
#'
#' Valid Congress numbers are integers between 1 and `r current_congress()`
#' (the current Congress).
#'
#'
#' @returns A three-character string.
#'
#' Either three digits between `"001"` and ``r paste0('"', current_congress(), '"')``,
#' or `"all"` if no Congress is specified.
#'
#' @examples
#' match_congress(118)
#' match_congress(1)
#'
#' match_congress(NULL)
#' match_congress(300)
#' match_congress("not a valid number")
#'
#' @noRd
match_congress <- function(congress = NULL, call = rlang::caller_env()) {
if (length(congress) > 1) {
return(sapply(congress,
function(.x) match_congress(congress = .x, call = call)))
}

# default: all
if (is.null(congress)) {
return("all")
}

# error for invalid `congress`
if (!is.numeric(congress) ||
!all(congress %in% 1:current_congress())) {
cli::cli_abort(c(
"Invalid {.arg congress} argument ({.val {congress}}) provided.",
"i" = "{.arg congress} must be a whole number between {.val {1}} and {.val {current_congress()}}."
),
call = call)
}

# valid Congress numbers: pad with zeros to 3 characters
stringr::str_pad(string = as.integer(congress),
width = 3, side = "left", pad = 0)
}
Loading
Loading