Skip to content

Commit

Permalink
allow for source = NULL in a talkr init (#100)
Browse files Browse the repository at this point in the history
* allow for source = NULL in a talkr init

* check that existing columns arent overwritten when source = null
  • Loading branch information
bvreede authored Aug 22, 2024
1 parent 8a4b578 commit 5413cc4
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 2 deletions.
9 changes: 8 additions & 1 deletion R/init.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@
#' Initializing a talkr dataset is the first step in the talkr workflow.
#'
#' @param data A dataframe object
#' @param source The column name identifying the conversation source (e.g. a filename; is used as unique conversation ID)
#' @param source The column name identifying the conversation source
#' (e.g. a filename; is used as unique conversation ID). If there are no different
#' sources in the data, set this parameter to `NULL`.
#' @param begin The column name with the begin time of the utterance (in milliseconds)
#' @param end The column name with the end time of the utterance (in milliseconds)
#' @param participant The column name with the participant who produced the utterance
Expand Down Expand Up @@ -43,6 +45,11 @@ init <- function(data,
data$end <- as.numeric(data$end)
}

# ensure a `source` column exists; if it does not exist, create one
if(!"source" %in% names(data)){
data$source <- "talkr"
}

# generate UIDs
if("uid" %in% names(data)){
warning("Column 'uid' already exists in the dataset. This column will be renamed to `original_uid`.")
Expand Down
4 changes: 3 additions & 1 deletion man/init.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

27 changes: 27 additions & 0 deletions tests/testthat/test-init.R
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,30 @@ test_that("Warning is generated with existing UID column", {
expect_true("original_uid" %in% names(talkr_dataset))

})

test_that("init works with source = NULL", {
expect_no_error(talkr_dataset <- init(dummy_data,
source = NULL,
begin = "col1",
end = "col2",
participant = "x",
utterance = "y"))
expect_false("source" %in% names(dummy_data))
expect_true("source" %in% names(talkr_dataset))
expected_UIDs <- c("talkr-0001-1",
"talkr-0002-2",
"talkr-0003-3",
"talkr-0004-4",
"talkr-0005-5")
expect_equal(talkr_dataset$uid, expected_UIDs)
})

test_that("init does not overwrite existing columns when source = NULL", {
data <- data.frame(begin = 1:4,
end = 5:8,
participant = "Person1",
utterance = "HelloWorld",
source = "A.txt")
talkr_dataset <- init(data, source = NULL)
expect_equal(talkr_dataset$source, data$source)
})

0 comments on commit 5413cc4

Please sign in to comment.