Skip to content

Commit

Permalink
Tweaks
Browse files Browse the repository at this point in the history
  • Loading branch information
jeroen committed Aug 3, 2024
1 parent ae058de commit 1fc09d0
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 14 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Imports:
rappdirs,
digest
LinkingTo: Rcpp
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
Roxygen: list(markdown = TRUE)
Suggests:
magick (>= 1.7),
Expand Down
13 changes: 7 additions & 6 deletions R/tessdata.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,18 +22,20 @@
#' @family tesseract
#' @param lang three letter code for language, see [tessdata](https://github.com/tesseract-ocr/tessdata) repository.
#' @param datapath destination directory where to download store the file
#' @param best download the most accurate (but slower) trained models for Tesseract 4.0 or higher
#' @param model either `fast` or `best` is currently supported. The latter downloads
#' more accurate (but slower) trained models for Tesseract 4.0 or higher
#' @param progress print progress while downloading
#' @references [tesseract wiki: training data](https://tesseract-ocr.github.io/tessdoc/Data-Files)
#' @examples \dontrun{
#' if(is.na(match("fra", tesseract_info()$available)))
#' tesseract_download("fra")
#' tesseract_download("fra", model = 'best')
#' french <- tesseract("fra")
#' text <- ocr("https://jeroen.github.io/images/french_text.png", engine = french)
#' cat(text)
#' }
tesseract_download <- function(lang, datapath = NULL, best = FALSE, progress = interactive()) {
tesseract_download <- function(lang, datapath = NULL, model = c("fast", "best"), progress = interactive()) {
stopifnot(is.character(lang))
model <- match.arg(model)
if(!length(datapath)){
warn_on_linux()
datapath <- tesseract_info()$datapath
Expand All @@ -45,7 +47,7 @@ tesseract_download <- function(lang, datapath = NULL, best = FALSE, progress = i
repo <- "tessdata"
release <- "3.04.00"
} else {
repo <- ifelse(best, "tessdata_best", "tessdata_fast")
repo <- paste0("tessdata_", model)
release <- "4.1.0"
}

Expand All @@ -54,8 +56,7 @@ tesseract_download <- function(lang, datapath = NULL, best = FALSE, progress = i
destfile <- file.path(datapath, basename(url))

if (file.exists(destfile)) {
message("Training data already exists.")
return(destfile)
message(paste("Training data already exists. Overwriting", destfile))
}

req <- curl::curl_fetch_memory(url, curl::new_handle(
Expand Down
4 changes: 2 additions & 2 deletions man/ocr.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 5 additions & 5 deletions man/tessdata.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 1fc09d0

Please sign in to comment.