From 61087282843866ad6bb6142e96141399b634d089 Mon Sep 17 00:00:00 2001 From: hadley Date: Fri, 4 Aug 2023 19:42:36 +0000 Subject: [PATCH] =?UTF-8?q?Deploying=20to=20gh-pages=20from=20@=20tidyvers?= =?UTF-8?q?e/stringr@aee0ebc9aa5cb17b9af0a82fd5fc83e7572dea72=20?= =?UTF-8?q?=F0=9F=9A=80?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- dev/articles/from-base.html | 106 +++++++++++++++++------------------ dev/pkgdown.yml | 2 +- dev/reference/str_split.html | 2 +- dev/search.json | 2 +- 4 files changed, 56 insertions(+), 56 deletions(-) diff --git a/dev/articles/from-base.html b/dev/articles/from-base.html index 03a32387..5f8806ce 100644 --- a/dev/articles/from-base.html +++ b/dev/articles/from-base.html @@ -120,23 +120,23 @@

Overall differences#> ! Using `across()` without supplying `.cols` was deprecated in dplyr #> 1.1.0. #> Please supply `.cols` instead. -
- diff --git a/dev/pkgdown.yml b/dev/pkgdown.yml index 0f943954..bfd1dfba 100644 --- a/dev/pkgdown.yml +++ b/dev/pkgdown.yml @@ -5,7 +5,7 @@ articles: from-base: from-base.html regular-expressions: regular-expressions.html stringr: stringr.html -last_built: 2023-08-04T19:14Z +last_built: 2023-08-04T19:42Z urls: reference: https://stringr.tidyverse.org/reference article: https://stringr.tidyverse.org/articles diff --git a/dev/reference/str_split.html b/dev/reference/str_split.html index 0191bc7f..1a019cf3 100644 --- a/dev/reference/str_split.html +++ b/dev/reference/str_split.html @@ -123,7 +123,7 @@

Arguments.

diff --git a/dev/search.json b/dev/search.json index 336bd5a7..7389e892 100644 --- a/dev/search.json +++ b/dev/search.json @@ -1 +1 @@ -[{"path":"https://stringr.tidyverse.org/dev/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2020 stringr authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"overall-differences","dir":"Articles","previous_headings":"","what":"Overall differences","title":"From base R","text":"’ll begin lookup table important stringr functions base R equivalents. str_detect(string, pattern) grepl(pattern, x) str_dup(string, times) strrep(x, times) str_extract(string, pattern) regmatches(x, m = regexpr(pattern, text)) str_extract_all(string, pattern) regmatches(x, m = gregexpr(pattern, text)) str_length(string) nchar(x) str_locate(string, pattern) regexpr(pattern, text) str_locate_all(string, pattern) gregexpr(pattern, text) str_match(string, pattern) regmatches(x, m = regexec(pattern, text)) str_order(string) order(...) str_replace(string, pattern, replacement) sub(pattern, replacement, x) str_replace_all(string, pattern, replacement) gsub(pattern, replacement, x) str_sort(string) sort(x) str_split(string, pattern) strsplit(x, split) str_sub(string, start, end) substr(x, start, stop) str_subset(string, pattern) grep(pattern, x, value = TRUE) str_to_lower(string) tolower(x) str_to_title(string) tools::toTitleCase(text) str_to_upper(string) toupper(x) str_trim(string) trimws(x) str_which(string, pattern) grep(pattern, x) str_wrap(string) strwrap(x) Overall main differences base R stringr : stringr functions start str_ prefix; base R string functions consistent naming scheme. order inputs usually different base R stringr. base R, pattern match usually comes first; stringr, string manupulate always comes first. makes stringr easier use pipes, lapply() purrr::map(). Functions stringr tend less, many string processing functions base R multiple purposes. output input stringr functions carefully designed. example, output str_locate() can fed directly str_sub(); true regpexpr() substr(). Base functions use arguments (like perl, fixed, ignore.case) control pattern interpreted. avoid dependence arguments, stringr instead uses helper functions (like fixed(), regex(), coll()). Next ’ll walk functions, noting similarities important differences. examples adapted stringr documentation contrasted analogous base R operations.","code":"#> Warning: There was 1 warning in `dplyr::mutate()`. #> ℹ In argument: `dplyr::across(.fns = ~paste0(\"`\", .x, \"`\"))`. #> Caused by warning: #> ! Using `across()` without supplying `.cols` was deprecated in dplyr #> 1.1.0. #> ℹ Please supply `.cols` instead."},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_detect-detect-the-presence-or-absence-of-a-pattern-in-a-string","dir":"Articles","previous_headings":"Detect matches","what":"str_detect(): Detect the presence or absence of a pattern in a string","title":"From base R","text":"Suppose want know whether word vector fruit names contains “”. base use grepl() (see “l” think logical) stringr use str_detect() (see verb “detect” think yes/action).","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") # base grepl(pattern = \"a\", x = fruit) #> [1] TRUE TRUE TRUE TRUE # stringr str_detect(fruit, pattern = \"a\") #> [1] TRUE TRUE TRUE TRUE"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_which-find-positions-matching-a-pattern","dir":"Articles","previous_headings":"Detect matches","what":"str_which(): Find positions matching a pattern","title":"From base R","text":"Now want identify positions words vector fruit names contain “”. base use grep() stringr use str_which() (analogy ()).","code":"# base grep(pattern = \"a\", x = fruit) #> [1] 1 2 3 4 # stringr str_which(fruit, pattern = \"a\") #> [1] 1 2 3 4"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_count-count-the-number-of-matches-in-a-string","dir":"Articles","previous_headings":"Detect matches","what":"str_count(): Count the number of matches in a string","title":"From base R","text":"many “”s fruit? information can gleaned gregexpr() base, need look match.length attribute vector uses length-1 integer vector (-1) indicate match.","code":"# base loc <- gregexpr(pattern = \"a\", text = fruit, fixed = TRUE) sapply(loc, function(x) length(attr(x, \"match.length\"))) #> [1] 1 3 1 1 # stringr str_count(fruit, pattern = \"a\") #> [1] 1 3 1 1"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_locate-locate-the-position-of-patterns-in-a-string","dir":"Articles","previous_headings":"Detect matches","what":"str_locate(): Locate the position of patterns in a string","title":"From base R","text":"Within fruit, first “p” occur? “p”s?","code":"fruit3 <- c(\"papaya\", \"lime\", \"apple\") # base str(gregexpr(pattern = \"p\", text = fruit3)) #> List of 3 #> $ : int [1:2] 1 3 #> ..- attr(*, \"match.length\")= int [1:2] 1 1 #> ..- attr(*, \"index.type\")= chr \"chars\" #> ..- attr(*, \"useBytes\")= logi TRUE #> $ : int -1 #> ..- attr(*, \"match.length\")= int -1 #> ..- attr(*, \"index.type\")= chr \"chars\" #> ..- attr(*, \"useBytes\")= logi TRUE #> $ : int [1:2] 2 3 #> ..- attr(*, \"match.length\")= int [1:2] 1 1 #> ..- attr(*, \"index.type\")= chr \"chars\" #> ..- attr(*, \"useBytes\")= logi TRUE # stringr str_locate(fruit3, pattern = \"p\") #> start end #> [1,] 1 1 #> [2,] NA NA #> [3,] 2 2 str_locate_all(fruit3, pattern = \"p\") #> [[1]] #> start end #> [1,] 1 1 #> [2,] 3 3 #> #> [[2]] #> start end #> #> [[3]] #> start end #> [1,] 2 2 #> [2,] 3 3"},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_sub-extract-and-replace-substrings-from-a-character-vector","dir":"Articles","previous_headings":"Subset strings","what":"str_sub(): Extract and replace substrings from a character vector","title":"From base R","text":"want grab part string? base use substr() substring(). former requires start stop substring latter assumes stop end string. stringr version, str_sub() functionality, also gives default start value (beginning string). base stringr functions order expected inputs. stringr can use negative numbers index right-hand side string: -1 last letter, -2 second last, . base R stringr subset vectorized parameters. means can either choose subset across multiple strings specify different subsets different strings. stringr automatically recycle first argument length start stop: Whereas base equivalent silently uses just first value:","code":"hw <- \"Hadley Wickham\" # base substr(hw, start = 1, stop = 6) #> [1] \"Hadley\" substring(hw, first = 1) #> [1] \"Hadley Wickham\" # stringr str_sub(hw, start = 1, end = 6) #> [1] \"Hadley\" str_sub(hw, start = 1) #> [1] \"Hadley Wickham\" str_sub(hw, end = 6) #> [1] \"Hadley\" str_sub(hw, start = 1, end = -1) #> [1] \"Hadley Wickham\" str_sub(hw, start = -5, end = -2) #> [1] \"ckha\" al <- \"Ada Lovelace\" # base substr(c(hw,al), start = 1, stop = 6) #> [1] \"Hadley\" \"Ada Lo\" substr(c(hw,al), start = c(1,1), stop = c(6,7)) #> [1] \"Hadley\" \"Ada Lov\" # stringr str_sub(c(hw,al), start = 1, end = -1) #> [1] \"Hadley Wickham\" \"Ada Lovelace\" str_sub(c(hw,al), start = c(1,1), end = c(-1,-2)) #> [1] \"Hadley Wickham\" \"Ada Lovelac\" str_sub(hw, start = 1:5) #> [1] \"Hadley Wickham\" \"adley Wickham\" \"dley Wickham\" \"ley Wickham\" #> [5] \"ey Wickham\" substr(hw, start = 1:5, stop = 15) #> [1] \"Hadley Wickham\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_sub---subset-assignment","dir":"Articles","previous_headings":"Subset strings","what":"str_sub() <-: Subset assignment","title":"From base R","text":"substr() behaves surprising way replace substring different number characters: str_sub() expect:","code":"# base x <- \"ABCDEF\" substr(x, 1, 3) <- \"x\" x #> [1] \"xBCDEF\" # stringr x <- \"ABCDEF\" str_sub(x, 1, 3) <- \"x\" x #> [1] \"xDEF\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_subset-keep-strings-matching-a-pattern-or-find-positions","dir":"Articles","previous_headings":"Subset strings","what":"str_subset(): Keep strings matching a pattern, or find positions","title":"From base R","text":"may want retrieve strings contain pattern interest:","code":"# base grep(pattern = \"g\", x = fruit, value = TRUE) #> character(0) # stringr str_subset(fruit, pattern = \"g\") #> character(0)"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_extract-extract-matching-patterns-from-a-string","dir":"Articles","previous_headings":"Subset strings","what":"str_extract(): Extract matching patterns from a string","title":"From base R","text":"may want pick certain patterns string, example, digits shopping list: Base R requires combination regexpr() regmatches(); note strings without matches dropped output. stringr provides str_extract() str_extract_all(), output always length input.","code":"shopping_list <- c(\"apples x4\", \"bag of flour\", \"10\", \"milk x2\") # base matches <- regexpr(pattern = \"\\\\d+\", text = shopping_list) # digits regmatches(shopping_list, m = matches) #> [1] \"4\" \"10\" \"2\" matches <- gregexpr(pattern = \"[a-z]+\", text = shopping_list) # words regmatches(shopping_list, m = matches) #> [[1]] #> [1] \"apples\" \"x\" #> #> [[2]] #> [1] \"bag\" \"of\" \"flour\" #> #> [[3]] #> character(0) #> #> [[4]] #> [1] \"milk\" \"x\" # stringr str_extract(shopping_list, pattern = \"\\\\d+\") #> [1] \"4\" NA \"10\" \"2\" str_extract_all(shopping_list, \"[a-z]+\") #> [[1]] #> [1] \"apples\" \"x\" #> #> [[2]] #> [1] \"bag\" \"of\" \"flour\" #> #> [[3]] #> character(0) #> #> [[4]] #> [1] \"milk\" \"x\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_match-extract-matched-groups-from-a-string","dir":"Articles","previous_headings":"Subset strings","what":"str_match(): Extract matched groups from a string","title":"From base R","text":"may also want extract groups string. ’m going use scenario Section 14.4.3 R Data Science. extracting full match base R requires combination two functions, inputs matches dropped output.","code":"head(sentences) #> [1] \"The birch canoe slid on the smooth planks.\" #> [2] \"Glue the sheet to the dark blue background.\" #> [3] \"It's easy to tell the depth of a well.\" #> [4] \"These days a chicken leg is a rare dish.\" #> [5] \"Rice is often served in round bowls.\" #> [6] \"The juice of lemons makes fine punch.\" noun <- \"([A]a|[Tt]he) ([^ ]+)\" # base matches <- regexec(pattern = noun, text = head(sentences)) do.call(\"rbind\", regmatches(x = head(sentences), m = matches)) #> [,1] [,2] [,3] #> [1,] \"The birch\" \"The\" \"birch\" #> [2,] \"the sheet\" \"the\" \"sheet\" #> [3,] \"the depth\" \"the\" \"depth\" #> [4,] \"The juice\" \"The\" \"juice\" # stringr str_match(head(sentences), pattern = noun) #> [,1] [,2] [,3] #> [1,] \"The birch\" \"The\" \"birch\" #> [2,] \"the sheet\" \"the\" \"sheet\" #> [3,] \"the depth\" \"the\" \"depth\" #> [4,] NA NA NA #> [5,] NA NA NA #> [6,] \"The juice\" \"The\" \"juice\""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_length-the-length-of-a-string","dir":"Articles","previous_headings":"Manage lengths","what":"str_length(): The length of a string","title":"From base R","text":"determine length string, base R uses nchar() (confused length() gives length vectors, etc.) stringr uses str_length(). subtle differences base stringr . nchar() requires character vector, return error used factor. str_length() can handle factor input. Note “characters” poorly defined concept, technically nchar() str_length() returns number code points. usually ’d consider charcter, always:","code":"# base nchar(letters) #> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # stringr str_length(letters) #> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # base nchar(factor(\"abc\")) #> Error in nchar(factor(\"abc\")): 'nchar()' requires a character vector # stringr str_length(factor(\"abc\")) #> [1] 3 x <- c(\"\\u00fc\", \"u\\u0308\") x #> [1] \"ü\" \"ü\" nchar(x) #> [1] 1 2 str_length(x) #> [1] 1 2"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_pad-pad-a-string","dir":"Articles","previous_headings":"Manage lengths","what":"str_pad(): Pad a string","title":"From base R","text":"pad string certain width, use stringr’s str_pad(). base R use sprintf(), unlike str_pad(), sprintf() many functionalities.","code":"# base sprintf(\"%30s\", \"hadley\") #> [1] \" hadley\" sprintf(\"%-30s\", \"hadley\") #> [1] \"hadley \" # \"both\" is not as straightforward # stringr rbind( str_pad(\"hadley\", 30, \"left\"), str_pad(\"hadley\", 30, \"right\"), str_pad(\"hadley\", 30, \"both\") ) #> [,1] #> [1,] \" hadley\" #> [2,] \"hadley \" #> [3,] \" hadley \""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_trunc-truncate-a-character-string","dir":"Articles","previous_headings":"Manage lengths","what":"str_trunc(): Truncate a character string","title":"From base R","text":"stringr package provides easy way truncate character string: str_trunc(). Base R function directly.","code":"x <- \"This string is moderately long\" # stringr rbind( str_trunc(x, 20, \"right\"), str_trunc(x, 20, \"left\"), str_trunc(x, 20, \"center\") ) #> [,1] #> [1,] \"This string is mo...\" #> [2,] \"...s moderately long\" #> [3,] \"This stri...ely long\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_trim-trim-whitespace-from-a-string","dir":"Articles","previous_headings":"Manage lengths","what":"str_trim(): Trim whitespace from a string","title":"From base R","text":"Similarly, stringr provides str_trim() trim whitespace string. analogous base R’s trimws() added R 3.3.0. stringr function str_squish() allows extra whitespace within string trimmed (contrast str_trim() removes whitespace beginning /end string). base R, one might take advantage gsub() accomplish effect.","code":"# base trimws(\" String with trailing and leading white space\\t\") #> [1] \"String with trailing and leading white space\" trimws(\"\\n\\nString with trailing and leading white space\\n\\n\") #> [1] \"String with trailing and leading white space\" # stringr str_trim(\" String with trailing and leading white space\\t\") #> [1] \"String with trailing and leading white space\" str_trim(\"\\n\\nString with trailing and leading white space\\n\\n\") #> [1] \"String with trailing and leading white space\" # stringr str_squish(\" String with trailing, middle, and leading white space\\t\") #> [1] \"String with trailing, middle, and leading white space\" str_squish(\"\\n\\nString with excess, trailing and leading white space\\n\\n\") #> [1] \"String with excess, trailing and leading white space\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_wrap-wrap-strings-into-nicely-formatted-paragraphs","dir":"Articles","previous_headings":"Manage lengths","what":"str_wrap(): Wrap strings into nicely formatted paragraphs","title":"From base R","text":"strwrap() str_wrap() use different algorithms. str_wrap() uses famous Knuth-Plass algorithm. Note strwrap() returns character vector one element line; str_wrap() returns single string containing line breaks.","code":"gettysburg <- \"Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.\" # base cat(strwrap(gettysburg, width = 60), sep = \"\\n\") #> Four score and seven years ago our fathers brought forth on #> this continent, a new nation, conceived in Liberty, and #> dedicated to the proposition that all men are created #> equal. # stringr cat(str_wrap(gettysburg, width = 60), \"\\n\") #> Four score and seven years ago our fathers brought forth #> on this continent, a new nation, conceived in Liberty, and #> dedicated to the proposition that all men are created equal."},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_replace-replace-matched-patterns-in-a-string","dir":"Articles","previous_headings":"Mutate strings","what":"str_replace(): Replace matched patterns in a string","title":"From base R","text":"replace certain patterns within string, stringr provides functions str_replace() str_replace_all(). base R equivalents sub() gsub(). Note difference default input order .","code":"fruits <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") # base sub(\"[aeiou]\", \"-\", fruits) #> [1] \"-pple\" \"b-nana\" \"p-ar\" \"p-neapple\" gsub(\"[aeiou]\", \"-\", fruits) #> [1] \"-ppl-\" \"b-n-n-\" \"p--r\" \"p-n--ppl-\" # stringr str_replace(fruits, \"[aeiou]\", \"-\") #> [1] \"-pple\" \"b-nana\" \"p-ar\" \"p-neapple\" str_replace_all(fruits, \"[aeiou]\", \"-\") #> [1] \"-ppl-\" \"b-n-n-\" \"p--r\" \"p-n--ppl-\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"case-convert-case-of-a-string","dir":"Articles","previous_headings":"Mutate strings","what":"case: Convert case of a string","title":"From base R","text":"stringr base R functions convert upper lower case. Title case also provided stringr. stringr can control locale, base R locale distinctions controlled global variables. Therefore, output base R code may vary across different computers different global settings.","code":"dog <- \"The quick brown dog\" # base toupper(dog) #> [1] \"THE QUICK BROWN DOG\" tolower(dog) #> [1] \"the quick brown dog\" tools::toTitleCase(dog) #> [1] \"The Quick Brown Dog\" # stringr str_to_upper(dog) #> [1] \"THE QUICK BROWN DOG\" str_to_lower(dog) #> [1] \"the quick brown dog\" str_to_title(dog) #> [1] \"The Quick Brown Dog\" # stringr str_to_upper(\"i\") # English #> [1] \"I\" str_to_upper(\"i\", locale = \"tr\") # Turkish #> [1] \"İ\""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_flatten-flatten-a-string","dir":"Articles","previous_headings":"Join and split","what":"str_flatten(): Flatten a string","title":"From base R","text":"want take elements string vector collapse single string can use collapse argument paste() use stringr’s str_flatten(). advantage str_flatten() always returns vector length input; predict return length paste() must carefully read arguments.","code":"# base paste0(letters, collapse = \"-\") #> [1] \"a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z\" # stringr str_flatten(letters, collapse = \"-\") #> [1] \"a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_dup-duplicate-strings-within-a-character-vector","dir":"Articles","previous_headings":"Join and split","what":"str_dup(): duplicate strings within a character vector","title":"From base R","text":"duplicate strings within character vector use strrep() (R 3.3.0 greater) str_dup():","code":"fruit <- c(\"apple\", \"pear\", \"banana\") # base strrep(fruit, 2) #> [1] \"appleapple\" \"pearpear\" \"bananabanana\" strrep(fruit, 1:3) #> [1] \"apple\" \"pearpear\" \"bananabananabanana\" # stringr str_dup(fruit, 2) #> [1] \"appleapple\" \"pearpear\" \"bananabanana\" str_dup(fruit, 1:3) #> [1] \"apple\" \"pearpear\" \"bananabananabanana\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_split-split-up-a-string-into-pieces","dir":"Articles","previous_headings":"Join and split","what":"str_split(): Split up a string into pieces","title":"From base R","text":"split string pieces breaks based particular pattern match stringr uses str_split() base R uses strsplit(). Unlike functions, strsplit() starts character vector modify. stringr package’s str_split() allows control split, including restricting number possible matches.","code":"fruits <- c( \"apples and oranges and pears and bananas\", \"pineapples and mangos and guavas\" ) # base strsplit(fruits, \" and \") #> [[1]] #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" # stringr str_split(fruits, \" and \") #> [[1]] #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" # stringr str_split(fruits, \" and \", n = 3) #> [[1]] #> [1] \"apples\" \"oranges\" \"pears and bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" str_split(fruits, \" and \", n = 2) #> [[1]] #> [1] \"apples\" \"oranges and pears and bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos and guavas\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_glue-interpolate-strings","dir":"Articles","previous_headings":"Join and split","what":"str_glue(): Interpolate strings","title":"From base R","text":"’s often useful interpolate varying values fixed string. base R, can use sprintf() purpose; stringr provides wrapper general purpose glue package.","code":"name <- \"Fred\" age <- 50 anniversary <- as.Date(\"1991-10-12\") # base sprintf( \"My name is %s my age next year is %s and my anniversary is %s.\", name, age + 1, format(anniversary, \"%A, %B %d, %Y\") ) #> [1] \"My name is Fred my age next year is 51 and my anniversary is Saturday, October 12, 1991.\" # stringr str_glue( \"My name is {name}, \", \"my age next year is {age + 1}, \", \"and my anniversary is {format(anniversary, '%A, %B %d, %Y')}.\" ) #> My name is Fred, my age next year is 51, and my anniversary is Saturday, October 12, 1991."},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_order-order-or-sort-a-character-vector","dir":"Articles","previous_headings":"Order strings","what":"str_order(): Order or sort a character vector","title":"From base R","text":"base R stringr separate functions order sort strings. options str_order() str_sort() don’t analogous base R options. example, stringr functions locale argument control order sort. base R locale global setting, outputs sort() order() may differ across different computers. example, Norwegian alphabet, å comes z: stringr functions also numeric argument sort digits numerically instead treating strings.","code":"# base order(letters) #> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #> [24] 24 25 26 sort(letters) #> [1] \"a\" \"b\" \"c\" \"d\" \"e\" \"f\" \"g\" \"h\" \"i\" \"j\" \"k\" \"l\" \"m\" \"n\" \"o\" \"p\" \"q\" #> [18] \"r\" \"s\" \"t\" \"u\" \"v\" \"w\" \"x\" \"y\" \"z\" # stringr str_order(letters) #> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #> [24] 24 25 26 str_sort(letters) #> [1] \"a\" \"b\" \"c\" \"d\" \"e\" \"f\" \"g\" \"h\" \"i\" \"j\" \"k\" \"l\" \"m\" \"n\" \"o\" \"p\" \"q\" #> [18] \"r\" \"s\" \"t\" \"u\" \"v\" \"w\" \"x\" \"y\" \"z\" x <- c(\"å\", \"a\", \"z\") str_sort(x) #> [1] \"a\" \"å\" \"z\" str_sort(x, locale = \"no\") #> [1] \"a\" \"z\" \"å\" # stringr x <- c(\"100a10\", \"100a5\", \"2b\", \"2a\") str_sort(x) #> [1] \"100a10\" \"100a5\" \"2a\" \"2b\" str_sort(x, numeric = TRUE) #> [1] \"2a\" \"2b\" \"100a5\" \"100a10\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"basic-matches","dir":"Articles","previous_headings":"","what":"Basic matches","title":"Regular expressions","text":"simplest patterns match exact strings: can perform case-insensitive match using ignore_case = TRUE: next step complexity ., matches character except newline: can allow . match everything, including \\n, setting dotall = TRUE:","code":"x <- c(\"apple\", \"banana\", \"pear\") str_extract(x, \"an\") #> [1] NA \"an\" NA bananas <- c(\"banana\", \"Banana\", \"BANANA\") str_detect(bananas, \"banana\") #> [1] TRUE FALSE FALSE str_detect(bananas, regex(\"banana\", ignore_case = TRUE)) #> [1] TRUE TRUE TRUE str_extract(x, \".a.\") #> [1] NA \"ban\" \"ear\" str_detect(\"\\nX\\n\", \".X.\") #> [1] FALSE str_detect(\"\\nX\\n\", regex(\".X.\", dotall = TRUE)) #> [1] TRUE"},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"escaping","dir":"Articles","previous_headings":"","what":"Escaping","title":"Regular expressions","text":"“.” matches character, match literal “.”? need use “escape” tell regular expression want match exactly, use special behaviour. Like strings, regexps use backslash, \\, escape special behaviour. match ., need regexp \\.. Unfortunately creates problem. use strings represent regular expressions, \\ also used escape symbol strings. create regular expression \\. need string \"\\\\.\". \\ used escape character regular expressions, match literal \\? Well need escape , creating regular expression \\\\. create regular expression, need use string, also needs escape \\. means match literal \\ need write \"\\\\\\\\\" — need four backslashes match one! vignette, use \\. denote regular expression, \"\\\\.\" denote string represents regular expression. alternative quoting mechanism \\Q...\\E: characters ... treated exact matches. useful want exactly match user input part regular expression.","code":"# To create the regular expression, we need \\\\ dot <- \"\\\\.\" # But the expression itself only contains one: writeLines(dot) #> \\. # And this tells R to look for an explicit . str_extract(c(\"abc\", \"a.c\", \"bef\"), \"a\\\\.c\") #> [1] NA \"a.c\" NA x <- \"a\\\\b\" writeLines(x) #> a\\b str_extract(x, \"\\\\\\\\\") #> [1] \"\\\\\" x <- c(\"a.b.c.d\", \"aeb\") starts_with <- \"a.b\" str_detect(x, paste0(\"^\", starts_with)) #> [1] TRUE TRUE str_detect(x, paste0(\"^\\\\Q\", starts_with, \"\\\\E\")) #> [1] TRUE FALSE"},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"special-characters","dir":"Articles","previous_headings":"","what":"Special characters","title":"Regular expressions","text":"Escapes also allow specify individual characters otherwise hard type. can specify individual unicode characters five ways, either variable number hex digits (four common), name: \\xhh: 2 hex digits. \\x{hhhh}: 1-6 hex digits. \\uhhhh: 4 hex digits. \\Uhhhhhhhh: 8 hex digits. \\N{name}, e.g. \\N{grinning face} matches basic smiling emoji. Similarly, can specify many common control characters: \\: bell. \\cX: match control-X character. \\e: escape (\\u001B). \\f: form feed (\\u000C). \\n: line feed (\\u000A). \\r: carriage return (\\u000D). \\t: horizontal tabulation (\\u0009). \\0ooo match octal character. ‘ooo’ one three octal digits, 000 0377. leading zero required. (Many historical interest included sake completeness.)","code":""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"matching-multiple-characters","dir":"Articles","previous_headings":"","what":"Matching multiple characters","title":"Regular expressions","text":"number patterns match one character. ’ve already seen ., matches character (except newline). closely related operator \\X, matches grapheme cluster, set individual elements form single symbol. example, one way representing “á” letter “” plus accent: . match component “”, \\X match complete symbol: five escaped pairs match narrower classes characters: \\d: matches digit. complement, \\D, matches character decimal digit. Technically, \\d includes character Unicode Category Nd (“Number, Decimal Digit”), also includes numeric symbols languages: \\s: matches whitespace. includes tabs, newlines, form feeds, character Unicode Z Category (includes variety space characters separators.). complement, \\S, matches non-whitespace character. \\p{property name} matches character specific unicode property, like \\p{Uppercase} \\p{Diacritic}. complement, \\P{property name}, matches characters without property. complete list unicode properties can found http://www.unicode.org/reports/tr44/#Property_Index. \\w matches “word” character, includes alphabetic characters, marks decimal numbers. complement, \\W, matches non-word character. Technically, \\w also matches connector punctuation, \\u200c (zero width connector), \\u200d (zero width joiner), rarely seen wild. \\b matches word boundaries, transition word non-word characters. \\B matches opposite: boundaries either word non-word characters either side. can also create character classes using []: [abc]: matches , b, c. [-z]: matches every character z (Unicode code point order). [^abc]: matches anything except , b, c. [\\^\\-]: matches ^ -. number pre-built classes can use inside []: [:punct:]: punctuation. [:alpha:]: letters. [:lower:]: lowercase letters. [:upper:]: upperclass letters. [:digit:]: digits. [:xdigit:]: hex digits. [:alnum:]: letters numbers. [:cntrl:]: control characters. [:graph:]: letters, numbers, punctuation. [:print:]: letters, numbers, punctuation, whitespace. [:space:]: space characters (basically equivalent \\s). [:blank:]: space tab. go inside [] character classes, .e. [[:digit:]AX] matches digits, , X. can also using Unicode properties, like [\\p{Letter}], various set operations, like [\\p{Letter}--\\p{script=latin}]. See ?\"stringi-search-charclass\" details.","code":"x <- \"a\\u0301\" str_extract(x, \".\") #> [1] \"a\" str_extract(x, \"\\\\X\") #> [1] \"á\" str_extract_all(\"1 + 2 = 3\", \"\\\\d+\")[[1]] #> [1] \"1\" \"2\" \"3\" # Some Laotian numbers str_detect(\"១២៣\", \"\\\\d\") #> [1] TRUE (text <- \"Some \\t badly\\n\\t\\tspaced \\f text\") #> [1] \"Some \\t badly\\n\\t\\tspaced \\f text\" str_replace_all(text, \"\\\\s+\", \" \") #> [1] \"Some badly spaced text\" (text <- c('\"Double quotes\"', \"«Guillemet»\", \"“Fancy quotes”\")) #> [1] \"\\\"Double quotes\\\"\" \"«Guillemet»\" \"“Fancy quotes”\" str_replace_all(text, \"\\\\p{quotation mark}\", \"'\") #> [1] \"'Double quotes'\" \"'Guillemet'\" \"'Fancy quotes'\" str_extract_all(\"Don't eat that!\", \"\\\\w+\")[[1]] #> [1] \"Don\" \"t\" \"eat\" \"that\" str_split(\"Don't eat that!\", \"\\\\W\")[[1]] #> [1] \"Don\" \"t\" \"eat\" \"that\" \"\" str_replace_all(\"The quick brown fox\", \"\\\\b\", \"_\") #> [1] \"_The_ _quick_ _brown_ _fox_\" str_replace_all(\"The quick brown fox\", \"\\\\B\", \"_\") #> [1] \"T_h_e q_u_i_c_k b_r_o_w_n f_o_x\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"alternation","dir":"Articles","previous_headings":"","what":"Alternation","title":"Regular expressions","text":"| alternation operator, pick one possible matches. example, abc|def match abc def: Note precedence | low: abc|def equivalent (abc)|(def) ab(c|d)ef.","code":"str_detect(c(\"abc\", \"def\", \"ghi\"), \"abc|def\") #> [1] TRUE TRUE FALSE"},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"grouping","dir":"Articles","previous_headings":"","what":"Grouping","title":"Regular expressions","text":"can use parentheses override default precedence rules: Parenthesis also define “groups” can refer backreferences, like \\1, \\2 etc, can extracted str_match(). example, following regular expression finds fruits repeated pair letters: can use (?:...), non-grouping parentheses, control precedence capture match group. slightly efficient capturing parentheses. useful complex cases need capture matches control precedence independently.","code":"str_extract(c(\"grey\", \"gray\"), \"gre|ay\") #> [1] \"gre\" \"ay\" str_extract(c(\"grey\", \"gray\"), \"gr(e|a)y\") #> [1] \"grey\" \"gray\" pattern <- \"(..)\\\\1\" fruit %>% str_subset(pattern) #> [1] \"banana\" \"coconut\" \"cucumber\" \"jujube\" \"papaya\" #> [6] \"salal berry\" fruit %>% str_subset(pattern) %>% str_match(pattern) #> [,1] [,2] #> [1,] \"anan\" \"an\" #> [2,] \"coco\" \"co\" #> [3,] \"cucu\" \"cu\" #> [4,] \"juju\" \"ju\" #> [5,] \"papa\" \"pa\" #> [6,] \"alal\" \"al\" str_match(c(\"grey\", \"gray\"), \"gr(e|a)y\") #> [,1] [,2] #> [1,] \"grey\" \"e\" #> [2,] \"gray\" \"a\" str_match(c(\"grey\", \"gray\"), \"gr(?:e|a)y\") #> [,1] #> [1,] \"grey\" #> [2,] \"gray\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"anchors","dir":"Articles","previous_headings":"","what":"Anchors","title":"Regular expressions","text":"default, regular expressions match part string. ’s often useful anchor regular expression matches start end string: ^ matches start string. $ matches end string. match literal “$” “^”, need escape , \\$, \\^. multiline strings, can use regex(multiline = TRUE). changes behaviour ^ $, introduces three new operators: ^ now matches start line. $ now matches end line. \\matches start input. \\z matches end input. \\Z matches end input, final line terminator, exists.","code":"x <- c(\"apple\", \"banana\", \"pear\") str_extract(x, \"^a\") #> [1] \"a\" NA NA str_extract(x, \"a$\") #> [1] NA \"a\" NA x <- \"Line 1\\nLine 2\\nLine 3\\n\" str_extract_all(x, \"^Line..\")[[1]] #> [1] \"Line 1\" str_extract_all(x, regex(\"^Line..\", multiline = TRUE))[[1]] #> [1] \"Line 1\" \"Line 2\" \"Line 3\" str_extract_all(x, regex(\"\\\\ALine..\", multiline = TRUE))[[1]] #> [1] \"Line 1\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"repetition","dir":"Articles","previous_headings":"","what":"Repetition","title":"Regular expressions","text":"can control many times pattern matches repetition operators: ?: 0 1. +: 1 . *: 0 . Note precedence operators high, can write: colou?r match either American British spellings. means uses need parentheses, like bana(na)+. can also specify number matches precisely: {n}: exactly n {n,}: n {n,m}: n m default matches “greedy”: match longest string possible. can make “lazy”, matching shortest string possible putting ? : ??: 0 1, prefer 0. +?: 1 , match times possible. *?: 0 , match times possible. {n,}?: n , match times possible. {n,m}?: n m, , match times possible, least n. can also make matches possessive putting + , means later parts match fail, repetition re-tried smaller number characters. advanced feature used improve performance worst-case scenarios (called “catastrophic backtracking”). ?+: 0 1, possessive. ++: 1 , possessive. *+: 0 , possessive. {n}+: exactly n, possessive. {n,}+: n , possessive. {n,m}+: n m, possessive. related concept atomic-match parenthesis, (?>...). later match fails engine needs back-track, atomic match kept : succeeds fails whole. Compare following two regular expressions: atomic match fails matches , next character C fails. regular match succeeds matches , C doesn’t match, back-tracks tries B instead.","code":"x <- \"1888 is the longest year in Roman numerals: MDCCCLXXXVIII\" str_extract(x, \"CC?\") #> [1] \"CC\" str_extract(x, \"CC+\") #> [1] \"CCC\" str_extract(x, 'C[LX]+') #> [1] \"CLXXX\" str_extract(x, \"C{2}\") #> [1] \"CC\" str_extract(x, \"C{2,}\") #> [1] \"CCC\" str_extract(x, \"C{2,3}\") #> [1] \"CCC\" str_extract(x, c(\"C{2,3}\", \"C{2,3}?\")) #> [1] \"CCC\" \"CC\" str_extract(x, c(\"C[LX]+\", \"C[LX]+?\")) #> [1] \"CLXXX\" \"CL\" str_detect(\"ABC\", \"(?>A|.B)C\") #> [1] FALSE str_detect(\"ABC\", \"(?:A|.B)C\") #> [1] TRUE"},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"look-arounds","dir":"Articles","previous_headings":"","what":"Look arounds","title":"Regular expressions","text":"assertions look ahead behind current match without “consuming” characters (.e. changing input position). (?=...): positive look-ahead assertion. Matches ... matches current input. (?!...): negative look-ahead assertion. Matches ... match current input. (?<=...): positive look-behind assertion. Matches ... matches text preceding current position, last character match character just current position. Length must bounded (.e. * +). (? [1] \"1\" \"2\" NA y <- c(\"100\", \"$400\") str_extract(y, \"(?<=\\\\$)\\\\d+\") #> [1] NA \"400\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"comments","dir":"Articles","previous_headings":"","what":"Comments","title":"Regular expressions","text":"two ways include comments regular expression. first (?#...): second use regex(comments = TRUE). form ignores spaces newlines, anything everything #. match literal space, ’ll need escape : \"\\\\ \". useful way describing complex regular expressions:","code":"str_detect(\"xyz\", \"x(?#this is a comment)\") #> [1] TRUE phone <- regex(\" \\\\(? # optional opening parens (\\\\d{3}) # area code \\\\)? # optional closing parens (?:-|\\\\ )? # optional dash or space (\\\\d{3}) # another three numbers (?:-|\\\\ )? # optional dash or space (\\\\d{3}) # three more numbers \", comments = TRUE) str_match(c(\"514-791-8141\", \"(514) 791 8141\"), phone) #> [,1] [,2] [,3] [,4] #> [1,] \"514-791-814\" \"514\" \"791\" \"814\" #> [2,] \"(514) 791 814\" \"514\" \"791\" \"814\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"getting-and-setting-individual-characters","dir":"Articles","previous_headings":"","what":"Getting and setting individual characters","title":"Introduction to stringr","text":"can get length string str_length(): now equivalent base R function nchar(). Previously needed work around issues nchar() fact returned 2 nchar(NA). fixed R 3.3.0, longer important. can access individual character using str_sub(). takes three arguments: character vector, start position end position. Either position can either positive integer, counts left, negative integer counts right. positions inclusive, longer string, silently truncated. can also use str_sub() modify strings: duplicate individual strings, can use str_dup():","code":"str_length(\"abc\") #> [1] 3 x <- c(\"abcdef\", \"ghifjk\") # The 3rd letter str_sub(x, 3, 3) #> [1] \"c\" \"i\" # The 2nd to 2nd-to-last character str_sub(x, 2, -2) #> [1] \"bcde\" \"hifj\" str_sub(x, 3, 3) <- \"X\" x #> [1] \"abXdef\" \"ghXfjk\" str_dup(x, c(2, 3)) #> [1] \"abXdefabXdef\" \"ghXfjkghXfjkghXfjk\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"whitespace","dir":"Articles","previous_headings":"","what":"Whitespace","title":"Introduction to stringr","text":"Three functions add, remove, modify whitespace: str_pad() pads string fixed length adding extra whitespace left, right, sides. (can pad characters using pad argument.) str_pad() never make string shorter: want ensure strings length (often useful print methods), combine str_pad() str_trunc(): opposite str_pad() str_trim(), removes leading trailing whitespace: can use str_wrap() modify existing whitespace order wrap paragraph text, length line similar possible.","code":"x <- c(\"abc\", \"defghi\") str_pad(x, 10) # default pads on left #> [1] \" abc\" \" defghi\" str_pad(x, 10, \"both\") #> [1] \" abc \" \" defghi \" str_pad(x, 4) #> [1] \" abc\" \"defghi\" x <- c(\"Short\", \"This is a long string\") x %>% str_trunc(10) %>% str_pad(10, \"right\") #> [1] \"Short \" \"This is...\" x <- c(\" a \", \"b \", \" c\") str_trim(x) #> [1] \"a\" \"b\" \"c\" str_trim(x, \"left\") #> [1] \"a \" \"b \" \"c\" jabberwocky <- str_c( \"`Twas brillig, and the slithy toves \", \"did gyre and gimble in the wabe: \", \"All mimsy were the borogoves, \", \"and the mome raths outgrabe. \" ) cat(str_wrap(jabberwocky, width = 40)) #> `Twas brillig, and the slithy toves did #> gyre and gimble in the wabe: All mimsy #> were the borogoves, and the mome raths #> outgrabe."},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"locale-sensitive","dir":"Articles","previous_headings":"","what":"Locale sensitive","title":"Introduction to stringr","text":"handful stringr functions locale-sensitive: perform differently different regions world. functions case transformation functions: String ordering sorting: locale always defaults English ensure default behaviour identical across systems. Locales always include two letter ISO-639-1 language code (like “en” English “zh” Chinese), optionally ISO-3166 country code (like “en_UK” vs “en_US”). can see complete list available locales running stringi::stri_locale_list().","code":"x <- \"I like horses.\" str_to_upper(x) #> [1] \"I LIKE HORSES.\" str_to_title(x) #> [1] \"I Like Horses.\" str_to_lower(x) #> [1] \"i like horses.\" # Turkish has two sorts of i: with and without the dot str_to_lower(x, \"tr\") #> [1] \"ı like horses.\" x <- c(\"y\", \"i\", \"k\") str_order(x) #> [1] 2 3 1 str_sort(x) #> [1] \"i\" \"k\" \"y\" # In Lithuanian, y comes between i and k str_sort(x, locale = \"lt\") #> [1] \"i\" \"y\" \"k\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"pattern-matching","dir":"Articles","previous_headings":"","what":"Pattern matching","title":"Introduction to stringr","text":"vast majority stringr functions work patterns. parameterised task perform types patterns match.","code":""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"tasks","dir":"Articles","previous_headings":"Pattern matching","what":"Tasks","title":"Introduction to stringr","text":"pattern matching function first two arguments, character vector strings process single pattern match. stringr provides pattern matching functions detect, locate, extract, match, replace, split strings. ’ll illustrate work strings regular expression designed match (US) phone numbers: str_detect() detects presence absence pattern returns logical vector (similar grepl()). str_subset() returns elements character vector match regular expression (similar grep() value = TRUE)`. str_count() counts number matches: str_locate() locates first position pattern returns numeric matrix columns start end. str_locate_all() locates matches, returning list numeric matrices. Similar regexpr() gregexpr(). str_extract() extracts text corresponding first match, returning character vector. str_extract_all() extracts matches returns list character vectors. str_match() extracts capture groups formed () first match. returns character matrix one column complete match one column group. str_match_all() extracts capture groups matches returns list character matrices. Similar regmatches(). str_replace() replaces first matched pattern returns character vector. str_replace_all() replaces matches. Similar sub() gsub(). str_split_fixed() splits string fixed number pieces based pattern returns character matrix. str_split() splits string variable number pieces returns list character vectors.","code":"strings <- c( \"apple\", \"219 733 8965\", \"329-293-8753\", \"Work: 579-499-7527; Home: 543.355.3679\" ) phone <- \"([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})\" # Which strings contain phone numbers? str_detect(strings, phone) #> [1] FALSE TRUE TRUE TRUE str_subset(strings, phone) #> [1] \"219 733 8965\" #> [2] \"329-293-8753\" #> [3] \"Work: 579-499-7527; Home: 543.355.3679\" # How many phone numbers in each string? str_count(strings, phone) #> [1] 0 1 1 2 # Where in the string is the phone number located? (loc <- str_locate(strings, phone)) #> start end #> [1,] NA NA #> [2,] 1 12 #> [3,] 1 12 #> [4,] 7 18 str_locate_all(strings, phone) #> [[1]] #> start end #> #> [[2]] #> start end #> [1,] 1 12 #> #> [[3]] #> start end #> [1,] 1 12 #> #> [[4]] #> start end #> [1,] 7 18 #> [2,] 27 38 # What are the phone numbers? str_extract(strings, phone) #> [1] NA \"219 733 8965\" \"329-293-8753\" \"579-499-7527\" str_extract_all(strings, phone) #> [[1]] #> character(0) #> #> [[2]] #> [1] \"219 733 8965\" #> #> [[3]] #> [1] \"329-293-8753\" #> #> [[4]] #> [1] \"579-499-7527\" \"543.355.3679\" str_extract_all(strings, phone, simplify = TRUE) #> [,1] [,2] #> [1,] \"\" \"\" #> [2,] \"219 733 8965\" \"\" #> [3,] \"329-293-8753\" \"\" #> [4,] \"579-499-7527\" \"543.355.3679\" # Pull out the three components of the match str_match(strings, phone) #> [,1] [,2] [,3] [,4] #> [1,] NA NA NA NA #> [2,] \"219 733 8965\" \"219\" \"733\" \"8965\" #> [3,] \"329-293-8753\" \"329\" \"293\" \"8753\" #> [4,] \"579-499-7527\" \"579\" \"499\" \"7527\" str_match_all(strings, phone) #> [[1]] #> [,1] [,2] [,3] [,4] #> #> [[2]] #> [,1] [,2] [,3] [,4] #> [1,] \"219 733 8965\" \"219\" \"733\" \"8965\" #> #> [[3]] #> [,1] [,2] [,3] [,4] #> [1,] \"329-293-8753\" \"329\" \"293\" \"8753\" #> #> [[4]] #> [,1] [,2] [,3] [,4] #> [1,] \"579-499-7527\" \"579\" \"499\" \"7527\" #> [2,] \"543.355.3679\" \"543\" \"355\" \"3679\" str_replace(strings, phone, \"XXX-XXX-XXXX\") #> [1] \"apple\" #> [2] \"XXX-XXX-XXXX\" #> [3] \"XXX-XXX-XXXX\" #> [4] \"Work: XXX-XXX-XXXX; Home: 543.355.3679\" str_replace_all(strings, phone, \"XXX-XXX-XXXX\") #> [1] \"apple\" #> [2] \"XXX-XXX-XXXX\" #> [3] \"XXX-XXX-XXXX\" #> [4] \"Work: XXX-XXX-XXXX; Home: XXX-XXX-XXXX\" str_split(\"a-b-c\", \"-\") #> [[1]] #> [1] \"a\" \"b\" \"c\" str_split_fixed(\"a-b-c\", \"-\", n = 2) #> [,1] [,2] #> [1,] \"a\" \"b-c\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"engines","dir":"Articles","previous_headings":"Pattern matching","what":"Engines","title":"Introduction to stringr","text":"four main engines stringr can use describe patterns: Regular expressions, default, shown , described vignette(\"regular-expressions\"). Fixed bytewise matching, fixed(). Locale-sensitive character matching, coll() Text boundary analysis boundary().","code":""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"fixed-matches","dir":"Articles","previous_headings":"Pattern matching > Engines","what":"Fixed matches","title":"Introduction to stringr","text":"fixed(x) matches exact sequence bytes specified x. limited “pattern”, restriction can make matching much faster. Beware using fixed() non-English data. problematic often multiple ways representing character. example, two ways define “á”: either single character “” plus accent: render identically, ’re defined differently, fixed() doesn’t find match. Instead, can use coll(), explained , respect human character comparison rules:","code":"a1 <- \"\\u00e1\" a2 <- \"a\\u0301\" c(a1, a2) #> [1] \"á\" \"á\" a1 == a2 #> [1] FALSE str_detect(a1, fixed(a2)) #> [1] FALSE str_detect(a1, coll(a2)) #> [1] TRUE"},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"collation-search","dir":"Articles","previous_headings":"Pattern matching > Engines","what":"Collation search","title":"Introduction to stringr","text":"coll(x) looks match x using human-language collation rules, particularly important want case insensitive matching. Collation rules differ around world, ’ll also need supply locale parameter. downside coll() speed. rules recognising characters complicated, coll() relatively slow compared regex() fixed(). Note fixed() regex() ignore_case arguments, perform much simpler comparison coll().","code":"i <- c(\"I\", \"İ\", \"i\", \"ı\") i #> [1] \"I\" \"İ\" \"i\" \"ı\" str_subset(i, coll(\"i\", ignore_case = TRUE)) #> [1] \"I\" \"i\" str_subset(i, coll(\"i\", ignore_case = TRUE, locale = \"tr\")) #> [1] \"İ\" \"i\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"boundary","dir":"Articles","previous_headings":"Pattern matching > Engines","what":"Boundary","title":"Introduction to stringr","text":"boundary() matches boundaries characters, lines, sentences words. ’s useful str_split(), can used pattern matching functions: convention, \"\" treated boundary(\"character\"):","code":"x <- \"This is a sentence.\" str_split(x, boundary(\"word\")) #> [[1]] #> [1] \"This\" \"is\" \"a\" \"sentence\" str_count(x, boundary(\"word\")) #> [1] 4 str_extract_all(x, boundary(\"word\")) #> [[1]] #> [1] \"This\" \"is\" \"a\" \"sentence\" str_split(x, \"\") #> [[1]] #> [1] \"T\" \"h\" \"i\" \"s\" \" \" \"i\" \"s\" \" \" \"a\" \" \" \"s\" \"e\" \"n\" \"t\" \"e\" \"n\" \"c\" #> [18] \"e\" \".\" str_count(x, \"\") #> [1] 19"},{"path":"https://stringr.tidyverse.org/dev/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Hadley Wickham. Author, maintainer, copyright holder. . Copyright holder, funder.","code":""},{"path":"https://stringr.tidyverse.org/dev/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Wickham H (2023). stringr: Simple, Consistent Wrappers Common String Operations. https://stringr.tidyverse.org, https://github.com/tidyverse/stringr.","code":"@Manual{, title = {stringr: Simple, Consistent Wrappers for Common String Operations}, author = {Hadley Wickham}, year = {2023}, note = {https://stringr.tidyverse.org, https://github.com/tidyverse/stringr}, }"},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"overview","dir":"","previous_headings":"","what":"Overview","title":"Simple, Consistent Wrappers for Common String Operations","text":"Strings glamorous, high-profile components R, play big role many data cleaning preparation tasks. stringr package provides cohesive set functions designed make working strings easy possible. ’re familiar strings, best place start chapter strings R Data Science. stringr built top stringi, uses ICU C library provide fast, correct implementations common string manipulations. stringr focusses important commonly used string manipulation functions whereas stringi provides comprehensive set covering almost anything can imagine. find stringr missing function need, try looking stringi. packages share similar conventions, ’ve mastered stringr, find stringi similarly easy use.","code":""},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Simple, Consistent Wrappers for Common String Operations","text":"","code":"# The easiest way to get stringr is to install the whole tidyverse: install.packages(\"tidyverse\") # Alternatively, install just stringr: install.packages(\"stringr\")"},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"usage","dir":"","previous_headings":"","what":"Usage","title":"Simple, Consistent Wrappers for Common String Operations","text":"functions stringr start str_ take vector strings first argument: string functions work regular expressions, concise language describing patterns text. example, regular expression \"[aeiou]\" matches single character vowel: seven main verbs work patterns: str_detect(x, pattern) tells ’s match pattern: str_count(x, pattern) counts number patterns: str_subset(x, pattern) extracts matching components: str_locate(x, pattern) gives position match: str_extract(x, pattern) extracts text match: str_match(x, pattern) extracts parts match defined parentheses: str_replace(x, pattern, replacement) replaces matches new text: str_split(x, pattern) splits string multiple pieces: well regular expressions (default), three pattern matching engines: fixed(): match exact bytes coll(): match human letters boundary(): match boundaries","code":"x <- c(\"why\", \"video\", \"cross\", \"extra\", \"deal\", \"authority\") str_length(x) #> [1] 3 5 5 5 4 9 str_c(x, collapse = \", \") #> [1] \"why, video, cross, extra, deal, authority\" str_sub(x, 1, 2) #> [1] \"wh\" \"vi\" \"cr\" \"ex\" \"de\" \"au\" str_subset(x, \"[aeiou]\") #> [1] \"video\" \"cross\" \"extra\" \"deal\" \"authority\" str_count(x, \"[aeiou]\") #> [1] 0 3 1 2 2 4 str_detect(x, \"[aeiou]\") #> [1] FALSE TRUE TRUE TRUE TRUE TRUE str_count(x, \"[aeiou]\") #> [1] 0 3 1 2 2 4 str_subset(x, \"[aeiou]\") #> [1] \"video\" \"cross\" \"extra\" \"deal\" \"authority\" str_locate(x, \"[aeiou]\") #> start end #> [1,] NA NA #> [2,] 2 2 #> [3,] 3 3 #> [4,] 1 1 #> [5,] 2 2 #> [6,] 1 1 str_extract(x, \"[aeiou]\") #> [1] NA \"i\" \"o\" \"e\" \"e\" \"a\" # extract the characters on either side of the vowel str_match(x, \"(.)[aeiou](.)\") #> [,1] [,2] [,3] #> [1,] NA NA NA #> [2,] \"vid\" \"v\" \"d\" #> [3,] \"ros\" \"r\" \"s\" #> [4,] NA NA NA #> [5,] \"dea\" \"d\" \"a\" #> [6,] \"aut\" \"a\" \"t\" str_replace(x, \"[aeiou]\", \"?\") #> [1] \"why\" \"v?deo\" \"cr?ss\" \"?xtra\" \"d?al\" \"?uthority\" str_split(c(\"a,b\", \"c,d,e\"), \",\") #> [[1]] #> [1] \"a\" \"b\" #> #> [[2]] #> [1] \"c\" \"d\" \"e\""},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"rstudio-addin","dir":"","previous_headings":"","what":"RStudio Addin","title":"Simple, Consistent Wrappers for Common String Operations","text":"RegExplain RStudio addin provides friendly interface working regular expressions functions stringr. addin allows interactively build regexp, check output common string matching functions, consult interactive help pages, use included resources learn regular expressions. addin can easily installed devtools:","code":"# install.packages(\"devtools\") devtools::install_github(\"gadenbuie/regexplain\")"},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"compared-to-base-r","dir":"","previous_headings":"","what":"Compared to base R","title":"Simple, Consistent Wrappers for Common String Operations","text":"R provides solid set string operations, grown organically time, can inconsistent little hard learn. Additionally, lag behind string operations programming languages, things easy languages like Ruby Python rather hard R. Uses consistent function argument names. first argument always vector strings modify, makes stringr work particularly well conjunction pipe: Simplifies string operations eliminating options don’t need 95% time. Produces outputs can easily used inputs. includes ensuring missing inputs result missing outputs, zero length inputs result zero length outputs. Learn vignette(\"-base\")","code":"letters %>% .[1:10] %>% str_pad(3, \"right\") %>% str_c(letters[2:11]) #> [1] \"a b\" \"b c\" \"c d\" \"d e\" \"e f\" \"f g\" \"g h\" \"h i\" \"i j\" \"j k\""},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert string to upper case, lower case, title case, or sentence case — case","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"str_to_upper() converts upper case. str_to_lower() converts lower case. str_to_title() converts title case, first letter word capitalized. str_to_sentence() convert sentence case, first letter sentence capitalized.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"","code":"str_to_upper(string, locale = \"en\") str_to_lower(string, locale = \"en\") str_to_title(string, locale = \"en\") str_to_sentence(string, locale = \"en\")"},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"string Input vector. Either character vector, something coercible one. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"character vector length string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"","code":"dog <- \"The quick brown dog\" str_to_upper(dog) #> [1] \"THE QUICK BROWN DOG\" str_to_lower(dog) #> [1] \"the quick brown dog\" str_to_title(dog) #> [1] \"The Quick Brown Dog\" str_to_sentence(\"the quick brown dog\") #> [1] \"The quick brown dog\" # Locale matters! str_to_upper(\"i\") # English #> [1] \"I\" str_to_upper(\"i\", \"tr\") # Turkish #> [1] \"İ\""},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":null,"dir":"Reference","previous_headings":"","what":"Switch location of matches to location of non-matches — invert_match","title":"Switch location of matches to location of non-matches — invert_match","text":"Invert matrix match locations match opposite previously matched.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Switch location of matches to location of non-matches — invert_match","text":"","code":"invert_match(loc)"},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Switch location of matches to location of non-matches — invert_match","text":"loc matrix match locations, str_locate_all()","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Switch location of matches to location of non-matches — invert_match","text":"numeric match giving locations non-matches","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Switch location of matches to location of non-matches — invert_match","text":"","code":"numbers <- \"1 and 2 and 4 and 456\" num_loc <- str_locate_all(numbers, \"[0-9]+\")[[1]] str_sub(numbers, num_loc[, \"start\"], num_loc[, \"end\"]) #> [1] \"1\" \"2\" \"4\" \"456\" text_loc <- invert_match(num_loc) str_sub(numbers, text_loc[, \"start\"], text_loc[, \"end\"]) #> [1] \"\" \" and \" \" and \" \" and \" \"\""},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":null,"dir":"Reference","previous_headings":"","what":"Control matching behaviour with modifier functions — modifiers","title":"Control matching behaviour with modifier functions — modifiers","text":"Modifier functions control meaning pattern argument stringr functions: boundary(): Match boundaries things. coll(): Compare strings using standard Unicode collation rules. fixed(): Compare literal bytes. regex() (default): Uses ICU regular expressions.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Control matching behaviour with modifier functions — modifiers","text":"","code":"fixed(pattern, ignore_case = FALSE) coll(pattern, ignore_case = FALSE, locale = \"en\", ...) regex( pattern, ignore_case = FALSE, multiline = FALSE, comments = FALSE, dotall = FALSE, ... ) boundary( type = c(\"character\", \"line_break\", \"sentence\", \"word\"), skip_word_none = NA, ... )"},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Control matching behaviour with modifier functions — modifiers","text":"pattern Pattern modify behaviour. ignore_case case differences ignored match? fixed(), uses simple algorithm assumes one--one mapping upper lower case letters. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms. ... less frequently used arguments passed stringi::stri_opts_collator(), stringi::stri_opts_regex(), stringi::stri_opts_brkiter() multiline TRUE, $ ^ match beginning end line. FALSE, default, match start end input. comments TRUE, white space comments beginning # ignored. Escape literal spaces \\\\ . dotall TRUE, . also match line terminators. type Boundary type detect. character Every character boundary. line_break Boundaries places acceptable line break current locale. sentence beginnings ends sentences boundaries, using intelligent rules avoid counting abbreviations (details). word beginnings ends words boundaries. skip_word_none Ignore \"words\" contain characters numbers - .e. punctuation. Default NA skip \"words\" splitting word boundaries.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Control matching behaviour with modifier functions — modifiers","text":"stringr modifier object, .e. character vector parent S3 class stringr_pattern.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Control matching behaviour with modifier functions — modifiers","text":"","code":"pattern <- \"a.b\" strings <- c(\"abb\", \"a.b\") str_detect(strings, pattern) #> [1] TRUE TRUE str_detect(strings, fixed(pattern)) #> [1] FALSE TRUE str_detect(strings, coll(pattern)) #> [1] FALSE TRUE # coll() is useful for locale-aware case-insensitive matching i <- c(\"I\", \"\\u0130\", \"i\") i #> [1] \"I\" \"İ\" \"i\" str_detect(i, fixed(\"i\", TRUE)) #> [1] TRUE FALSE TRUE str_detect(i, coll(\"i\", TRUE)) #> [1] TRUE FALSE TRUE str_detect(i, coll(\"i\", TRUE, locale = \"tr\")) #> [1] FALSE TRUE TRUE # Word boundaries words <- c(\"These are some words.\") str_count(words, boundary(\"word\")) #> [1] 4 str_split(words, \" \")[[1]] #> [1] \"These\" \"are\" \"\" \"\" \"some\" \"words.\" str_split(words, boundary(\"word\"))[[1]] #> [1] \"These\" \"are\" \"some\" \"words\" # Regular expression variations str_extract_all(\"The Cat in the Hat\", \"[a-z]+\") #> [[1]] #> [1] \"he\" \"at\" \"in\" \"the\" \"at\" #> str_extract_all(\"The Cat in the Hat\", regex(\"[a-z]+\", TRUE)) #> [[1]] #> [1] \"The\" \"Cat\" \"in\" \"the\" \"Hat\" #> str_extract_all(\"a\\nb\\nc\", \"^.\") #> [[1]] #> [1] \"a\" #> str_extract_all(\"a\\nb\\nc\", regex(\"^.\", multiline = TRUE)) #> [[1]] #> [1] \"a\" \"b\" \"c\" #> str_extract_all(\"a\\nb\\nc\", \"a.\") #> [[1]] #> character(0) #> str_extract_all(\"a\\nb\\nc\", regex(\"a.\", dotall = TRUE)) #> [[1]] #> [1] \"a\\n\" #>"},{"path":"https://stringr.tidyverse.org/dev/reference/pipe.html","id":null,"dir":"Reference","previous_headings":"","what":"Pipe operator — %>%","title":"Pipe operator — %>%","text":"Pipe operator","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/pipe.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Pipe operator — %>%","text":"","code":"lhs %>% rhs"},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":null,"dir":"Reference","previous_headings":"","what":"Join multiple strings into one string — str_c","title":"Join multiple strings into one string — str_c","text":"str_c() combines multiple character vectors single character vector. similar paste0() uses tidyverse recycling NA rules. One way understand str_c() works picture 2d matrix strings, argument forms column. sep inserted column, row combined together single string. collapse set, inserted row, result combined, time single string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Join multiple strings into one string — str_c","text":"","code":"str_c(..., sep = \"\", collapse = NULL)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Join multiple strings into one string — str_c","text":"... One character vectors. NULLs removed; scalar inputs (vectors length 1) recycled common length vector inputs. Like R functions, missing values \"infectious\": whenever missing value combined another string result always missing. Use dplyr::coalesce() str_replace_na() convert desired value. sep String insert input vectors. collapse Optional string used combine output single string. Generally better use str_flatten() needed behaviour.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Join multiple strings into one string — str_c","text":"collapse = NULL (default) character vector length equal longest input. collapse string, character vector length 1.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Join multiple strings into one string — str_c","text":"","code":"str_c(\"Letter: \", letters) #> [1] \"Letter: a\" \"Letter: b\" \"Letter: c\" \"Letter: d\" \"Letter: e\" #> [6] \"Letter: f\" \"Letter: g\" \"Letter: h\" \"Letter: i\" \"Letter: j\" #> [11] \"Letter: k\" \"Letter: l\" \"Letter: m\" \"Letter: n\" \"Letter: o\" #> [16] \"Letter: p\" \"Letter: q\" \"Letter: r\" \"Letter: s\" \"Letter: t\" #> [21] \"Letter: u\" \"Letter: v\" \"Letter: w\" \"Letter: x\" \"Letter: y\" #> [26] \"Letter: z\" str_c(\"Letter\", letters, sep = \": \") #> [1] \"Letter: a\" \"Letter: b\" \"Letter: c\" \"Letter: d\" \"Letter: e\" #> [6] \"Letter: f\" \"Letter: g\" \"Letter: h\" \"Letter: i\" \"Letter: j\" #> [11] \"Letter: k\" \"Letter: l\" \"Letter: m\" \"Letter: n\" \"Letter: o\" #> [16] \"Letter: p\" \"Letter: q\" \"Letter: r\" \"Letter: s\" \"Letter: t\" #> [21] \"Letter: u\" \"Letter: v\" \"Letter: w\" \"Letter: x\" \"Letter: y\" #> [26] \"Letter: z\" str_c(letters, \" is for\", \"...\") #> [1] \"a is for...\" \"b is for...\" \"c is for...\" \"d is for...\" \"e is for...\" #> [6] \"f is for...\" \"g is for...\" \"h is for...\" \"i is for...\" \"j is for...\" #> [11] \"k is for...\" \"l is for...\" \"m is for...\" \"n is for...\" \"o is for...\" #> [16] \"p is for...\" \"q is for...\" \"r is for...\" \"s is for...\" \"t is for...\" #> [21] \"u is for...\" \"v is for...\" \"w is for...\" \"x is for...\" \"y is for...\" #> [26] \"z is for...\" str_c(letters[-26], \" comes before \", letters[-1]) #> [1] \"a comes before b\" \"b comes before c\" \"c comes before d\" #> [4] \"d comes before e\" \"e comes before f\" \"f comes before g\" #> [7] \"g comes before h\" \"h comes before i\" \"i comes before j\" #> [10] \"j comes before k\" \"k comes before l\" \"l comes before m\" #> [13] \"m comes before n\" \"n comes before o\" \"o comes before p\" #> [16] \"p comes before q\" \"q comes before r\" \"r comes before s\" #> [19] \"s comes before t\" \"t comes before u\" \"u comes before v\" #> [22] \"v comes before w\" \"w comes before x\" \"x comes before y\" #> [25] \"y comes before z\" str_c(letters, collapse = \"\") #> [1] \"abcdefghijklmnopqrstuvwxyz\" str_c(letters, collapse = \", \") #> [1] \"a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z\" # Differences from paste() ---------------------- # Missing inputs give missing outputs str_c(c(\"a\", NA, \"b\"), \"-d\") #> [1] \"a-d\" NA \"b-d\" paste0(c(\"a\", NA, \"b\"), \"-d\") #> [1] \"a-d\" \"NA-d\" \"b-d\" # Use str_replace_NA to display literal NAs: str_c(str_replace_na(c(\"a\", NA, \"b\")), \"-d\") #> [1] \"a-d\" \"NA-d\" \"b-d\" # Uses tidyverse recycling rules if (FALSE) str_c(1:2, 1:3) # errors paste0(1:2, 1:3) #> [1] \"11\" \"22\" \"13\" str_c(\"x\", character()) #> character(0) paste0(\"x\", character()) #> [1] \"x\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_conv.html","id":null,"dir":"Reference","previous_headings":"","what":"Specify the encoding of a string — str_conv","title":"Specify the encoding of a string — str_conv","text":"convenient way override current encoding string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_conv.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Specify the encoding of a string — str_conv","text":"","code":"str_conv(string, encoding)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_conv.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Specify the encoding of a string — str_conv","text":"string Input vector. Either character vector, something coercible one. encoding Name encoding. See stringi::stri_enc_list() complete list.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_conv.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Specify the encoding of a string — str_conv","text":"","code":"# Example from encoding?stringi::stringi x <- rawToChar(as.raw(177)) x #> [1] \"\\xb1\" str_conv(x, \"ISO-8859-2\") # Polish \"a with ogonek\" #> [1] \"ą\" str_conv(x, \"ISO-8859-1\") # Plus-minus #> [1] \"±\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":null,"dir":"Reference","previous_headings":"","what":"Count number of matches — str_count","title":"Count number of matches — str_count","text":"Counts number times pattern found within element string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Count number of matches — str_count","text":"","code":"str_count(string, pattern = \"\")"},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Count number of matches — str_count","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\").","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Count number of matches — str_count","text":"integer vector length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Count number of matches — str_count","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_count(fruit, \"a\") #> [1] 1 3 1 1 str_count(fruit, \"p\") #> [1] 2 0 1 3 str_count(fruit, \"e\") #> [1] 1 0 1 2 str_count(fruit, c(\"a\", \"b\", \"p\", \"p\")) #> [1] 1 1 1 3 str_count(c(\"a.\", \"...\", \".a.a\"), \".\") #> [1] 2 3 4 str_count(c(\"a.\", \"...\", \".a.a\"), fixed(\".\")) #> [1] 1 3 2"},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":null,"dir":"Reference","previous_headings":"","what":"Detect the presence/absence of a match — str_detect","title":"Detect the presence/absence of a match — str_detect","text":"str_detect() returns logical vector TRUE element string matches pattern FALSE otherwise. equivalent grepl(pattern, string).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Detect the presence/absence of a match — str_detect","text":"","code":"str_detect(string, pattern, negate = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Detect the presence/absence of a match — str_detect","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). negate TRUE, inverts resulting boolean vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Detect the presence/absence of a match — str_detect","text":"logical vector length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Detect the presence/absence of a match — str_detect","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_detect(fruit, \"a\") #> [1] TRUE TRUE TRUE TRUE str_detect(fruit, \"^a\") #> [1] TRUE FALSE FALSE FALSE str_detect(fruit, \"a$\") #> [1] FALSE TRUE FALSE FALSE str_detect(fruit, \"b\") #> [1] FALSE TRUE FALSE FALSE str_detect(fruit, \"[aeiou]\") #> [1] TRUE TRUE TRUE TRUE # Also vectorised over pattern str_detect(\"aecfg\", letters) #> [1] TRUE FALSE TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE #> [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE #> [23] FALSE FALSE FALSE FALSE # Returns TRUE if the pattern do NOT match str_detect(fruit, \"^p\", negate = TRUE) #> [1] TRUE TRUE FALSE FALSE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":null,"dir":"Reference","previous_headings":"","what":"Duplicate a string — str_dup","title":"Duplicate a string — str_dup","text":"str_dup() duplicates characters within string, e.g. str_dup(\"xy\", 3) returns \"xyxyxy\".","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Duplicate a string — str_dup","text":"","code":"str_dup(string, times)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Duplicate a string — str_dup","text":"string Input vector. Either character vector, something coercible one. times Number times duplicate string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Duplicate a string — str_dup","text":"character vector length string/times.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Duplicate a string — str_dup","text":"","code":"fruit <- c(\"apple\", \"pear\", \"banana\") str_dup(fruit, 2) #> [1] \"appleapple\" \"pearpear\" \"bananabanana\" str_dup(fruit, 1:3) #> [1] \"apple\" \"pearpear\" \"bananabananabanana\" str_c(\"ba\", str_dup(\"na\", 0:5)) #> [1] \"ba\" \"bana\" \"banana\" \"bananana\" #> [5] \"banananana\" \"bananananana\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":null,"dir":"Reference","previous_headings":"","what":"Determine if two strings are equivalent — str_equal","title":"Determine if two strings are equivalent — str_equal","text":"uses Unicode canonicalisation rules, optionally ignores case.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Determine if two strings are equivalent — str_equal","text":"","code":"str_equal(x, y, locale = \"en\", ignore_case = FALSE, ...)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Determine if two strings are equivalent — str_equal","text":"x, y pair character vectors. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms. ignore_case Ignore case comparing strings? ... options used control collation. Passed stringi::stri_opts_collator().","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Determine if two strings are equivalent — str_equal","text":"logical vector length x/y.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Determine if two strings are equivalent — str_equal","text":"","code":"# These two strings encode \"a\" with an accent in two different ways a1 <- \"\\u00e1\" a2 <- \"a\\u0301\" c(a1, a2) #> [1] \"á\" \"á\" a1 == a2 #> [1] FALSE str_equal(a1, a2) #> [1] TRUE # ohm and omega use different code points but should always be treated # as equal ohm <- \"\\u2126\" omega <- \"\\u03A9\" c(ohm, omega) #> [1] \"Ω\" \"Ω\" ohm == omega #> [1] FALSE str_equal(ohm, omega) #> [1] TRUE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":null,"dir":"Reference","previous_headings":"","what":"Escape regular expression metacharacters — str_escape","title":"Escape regular expression metacharacters — str_escape","text":"function escapes metacharacter, characters special meaning regular expression engine. cases better using fixed() since faster, str_escape() useful composing user provided strings pattern.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Escape regular expression metacharacters — str_escape","text":"","code":"str_escape(string)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Escape regular expression metacharacters — str_escape","text":"string Input vector. Either character vector, something coercible one.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Escape regular expression metacharacters — str_escape","text":"character vector length string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Escape regular expression metacharacters — str_escape","text":"","code":"str_detect(c(\"a\", \".\"), \".\") #> [1] TRUE TRUE str_detect(c(\"a\", \".\"), str_escape(\".\")) #> [1] FALSE TRUE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract the complete match — str_extract","title":"Extract the complete match — str_extract","text":"str_extract() extracts first complete match string, str_extract_all()extracts matches string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract the complete match — str_extract","text":"","code":"str_extract(string, pattern, group = NULL) str_extract_all(string, pattern, simplify = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract the complete match — str_extract","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). group supplied, instead returning complete match, return matched text specified capturing group. simplify boolean. FALSE (default): returns list character vectors. TRUE: returns character matrix.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract the complete match — str_extract","text":"str_extract(): character vector length string/pattern. str_extract_all(): list character vectors length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extract the complete match — str_extract","text":"","code":"shopping_list <- c(\"apples x4\", \"bag of flour\", \"bag of sugar\", \"milk x2\") str_extract(shopping_list, \"\\\\d\") #> [1] \"4\" NA NA \"2\" str_extract(shopping_list, \"[a-z]+\") #> [1] \"apples\" \"bag\" \"bag\" \"milk\" str_extract(shopping_list, \"[a-z]{1,4}\") #> [1] \"appl\" \"bag\" \"bag\" \"milk\" str_extract(shopping_list, \"\\\\b[a-z]{1,4}\\\\b\") #> [1] NA \"bag\" \"bag\" \"milk\" str_extract(shopping_list, \"([a-z]+) of ([a-z]+)\") #> [1] NA \"bag of flour\" \"bag of sugar\" NA str_extract(shopping_list, \"([a-z]+) of ([a-z]+)\", group = 1) #> [1] NA \"bag\" \"bag\" NA str_extract(shopping_list, \"([a-z]+) of ([a-z]+)\", group = 2) #> [1] NA \"flour\" \"sugar\" NA # Extract all matches str_extract_all(shopping_list, \"[a-z]+\") #> [[1]] #> [1] \"apples\" \"x\" #> #> [[2]] #> [1] \"bag\" \"of\" \"flour\" #> #> [[3]] #> [1] \"bag\" \"of\" \"sugar\" #> #> [[4]] #> [1] \"milk\" \"x\" #> str_extract_all(shopping_list, \"\\\\b[a-z]+\\\\b\") #> [[1]] #> [1] \"apples\" #> #> [[2]] #> [1] \"bag\" \"of\" \"flour\" #> #> [[3]] #> [1] \"bag\" \"of\" \"sugar\" #> #> [[4]] #> [1] \"milk\" #> str_extract_all(shopping_list, \"\\\\d\") #> [[1]] #> [1] \"4\" #> #> [[2]] #> character(0) #> #> [[3]] #> character(0) #> #> [[4]] #> [1] \"2\" #> # Simplify results into character matrix str_extract_all(shopping_list, \"\\\\b[a-z]+\\\\b\", simplify = TRUE) #> [,1] [,2] [,3] #> [1,] \"apples\" \"\" \"\" #> [2,] \"bag\" \"of\" \"flour\" #> [3,] \"bag\" \"of\" \"sugar\" #> [4,] \"milk\" \"\" \"\" str_extract_all(shopping_list, \"\\\\d\", simplify = TRUE) #> [,1] #> [1,] \"4\" #> [2,] \"\" #> [3,] \"\" #> [4,] \"2\" # Extract all words str_extract_all(\"This is, suprisingly, a sentence.\", boundary(\"word\")) #> [[1]] #> [1] \"This\" \"is\" \"suprisingly\" \"a\" \"sentence\" #>"},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":null,"dir":"Reference","previous_headings":"","what":"Flatten a string — str_flatten","title":"Flatten a string — str_flatten","text":"str_flatten() reduces character vector single string. summary function regardless length input x, always returns single string. str_flatten_comma() variation designed specifically flattening commas. automatically recognises last uses Oxford comma handles special case 2 elements.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Flatten a string — str_flatten","text":"","code":"str_flatten(string, collapse = \"\", last = NULL, na.rm = FALSE) str_flatten_comma(string, last = NULL, na.rm = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Flatten a string — str_flatten","text":"string Input vector. Either character vector, something coercible one. collapse String insert piece. Defaults \"\". last Optional string use place final separator. na.rm Remove missing values? FALSE (default), result NA element string NA.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Flatten a string — str_flatten","text":"string, .e. character vector length 1.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Flatten a string — str_flatten","text":"","code":"str_flatten(letters) #> [1] \"abcdefghijklmnopqrstuvwxyz\" str_flatten(letters, \"-\") #> [1] \"a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z\" str_flatten(letters[1:3], \", \") #> [1] \"a, b, c\" # Use last to customise the last component str_flatten(letters[1:3], \", \", \" and \") #> [1] \"a, b and c\" # this almost works if you want an Oxford (aka serial) comma str_flatten(letters[1:3], \", \", \", and \") #> [1] \"a, b, and c\" # but it will always add a comma, even when not necessary str_flatten(letters[1:2], \", \", \", and \") #> [1] \"a, and b\" # str_flatten_comma knows how to handle the Oxford comma str_flatten_comma(letters[1:3], \", and \") #> [1] \"a, b, and c\" str_flatten_comma(letters[1:2], \", and \") #> [1] \"a and b\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":null,"dir":"Reference","previous_headings":"","what":"Interpolation with glue — str_glue","title":"Interpolation with glue — str_glue","text":"functions wrappers around glue::glue() glue::glue_data(), provide powerful elegant syntax interpolating strings {}. wrappers provide small set full options. Use glue() glue_data() directly glue control.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Interpolation with glue — str_glue","text":"","code":"str_glue(..., .sep = \"\", .envir = parent.frame()) str_glue_data(.x, ..., .sep = \"\", .envir = parent.frame(), .na = \"NA\")"},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Interpolation with glue — str_glue","text":"... [expressions] Unnamed arguments taken expression string(s) format. Multiple inputs concatenated together formatting. Named arguments taken temporary variables available substitution. .sep [character(1): ‘\"\"’] Separator used separate elements. .envir [environment: parent.frame()] Environment evaluate expression . Expressions evaluated left right. .x environment, expressions evaluated environment .envir ignored. NULL passed, equivalent emptyenv(). .x [listish] environment, list, data frame used lookup values. .na [character(1): ‘NA’] Value replace NA values . NULL missing values propagated, NA result cause NA output. Otherwise value replaced value .na.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Interpolation with glue — str_glue","text":"character vector length longest input.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Interpolation with glue — str_glue","text":"","code":"name <- \"Fred\" age <- 50 anniversary <- as.Date(\"1991-10-12\") str_glue( \"My name is {name}, \", \"my age next year is {age + 1}, \", \"and my anniversary is {format(anniversary, '%A, %B %d, %Y')}.\" ) #> My name is Fred, my age next year is 51, and my anniversary is Saturday, October 12, 1991. # single braces can be inserted by doubling them str_glue(\"My name is {name}, not {{name}}.\") #> My name is Fred, not {name}. # You can also used named arguments str_glue( \"My name is {name}, \", \"and my age next year is {age + 1}.\", name = \"Joe\", age = 40 ) #> My name is Joe, and my age next year is 41. # `str_glue_data()` is useful in data pipelines mtcars %>% str_glue_data(\"{rownames(.)} has {hp} hp\") #> Mazda RX4 has 110 hp #> Mazda RX4 Wag has 110 hp #> Datsun 710 has 93 hp #> Hornet 4 Drive has 110 hp #> Hornet Sportabout has 175 hp #> Valiant has 105 hp #> Duster 360 has 245 hp #> Merc 240D has 62 hp #> Merc 230 has 95 hp #> Merc 280 has 123 hp #> Merc 280C has 123 hp #> Merc 450SE has 180 hp #> Merc 450SL has 180 hp #> Merc 450SLC has 180 hp #> Cadillac Fleetwood has 205 hp #> Lincoln Continental has 215 hp #> Chrysler Imperial has 230 hp #> Fiat 128 has 66 hp #> Honda Civic has 52 hp #> Toyota Corolla has 65 hp #> Toyota Corona has 97 hp #> Dodge Challenger has 150 hp #> AMC Javelin has 150 hp #> Camaro Z28 has 245 hp #> Pontiac Firebird has 175 hp #> Fiat X1-9 has 66 hp #> Porsche 914-2 has 91 hp #> Lotus Europa has 113 hp #> Ford Pantera L has 264 hp #> Ferrari Dino has 175 hp #> Maserati Bora has 335 hp #> Volvo 142E has 109 hp"},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":null,"dir":"Reference","previous_headings":"","what":"String interpolation — str_interp","title":"String interpolation — str_interp","text":"str_interp() superseded favour str_glue(). String interpolation useful way specifying character string depends values certain environment. allows string creation easier read write compared using e.g. paste() sprintf(). (template) string can include expression placeholders form ${expression} $[format]{expression}, expressions valid R expressions can evaluated given environment, format format specification valid use sprintf().","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"String interpolation — str_interp","text":"","code":"str_interp(string, env = parent.frame())"},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"String interpolation — str_interp","text":"string template character string. function vectorised: character vector collapsed single string. env environment evaluate expressions.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"String interpolation — str_interp","text":"interpolated character string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"String interpolation — str_interp","text":"Stefan Milton Bache","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"String interpolation — str_interp","text":"","code":"# Using values from the environment, and some formats user_name <- \"smbache\" amount <- 6.656 account <- 1337 str_interp(\"User ${user_name} (account $[08d]{account}) has $$[.2f]{amount}.\") #> [1] \"User smbache (account 00001337) has $6.66.\" # Nested brace pairs work inside expressions too, and any braces can be # placed outside the expressions. str_interp(\"Works with } nested { braces too: $[.2f]{{{2 + 2}*{amount}}}\") #> [1] \"Works with } nested { braces too: 26.62\" # Values can also come from a list str_interp( \"One value, ${value1}, and then another, ${value2*2}.\", list(value1 = 10, value2 = 20) ) #> [1] \"One value, 10, and then another, 40.\" # Or a data frame str_interp( \"Values are $[.2f]{max(Sepal.Width)} and $[.2f]{min(Sepal.Width)}.\", iris ) #> [1] \"Values are 4.40 and 2.00.\" # Use a vector when the string is long: max_char <- 80 str_interp(c( \"This particular line is so long that it is hard to write \", \"without breaking the ${max_char}-char barrier!\" )) #> [1] \"This particular line is so long that it is hard to write without breaking the 80-char barrier!\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute the length/width — str_length","title":"Compute the length/width — str_length","text":"str_length() returns number codepoints string. individual elements (often, always letters) can extracted str_sub(). str_width() returns much space string occupy printed fixed width font (.e. printed console).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute the length/width — str_length","text":"","code":"str_length(string) str_width(string)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute the length/width — str_length","text":"string Input vector. Either character vector, something coercible one.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute the length/width — str_length","text":"numeric vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Compute the length/width — str_length","text":"","code":"str_length(letters) #> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 str_length(NA) #> [1] NA str_length(factor(\"abc\")) #> [1] 3 str_length(c(\"i\", \"like\", \"programming\", NA)) #> [1] 1 4 11 NA # Some characters, like emoji and Chinese characters (hanzi), are square # which means they take up the width of two Latin characters x <- c(\"\\u6c49\\u5b57\", \"\\U0001f60a\") str_view(x) #> [1] │ 汉字 #> [2] │ 😊 str_width(x) #> [1] 4 2 str_length(x) #> [1] 2 1 # There are two ways of representing a u with an umlaut u <- c(\"\\u00fc\", \"u\\u0308\") # They have the same width str_width(u) #> [1] 1 1 # But a different length str_length(u) #> [1] 1 2 # Because the second element is made up of a u + an accent str_sub(u, 1, 1) #> [1] \"ü\" \"u\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":null,"dir":"Reference","previous_headings":"","what":"Detect a pattern in the same way as SQL's LIKE operator — str_like","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"str_like() follows conventions SQL LIKE operator: Must match entire string. _ matches single character (like .). % matches number characters (like .*). \\% \\_ match literal % _. match case insensitive default.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"","code":"str_like(string, pattern, ignore_case = TRUE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"string Input vector. Either character vector, something coercible one. pattern character vector containing SQL \"like\" pattern. See details. ignore_case Ignore case matches? Defaults TRUE match SQL LIKE operator.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"logical vector length string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_like(fruit, \"app\") #> [1] FALSE FALSE FALSE FALSE str_like(fruit, \"app%\") #> [1] TRUE FALSE FALSE FALSE str_like(fruit, \"ba_ana\") #> [1] FALSE TRUE FALSE FALSE str_like(fruit, \"%APPLE\") #> [1] TRUE FALSE FALSE TRUE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":null,"dir":"Reference","previous_headings":"","what":"Find location of match — str_locate","title":"Find location of match — str_locate","text":"str_locate() returns start end position first match; str_locate_all() returns start end position match. start end values inclusive, zero-length matches (e.g. $, ^, \\\\b) end smaller start.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find location of match — str_locate","text":"","code":"str_locate(string, pattern) str_locate_all(string, pattern)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find location of match — str_locate","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\").","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find location of match — str_locate","text":"str_locate() returns integer matrix two columns one row element string. first column, start, gives position start match, second column, end, gives position end. str_locate_all() returns list integer matrices length string/pattern. matrices columns start end , one row match.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find location of match — str_locate","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_locate(fruit, \"$\") #> start end #> [1,] 6 5 #> [2,] 7 6 #> [3,] 5 4 #> [4,] 10 9 str_locate(fruit, \"a\") #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 5 5 str_locate(fruit, \"e\") #> start end #> [1,] 5 5 #> [2,] NA NA #> [3,] 2 2 #> [4,] 4 4 str_locate(fruit, c(\"a\", \"b\", \"p\", \"p\")) #> start end #> [1,] 1 1 #> [2,] 1 1 #> [3,] 1 1 #> [4,] 1 1 str_locate_all(fruit, \"a\") #> [[1]] #> start end #> [1,] 1 1 #> #> [[2]] #> start end #> [1,] 2 2 #> [2,] 4 4 #> [3,] 6 6 #> #> [[3]] #> start end #> [1,] 3 3 #> #> [[4]] #> start end #> [1,] 5 5 #> str_locate_all(fruit, \"e\") #> [[1]] #> start end #> [1,] 5 5 #> #> [[2]] #> start end #> #> [[3]] #> start end #> [1,] 2 2 #> #> [[4]] #> start end #> [1,] 4 4 #> [2,] 9 9 #> str_locate_all(fruit, c(\"a\", \"b\", \"p\", \"p\")) #> [[1]] #> start end #> [1,] 1 1 #> #> [[2]] #> start end #> [1,] 1 1 #> #> [[3]] #> start end #> [1,] 1 1 #> #> [[4]] #> start end #> [1,] 1 1 #> [2,] 6 6 #> [3,] 7 7 #> # Find location of every character str_locate_all(fruit, \"\") #> [[1]] #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 4 4 #> [5,] 5 5 #> #> [[2]] #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 4 4 #> [5,] 5 5 #> [6,] 6 6 #> #> [[3]] #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 4 4 #> #> [[4]] #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 4 4 #> [5,] 5 5 #> [6,] 6 6 #> [7,] 7 7 #> [8,] 8 8 #> [9,] 9 9 #>"},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract components (capturing groups) from a match — str_match","title":"Extract components (capturing groups) from a match — str_match","text":"Extract number matches defined unnamed, (pattern), named, (?pattern) capture groups. Use non-capturing group, (?:pattern), need override default operate precedence want capture result.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract components (capturing groups) from a match — str_match","text":"","code":"str_match(string, pattern) str_match_all(string, pattern)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract components (capturing groups) from a match — str_match","text":"string Input vector. Either character vector, something coercible one. pattern Unlike stringr functions, str_match() supports regular expressions, described vignette(\"regular-expressions\"). pattern contain least one capturing group.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract components (capturing groups) from a match — str_match","text":"str_match(): character matrix number rows length string/pattern. first column complete match, followed one column capture group. columns named used \"named captured groups\", .e. (?pattern'). str_match_all(): list length string/pattern containing character matrices. matrix columns descrbed one row match.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extract components (capturing groups) from a match — str_match","text":"","code":"strings <- c(\" 219 733 8965\", \"329-293-8753 \", \"banana\", \"595 794 7569\", \"387 287 6718\", \"apple\", \"233.398.9187 \", \"482 952 3315\", \"239 923 8115 and 842 566 4692\", \"Work: 579-499-7527\", \"$1000\", \"Home: 543.355.3679\") phone <- \"([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})\" str_extract(strings, phone) #> [1] \"219 733 8965\" \"329-293-8753\" NA \"595 794 7569\" #> [5] \"387 287 6718\" NA \"233.398.9187\" \"482 952 3315\" #> [9] \"239 923 8115\" \"579-499-7527\" NA \"543.355.3679\" str_match(strings, phone) #> [,1] [,2] [,3] [,4] #> [1,] \"219 733 8965\" \"219\" \"733\" \"8965\" #> [2,] \"329-293-8753\" \"329\" \"293\" \"8753\" #> [3,] NA NA NA NA #> [4,] \"595 794 7569\" \"595\" \"794\" \"7569\" #> [5,] \"387 287 6718\" \"387\" \"287\" \"6718\" #> [6,] NA NA NA NA #> [7,] \"233.398.9187\" \"233\" \"398\" \"9187\" #> [8,] \"482 952 3315\" \"482\" \"952\" \"3315\" #> [9,] \"239 923 8115\" \"239\" \"923\" \"8115\" #> [10,] \"579-499-7527\" \"579\" \"499\" \"7527\" #> [11,] NA NA NA NA #> [12,] \"543.355.3679\" \"543\" \"355\" \"3679\" # Extract/match all str_extract_all(strings, phone) #> [[1]] #> [1] \"219 733 8965\" #> #> [[2]] #> [1] \"329-293-8753\" #> #> [[3]] #> character(0) #> #> [[4]] #> [1] \"595 794 7569\" #> #> [[5]] #> [1] \"387 287 6718\" #> #> [[6]] #> character(0) #> #> [[7]] #> [1] \"233.398.9187\" #> #> [[8]] #> [1] \"482 952 3315\" #> #> [[9]] #> [1] \"239 923 8115\" \"842 566 4692\" #> #> [[10]] #> [1] \"579-499-7527\" #> #> [[11]] #> character(0) #> #> [[12]] #> [1] \"543.355.3679\" #> str_match_all(strings, phone) #> [[1]] #> [,1] [,2] [,3] [,4] #> [1,] \"219 733 8965\" \"219\" \"733\" \"8965\" #> #> [[2]] #> [,1] [,2] [,3] [,4] #> [1,] \"329-293-8753\" \"329\" \"293\" \"8753\" #> #> [[3]] #> [,1] [,2] [,3] [,4] #> #> [[4]] #> [,1] [,2] [,3] [,4] #> [1,] \"595 794 7569\" \"595\" \"794\" \"7569\" #> #> [[5]] #> [,1] [,2] [,3] [,4] #> [1,] \"387 287 6718\" \"387\" \"287\" \"6718\" #> #> [[6]] #> [,1] [,2] [,3] [,4] #> #> [[7]] #> [,1] [,2] [,3] [,4] #> [1,] \"233.398.9187\" \"233\" \"398\" \"9187\" #> #> [[8]] #> [,1] [,2] [,3] [,4] #> [1,] \"482 952 3315\" \"482\" \"952\" \"3315\" #> #> [[9]] #> [,1] [,2] [,3] [,4] #> [1,] \"239 923 8115\" \"239\" \"923\" \"8115\" #> [2,] \"842 566 4692\" \"842\" \"566\" \"4692\" #> #> [[10]] #> [,1] [,2] [,3] [,4] #> [1,] \"579-499-7527\" \"579\" \"499\" \"7527\" #> #> [[11]] #> [,1] [,2] [,3] [,4] #> #> [[12]] #> [,1] [,2] [,3] [,4] #> [1,] \"543.355.3679\" \"543\" \"355\" \"3679\" #> # You can also name the groups to make further manipulation easier phone <- \"(?[2-9][0-9]{2})[- .](?[0-9]{3}[- .][0-9]{4})\" str_match(strings, phone) #> area phone #> [1,] \"219 733 8965\" \"219\" \"733 8965\" #> [2,] \"329-293-8753\" \"329\" \"293-8753\" #> [3,] NA NA NA #> [4,] \"595 794 7569\" \"595\" \"794 7569\" #> [5,] \"387 287 6718\" \"387\" \"287 6718\" #> [6,] NA NA NA #> [7,] \"233.398.9187\" \"233\" \"398.9187\" #> [8,] \"482 952 3315\" \"482\" \"952 3315\" #> [9,] \"239 923 8115\" \"239\" \"923 8115\" #> [10,] \"579-499-7527\" \"579\" \"499-7527\" #> [11,] NA NA NA #> [12,] \"543.355.3679\" \"543\" \"355.3679\" x <- c(\"
\", \" <>\", \"\", \"\", NA) str_match(x, \"<(.*?)> <(.*?)>\") #> [,1] [,2] [,3] #> [1,] \" \" \"a\" \"b\" #> [2,] \" <>\" \"a\" \"\" #> [3,] NA NA NA #> [4,] NA NA NA #> [5,] NA NA NA str_match_all(x, \"<(.*?)>\") #> [[1]] #> [,1] [,2] #> [1,] \"\" \"a\" #> [2,] \"\" \"b\" #> #> [[2]] #> [,1] [,2] #> [1,] \"\" \"a\" #> [2,] \"<>\" \"\" #> #> [[3]] #> [,1] [,2] #> [1,] \"\" \"a\" #> #> [[4]] #> [,1] [,2] #> #> [[5]] #> [,1] [,2] #> [1,] NA NA #> str_extract(x, \"<.*?>\") #> [1] \"\" \"\" \"\" NA NA str_extract_all(x, \"<.*?>\") #> [[1]] #> [1] \"\" \"\" #> #> [[2]] #> [1] \"\" \"<>\" #> #> [[3]] #> [1] \"\" #> #> [[4]] #> character(0) #> #> [[5]] #> [1] NA #>"},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":null,"dir":"Reference","previous_headings":"","what":"Order, rank, or sort a character vector — str_order","title":"Order, rank, or sort a character vector — str_order","text":"str_sort() returns sorted vector. str_order() returns integer vector returns desired order used subsetting, .e. x[str_order(x)] str_sort() str_rank() returns ranks values, .e. arrange(df, str_rank(x)) str_sort(df$x).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Order, rank, or sort a character vector — str_order","text":"","code":"str_order( x, decreasing = FALSE, na_last = TRUE, locale = \"en\", numeric = FALSE, ... ) str_rank(x, locale = \"en\", numeric = FALSE, ...) str_sort( x, decreasing = FALSE, na_last = TRUE, locale = \"en\", numeric = FALSE, ... )"},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Order, rank, or sort a character vector — str_order","text":"x character vector sort. decreasing boolean. FALSE, default, sorts lowest highest; TRUE sorts highest lowest. na_last NA go? TRUE end, FALSE beginning, NA dropped. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms. numeric TRUE, sort digits numerically, instead strings. ... options used control collation. Passed stringi::stri_opts_collator().","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Order, rank, or sort a character vector — str_order","text":"character vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Order, rank, or sort a character vector — str_order","text":"","code":"x <- c(\"apple\", \"car\", \"happy\", \"char\") str_sort(x) #> [1] \"apple\" \"car\" \"char\" \"happy\" str_order(x) #> [1] 1 2 4 3 x[str_order(x)] #> [1] \"apple\" \"car\" \"char\" \"happy\" str_rank(x) #> [1] 1 2 4 3 # In Czech, ch is a digraph that sorts after h str_sort(x, locale = \"cs\") #> [1] \"apple\" \"car\" \"happy\" \"char\" # Use numeric = TRUE to sort numbers in strings x <- c(\"100a10\", \"100a5\", \"2b\", \"2a\") str_sort(x) #> [1] \"100a10\" \"100a5\" \"2a\" \"2b\" str_sort(x, numeric = TRUE) #> [1] \"2a\" \"2b\" \"100a5\" \"100a10\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":null,"dir":"Reference","previous_headings":"","what":"Pad a string to minimum width — str_pad","title":"Pad a string to minimum width — str_pad","text":"Pad string fixed width, str_length(str_pad(x, n)) always greater equal n.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Pad a string to minimum width — str_pad","text":"","code":"str_pad( string, width, side = c(\"left\", \"right\", \"both\"), pad = \" \", use_width = TRUE )"},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Pad a string to minimum width — str_pad","text":"string Input vector. Either character vector, something coercible one. width Minimum width padded strings. side Side padding character added (left, right ). pad Single padding character (default space). use_width FALSE, use length string instead width; see str_width()/str_length() difference.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Pad a string to minimum width — str_pad","text":"character vector length stringr/width/pad.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Pad a string to minimum width — str_pad","text":"","code":"rbind( str_pad(\"hadley\", 30, \"left\"), str_pad(\"hadley\", 30, \"right\"), str_pad(\"hadley\", 30, \"both\") ) #> [,1] #> [1,] \" hadley\" #> [2,] \"hadley \" #> [3,] \" hadley \" # All arguments are vectorised except side str_pad(c(\"a\", \"abc\", \"abcdef\"), 10) #> [1] \" a\" \" abc\" \" abcdef\" str_pad(\"a\", c(5, 10, 20)) #> [1] \" a\" \" a\" \" a\" str_pad(\"a\", 10, pad = c(\"-\", \"_\", \" \")) #> [1] \"---------a\" \"_________a\" \" a\" # Longer strings are returned unchanged str_pad(\"hadley\", 3) #> [1] \"hadley\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":null,"dir":"Reference","previous_headings":"","what":"Remove matched patterns — str_remove","title":"Remove matched patterns — str_remove","text":"Remove matches, .e. replace \"\".","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Remove matched patterns — str_remove","text":"","code":"str_remove(string, pattern) str_remove_all(string, pattern)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Remove matched patterns — str_remove","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\").","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Remove matched patterns — str_remove","text":"character vector length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Remove matched patterns — str_remove","text":"","code":"fruits <- c(\"one apple\", \"two pears\", \"three bananas\") str_remove(fruits, \"[aeiou]\") #> [1] \"ne apple\" \"tw pears\" \"thre bananas\" str_remove_all(fruits, \"[aeiou]\") #> [1] \"n ppl\" \"tw prs\" \"thr bnns\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":null,"dir":"Reference","previous_headings":"","what":"Replace matches with new text — str_replace","title":"Replace matches with new text — str_replace","text":"str_replace() replaces first match; str_replace_all() replaces matches.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Replace matches with new text — str_replace","text":"","code":"str_replace(string, pattern, replacement) str_replace_all(string, pattern, replacement)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Replace matches with new text — str_replace","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described stringi::about_search_regex. Control options regex(). perform multiple replacements element string, pass supply named vector (c(pattern1 = replacement1)). Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. replacement replacement value, usually single string, can vector length string pattern. References form \\1, \\2, etc replaced contents respective matched group (created ()). Alternatively, supply function, called match (right left) return value used replace match.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Replace matches with new text — str_replace","text":"character vector length string/pattern/replacement.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Replace matches with new text — str_replace","text":"","code":"fruits <- c(\"one apple\", \"two pears\", \"three bananas\") str_replace(fruits, \"[aeiou]\", \"-\") #> [1] \"-ne apple\" \"tw- pears\" \"thr-e bananas\" str_replace_all(fruits, \"[aeiou]\", \"-\") #> [1] \"-n- -ppl-\" \"tw- p--rs\" \"thr-- b-n-n-s\" str_replace_all(fruits, \"[aeiou]\", toupper) #> [1] \"OnE ApplE\" \"twO pEArs\" \"thrEE bAnAnAs\" str_replace_all(fruits, \"b\", NA_character_) #> [1] \"one apple\" \"two pears\" NA str_replace(fruits, \"([aeiou])\", \"\") #> [1] \"ne apple\" \"tw pears\" \"thre bananas\" str_replace(fruits, \"([aeiou])\", \"\\\\1\\\\1\") #> [1] \"oone apple\" \"twoo pears\" \"threee bananas\" # Note that str_replace() is vectorised along text, pattern, and replacement str_replace(fruits, \"[aeiou]\", c(\"1\", \"2\", \"3\")) #> [1] \"1ne apple\" \"tw2 pears\" \"thr3e bananas\" str_replace(fruits, c(\"a\", \"e\", \"i\"), \"-\") #> [1] \"one -pple\" \"two p-ars\" \"three bananas\" # If you want to apply multiple patterns and replacements to the same # string, pass a named vector to pattern. fruits %>% str_c(collapse = \"---\") %>% str_replace_all(c(\"one\" = \"1\", \"two\" = \"2\", \"three\" = \"3\")) #> [1] \"1 apple---2 pears---3 bananas\" # Use a function for more sophisticated replacement. This example # replaces colour names with their hex values. colours <- str_c(\"\\\\b\", colors(), \"\\\\b\", collapse=\"|\") col2hex <- function(col) { rgb <- col2rgb(col) rgb(rgb[\"red\", ], rgb[\"green\", ], rgb[\"blue\", ], max = 255) } x <- c( \"Roses are red, violets are blue\", \"My favourite colour is green\" ) str_replace_all(x, colours, col2hex) #> [1] \"Roses are #FF0000, violets are #0000FF\" #> [2] \"My favourite colour is #00FF00\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace_na.html","id":null,"dir":"Reference","previous_headings":"","what":"Turn NA into ","title":"Turn NA into ","text":"Turn NA \"NA\"","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace_na.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Turn NA into ","text":"","code":"str_replace_na(string, replacement = \"NA\")"},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace_na.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Turn NA into ","text":"string Input vector. Either character vector, something coercible one. replacement single string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace_na.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Turn NA into ","text":"","code":"str_replace_na(c(NA, \"abc\", \"def\")) #> [1] \"NA\" \"abc\" \"def\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":null,"dir":"Reference","previous_headings":"","what":"Split up a string into pieces — str_split","title":"Split up a string into pieces — str_split","text":"functions differ primarily input output types: str_split() takes character vector returns list. str_split_1() takes single string returns character vector. str_split_fixed() takes character vector returns matrix. str_split_i() takes character vector returns character vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Split up a string into pieces — str_split","text":"","code":"str_split(string, pattern, n = Inf, simplify = FALSE) str_split_1(string, pattern) str_split_fixed(string, pattern, n) str_split_i(string, pattern, i)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Split up a string into pieces — str_split","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). n Maximum number pieces return. Default (Inf) uses possible split positions. split_split(), determines maximum length element output. str_split_fixed(), determines number columns output; input short, result padded \"\". simplify boolean. FALSE (default): returns list character vectors. TRUE: returns character matrix. Element return. Use negative value count right hand side.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Split up a string into pieces — str_split","text":"str_split_1(): character vector. str_split(): list length string/pattern containing character vectors. str_split_fixed(): character matrix n columns number rows length string/pattern. str_split_i(): character vector length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Split up a string into pieces — str_split","text":"","code":"fruits <- c( \"apples and oranges and pears and bananas\", \"pineapples and mangos and guavas\" ) str_split(fruits, \" and \") #> [[1]] #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" #> str_split(fruits, \" and \", simplify = TRUE) #> [,1] [,2] [,3] [,4] #> [1,] \"apples\" \"oranges\" \"pears\" \"bananas\" #> [2,] \"pineapples\" \"mangos\" \"guavas\" \"\" # If you want to split a single string, use `str_split1` str_split_1(fruits[[1]], \" and \") #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" # Specify n to restrict the number of possible matches str_split(fruits, \" and \", n = 3) #> [[1]] #> [1] \"apples\" \"oranges\" \"pears and bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" #> str_split(fruits, \" and \", n = 2) #> [[1]] #> [1] \"apples\" \"oranges and pears and bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos and guavas\" #> # If n greater than number of pieces, no padding occurs str_split(fruits, \" and \", n = 5) #> [[1]] #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" #> # Use fixed to return a character matrix str_split_fixed(fruits, \" and \", 3) #> [,1] [,2] [,3] #> [1,] \"apples\" \"oranges\" \"pears and bananas\" #> [2,] \"pineapples\" \"mangos\" \"guavas\" str_split_fixed(fruits, \" and \", 4) #> [,1] [,2] [,3] [,4] #> [1,] \"apples\" \"oranges\" \"pears\" \"bananas\" #> [2,] \"pineapples\" \"mangos\" \"guavas\" \"\" # str_split_i extracts only a single piece from a string str_split_i(fruits, \" and \", 1) #> [1] \"apples\" \"pineapples\" str_split_i(fruits, \" and \", 4) #> [1] \"bananas\" NA # use a negative number to select from the end str_split_i(fruits, \" and \", -1) #> [1] \"bananas\" \"guavas\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":null,"dir":"Reference","previous_headings":"","what":"Detect the presence/absence of a match at the start/end — str_starts","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"str_starts() str_ends() special cases str_detect() match beginning end string, respectively.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"","code":"str_starts(string, pattern, negate = FALSE) str_ends(string, pattern, negate = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"string Input vector. Either character vector, something coercible one. pattern Pattern string starts ends. default interpretation regular expression, described stringi::about_search_regex. Control options regex(). Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. negate TRUE, inverts resulting boolean vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"logical vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_starts(fruit, \"p\") #> [1] FALSE FALSE TRUE TRUE str_starts(fruit, \"p\", negate = TRUE) #> [1] TRUE TRUE FALSE FALSE str_ends(fruit, \"e\") #> [1] TRUE FALSE FALSE TRUE str_ends(fruit, \"e\", negate = TRUE) #> [1] FALSE TRUE TRUE FALSE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":null,"dir":"Reference","previous_headings":"","what":"Get and set substrings using their positions — str_sub","title":"Get and set substrings using their positions — str_sub","text":"str_sub() extracts replaces elements single position string. str_sub_all() allows extract strings multiple elements every string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get and set substrings using their positions — str_sub","text":"","code":"str_sub(string, start = 1L, end = -1L) str_sub(string, start = 1L, end = -1L, omit_na = FALSE) <- value str_sub_all(string, start = 1L, end = -1L)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get and set substrings using their positions — str_sub","text":"string Input vector. Either character vector, something coercible one. start, end pair integer vectors defining range characters extract (inclusive). Alternatively, instead pair vectors, can pass matrix start. matrix two columns, either labelled start end, start length. omit_na Single logical value. TRUE, missing values arguments provided result unchanged input. value replacement string","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get and set substrings using their positions — str_sub","text":"str_sub(): character vector length string/start/end. str_sub_all(): list length string. element character vector length start/end.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get and set substrings using their positions — str_sub","text":"","code":"hw <- \"Hadley Wickham\" str_sub(hw, 1, 6) #> [1] \"Hadley\" str_sub(hw, end = 6) #> [1] \"Hadley\" str_sub(hw, 8, 14) #> [1] \"Wickham\" str_sub(hw, 8) #> [1] \"Wickham\" # Negative indices index from end of string str_sub(hw, -1) #> [1] \"m\" str_sub(hw, -7) #> [1] \"Wickham\" str_sub(hw, end = -7) #> [1] \"Hadley W\" # str_sub() is vectorised by both string and position str_sub(hw, c(1, 8), c(6, 14)) #> [1] \"Hadley\" \"Wickham\" # if you want to extract multiple positions from multiple strings, # use str_sub_all() x <- c(\"abcde\", \"ghifgh\") str_sub(x, c(1, 2), c(2, 4)) #> [1] \"ab\" \"hif\" str_sub_all(x, start = c(1, 2), end = c(2, 4)) #> [[1]] #> [1] \"ab\" \"bcd\" #> #> [[2]] #> [1] \"gh\" \"hif\" #> # Alternatively, you can pass in a two column matrix, as in the # output from str_locate_all pos <- str_locate_all(hw, \"[aeio]\")[[1]] pos #> start end #> [1,] 2 2 #> [2,] 5 5 #> [3,] 9 9 #> [4,] 13 13 str_sub(hw, pos) #> [1] \"a\" \"e\" \"i\" \"a\" # You can also use `str_sub()` to modify strings: x <- \"BBCDEF\" str_sub(x, 1, 1) <- \"A\"; x #> [1] \"ABCDEF\" str_sub(x, -1, -1) <- \"K\"; x #> [1] \"ABCDEK\" str_sub(x, -2, -2) <- \"GHIJ\"; x #> [1] \"ABCDGHIJK\" str_sub(x, 2, -2) <- \"\"; x #> [1] \"AK\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":null,"dir":"Reference","previous_headings":"","what":"Find matching elements — str_subset","title":"Find matching elements — str_subset","text":"str_subset() returns elements string least one match pattern. wrapper around x[str_detect(x, pattern)], equivalent grep(pattern, x, value = TRUE). Use str_extract() find location match within string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find matching elements — str_subset","text":"","code":"str_subset(string, pattern, negate = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find matching elements — str_subset","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). negate TRUE, inverts resulting boolean vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find matching elements — str_subset","text":"character vector, usually smaller string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find matching elements — str_subset","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_subset(fruit, \"a\") #> [1] \"apple\" \"banana\" \"pear\" \"pineapple\" str_subset(fruit, \"^a\") #> [1] \"apple\" str_subset(fruit, \"a$\") #> [1] \"banana\" str_subset(fruit, \"b\") #> [1] \"banana\" str_subset(fruit, \"[aeiou]\") #> [1] \"apple\" \"banana\" \"pear\" \"pineapple\" # Elements that don't match str_subset(fruit, \"^p\", negate = TRUE) #> [1] \"apple\" \"banana\" # Missings never match str_subset(c(\"a\", NA, \"b\"), \".\") #> [1] \"a\" \"b\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":null,"dir":"Reference","previous_headings":"","what":"Remove whitespace — str_trim","title":"Remove whitespace — str_trim","text":"str_trim() removes whitespace start end string; str_squish() removes whitespace start end, replaces internal whitespace single space.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Remove whitespace — str_trim","text":"","code":"str_trim(string, side = c(\"both\", \"left\", \"right\")) str_squish(string)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Remove whitespace — str_trim","text":"string Input vector. Either character vector, something coercible one. side Side remove whitespace: \"left\", \"right\", \"\", default.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Remove whitespace — str_trim","text":"character vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Remove whitespace — str_trim","text":"","code":"str_trim(\" String with trailing and leading white space\\t\") #> [1] \"String with trailing and leading white space\" str_trim(\"\\n\\nString with trailing and leading white space\\n\\n\") #> [1] \"String with trailing and leading white space\" str_squish(\" String with trailing, middle, and leading white space\\t\") #> [1] \"String with trailing, middle, and leading white space\" str_squish(\"\\n\\nString with excess, trailing and leading white space\\n\\n\") #> [1] \"String with excess, trailing and leading white space\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":null,"dir":"Reference","previous_headings":"","what":"Truncate a string to maximum width — str_trunc","title":"Truncate a string to maximum width — str_trunc","text":"Truncate string fixed characters, str_length(str_trunc(x, n)) always less equal n.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Truncate a string to maximum width — str_trunc","text":"","code":"str_trunc(string, width, side = c(\"right\", \"left\", \"center\"), ellipsis = \"...\")"},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Truncate a string to maximum width — str_trunc","text":"string Input vector. Either character vector, something coercible one. width Maximum width string. side, ellipsis Location content ellipsis indicates content removed.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Truncate a string to maximum width — str_trunc","text":"character vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Truncate a string to maximum width — str_trunc","text":"","code":"x <- \"This string is moderately long\" rbind( str_trunc(x, 20, \"right\"), str_trunc(x, 20, \"left\"), str_trunc(x, 20, \"center\") ) #> [,1] #> [1,] \"This string is mo...\" #> [2,] \"...s moderately long\" #> [3,] \"This stri...ely long\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":null,"dir":"Reference","previous_headings":"","what":"Remove duplicated strings — str_unique","title":"Remove duplicated strings — str_unique","text":"str_unique() removes duplicated values, optional control duplication measured.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Remove duplicated strings — str_unique","text":"","code":"str_unique(string, locale = \"en\", ignore_case = FALSE, ...)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Remove duplicated strings — str_unique","text":"string Input vector. Either character vector, something coercible one. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms. ignore_case Ignore case comparing strings? ... options used control collation. Passed stringi::stri_opts_collator().","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Remove duplicated strings — str_unique","text":"character vector, usually shorter string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Remove duplicated strings — str_unique","text":"","code":"str_unique(c(\"a\", \"b\", \"c\", \"b\", \"a\")) #> [1] \"a\" \"b\" \"c\" str_unique(c(\"a\", \"b\", \"c\", \"B\", \"A\")) #> [1] \"a\" \"b\" \"c\" \"B\" \"A\" str_unique(c(\"a\", \"b\", \"c\", \"B\", \"A\"), ignore_case = TRUE) #> [1] \"a\" \"b\" \"c\" # Use ... to pass additional arguments to stri_unique() str_unique(c(\"motley\", \"mötley\", \"pinguino\", \"pingüino\")) #> [1] \"motley\" \"mötley\" \"pinguino\" \"pingüino\" str_unique(c(\"motley\", \"mötley\", \"pinguino\", \"pingüino\"), strength = 1) #> [1] \"motley\" \"pinguino\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_view.html","id":null,"dir":"Reference","previous_headings":"","what":"View strings and matches — str_view","title":"View strings and matches — str_view","text":"str_view() used print underlying representation string see pattern matches. Matches surrounded <> unusual whitespace (.e. whitespace apart \" \" \"\\n\") surrounded {} escaped. possible, matches unusual whitespace coloured blue NAs red.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_view.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"View strings and matches — str_view","text":"","code":"str_view( string, pattern = NULL, match = TRUE, html = FALSE, use_escapes = FALSE )"},{"path":"https://stringr.tidyverse.org/dev/reference/str_view.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"View strings and matches — str_view","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). match pattern supplied, elements shown? TRUE, default, shows elements match pattern. NA shows elements. FALSE shows elements match pattern. pattern supplied, elements always shown. html Use HTML output? TRUE create HTML widget; FALSE style using ANSI escapes. default prefers ANSI escapes available current terminal; can override setting options(stringr.html = TRUE). use_escapes TRUE, non-ASCII characters rendered unicode escapes. useful see exactly underlying values stored string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_view.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"View strings and matches — str_view","text":"","code":"# Show special characters str_view(c(\"\\\"\\\\\", \"\\\\\\\\\\\\\", \"fgh\", NA, \"NA\")) #> [1] │ \"\\ #> [2] │ \\\\\\ #> [3] │ fgh #> [4] │ NA #> [5] │ NA # A non-breaking space looks like a regular space: nbsp <- \"Hi\\u00A0you\" nbsp #> [1] \"Hi you\" # But it doesn't behave like one: str_detect(nbsp, \" \") #> [1] FALSE # So str_view() brings it to your attention with a blue background str_view(nbsp) #> [1] │ Hi{\\u00a0}you # You can also use escapes to see all non-ASCII characters str_view(nbsp, use_escapes = TRUE) #> [1] │ Hi\\u00a0you # Supply a pattern to see where it matches str_view(c(\"abc\", \"def\", \"fghi\"), \"[aeiou]\") #> [1] │ bc #> [2] │ df #> [3] │ fgh str_view(c(\"abc\", \"def\", \"fghi\"), \"^\") #> [1] │ <>abc #> [2] │ <>def #> [3] │ <>fghi str_view(c(\"abc\", \"def\", \"fghi\"), \"..\") #> [1] │ c #> [2] │ f #> [3] │ # By default, only matching strings will be shown str_view(c(\"abc\", \"def\", \"fghi\"), \"e\") #> [2] │ df # but you can show all: str_view(c(\"abc\", \"def\", \"fghi\"), \"e\", match = NA) #> [1] │ abc #> [2] │ df #> [3] │ fghi # or just those that don't match: str_view(c(\"abc\", \"def\", \"fghi\"), \"e\", match = FALSE) #> [1] │ abc #> [3] │ fghi"},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":null,"dir":"Reference","previous_headings":"","what":"Find matching indices — str_which","title":"Find matching indices — str_which","text":"str_subset() returns indices ofstring least one match pattern. wrapper around (str_detect(x, pattern)), equivalent grep(pattern, x).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find matching indices — str_which","text":"","code":"str_which(string, pattern, negate = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find matching indices — str_which","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). negate TRUE, inverts resulting boolean vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find matching indices — str_which","text":"integer vector, usually smaller string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find matching indices — str_which","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_which(fruit, \"a\") #> [1] 1 2 3 4 # Elements that don't match str_which(fruit, \"^p\", negate = TRUE) #> [1] 1 2 # Missings never match str_which(c(\"a\", NA, \"b\"), \".\") #> [1] 1 3"},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":null,"dir":"Reference","previous_headings":"","what":"Wrap words into nicely formatted paragraphs — str_wrap","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"Wrap words paragraphs, minimizing \"raggedness\" lines (.e. variation length line) using Knuth-Plass algorithm.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"","code":"str_wrap(string, width = 80, indent = 0, exdent = 0, whitespace_only = TRUE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"string Input vector. Either character vector, something coercible one. width Positive integer giving target line width (number characters). width less equal 1 put word line. indent, exdent non-negative integer giving indent first line (indent) subsequent lines (exdent). whitespace_only boolean. TRUE (default) wrapping occur whitespace. FALSE, can break non-word character (e.g. /, -).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"character vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"","code":"thanks_path <- file.path(R.home(\"doc\"), \"THANKS\") thanks <- str_c(readLines(thanks_path), collapse = \"\\n\") thanks <- word(thanks, 1, 3, fixed(\"\\n\\n\")) cat(str_wrap(thanks), \"\\n\") #> R would not be what it is today without the invaluable help of these people #> outside of the (former and current) R Core team, who contributed by donating #> code, bug fixes and documentation: Valerio Aimale, Suharto Anggono, Thomas #> Baier, Gabe Becker, Henrik Bengtsson, Roger Bivand, Ben Bolker, David Brahm, #> G\"oran Brostr\"om, Patrick Burns, Vince Carey, Saikat DebRoy, Matt Dowle, Brian #> D'Urso, Lyndon Drake, Dirk Eddelbuettel, Claus Ekstrom, Sebastian Fischmeister, #> John Fox, Paul Gilbert, Yu Gong, Gabor Grothendieck, Frank E Harrell Jr, Peter #> M. Haverty, Torsten Hothorn, Robert King, Kjetil Kjernsmo, Roger Koenker, #> Philippe Lambert, Jan de Leeuw, Jim Lindsey, Patrick Lindsey, Catherine Loader, #> Gordon Maclean, Arni Magnusson, John Maindonald, David Meyer, Ei-ji Nakama, #> Jens Oehlschl\"agel, Steve Oncley, Richard O'Keefe, Hubert Palme, Roger D. Peng, #> Jose' C. Pinheiro, Tony Plate, Anthony Rossini, Jonathan Rougier, Petr Savicky, #> Guenther Sawitzki, Marc Schwartz, Arun Srinivasan, Detlef Steuer, Bill Simpson, #> Gordon Smyth, Adrian Trapletti, Terry Therneau, Rolf Turner, Bill Venables, #> Gregory R. Warnes, Andreas Weingessel, Morten Welinder, James Wettenhall, Simon #> Wood, and Achim Zeileis. Others have written code that has been adopted by R and #> is acknowledged in the code files, including cat(str_wrap(thanks, width = 40), \"\\n\") #> R would not be what it is today without #> the invaluable help of these people #> outside of the (former and current) R #> Core team, who contributed by donating #> code, bug fixes and documentation: #> Valerio Aimale, Suharto Anggono, Thomas #> Baier, Gabe Becker, Henrik Bengtsson, #> Roger Bivand, Ben Bolker, David Brahm, #> G\"oran Brostr\"om, Patrick Burns, Vince #> Carey, Saikat DebRoy, Matt Dowle, #> Brian D'Urso, Lyndon Drake, Dirk #> Eddelbuettel, Claus Ekstrom, Sebastian #> Fischmeister, John Fox, Paul Gilbert, #> Yu Gong, Gabor Grothendieck, Frank E #> Harrell Jr, Peter M. Haverty, Torsten #> Hothorn, Robert King, Kjetil Kjernsmo, #> Roger Koenker, Philippe Lambert, Jan #> de Leeuw, Jim Lindsey, Patrick Lindsey, #> Catherine Loader, Gordon Maclean, #> Arni Magnusson, John Maindonald, #> David Meyer, Ei-ji Nakama, Jens #> Oehlschl\"agel, Steve Oncley, Richard #> O'Keefe, Hubert Palme, Roger D. Peng, #> Jose' C. Pinheiro, Tony Plate, Anthony #> Rossini, Jonathan Rougier, Petr Savicky, #> Guenther Sawitzki, Marc Schwartz, Arun #> Srinivasan, Detlef Steuer, Bill Simpson, #> Gordon Smyth, Adrian Trapletti, Terry #> Therneau, Rolf Turner, Bill Venables, #> Gregory R. Warnes, Andreas Weingessel, #> Morten Welinder, James Wettenhall, Simon #> Wood, and Achim Zeileis. Others have #> written code that has been adopted by R #> and is acknowledged in the code files, #> including cat(str_wrap(thanks, width = 60, indent = 2), \"\\n\") #> R would not be what it is today without the invaluable #> help of these people outside of the (former and current) #> R Core team, who contributed by donating code, bug fixes #> and documentation: Valerio Aimale, Suharto Anggono, Thomas #> Baier, Gabe Becker, Henrik Bengtsson, Roger Bivand, Ben #> Bolker, David Brahm, G\"oran Brostr\"om, Patrick Burns, #> Vince Carey, Saikat DebRoy, Matt Dowle, Brian D'Urso, #> Lyndon Drake, Dirk Eddelbuettel, Claus Ekstrom, Sebastian #> Fischmeister, John Fox, Paul Gilbert, Yu Gong, Gabor #> Grothendieck, Frank E Harrell Jr, Peter M. Haverty, #> Torsten Hothorn, Robert King, Kjetil Kjernsmo, Roger #> Koenker, Philippe Lambert, Jan de Leeuw, Jim Lindsey, #> Patrick Lindsey, Catherine Loader, Gordon Maclean, Arni #> Magnusson, John Maindonald, David Meyer, Ei-ji Nakama, #> Jens Oehlschl\"agel, Steve Oncley, Richard O'Keefe, Hubert #> Palme, Roger D. Peng, Jose' C. Pinheiro, Tony Plate, Anthony #> Rossini, Jonathan Rougier, Petr Savicky, Guenther Sawitzki, #> Marc Schwartz, Arun Srinivasan, Detlef Steuer, Bill Simpson, #> Gordon Smyth, Adrian Trapletti, Terry Therneau, Rolf Turner, #> Bill Venables, Gregory R. Warnes, Andreas Weingessel, Morten #> Welinder, James Wettenhall, Simon Wood, and Achim Zeileis. #> Others have written code that has been adopted by R and is #> acknowledged in the code files, including cat(str_wrap(thanks, width = 60, exdent = 2), \"\\n\") #> R would not be what it is today without the invaluable help #> of these people outside of the (former and current) R #> Core team, who contributed by donating code, bug fixes #> and documentation: Valerio Aimale, Suharto Anggono, Thomas #> Baier, Gabe Becker, Henrik Bengtsson, Roger Bivand, Ben #> Bolker, David Brahm, G\"oran Brostr\"om, Patrick Burns, #> Vince Carey, Saikat DebRoy, Matt Dowle, Brian D'Urso, #> Lyndon Drake, Dirk Eddelbuettel, Claus Ekstrom, Sebastian #> Fischmeister, John Fox, Paul Gilbert, Yu Gong, Gabor #> Grothendieck, Frank E Harrell Jr, Peter M. Haverty, #> Torsten Hothorn, Robert King, Kjetil Kjernsmo, Roger #> Koenker, Philippe Lambert, Jan de Leeuw, Jim Lindsey, #> Patrick Lindsey, Catherine Loader, Gordon Maclean, Arni #> Magnusson, John Maindonald, David Meyer, Ei-ji Nakama, #> Jens Oehlschl\"agel, Steve Oncley, Richard O'Keefe, Hubert #> Palme, Roger D. Peng, Jose' C. Pinheiro, Tony Plate, #> Anthony Rossini, Jonathan Rougier, Petr Savicky, Guenther #> Sawitzki, Marc Schwartz, Arun Srinivasan, Detlef Steuer, #> Bill Simpson, Gordon Smyth, Adrian Trapletti, Terry #> Therneau, Rolf Turner, Bill Venables, Gregory R. Warnes, #> Andreas Weingessel, Morten Welinder, James Wettenhall, #> Simon Wood, and Achim Zeileis. Others have written code #> that has been adopted by R and is acknowledged in the code #> files, including cat(str_wrap(thanks, width = 0, exdent = 2), \"\\n\") #> R #> would #> not #> be #> what #> it #> is #> today #> without #> the #> invaluable #> help #> of #> these #> people #> outside #> of #> the #> (former #> and #> current) #> R #> Core #> team, #> who #> contributed #> by #> donating #> code, #> bug #> fixes #> and #> documentation: #> Valerio #> Aimale, #> Suharto #> Anggono, #> Thomas #> Baier, #> Gabe #> Becker, #> Henrik #> Bengtsson, #> Roger #> Bivand, #> Ben #> Bolker, #> David #> Brahm, #> G\"oran #> Brostr\"om, #> Patrick #> Burns, #> Vince #> Carey, #> Saikat #> DebRoy, #> Matt #> Dowle, #> Brian #> D'Urso, #> Lyndon #> Drake, #> Dirk #> Eddelbuettel, #> Claus #> Ekstrom, #> Sebastian #> Fischmeister, #> John #> Fox, #> Paul #> Gilbert, #> Yu #> Gong, #> Gabor #> Grothendieck, #> Frank #> E #> Harrell #> Jr, #> Peter #> M. #> Haverty, #> Torsten #> Hothorn, #> Robert #> King, #> Kjetil #> Kjernsmo, #> Roger #> Koenker, #> Philippe #> Lambert, #> Jan #> de #> Leeuw, #> Jim #> Lindsey, #> Patrick #> Lindsey, #> Catherine #> Loader, #> Gordon #> Maclean, #> Arni #> Magnusson, #> John #> Maindonald, #> David #> Meyer, #> Ei-ji #> Nakama, #> Jens #> Oehlschl\"agel, #> Steve #> Oncley, #> Richard #> O'Keefe, #> Hubert #> Palme, #> Roger #> D. #> Peng, #> Jose' #> C. #> Pinheiro, #> Tony #> Plate, #> Anthony #> Rossini, #> Jonathan #> Rougier, #> Petr #> Savicky, #> Guenther #> Sawitzki, #> Marc #> Schwartz, #> Arun #> Srinivasan, #> Detlef #> Steuer, #> Bill #> Simpson, #> Gordon #> Smyth, #> Adrian #> Trapletti, #> Terry #> Therneau, #> Rolf #> Turner, #> Bill #> Venables, #> Gregory #> R. #> Warnes, #> Andreas #> Weingessel, #> Morten #> Welinder, #> James #> Wettenhall, #> Simon #> Wood, #> and #> Achim #> Zeileis. #> Others #> have #> written #> code #> that #> has #> been #> adopted #> by #> R #> and #> is #> acknowledged #> in #> the #> code #> files, #> including"},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-data.html","id":null,"dir":"Reference","previous_headings":"","what":"Sample character vectors for practicing string manipulations — stringr-data","title":"Sample character vectors for practicing string manipulations — stringr-data","text":"fruit words come rcorpora package written Gabor Csardi; data collected Darius Kazemi made available https://github.com/dariusk/corpora. sentences collection \"Harvard sentences\" used standardised testing voice.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Sample character vectors for practicing string manipulations — stringr-data","text":"","code":"sentences fruit words"},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-data.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Sample character vectors for practicing string manipulations — stringr-data","text":"Character vectors.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-data.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Sample character vectors for practicing string manipulations — stringr-data","text":"","code":"length(sentences) #> [1] 720 sentences[1:5] #> [1] \"The birch canoe slid on the smooth planks.\" #> [2] \"Glue the sheet to the dark blue background.\" #> [3] \"It's easy to tell the depth of a well.\" #> [4] \"These days a chicken leg is a rare dish.\" #> [5] \"Rice is often served in round bowls.\" length(fruit) #> [1] 80 fruit[1:5] #> [1] \"apple\" \"apricot\" \"avocado\" \"banana\" \"bell pepper\" length(words) #> [1] 980 words[1:5] #> [1] \"a\" \"able\" \"about\" \"absolute\" \"accept\""},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-package.html","id":null,"dir":"Reference","previous_headings":"","what":"stringr: Simple, Consistent Wrappers for Common String Operations — stringr-package","title":"stringr: Simple, Consistent Wrappers for Common String Operations — stringr-package","text":"consistent, simple easy use set wrappers around fantastic 'stringi' package. function argument names (positions) consistent, functions deal \"NA\"'s zero length vectors way, output one function easy feed input another.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"stringr: Simple, Consistent Wrappers for Common String Operations — stringr-package","text":"Maintainer: Hadley Wickham hadley@rstudio.com [copyright holder] contributors: RStudio [copyright holder, funder]","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract words from a sentence — word","title":"Extract words from a sentence — word","text":"Extract words sentence","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract words from a sentence — word","text":"","code":"word(string, start = 1L, end = start, sep = fixed(\" \"))"},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract words from a sentence — word","text":"string Input vector. Either character vector, something coercible one. start, end Pair integer vectors giving range words (inclusive) extract. negative, counts backwards last word. default value select first word. sep Separator words. Defaults single space.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract words from a sentence — word","text":"character vector length string/start/end.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extract words from a sentence — word","text":"","code":"sentences <- c(\"Jane saw a cat\", \"Jane sat down\") word(sentences, 1) #> [1] \"Jane\" \"Jane\" word(sentences, 2) #> [1] \"saw\" \"sat\" word(sentences, -1) #> [1] \"cat\" \"down\" word(sentences, 2, -1) #> [1] \"saw a cat\" \"sat down\" # Also vectorised over start and end word(sentences[1], 1:3, -1) #> [1] \"Jane saw a cat\" \"saw a cat\" \"a cat\" word(sentences[1], 1, 1:4) #> [1] \"Jane\" \"Jane saw\" \"Jane saw a\" \"Jane saw a cat\" # Can define words by other separators str <- 'abc.def..123.4568.999' word(str, 1, sep = fixed('..')) #> [1] \"abc.def\" word(str, 2, sep = fixed('..')) #> [1] \"123.4568.999\""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-150","dir":"Changelog","previous_headings":"","what":"stringr 1.5.0","title":"stringr 1.5.0","text":"CRAN release: 2022-12-02","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"breaking-changes-1-5-0","dir":"Changelog","previous_headings":"","what":"Breaking changes","title":"stringr 1.5.0","text":"stringr functions now consistently implement tidyverse recycling rules (#372). two main changes: vectors length 1 recycled. Previously, (e.g.) str_detect(letters, c(\"x\", \"y\")) worked, now errors. str_c() ignores NULLs, rather treating length 0 vectors. Additionally, many arguments now throw errors, rather warnings, supplied wrong type input. regex() friends now generate class names stringr_ prefix (#384). str_detect(), str_starts(), str_ends() str_subset() now error used either empty string (\"\") boundary(). operations didn’t really make sense (str_detect(x, \"\") returned TRUE non-empty strings) made easy make mistakes programming.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"new-features-1-5-0","dir":"Changelog","previous_headings":"","what":"New features","title":"stringr 1.5.0","text":"Many tweaks documentation make useful consistent. New vignette(\"-base\") @sastoudt provides comprehensive comparison base R functions stringr equivalents. ’s designed help move stringr ’re already familiar base R string functions (#266). New str_escape() escapes regular expression metacharacters, providing alternative fixed() want compose pattern user supplied strings (#408). New str_equal() compares two character vectors using unicode rules, optionally ignoring case (#381). str_extract() can now optionally extract capturing group instead complete match (#420). New str_flatten_comma() special case str_flatten() designed comma separated flattening can correctly apply Oxford commas two elements (#444). New str_split_1() tailored special case splitting single string (#409). New str_split_i() extract single piece string (#278, @bfgray3). New str_like() allows use SQL wildcards (#280, @rjpat). New str_rank() complete set order/rank/sort functions (#353). New str_sub_all() extract multiple substrings string. New str_unique() wrapper around stri_unique() returns unique string values character vector (#249, @seasmith). str_view() uses ANSI colouring rather HTML widget (#370). works places requires fewer dependencies. includes number small improvements: longer requires pattern can use display strings special characters. highlights unusual whitespace characters. ’s vectorised stringandpattern` (#407). defaults displaying matches, making str_view_all() redundant (hence deprecated) (#455). New str_width() returns display width string (#380). stringr now licensed MIT (#351).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"minor-improvements-and-bug-fixes-1-5-0","dir":"Changelog","previous_headings":"","what":"Minor improvements and bug fixes","title":"stringr 1.5.0","text":"Better error message supply non-string pattern (#378). new data source sentences fixed many small errors. str_extract() str_exctract_all() now work correctly pattern boundary(). str_flatten() gains last argument optionally override final separator (#377). gains na.rm argument remove missing values (since ’s summary function) (#439). str_pad() gains use_width argument control whether use total code point width number code points “width” string (#190). str_replace() str_replace_all() can use standard tidyverse formula shorthand replacement function (#331). str_starts() str_ends() now correctly respect regex operator precedence (@carlganz). str_wrap() breaks whitespace default; set whitespace_only = FALSE return previous behaviour (#335, @rjpat). word() now returns sentence using negative start parameter greater equal number words. (@pdelboca, #245)","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-141","dir":"Changelog","previous_headings":"","what":"stringr 1.4.1","title":"stringr 1.4.1","text":"CRAN release: 2022-08-20 Hot patch release resolve R CMD check failures.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-140","dir":"Changelog","previous_headings":"","what":"stringr 1.4.0","title":"stringr 1.4.0","text":"CRAN release: 2019-02-10 str_interp() now renders lists consistently independent presence additional placeholders (@amhrasmussen). New str_starts() str_ends() functions detect patterns beginning end strings (@jonthegeek, #258). str_subset(), str_detect(), str_which() get negate argument, useful want elements match (#259, @yutannihilation). New str_to_sentence() function capitalize sentence case (@jonthegeek, #202).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-131","dir":"Changelog","previous_headings":"","what":"stringr 1.3.1","title":"stringr 1.3.1","text":"CRAN release: 2018-05-10 str_replace_all() named vector now respects modifier functions (#207) str_trunc() vectorised correctly (#203, @austin3dickey). str_view() handles NA values gracefully (#217). ’ve also tweaked sizing policy hopefully work better notebooks, preserving existing behaviour knit documents (#232).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-130","dir":"Changelog","previous_headings":"","what":"stringr 1.3.0","title":"stringr 1.3.0","text":"CRAN release: 2018-02-19","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"api-changes-1-3-0","dir":"Changelog","previous_headings":"","what":"API changes","title":"stringr 1.3.0","text":"package build, may see Error : object ‘ignore.case’ exported 'namespace:stringr'. long deprecated str_join(), ignore.case() perl() now removed.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"new-features-1-3-0","dir":"Changelog","previous_headings":"","what":"New features","title":"stringr 1.3.0","text":"str_glue() str_glue_data() provide convenient wrappers around glue glue_data() glue package (#157). str_flatten() wrapper around stri_flatten() clearly conveys flattening character vector single string (#186). str_remove() str_remove_all() functions. wrap str_replace() str_replace_all() remove patterns strings. (@Shians, #178) str_squish() removes spaces left right side strings, also converts multiple space (space-like characters) single space within strings (@stephlocke, #197). str_sub() gains omit_na argument ignoring NA. Accordingly, str_replace() now ignores NAs keeps original strings. (@yutannihilation, #164)","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"bug-fixes-and-minor-improvements-1-3-0","dir":"Changelog","previous_headings":"","what":"Bug fixes and minor improvements","title":"stringr 1.3.0","text":"str_trunc() now preserves NAs (@ClaytonJY, #162) str_trunc() now throws error width shorter ellipsis (@ClaytonJY, #163). Long deprecated str_join(), ignore.case() perl() now removed.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-120","dir":"Changelog","previous_headings":"","what":"stringr 1.2.0","title":"stringr 1.2.0","text":"CRAN release: 2017-02-18","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"api-changes-1-2-0","dir":"Changelog","previous_headings":"","what":"API changes","title":"stringr 1.2.0","text":"str_match_all() now returns NA optional group doesn’t match (previously returned ““). consistent str_match() match failures (#134).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"new-features-1-2-0","dir":"Changelog","previous_headings":"","what":"New features","title":"stringr 1.2.0","text":"str_replace(), replacement can now function called match whose return value used replace match. New str_which() mimics grep() (#129). new vignette (vignette(\"regular-expressions\")) describes details regular expressions supported stringr. main vignette (vignette(\"stringr\")) updated give high-level overview package.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"minor-improvements-and-bug-fixes-1-2-0","dir":"Changelog","previous_headings":"","what":"Minor improvements and bug fixes","title":"stringr 1.2.0","text":"str_order() str_sort() gain explicit numeric argument sorting mixed numbers strings. str_replace_all() now throws error replacement character vector. replacement NA_character_ replaces complete string replaces NA (#124). functions take locale (e.g. str_to_lower() str_sort()) default “en” (English) ensure default consistent across platforms.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-110","dir":"Changelog","previous_headings":"","what":"stringr 1.1.0","title":"stringr 1.1.0","text":"CRAN release: 2016-08-19 Add sample datasets: fruit, words sentences. fixed(), regex(), coll() now throw error use anything plain string (#60). ’ve clarified replacement perl() regex() regexp() (#61). boundary() improved defaults splitting non-word boundaries (#58, @lmullen). str_detect() now can detect boundaries (checking str_count() > 0) (#120). str_subset() works similarly. str_extract() str_extract_all() now work boundary(). particularly useful want extract logical constructs like words sentences. str_extract_all() respects simplify argument used fixed() matches. str_subset() now respects custom options fixed() patterns (#79, @gagolews). str_replace() str_replace_all() now behave correctly replacement string contains $s, \\\\\\\\1, etc. (#83, #99). str_split() gains simplify argument match str_extract_all() etc. str_view() str_view_all() create HTML widgets display regular expression matches (#96). word() returns NA indexes greater number words (#112).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-100","dir":"Changelog","previous_headings":"","what":"stringr 1.0.0","title":"stringr 1.0.0","text":"CRAN release: 2015-04-30 stringr now powered stringi instead base R regular expressions. improves unicode support, makes operations considerably faster. find stringr inadequate string processing needs, highly recommend looking stringi detail. stringr gains vignette, currently straight forward update article appeared R Journal. str_c() now returns zero length vector inputs zero length vectors. consistent functions, standard R recycling rules. Similarly, using str_c(\"x\", NA) now yields NA. want \"xNA\", use str_replace_na() inputs. str_replace_all() gains convenient syntax applying multiple pairs pattern replacement vector: str_match() now returns NA optional group doesn’t match (previously returned ““). consistent str_extract() match failures. New str_subset() keeps values match pattern. ’s convenient wrapper x[str_detect(x)] (#21, @jiho). New str_order() str_sort() allow sort order strings specified locale. New str_conv() convert strings specified encoding UTF-8. New modifier boundary() allows count, locate split character, word, line sentence boundaries. documentation got lot love, similar functions (e.g. first variants) now documented together. hopefully make easier locate function need. ignore.case(x) deprecated favour fixed|regex|coll(x, ignore.case = TRUE), perl(x) deprecated favour regex(x). str_join() deprecated, please use str_c() instead.","code":"input <- c(\"abc\", \"def\") str_replace_all(input, c(\"[ad]\" = \"!\", \"[cf]\" = \"?\"))"},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-062","dir":"Changelog","previous_headings":"","what":"stringr 0.6.2","title":"stringr 0.6.2","text":"CRAN release: 2012-12-06 fixed path str_wrap example works R installations. remove dependency plyr","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-061","dir":"Changelog","previous_headings":"","what":"stringr 0.6.1","title":"stringr 0.6.1","text":"CRAN release: 2012-07-25 Zero input str_split_fixed returns 0 row matrix n columns Export str_join","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-06","dir":"Changelog","previous_headings":"","what":"stringr 0.6","title":"stringr 0.6","text":"CRAN release: 2011-12-08 new modifier perl switches Perl regular expressions str_match now uses new base function regmatches extract matches - hopefully faster previous pure R algorithm","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-05","dir":"Changelog","previous_headings":"","what":"stringr 0.5","title":"stringr 0.5","text":"CRAN release: 2011-06-30 new str_wrap function gives strwrap output convenient format new word function extract words string given user defined separator (thanks suggestion David Cooper) str_locate now returns consistent type matching empty string (thanks Stavros Macrakis) new str_count counts number matches string. str_pad str_trim receive performance tweaks - large vectors give least two order magnitude speed str_length returns NA invalid multibyte strings fix small bug internal recyclable function","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-04","dir":"Changelog","previous_headings":"","what":"stringr 0.4","title":"stringr 0.4","text":"CRAN release: 2010-08-24 functions now vectorised respect string, pattern (appropriate) replacement parameters fixed() function now tells stringr functions use fixed matching, rather escaping regular expression. improve performance large vectors. new ignore.case() modifier tells stringr functions ignore case pattern. str_replace renamed str_replace_all new str_replace function added. makes str_replace consistent functions. new str_sub<- function (analogous substring<-) substring replacement str_sub now understands negative positions position end string. -1 replaces Inf indicator string end. str_pad side argument can left, right, (instead center) str_trim gains side argument better match str_pad stringr now namespace imports plyr (rather requiring )","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-03","dir":"Changelog","previous_headings":"","what":"stringr 0.3","title":"stringr 0.3","text":"CRAN release: 2010-02-15 fixed() now also escapes | str_join() renamed str_c() functions carefully check input return informative error messages expected. add invert_match() function convert matrix location matches locations non-matches add fixed() function allow matching fixed strings.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-02","dir":"Changelog","previous_headings":"","what":"stringr 0.2","title":"stringr 0.2","text":"CRAN release: 2009-11-16 str_length now returns correct results used factors str_sub now correctly replaces Inf end argument length string new function str_split_fixed returns fixed number splits character matrix str_split longer uses strsplit preserve trailing breaks","code":""}] +[{"path":"https://stringr.tidyverse.org/dev/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2020 stringr authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"overall-differences","dir":"Articles","previous_headings":"","what":"Overall differences","title":"From base R","text":"’ll begin lookup table important stringr functions base R equivalents. str_detect(string, pattern) grepl(pattern, x) str_dup(string, times) strrep(x, times) str_extract(string, pattern) regmatches(x, m = regexpr(pattern, text)) str_extract_all(string, pattern) regmatches(x, m = gregexpr(pattern, text)) str_length(string) nchar(x) str_locate(string, pattern) regexpr(pattern, text) str_locate_all(string, pattern) gregexpr(pattern, text) str_match(string, pattern) regmatches(x, m = regexec(pattern, text)) str_order(string) order(...) str_replace(string, pattern, replacement) sub(pattern, replacement, x) str_replace_all(string, pattern, replacement) gsub(pattern, replacement, x) str_sort(string) sort(x) str_split(string, pattern) strsplit(x, split) str_sub(string, start, end) substr(x, start, stop) str_subset(string, pattern) grep(pattern, x, value = TRUE) str_to_lower(string) tolower(x) str_to_title(string) tools::toTitleCase(text) str_to_upper(string) toupper(x) str_trim(string) trimws(x) str_which(string, pattern) grep(pattern, x) str_wrap(string) strwrap(x) Overall main differences base R stringr : stringr functions start str_ prefix; base R string functions consistent naming scheme. order inputs usually different base R stringr. base R, pattern match usually comes first; stringr, string manupulate always comes first. makes stringr easier use pipes, lapply() purrr::map(). Functions stringr tend less, many string processing functions base R multiple purposes. output input stringr functions carefully designed. example, output str_locate() can fed directly str_sub(); true regpexpr() substr(). Base functions use arguments (like perl, fixed, ignore.case) control pattern interpreted. avoid dependence arguments, stringr instead uses helper functions (like fixed(), regex(), coll()). Next ’ll walk functions, noting similarities important differences. examples adapted stringr documentation contrasted analogous base R operations.","code":"#> Warning: There was 1 warning in `dplyr::mutate()`. #> ℹ In argument: `dplyr::across(.fns = ~paste0(\"`\", .x, \"`\"))`. #> Caused by warning: #> ! Using `across()` without supplying `.cols` was deprecated in dplyr #> 1.1.0. #> ℹ Please supply `.cols` instead."},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_detect-detect-the-presence-or-absence-of-a-pattern-in-a-string","dir":"Articles","previous_headings":"Detect matches","what":"str_detect(): Detect the presence or absence of a pattern in a string","title":"From base R","text":"Suppose want know whether word vector fruit names contains “”. base use grepl() (see “l” think logical) stringr use str_detect() (see verb “detect” think yes/action).","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") # base grepl(pattern = \"a\", x = fruit) #> [1] TRUE TRUE TRUE TRUE # stringr str_detect(fruit, pattern = \"a\") #> [1] TRUE TRUE TRUE TRUE"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_which-find-positions-matching-a-pattern","dir":"Articles","previous_headings":"Detect matches","what":"str_which(): Find positions matching a pattern","title":"From base R","text":"Now want identify positions words vector fruit names contain “”. base use grep() stringr use str_which() (analogy ()).","code":"# base grep(pattern = \"a\", x = fruit) #> [1] 1 2 3 4 # stringr str_which(fruit, pattern = \"a\") #> [1] 1 2 3 4"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_count-count-the-number-of-matches-in-a-string","dir":"Articles","previous_headings":"Detect matches","what":"str_count(): Count the number of matches in a string","title":"From base R","text":"many “”s fruit? information can gleaned gregexpr() base, need look match.length attribute vector uses length-1 integer vector (-1) indicate match.","code":"# base loc <- gregexpr(pattern = \"a\", text = fruit, fixed = TRUE) sapply(loc, function(x) length(attr(x, \"match.length\"))) #> [1] 1 3 1 1 # stringr str_count(fruit, pattern = \"a\") #> [1] 1 3 1 1"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_locate-locate-the-position-of-patterns-in-a-string","dir":"Articles","previous_headings":"Detect matches","what":"str_locate(): Locate the position of patterns in a string","title":"From base R","text":"Within fruit, first “p” occur? “p”s?","code":"fruit3 <- c(\"papaya\", \"lime\", \"apple\") # base str(gregexpr(pattern = \"p\", text = fruit3)) #> List of 3 #> $ : int [1:2] 1 3 #> ..- attr(*, \"match.length\")= int [1:2] 1 1 #> ..- attr(*, \"index.type\")= chr \"chars\" #> ..- attr(*, \"useBytes\")= logi TRUE #> $ : int -1 #> ..- attr(*, \"match.length\")= int -1 #> ..- attr(*, \"index.type\")= chr \"chars\" #> ..- attr(*, \"useBytes\")= logi TRUE #> $ : int [1:2] 2 3 #> ..- attr(*, \"match.length\")= int [1:2] 1 1 #> ..- attr(*, \"index.type\")= chr \"chars\" #> ..- attr(*, \"useBytes\")= logi TRUE # stringr str_locate(fruit3, pattern = \"p\") #> start end #> [1,] 1 1 #> [2,] NA NA #> [3,] 2 2 str_locate_all(fruit3, pattern = \"p\") #> [[1]] #> start end #> [1,] 1 1 #> [2,] 3 3 #> #> [[2]] #> start end #> #> [[3]] #> start end #> [1,] 2 2 #> [2,] 3 3"},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_sub-extract-and-replace-substrings-from-a-character-vector","dir":"Articles","previous_headings":"Subset strings","what":"str_sub(): Extract and replace substrings from a character vector","title":"From base R","text":"want grab part string? base use substr() substring(). former requires start stop substring latter assumes stop end string. stringr version, str_sub() functionality, also gives default start value (beginning string). base stringr functions order expected inputs. stringr can use negative numbers index right-hand side string: -1 last letter, -2 second last, . base R stringr subset vectorized parameters. means can either choose subset across multiple strings specify different subsets different strings. stringr automatically recycle first argument length start stop: Whereas base equivalent silently uses just first value:","code":"hw <- \"Hadley Wickham\" # base substr(hw, start = 1, stop = 6) #> [1] \"Hadley\" substring(hw, first = 1) #> [1] \"Hadley Wickham\" # stringr str_sub(hw, start = 1, end = 6) #> [1] \"Hadley\" str_sub(hw, start = 1) #> [1] \"Hadley Wickham\" str_sub(hw, end = 6) #> [1] \"Hadley\" str_sub(hw, start = 1, end = -1) #> [1] \"Hadley Wickham\" str_sub(hw, start = -5, end = -2) #> [1] \"ckha\" al <- \"Ada Lovelace\" # base substr(c(hw,al), start = 1, stop = 6) #> [1] \"Hadley\" \"Ada Lo\" substr(c(hw,al), start = c(1,1), stop = c(6,7)) #> [1] \"Hadley\" \"Ada Lov\" # stringr str_sub(c(hw,al), start = 1, end = -1) #> [1] \"Hadley Wickham\" \"Ada Lovelace\" str_sub(c(hw,al), start = c(1,1), end = c(-1,-2)) #> [1] \"Hadley Wickham\" \"Ada Lovelac\" str_sub(hw, start = 1:5) #> [1] \"Hadley Wickham\" \"adley Wickham\" \"dley Wickham\" \"ley Wickham\" #> [5] \"ey Wickham\" substr(hw, start = 1:5, stop = 15) #> [1] \"Hadley Wickham\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_sub---subset-assignment","dir":"Articles","previous_headings":"Subset strings","what":"str_sub() <-: Subset assignment","title":"From base R","text":"substr() behaves surprising way replace substring different number characters: str_sub() expect:","code":"# base x <- \"ABCDEF\" substr(x, 1, 3) <- \"x\" x #> [1] \"xBCDEF\" # stringr x <- \"ABCDEF\" str_sub(x, 1, 3) <- \"x\" x #> [1] \"xDEF\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_subset-keep-strings-matching-a-pattern-or-find-positions","dir":"Articles","previous_headings":"Subset strings","what":"str_subset(): Keep strings matching a pattern, or find positions","title":"From base R","text":"may want retrieve strings contain pattern interest:","code":"# base grep(pattern = \"g\", x = fruit, value = TRUE) #> character(0) # stringr str_subset(fruit, pattern = \"g\") #> character(0)"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_extract-extract-matching-patterns-from-a-string","dir":"Articles","previous_headings":"Subset strings","what":"str_extract(): Extract matching patterns from a string","title":"From base R","text":"may want pick certain patterns string, example, digits shopping list: Base R requires combination regexpr() regmatches(); note strings without matches dropped output. stringr provides str_extract() str_extract_all(), output always length input.","code":"shopping_list <- c(\"apples x4\", \"bag of flour\", \"10\", \"milk x2\") # base matches <- regexpr(pattern = \"\\\\d+\", text = shopping_list) # digits regmatches(shopping_list, m = matches) #> [1] \"4\" \"10\" \"2\" matches <- gregexpr(pattern = \"[a-z]+\", text = shopping_list) # words regmatches(shopping_list, m = matches) #> [[1]] #> [1] \"apples\" \"x\" #> #> [[2]] #> [1] \"bag\" \"of\" \"flour\" #> #> [[3]] #> character(0) #> #> [[4]] #> [1] \"milk\" \"x\" # stringr str_extract(shopping_list, pattern = \"\\\\d+\") #> [1] \"4\" NA \"10\" \"2\" str_extract_all(shopping_list, \"[a-z]+\") #> [[1]] #> [1] \"apples\" \"x\" #> #> [[2]] #> [1] \"bag\" \"of\" \"flour\" #> #> [[3]] #> character(0) #> #> [[4]] #> [1] \"milk\" \"x\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_match-extract-matched-groups-from-a-string","dir":"Articles","previous_headings":"Subset strings","what":"str_match(): Extract matched groups from a string","title":"From base R","text":"may also want extract groups string. ’m going use scenario Section 14.4.3 R Data Science. extracting full match base R requires combination two functions, inputs matches dropped output.","code":"head(sentences) #> [1] \"The birch canoe slid on the smooth planks.\" #> [2] \"Glue the sheet to the dark blue background.\" #> [3] \"It's easy to tell the depth of a well.\" #> [4] \"These days a chicken leg is a rare dish.\" #> [5] \"Rice is often served in round bowls.\" #> [6] \"The juice of lemons makes fine punch.\" noun <- \"([A]a|[Tt]he) ([^ ]+)\" # base matches <- regexec(pattern = noun, text = head(sentences)) do.call(\"rbind\", regmatches(x = head(sentences), m = matches)) #> [,1] [,2] [,3] #> [1,] \"The birch\" \"The\" \"birch\" #> [2,] \"the sheet\" \"the\" \"sheet\" #> [3,] \"the depth\" \"the\" \"depth\" #> [4,] \"The juice\" \"The\" \"juice\" # stringr str_match(head(sentences), pattern = noun) #> [,1] [,2] [,3] #> [1,] \"The birch\" \"The\" \"birch\" #> [2,] \"the sheet\" \"the\" \"sheet\" #> [3,] \"the depth\" \"the\" \"depth\" #> [4,] NA NA NA #> [5,] NA NA NA #> [6,] \"The juice\" \"The\" \"juice\""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_length-the-length-of-a-string","dir":"Articles","previous_headings":"Manage lengths","what":"str_length(): The length of a string","title":"From base R","text":"determine length string, base R uses nchar() (confused length() gives length vectors, etc.) stringr uses str_length(). subtle differences base stringr . nchar() requires character vector, return error used factor. str_length() can handle factor input. Note “characters” poorly defined concept, technically nchar() str_length() returns number code points. usually ’d consider charcter, always:","code":"# base nchar(letters) #> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # stringr str_length(letters) #> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 # base nchar(factor(\"abc\")) #> Error in nchar(factor(\"abc\")): 'nchar()' requires a character vector # stringr str_length(factor(\"abc\")) #> [1] 3 x <- c(\"\\u00fc\", \"u\\u0308\") x #> [1] \"ü\" \"ü\" nchar(x) #> [1] 1 2 str_length(x) #> [1] 1 2"},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_pad-pad-a-string","dir":"Articles","previous_headings":"Manage lengths","what":"str_pad(): Pad a string","title":"From base R","text":"pad string certain width, use stringr’s str_pad(). base R use sprintf(), unlike str_pad(), sprintf() many functionalities.","code":"# base sprintf(\"%30s\", \"hadley\") #> [1] \" hadley\" sprintf(\"%-30s\", \"hadley\") #> [1] \"hadley \" # \"both\" is not as straightforward # stringr rbind( str_pad(\"hadley\", 30, \"left\"), str_pad(\"hadley\", 30, \"right\"), str_pad(\"hadley\", 30, \"both\") ) #> [,1] #> [1,] \" hadley\" #> [2,] \"hadley \" #> [3,] \" hadley \""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_trunc-truncate-a-character-string","dir":"Articles","previous_headings":"Manage lengths","what":"str_trunc(): Truncate a character string","title":"From base R","text":"stringr package provides easy way truncate character string: str_trunc(). Base R function directly.","code":"x <- \"This string is moderately long\" # stringr rbind( str_trunc(x, 20, \"right\"), str_trunc(x, 20, \"left\"), str_trunc(x, 20, \"center\") ) #> [,1] #> [1,] \"This string is mo...\" #> [2,] \"...s moderately long\" #> [3,] \"This stri...ely long\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_trim-trim-whitespace-from-a-string","dir":"Articles","previous_headings":"Manage lengths","what":"str_trim(): Trim whitespace from a string","title":"From base R","text":"Similarly, stringr provides str_trim() trim whitespace string. analogous base R’s trimws() added R 3.3.0. stringr function str_squish() allows extra whitespace within string trimmed (contrast str_trim() removes whitespace beginning /end string). base R, one might take advantage gsub() accomplish effect.","code":"# base trimws(\" String with trailing and leading white space\\t\") #> [1] \"String with trailing and leading white space\" trimws(\"\\n\\nString with trailing and leading white space\\n\\n\") #> [1] \"String with trailing and leading white space\" # stringr str_trim(\" String with trailing and leading white space\\t\") #> [1] \"String with trailing and leading white space\" str_trim(\"\\n\\nString with trailing and leading white space\\n\\n\") #> [1] \"String with trailing and leading white space\" # stringr str_squish(\" String with trailing, middle, and leading white space\\t\") #> [1] \"String with trailing, middle, and leading white space\" str_squish(\"\\n\\nString with excess, trailing and leading white space\\n\\n\") #> [1] \"String with excess, trailing and leading white space\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_wrap-wrap-strings-into-nicely-formatted-paragraphs","dir":"Articles","previous_headings":"Manage lengths","what":"str_wrap(): Wrap strings into nicely formatted paragraphs","title":"From base R","text":"strwrap() str_wrap() use different algorithms. str_wrap() uses famous Knuth-Plass algorithm. Note strwrap() returns character vector one element line; str_wrap() returns single string containing line breaks.","code":"gettysburg <- \"Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.\" # base cat(strwrap(gettysburg, width = 60), sep = \"\\n\") #> Four score and seven years ago our fathers brought forth on #> this continent, a new nation, conceived in Liberty, and #> dedicated to the proposition that all men are created #> equal. # stringr cat(str_wrap(gettysburg, width = 60), \"\\n\") #> Four score and seven years ago our fathers brought forth #> on this continent, a new nation, conceived in Liberty, and #> dedicated to the proposition that all men are created equal."},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_replace-replace-matched-patterns-in-a-string","dir":"Articles","previous_headings":"Mutate strings","what":"str_replace(): Replace matched patterns in a string","title":"From base R","text":"replace certain patterns within string, stringr provides functions str_replace() str_replace_all(). base R equivalents sub() gsub(). Note difference default input order .","code":"fruits <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") # base sub(\"[aeiou]\", \"-\", fruits) #> [1] \"-pple\" \"b-nana\" \"p-ar\" \"p-neapple\" gsub(\"[aeiou]\", \"-\", fruits) #> [1] \"-ppl-\" \"b-n-n-\" \"p--r\" \"p-n--ppl-\" # stringr str_replace(fruits, \"[aeiou]\", \"-\") #> [1] \"-pple\" \"b-nana\" \"p-ar\" \"p-neapple\" str_replace_all(fruits, \"[aeiou]\", \"-\") #> [1] \"-ppl-\" \"b-n-n-\" \"p--r\" \"p-n--ppl-\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"case-convert-case-of-a-string","dir":"Articles","previous_headings":"Mutate strings","what":"case: Convert case of a string","title":"From base R","text":"stringr base R functions convert upper lower case. Title case also provided stringr. stringr can control locale, base R locale distinctions controlled global variables. Therefore, output base R code may vary across different computers different global settings.","code":"dog <- \"The quick brown dog\" # base toupper(dog) #> [1] \"THE QUICK BROWN DOG\" tolower(dog) #> [1] \"the quick brown dog\" tools::toTitleCase(dog) #> [1] \"The Quick Brown Dog\" # stringr str_to_upper(dog) #> [1] \"THE QUICK BROWN DOG\" str_to_lower(dog) #> [1] \"the quick brown dog\" str_to_title(dog) #> [1] \"The Quick Brown Dog\" # stringr str_to_upper(\"i\") # English #> [1] \"I\" str_to_upper(\"i\", locale = \"tr\") # Turkish #> [1] \"İ\""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_flatten-flatten-a-string","dir":"Articles","previous_headings":"Join and split","what":"str_flatten(): Flatten a string","title":"From base R","text":"want take elements string vector collapse single string can use collapse argument paste() use stringr’s str_flatten(). advantage str_flatten() always returns vector length input; predict return length paste() must carefully read arguments.","code":"# base paste0(letters, collapse = \"-\") #> [1] \"a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z\" # stringr str_flatten(letters, collapse = \"-\") #> [1] \"a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_dup-duplicate-strings-within-a-character-vector","dir":"Articles","previous_headings":"Join and split","what":"str_dup(): duplicate strings within a character vector","title":"From base R","text":"duplicate strings within character vector use strrep() (R 3.3.0 greater) str_dup():","code":"fruit <- c(\"apple\", \"pear\", \"banana\") # base strrep(fruit, 2) #> [1] \"appleapple\" \"pearpear\" \"bananabanana\" strrep(fruit, 1:3) #> [1] \"apple\" \"pearpear\" \"bananabananabanana\" # stringr str_dup(fruit, 2) #> [1] \"appleapple\" \"pearpear\" \"bananabanana\" str_dup(fruit, 1:3) #> [1] \"apple\" \"pearpear\" \"bananabananabanana\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_split-split-up-a-string-into-pieces","dir":"Articles","previous_headings":"Join and split","what":"str_split(): Split up a string into pieces","title":"From base R","text":"split string pieces breaks based particular pattern match stringr uses str_split() base R uses strsplit(). Unlike functions, strsplit() starts character vector modify. stringr package’s str_split() allows control split, including restricting number possible matches.","code":"fruits <- c( \"apples and oranges and pears and bananas\", \"pineapples and mangos and guavas\" ) # base strsplit(fruits, \" and \") #> [[1]] #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" # stringr str_split(fruits, \" and \") #> [[1]] #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" # stringr str_split(fruits, \" and \", n = 3) #> [[1]] #> [1] \"apples\" \"oranges\" \"pears and bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" str_split(fruits, \" and \", n = 2) #> [[1]] #> [1] \"apples\" \"oranges and pears and bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos and guavas\""},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_glue-interpolate-strings","dir":"Articles","previous_headings":"Join and split","what":"str_glue(): Interpolate strings","title":"From base R","text":"’s often useful interpolate varying values fixed string. base R, can use sprintf() purpose; stringr provides wrapper general purpose glue package.","code":"name <- \"Fred\" age <- 50 anniversary <- as.Date(\"1991-10-12\") # base sprintf( \"My name is %s my age next year is %s and my anniversary is %s.\", name, age + 1, format(anniversary, \"%A, %B %d, %Y\") ) #> [1] \"My name is Fred my age next year is 51 and my anniversary is Saturday, October 12, 1991.\" # stringr str_glue( \"My name is {name}, \", \"my age next year is {age + 1}, \", \"and my anniversary is {format(anniversary, '%A, %B %d, %Y')}.\" ) #> My name is Fred, my age next year is 51, and my anniversary is Saturday, October 12, 1991."},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/articles/from-base.html","id":"str_order-order-or-sort-a-character-vector","dir":"Articles","previous_headings":"Order strings","what":"str_order(): Order or sort a character vector","title":"From base R","text":"base R stringr separate functions order sort strings. options str_order() str_sort() don’t analogous base R options. example, stringr functions locale argument control order sort. base R locale global setting, outputs sort() order() may differ across different computers. example, Norwegian alphabet, å comes z: stringr functions also numeric argument sort digits numerically instead treating strings.","code":"# base order(letters) #> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #> [24] 24 25 26 sort(letters) #> [1] \"a\" \"b\" \"c\" \"d\" \"e\" \"f\" \"g\" \"h\" \"i\" \"j\" \"k\" \"l\" \"m\" \"n\" \"o\" \"p\" \"q\" #> [18] \"r\" \"s\" \"t\" \"u\" \"v\" \"w\" \"x\" \"y\" \"z\" # stringr str_order(letters) #> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #> [24] 24 25 26 str_sort(letters) #> [1] \"a\" \"b\" \"c\" \"d\" \"e\" \"f\" \"g\" \"h\" \"i\" \"j\" \"k\" \"l\" \"m\" \"n\" \"o\" \"p\" \"q\" #> [18] \"r\" \"s\" \"t\" \"u\" \"v\" \"w\" \"x\" \"y\" \"z\" x <- c(\"å\", \"a\", \"z\") str_sort(x) #> [1] \"a\" \"å\" \"z\" str_sort(x, locale = \"no\") #> [1] \"a\" \"z\" \"å\" # stringr x <- c(\"100a10\", \"100a5\", \"2b\", \"2a\") str_sort(x) #> [1] \"100a10\" \"100a5\" \"2a\" \"2b\" str_sort(x, numeric = TRUE) #> [1] \"2a\" \"2b\" \"100a5\" \"100a10\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"basic-matches","dir":"Articles","previous_headings":"","what":"Basic matches","title":"Regular expressions","text":"simplest patterns match exact strings: can perform case-insensitive match using ignore_case = TRUE: next step complexity ., matches character except newline: can allow . match everything, including \\n, setting dotall = TRUE:","code":"x <- c(\"apple\", \"banana\", \"pear\") str_extract(x, \"an\") #> [1] NA \"an\" NA bananas <- c(\"banana\", \"Banana\", \"BANANA\") str_detect(bananas, \"banana\") #> [1] TRUE FALSE FALSE str_detect(bananas, regex(\"banana\", ignore_case = TRUE)) #> [1] TRUE TRUE TRUE str_extract(x, \".a.\") #> [1] NA \"ban\" \"ear\" str_detect(\"\\nX\\n\", \".X.\") #> [1] FALSE str_detect(\"\\nX\\n\", regex(\".X.\", dotall = TRUE)) #> [1] TRUE"},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"escaping","dir":"Articles","previous_headings":"","what":"Escaping","title":"Regular expressions","text":"“.” matches character, match literal “.”? need use “escape” tell regular expression want match exactly, use special behaviour. Like strings, regexps use backslash, \\, escape special behaviour. match ., need regexp \\.. Unfortunately creates problem. use strings represent regular expressions, \\ also used escape symbol strings. create regular expression \\. need string \"\\\\.\". \\ used escape character regular expressions, match literal \\? Well need escape , creating regular expression \\\\. create regular expression, need use string, also needs escape \\. means match literal \\ need write \"\\\\\\\\\" — need four backslashes match one! vignette, use \\. denote regular expression, \"\\\\.\" denote string represents regular expression. alternative quoting mechanism \\Q...\\E: characters ... treated exact matches. useful want exactly match user input part regular expression.","code":"# To create the regular expression, we need \\\\ dot <- \"\\\\.\" # But the expression itself only contains one: writeLines(dot) #> \\. # And this tells R to look for an explicit . str_extract(c(\"abc\", \"a.c\", \"bef\"), \"a\\\\.c\") #> [1] NA \"a.c\" NA x <- \"a\\\\b\" writeLines(x) #> a\\b str_extract(x, \"\\\\\\\\\") #> [1] \"\\\\\" x <- c(\"a.b.c.d\", \"aeb\") starts_with <- \"a.b\" str_detect(x, paste0(\"^\", starts_with)) #> [1] TRUE TRUE str_detect(x, paste0(\"^\\\\Q\", starts_with, \"\\\\E\")) #> [1] TRUE FALSE"},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"special-characters","dir":"Articles","previous_headings":"","what":"Special characters","title":"Regular expressions","text":"Escapes also allow specify individual characters otherwise hard type. can specify individual unicode characters five ways, either variable number hex digits (four common), name: \\xhh: 2 hex digits. \\x{hhhh}: 1-6 hex digits. \\uhhhh: 4 hex digits. \\Uhhhhhhhh: 8 hex digits. \\N{name}, e.g. \\N{grinning face} matches basic smiling emoji. Similarly, can specify many common control characters: \\: bell. \\cX: match control-X character. \\e: escape (\\u001B). \\f: form feed (\\u000C). \\n: line feed (\\u000A). \\r: carriage return (\\u000D). \\t: horizontal tabulation (\\u0009). \\0ooo match octal character. ‘ooo’ one three octal digits, 000 0377. leading zero required. (Many historical interest included sake completeness.)","code":""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"matching-multiple-characters","dir":"Articles","previous_headings":"","what":"Matching multiple characters","title":"Regular expressions","text":"number patterns match one character. ’ve already seen ., matches character (except newline). closely related operator \\X, matches grapheme cluster, set individual elements form single symbol. example, one way representing “á” letter “” plus accent: . match component “”, \\X match complete symbol: five escaped pairs match narrower classes characters: \\d: matches digit. complement, \\D, matches character decimal digit. Technically, \\d includes character Unicode Category Nd (“Number, Decimal Digit”), also includes numeric symbols languages: \\s: matches whitespace. includes tabs, newlines, form feeds, character Unicode Z Category (includes variety space characters separators.). complement, \\S, matches non-whitespace character. \\p{property name} matches character specific unicode property, like \\p{Uppercase} \\p{Diacritic}. complement, \\P{property name}, matches characters without property. complete list unicode properties can found http://www.unicode.org/reports/tr44/#Property_Index. \\w matches “word” character, includes alphabetic characters, marks decimal numbers. complement, \\W, matches non-word character. Technically, \\w also matches connector punctuation, \\u200c (zero width connector), \\u200d (zero width joiner), rarely seen wild. \\b matches word boundaries, transition word non-word characters. \\B matches opposite: boundaries either word non-word characters either side. can also create character classes using []: [abc]: matches , b, c. [-z]: matches every character z (Unicode code point order). [^abc]: matches anything except , b, c. [\\^\\-]: matches ^ -. number pre-built classes can use inside []: [:punct:]: punctuation. [:alpha:]: letters. [:lower:]: lowercase letters. [:upper:]: upperclass letters. [:digit:]: digits. [:xdigit:]: hex digits. [:alnum:]: letters numbers. [:cntrl:]: control characters. [:graph:]: letters, numbers, punctuation. [:print:]: letters, numbers, punctuation, whitespace. [:space:]: space characters (basically equivalent \\s). [:blank:]: space tab. go inside [] character classes, .e. [[:digit:]AX] matches digits, , X. can also using Unicode properties, like [\\p{Letter}], various set operations, like [\\p{Letter}--\\p{script=latin}]. See ?\"stringi-search-charclass\" details.","code":"x <- \"a\\u0301\" str_extract(x, \".\") #> [1] \"a\" str_extract(x, \"\\\\X\") #> [1] \"á\" str_extract_all(\"1 + 2 = 3\", \"\\\\d+\")[[1]] #> [1] \"1\" \"2\" \"3\" # Some Laotian numbers str_detect(\"១២៣\", \"\\\\d\") #> [1] TRUE (text <- \"Some \\t badly\\n\\t\\tspaced \\f text\") #> [1] \"Some \\t badly\\n\\t\\tspaced \\f text\" str_replace_all(text, \"\\\\s+\", \" \") #> [1] \"Some badly spaced text\" (text <- c('\"Double quotes\"', \"«Guillemet»\", \"“Fancy quotes”\")) #> [1] \"\\\"Double quotes\\\"\" \"«Guillemet»\" \"“Fancy quotes”\" str_replace_all(text, \"\\\\p{quotation mark}\", \"'\") #> [1] \"'Double quotes'\" \"'Guillemet'\" \"'Fancy quotes'\" str_extract_all(\"Don't eat that!\", \"\\\\w+\")[[1]] #> [1] \"Don\" \"t\" \"eat\" \"that\" str_split(\"Don't eat that!\", \"\\\\W\")[[1]] #> [1] \"Don\" \"t\" \"eat\" \"that\" \"\" str_replace_all(\"The quick brown fox\", \"\\\\b\", \"_\") #> [1] \"_The_ _quick_ _brown_ _fox_\" str_replace_all(\"The quick brown fox\", \"\\\\B\", \"_\") #> [1] \"T_h_e q_u_i_c_k b_r_o_w_n f_o_x\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"alternation","dir":"Articles","previous_headings":"","what":"Alternation","title":"Regular expressions","text":"| alternation operator, pick one possible matches. example, abc|def match abc def: Note precedence | low: abc|def equivalent (abc)|(def) ab(c|d)ef.","code":"str_detect(c(\"abc\", \"def\", \"ghi\"), \"abc|def\") #> [1] TRUE TRUE FALSE"},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"grouping","dir":"Articles","previous_headings":"","what":"Grouping","title":"Regular expressions","text":"can use parentheses override default precedence rules: Parenthesis also define “groups” can refer backreferences, like \\1, \\2 etc, can extracted str_match(). example, following regular expression finds fruits repeated pair letters: can use (?:...), non-grouping parentheses, control precedence capture match group. slightly efficient capturing parentheses. useful complex cases need capture matches control precedence independently.","code":"str_extract(c(\"grey\", \"gray\"), \"gre|ay\") #> [1] \"gre\" \"ay\" str_extract(c(\"grey\", \"gray\"), \"gr(e|a)y\") #> [1] \"grey\" \"gray\" pattern <- \"(..)\\\\1\" fruit %>% str_subset(pattern) #> [1] \"banana\" \"coconut\" \"cucumber\" \"jujube\" \"papaya\" #> [6] \"salal berry\" fruit %>% str_subset(pattern) %>% str_match(pattern) #> [,1] [,2] #> [1,] \"anan\" \"an\" #> [2,] \"coco\" \"co\" #> [3,] \"cucu\" \"cu\" #> [4,] \"juju\" \"ju\" #> [5,] \"papa\" \"pa\" #> [6,] \"alal\" \"al\" str_match(c(\"grey\", \"gray\"), \"gr(e|a)y\") #> [,1] [,2] #> [1,] \"grey\" \"e\" #> [2,] \"gray\" \"a\" str_match(c(\"grey\", \"gray\"), \"gr(?:e|a)y\") #> [,1] #> [1,] \"grey\" #> [2,] \"gray\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"anchors","dir":"Articles","previous_headings":"","what":"Anchors","title":"Regular expressions","text":"default, regular expressions match part string. ’s often useful anchor regular expression matches start end string: ^ matches start string. $ matches end string. match literal “$” “^”, need escape , \\$, \\^. multiline strings, can use regex(multiline = TRUE). changes behaviour ^ $, introduces three new operators: ^ now matches start line. $ now matches end line. \\matches start input. \\z matches end input. \\Z matches end input, final line terminator, exists.","code":"x <- c(\"apple\", \"banana\", \"pear\") str_extract(x, \"^a\") #> [1] \"a\" NA NA str_extract(x, \"a$\") #> [1] NA \"a\" NA x <- \"Line 1\\nLine 2\\nLine 3\\n\" str_extract_all(x, \"^Line..\")[[1]] #> [1] \"Line 1\" str_extract_all(x, regex(\"^Line..\", multiline = TRUE))[[1]] #> [1] \"Line 1\" \"Line 2\" \"Line 3\" str_extract_all(x, regex(\"\\\\ALine..\", multiline = TRUE))[[1]] #> [1] \"Line 1\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"repetition","dir":"Articles","previous_headings":"","what":"Repetition","title":"Regular expressions","text":"can control many times pattern matches repetition operators: ?: 0 1. +: 1 . *: 0 . Note precedence operators high, can write: colou?r match either American British spellings. means uses need parentheses, like bana(na)+. can also specify number matches precisely: {n}: exactly n {n,}: n {n,m}: n m default matches “greedy”: match longest string possible. can make “lazy”, matching shortest string possible putting ? : ??: 0 1, prefer 0. +?: 1 , match times possible. *?: 0 , match times possible. {n,}?: n , match times possible. {n,m}?: n m, , match times possible, least n. can also make matches possessive putting + , means later parts match fail, repetition re-tried smaller number characters. advanced feature used improve performance worst-case scenarios (called “catastrophic backtracking”). ?+: 0 1, possessive. ++: 1 , possessive. *+: 0 , possessive. {n}+: exactly n, possessive. {n,}+: n , possessive. {n,m}+: n m, possessive. related concept atomic-match parenthesis, (?>...). later match fails engine needs back-track, atomic match kept : succeeds fails whole. Compare following two regular expressions: atomic match fails matches , next character C fails. regular match succeeds matches , C doesn’t match, back-tracks tries B instead.","code":"x <- \"1888 is the longest year in Roman numerals: MDCCCLXXXVIII\" str_extract(x, \"CC?\") #> [1] \"CC\" str_extract(x, \"CC+\") #> [1] \"CCC\" str_extract(x, 'C[LX]+') #> [1] \"CLXXX\" str_extract(x, \"C{2}\") #> [1] \"CC\" str_extract(x, \"C{2,}\") #> [1] \"CCC\" str_extract(x, \"C{2,3}\") #> [1] \"CCC\" str_extract(x, c(\"C{2,3}\", \"C{2,3}?\")) #> [1] \"CCC\" \"CC\" str_extract(x, c(\"C[LX]+\", \"C[LX]+?\")) #> [1] \"CLXXX\" \"CL\" str_detect(\"ABC\", \"(?>A|.B)C\") #> [1] FALSE str_detect(\"ABC\", \"(?:A|.B)C\") #> [1] TRUE"},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"look-arounds","dir":"Articles","previous_headings":"","what":"Look arounds","title":"Regular expressions","text":"assertions look ahead behind current match without “consuming” characters (.e. changing input position). (?=...): positive look-ahead assertion. Matches ... matches current input. (?!...): negative look-ahead assertion. Matches ... match current input. (?<=...): positive look-behind assertion. Matches ... matches text preceding current position, last character match character just current position. Length must bounded (.e. * +). (? [1] \"1\" \"2\" NA y <- c(\"100\", \"$400\") str_extract(y, \"(?<=\\\\$)\\\\d+\") #> [1] NA \"400\""},{"path":"https://stringr.tidyverse.org/dev/articles/regular-expressions.html","id":"comments","dir":"Articles","previous_headings":"","what":"Comments","title":"Regular expressions","text":"two ways include comments regular expression. first (?#...): second use regex(comments = TRUE). form ignores spaces newlines, anything everything #. match literal space, ’ll need escape : \"\\\\ \". useful way describing complex regular expressions:","code":"str_detect(\"xyz\", \"x(?#this is a comment)\") #> [1] TRUE phone <- regex(\" \\\\(? # optional opening parens (\\\\d{3}) # area code \\\\)? # optional closing parens (?:-|\\\\ )? # optional dash or space (\\\\d{3}) # another three numbers (?:-|\\\\ )? # optional dash or space (\\\\d{3}) # three more numbers \", comments = TRUE) str_match(c(\"514-791-8141\", \"(514) 791 8141\"), phone) #> [,1] [,2] [,3] [,4] #> [1,] \"514-791-814\" \"514\" \"791\" \"814\" #> [2,] \"(514) 791 814\" \"514\" \"791\" \"814\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"getting-and-setting-individual-characters","dir":"Articles","previous_headings":"","what":"Getting and setting individual characters","title":"Introduction to stringr","text":"can get length string str_length(): now equivalent base R function nchar(). Previously needed work around issues nchar() fact returned 2 nchar(NA). fixed R 3.3.0, longer important. can access individual character using str_sub(). takes three arguments: character vector, start position end position. Either position can either positive integer, counts left, negative integer counts right. positions inclusive, longer string, silently truncated. can also use str_sub() modify strings: duplicate individual strings, can use str_dup():","code":"str_length(\"abc\") #> [1] 3 x <- c(\"abcdef\", \"ghifjk\") # The 3rd letter str_sub(x, 3, 3) #> [1] \"c\" \"i\" # The 2nd to 2nd-to-last character str_sub(x, 2, -2) #> [1] \"bcde\" \"hifj\" str_sub(x, 3, 3) <- \"X\" x #> [1] \"abXdef\" \"ghXfjk\" str_dup(x, c(2, 3)) #> [1] \"abXdefabXdef\" \"ghXfjkghXfjkghXfjk\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"whitespace","dir":"Articles","previous_headings":"","what":"Whitespace","title":"Introduction to stringr","text":"Three functions add, remove, modify whitespace: str_pad() pads string fixed length adding extra whitespace left, right, sides. (can pad characters using pad argument.) str_pad() never make string shorter: want ensure strings length (often useful print methods), combine str_pad() str_trunc(): opposite str_pad() str_trim(), removes leading trailing whitespace: can use str_wrap() modify existing whitespace order wrap paragraph text, length line similar possible.","code":"x <- c(\"abc\", \"defghi\") str_pad(x, 10) # default pads on left #> [1] \" abc\" \" defghi\" str_pad(x, 10, \"both\") #> [1] \" abc \" \" defghi \" str_pad(x, 4) #> [1] \" abc\" \"defghi\" x <- c(\"Short\", \"This is a long string\") x %>% str_trunc(10) %>% str_pad(10, \"right\") #> [1] \"Short \" \"This is...\" x <- c(\" a \", \"b \", \" c\") str_trim(x) #> [1] \"a\" \"b\" \"c\" str_trim(x, \"left\") #> [1] \"a \" \"b \" \"c\" jabberwocky <- str_c( \"`Twas brillig, and the slithy toves \", \"did gyre and gimble in the wabe: \", \"All mimsy were the borogoves, \", \"and the mome raths outgrabe. \" ) cat(str_wrap(jabberwocky, width = 40)) #> `Twas brillig, and the slithy toves did #> gyre and gimble in the wabe: All mimsy #> were the borogoves, and the mome raths #> outgrabe."},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"locale-sensitive","dir":"Articles","previous_headings":"","what":"Locale sensitive","title":"Introduction to stringr","text":"handful stringr functions locale-sensitive: perform differently different regions world. functions case transformation functions: String ordering sorting: locale always defaults English ensure default behaviour identical across systems. Locales always include two letter ISO-639-1 language code (like “en” English “zh” Chinese), optionally ISO-3166 country code (like “en_UK” vs “en_US”). can see complete list available locales running stringi::stri_locale_list().","code":"x <- \"I like horses.\" str_to_upper(x) #> [1] \"I LIKE HORSES.\" str_to_title(x) #> [1] \"I Like Horses.\" str_to_lower(x) #> [1] \"i like horses.\" # Turkish has two sorts of i: with and without the dot str_to_lower(x, \"tr\") #> [1] \"ı like horses.\" x <- c(\"y\", \"i\", \"k\") str_order(x) #> [1] 2 3 1 str_sort(x) #> [1] \"i\" \"k\" \"y\" # In Lithuanian, y comes between i and k str_sort(x, locale = \"lt\") #> [1] \"i\" \"y\" \"k\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"pattern-matching","dir":"Articles","previous_headings":"","what":"Pattern matching","title":"Introduction to stringr","text":"vast majority stringr functions work patterns. parameterised task perform types patterns match.","code":""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"tasks","dir":"Articles","previous_headings":"Pattern matching","what":"Tasks","title":"Introduction to stringr","text":"pattern matching function first two arguments, character vector strings process single pattern match. stringr provides pattern matching functions detect, locate, extract, match, replace, split strings. ’ll illustrate work strings regular expression designed match (US) phone numbers: str_detect() detects presence absence pattern returns logical vector (similar grepl()). str_subset() returns elements character vector match regular expression (similar grep() value = TRUE)`. str_count() counts number matches: str_locate() locates first position pattern returns numeric matrix columns start end. str_locate_all() locates matches, returning list numeric matrices. Similar regexpr() gregexpr(). str_extract() extracts text corresponding first match, returning character vector. str_extract_all() extracts matches returns list character vectors. str_match() extracts capture groups formed () first match. returns character matrix one column complete match one column group. str_match_all() extracts capture groups matches returns list character matrices. Similar regmatches(). str_replace() replaces first matched pattern returns character vector. str_replace_all() replaces matches. Similar sub() gsub(). str_split_fixed() splits string fixed number pieces based pattern returns character matrix. str_split() splits string variable number pieces returns list character vectors.","code":"strings <- c( \"apple\", \"219 733 8965\", \"329-293-8753\", \"Work: 579-499-7527; Home: 543.355.3679\" ) phone <- \"([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})\" # Which strings contain phone numbers? str_detect(strings, phone) #> [1] FALSE TRUE TRUE TRUE str_subset(strings, phone) #> [1] \"219 733 8965\" #> [2] \"329-293-8753\" #> [3] \"Work: 579-499-7527; Home: 543.355.3679\" # How many phone numbers in each string? str_count(strings, phone) #> [1] 0 1 1 2 # Where in the string is the phone number located? (loc <- str_locate(strings, phone)) #> start end #> [1,] NA NA #> [2,] 1 12 #> [3,] 1 12 #> [4,] 7 18 str_locate_all(strings, phone) #> [[1]] #> start end #> #> [[2]] #> start end #> [1,] 1 12 #> #> [[3]] #> start end #> [1,] 1 12 #> #> [[4]] #> start end #> [1,] 7 18 #> [2,] 27 38 # What are the phone numbers? str_extract(strings, phone) #> [1] NA \"219 733 8965\" \"329-293-8753\" \"579-499-7527\" str_extract_all(strings, phone) #> [[1]] #> character(0) #> #> [[2]] #> [1] \"219 733 8965\" #> #> [[3]] #> [1] \"329-293-8753\" #> #> [[4]] #> [1] \"579-499-7527\" \"543.355.3679\" str_extract_all(strings, phone, simplify = TRUE) #> [,1] [,2] #> [1,] \"\" \"\" #> [2,] \"219 733 8965\" \"\" #> [3,] \"329-293-8753\" \"\" #> [4,] \"579-499-7527\" \"543.355.3679\" # Pull out the three components of the match str_match(strings, phone) #> [,1] [,2] [,3] [,4] #> [1,] NA NA NA NA #> [2,] \"219 733 8965\" \"219\" \"733\" \"8965\" #> [3,] \"329-293-8753\" \"329\" \"293\" \"8753\" #> [4,] \"579-499-7527\" \"579\" \"499\" \"7527\" str_match_all(strings, phone) #> [[1]] #> [,1] [,2] [,3] [,4] #> #> [[2]] #> [,1] [,2] [,3] [,4] #> [1,] \"219 733 8965\" \"219\" \"733\" \"8965\" #> #> [[3]] #> [,1] [,2] [,3] [,4] #> [1,] \"329-293-8753\" \"329\" \"293\" \"8753\" #> #> [[4]] #> [,1] [,2] [,3] [,4] #> [1,] \"579-499-7527\" \"579\" \"499\" \"7527\" #> [2,] \"543.355.3679\" \"543\" \"355\" \"3679\" str_replace(strings, phone, \"XXX-XXX-XXXX\") #> [1] \"apple\" #> [2] \"XXX-XXX-XXXX\" #> [3] \"XXX-XXX-XXXX\" #> [4] \"Work: XXX-XXX-XXXX; Home: 543.355.3679\" str_replace_all(strings, phone, \"XXX-XXX-XXXX\") #> [1] \"apple\" #> [2] \"XXX-XXX-XXXX\" #> [3] \"XXX-XXX-XXXX\" #> [4] \"Work: XXX-XXX-XXXX; Home: XXX-XXX-XXXX\" str_split(\"a-b-c\", \"-\") #> [[1]] #> [1] \"a\" \"b\" \"c\" str_split_fixed(\"a-b-c\", \"-\", n = 2) #> [,1] [,2] #> [1,] \"a\" \"b-c\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"engines","dir":"Articles","previous_headings":"Pattern matching","what":"Engines","title":"Introduction to stringr","text":"four main engines stringr can use describe patterns: Regular expressions, default, shown , described vignette(\"regular-expressions\"). Fixed bytewise matching, fixed(). Locale-sensitive character matching, coll() Text boundary analysis boundary().","code":""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"fixed-matches","dir":"Articles","previous_headings":"Pattern matching > Engines","what":"Fixed matches","title":"Introduction to stringr","text":"fixed(x) matches exact sequence bytes specified x. limited “pattern”, restriction can make matching much faster. Beware using fixed() non-English data. problematic often multiple ways representing character. example, two ways define “á”: either single character “” plus accent: render identically, ’re defined differently, fixed() doesn’t find match. Instead, can use coll(), explained , respect human character comparison rules:","code":"a1 <- \"\\u00e1\" a2 <- \"a\\u0301\" c(a1, a2) #> [1] \"á\" \"á\" a1 == a2 #> [1] FALSE str_detect(a1, fixed(a2)) #> [1] FALSE str_detect(a1, coll(a2)) #> [1] TRUE"},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"collation-search","dir":"Articles","previous_headings":"Pattern matching > Engines","what":"Collation search","title":"Introduction to stringr","text":"coll(x) looks match x using human-language collation rules, particularly important want case insensitive matching. Collation rules differ around world, ’ll also need supply locale parameter. downside coll() speed. rules recognising characters complicated, coll() relatively slow compared regex() fixed(). Note fixed() regex() ignore_case arguments, perform much simpler comparison coll().","code":"i <- c(\"I\", \"İ\", \"i\", \"ı\") i #> [1] \"I\" \"İ\" \"i\" \"ı\" str_subset(i, coll(\"i\", ignore_case = TRUE)) #> [1] \"I\" \"i\" str_subset(i, coll(\"i\", ignore_case = TRUE, locale = \"tr\")) #> [1] \"İ\" \"i\""},{"path":"https://stringr.tidyverse.org/dev/articles/stringr.html","id":"boundary","dir":"Articles","previous_headings":"Pattern matching > Engines","what":"Boundary","title":"Introduction to stringr","text":"boundary() matches boundaries characters, lines, sentences words. ’s useful str_split(), can used pattern matching functions: convention, \"\" treated boundary(\"character\"):","code":"x <- \"This is a sentence.\" str_split(x, boundary(\"word\")) #> [[1]] #> [1] \"This\" \"is\" \"a\" \"sentence\" str_count(x, boundary(\"word\")) #> [1] 4 str_extract_all(x, boundary(\"word\")) #> [[1]] #> [1] \"This\" \"is\" \"a\" \"sentence\" str_split(x, \"\") #> [[1]] #> [1] \"T\" \"h\" \"i\" \"s\" \" \" \"i\" \"s\" \" \" \"a\" \" \" \"s\" \"e\" \"n\" \"t\" \"e\" \"n\" \"c\" #> [18] \"e\" \".\" str_count(x, \"\") #> [1] 19"},{"path":"https://stringr.tidyverse.org/dev/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Hadley Wickham. Author, maintainer, copyright holder. . Copyright holder, funder.","code":""},{"path":"https://stringr.tidyverse.org/dev/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Wickham H (2023). stringr: Simple, Consistent Wrappers Common String Operations. https://stringr.tidyverse.org, https://github.com/tidyverse/stringr.","code":"@Manual{, title = {stringr: Simple, Consistent Wrappers for Common String Operations}, author = {Hadley Wickham}, year = {2023}, note = {https://stringr.tidyverse.org, https://github.com/tidyverse/stringr}, }"},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"overview","dir":"","previous_headings":"","what":"Overview","title":"Simple, Consistent Wrappers for Common String Operations","text":"Strings glamorous, high-profile components R, play big role many data cleaning preparation tasks. stringr package provides cohesive set functions designed make working strings easy possible. ’re familiar strings, best place start chapter strings R Data Science. stringr built top stringi, uses ICU C library provide fast, correct implementations common string manipulations. stringr focusses important commonly used string manipulation functions whereas stringi provides comprehensive set covering almost anything can imagine. find stringr missing function need, try looking stringi. packages share similar conventions, ’ve mastered stringr, find stringi similarly easy use.","code":""},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Simple, Consistent Wrappers for Common String Operations","text":"","code":"# The easiest way to get stringr is to install the whole tidyverse: install.packages(\"tidyverse\") # Alternatively, install just stringr: install.packages(\"stringr\")"},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"usage","dir":"","previous_headings":"","what":"Usage","title":"Simple, Consistent Wrappers for Common String Operations","text":"functions stringr start str_ take vector strings first argument: string functions work regular expressions, concise language describing patterns text. example, regular expression \"[aeiou]\" matches single character vowel: seven main verbs work patterns: str_detect(x, pattern) tells ’s match pattern: str_count(x, pattern) counts number patterns: str_subset(x, pattern) extracts matching components: str_locate(x, pattern) gives position match: str_extract(x, pattern) extracts text match: str_match(x, pattern) extracts parts match defined parentheses: str_replace(x, pattern, replacement) replaces matches new text: str_split(x, pattern) splits string multiple pieces: well regular expressions (default), three pattern matching engines: fixed(): match exact bytes coll(): match human letters boundary(): match boundaries","code":"x <- c(\"why\", \"video\", \"cross\", \"extra\", \"deal\", \"authority\") str_length(x) #> [1] 3 5 5 5 4 9 str_c(x, collapse = \", \") #> [1] \"why, video, cross, extra, deal, authority\" str_sub(x, 1, 2) #> [1] \"wh\" \"vi\" \"cr\" \"ex\" \"de\" \"au\" str_subset(x, \"[aeiou]\") #> [1] \"video\" \"cross\" \"extra\" \"deal\" \"authority\" str_count(x, \"[aeiou]\") #> [1] 0 3 1 2 2 4 str_detect(x, \"[aeiou]\") #> [1] FALSE TRUE TRUE TRUE TRUE TRUE str_count(x, \"[aeiou]\") #> [1] 0 3 1 2 2 4 str_subset(x, \"[aeiou]\") #> [1] \"video\" \"cross\" \"extra\" \"deal\" \"authority\" str_locate(x, \"[aeiou]\") #> start end #> [1,] NA NA #> [2,] 2 2 #> [3,] 3 3 #> [4,] 1 1 #> [5,] 2 2 #> [6,] 1 1 str_extract(x, \"[aeiou]\") #> [1] NA \"i\" \"o\" \"e\" \"e\" \"a\" # extract the characters on either side of the vowel str_match(x, \"(.)[aeiou](.)\") #> [,1] [,2] [,3] #> [1,] NA NA NA #> [2,] \"vid\" \"v\" \"d\" #> [3,] \"ros\" \"r\" \"s\" #> [4,] NA NA NA #> [5,] \"dea\" \"d\" \"a\" #> [6,] \"aut\" \"a\" \"t\" str_replace(x, \"[aeiou]\", \"?\") #> [1] \"why\" \"v?deo\" \"cr?ss\" \"?xtra\" \"d?al\" \"?uthority\" str_split(c(\"a,b\", \"c,d,e\"), \",\") #> [[1]] #> [1] \"a\" \"b\" #> #> [[2]] #> [1] \"c\" \"d\" \"e\""},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"rstudio-addin","dir":"","previous_headings":"","what":"RStudio Addin","title":"Simple, Consistent Wrappers for Common String Operations","text":"RegExplain RStudio addin provides friendly interface working regular expressions functions stringr. addin allows interactively build regexp, check output common string matching functions, consult interactive help pages, use included resources learn regular expressions. addin can easily installed devtools:","code":"# install.packages(\"devtools\") devtools::install_github(\"gadenbuie/regexplain\")"},{"path":"https://stringr.tidyverse.org/dev/index.html","id":"compared-to-base-r","dir":"","previous_headings":"","what":"Compared to base R","title":"Simple, Consistent Wrappers for Common String Operations","text":"R provides solid set string operations, grown organically time, can inconsistent little hard learn. Additionally, lag behind string operations programming languages, things easy languages like Ruby Python rather hard R. Uses consistent function argument names. first argument always vector strings modify, makes stringr work particularly well conjunction pipe: Simplifies string operations eliminating options don’t need 95% time. Produces outputs can easily used inputs. includes ensuring missing inputs result missing outputs, zero length inputs result zero length outputs. Learn vignette(\"-base\")","code":"letters %>% .[1:10] %>% str_pad(3, \"right\") %>% str_c(letters[2:11]) #> [1] \"a b\" \"b c\" \"c d\" \"d e\" \"e f\" \"f g\" \"g h\" \"h i\" \"i j\" \"j k\""},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert string to upper case, lower case, title case, or sentence case — case","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"str_to_upper() converts upper case. str_to_lower() converts lower case. str_to_title() converts title case, first letter word capitalized. str_to_sentence() convert sentence case, first letter sentence capitalized.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"","code":"str_to_upper(string, locale = \"en\") str_to_lower(string, locale = \"en\") str_to_title(string, locale = \"en\") str_to_sentence(string, locale = \"en\")"},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"string Input vector. Either character vector, something coercible one. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"character vector length string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/case.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert string to upper case, lower case, title case, or sentence case — case","text":"","code":"dog <- \"The quick brown dog\" str_to_upper(dog) #> [1] \"THE QUICK BROWN DOG\" str_to_lower(dog) #> [1] \"the quick brown dog\" str_to_title(dog) #> [1] \"The Quick Brown Dog\" str_to_sentence(\"the quick brown dog\") #> [1] \"The quick brown dog\" # Locale matters! str_to_upper(\"i\") # English #> [1] \"I\" str_to_upper(\"i\", \"tr\") # Turkish #> [1] \"İ\""},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":null,"dir":"Reference","previous_headings":"","what":"Switch location of matches to location of non-matches — invert_match","title":"Switch location of matches to location of non-matches — invert_match","text":"Invert matrix match locations match opposite previously matched.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Switch location of matches to location of non-matches — invert_match","text":"","code":"invert_match(loc)"},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Switch location of matches to location of non-matches — invert_match","text":"loc matrix match locations, str_locate_all()","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Switch location of matches to location of non-matches — invert_match","text":"numeric match giving locations non-matches","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/invert_match.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Switch location of matches to location of non-matches — invert_match","text":"","code":"numbers <- \"1 and 2 and 4 and 456\" num_loc <- str_locate_all(numbers, \"[0-9]+\")[[1]] str_sub(numbers, num_loc[, \"start\"], num_loc[, \"end\"]) #> [1] \"1\" \"2\" \"4\" \"456\" text_loc <- invert_match(num_loc) str_sub(numbers, text_loc[, \"start\"], text_loc[, \"end\"]) #> [1] \"\" \" and \" \" and \" \" and \" \"\""},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":null,"dir":"Reference","previous_headings":"","what":"Control matching behaviour with modifier functions — modifiers","title":"Control matching behaviour with modifier functions — modifiers","text":"Modifier functions control meaning pattern argument stringr functions: boundary(): Match boundaries things. coll(): Compare strings using standard Unicode collation rules. fixed(): Compare literal bytes. regex() (default): Uses ICU regular expressions.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Control matching behaviour with modifier functions — modifiers","text":"","code":"fixed(pattern, ignore_case = FALSE) coll(pattern, ignore_case = FALSE, locale = \"en\", ...) regex( pattern, ignore_case = FALSE, multiline = FALSE, comments = FALSE, dotall = FALSE, ... ) boundary( type = c(\"character\", \"line_break\", \"sentence\", \"word\"), skip_word_none = NA, ... )"},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Control matching behaviour with modifier functions — modifiers","text":"pattern Pattern modify behaviour. ignore_case case differences ignored match? fixed(), uses simple algorithm assumes one--one mapping upper lower case letters. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms. ... less frequently used arguments passed stringi::stri_opts_collator(), stringi::stri_opts_regex(), stringi::stri_opts_brkiter() multiline TRUE, $ ^ match beginning end line. FALSE, default, match start end input. comments TRUE, white space comments beginning # ignored. Escape literal spaces \\\\ . dotall TRUE, . also match line terminators. type Boundary type detect. character Every character boundary. line_break Boundaries places acceptable line break current locale. sentence beginnings ends sentences boundaries, using intelligent rules avoid counting abbreviations (details). word beginnings ends words boundaries. skip_word_none Ignore \"words\" contain characters numbers - .e. punctuation. Default NA skip \"words\" splitting word boundaries.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Control matching behaviour with modifier functions — modifiers","text":"stringr modifier object, .e. character vector parent S3 class stringr_pattern.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/modifiers.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Control matching behaviour with modifier functions — modifiers","text":"","code":"pattern <- \"a.b\" strings <- c(\"abb\", \"a.b\") str_detect(strings, pattern) #> [1] TRUE TRUE str_detect(strings, fixed(pattern)) #> [1] FALSE TRUE str_detect(strings, coll(pattern)) #> [1] FALSE TRUE # coll() is useful for locale-aware case-insensitive matching i <- c(\"I\", \"\\u0130\", \"i\") i #> [1] \"I\" \"İ\" \"i\" str_detect(i, fixed(\"i\", TRUE)) #> [1] TRUE FALSE TRUE str_detect(i, coll(\"i\", TRUE)) #> [1] TRUE FALSE TRUE str_detect(i, coll(\"i\", TRUE, locale = \"tr\")) #> [1] FALSE TRUE TRUE # Word boundaries words <- c(\"These are some words.\") str_count(words, boundary(\"word\")) #> [1] 4 str_split(words, \" \")[[1]] #> [1] \"These\" \"are\" \"\" \"\" \"some\" \"words.\" str_split(words, boundary(\"word\"))[[1]] #> [1] \"These\" \"are\" \"some\" \"words\" # Regular expression variations str_extract_all(\"The Cat in the Hat\", \"[a-z]+\") #> [[1]] #> [1] \"he\" \"at\" \"in\" \"the\" \"at\" #> str_extract_all(\"The Cat in the Hat\", regex(\"[a-z]+\", TRUE)) #> [[1]] #> [1] \"The\" \"Cat\" \"in\" \"the\" \"Hat\" #> str_extract_all(\"a\\nb\\nc\", \"^.\") #> [[1]] #> [1] \"a\" #> str_extract_all(\"a\\nb\\nc\", regex(\"^.\", multiline = TRUE)) #> [[1]] #> [1] \"a\" \"b\" \"c\" #> str_extract_all(\"a\\nb\\nc\", \"a.\") #> [[1]] #> character(0) #> str_extract_all(\"a\\nb\\nc\", regex(\"a.\", dotall = TRUE)) #> [[1]] #> [1] \"a\\n\" #>"},{"path":"https://stringr.tidyverse.org/dev/reference/pipe.html","id":null,"dir":"Reference","previous_headings":"","what":"Pipe operator — %>%","title":"Pipe operator — %>%","text":"Pipe operator","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/pipe.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Pipe operator — %>%","text":"","code":"lhs %>% rhs"},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":null,"dir":"Reference","previous_headings":"","what":"Join multiple strings into one string — str_c","title":"Join multiple strings into one string — str_c","text":"str_c() combines multiple character vectors single character vector. similar paste0() uses tidyverse recycling NA rules. One way understand str_c() works picture 2d matrix strings, argument forms column. sep inserted column, row combined together single string. collapse set, inserted row, result combined, time single string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Join multiple strings into one string — str_c","text":"","code":"str_c(..., sep = \"\", collapse = NULL)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Join multiple strings into one string — str_c","text":"... One character vectors. NULLs removed; scalar inputs (vectors length 1) recycled common length vector inputs. Like R functions, missing values \"infectious\": whenever missing value combined another string result always missing. Use dplyr::coalesce() str_replace_na() convert desired value. sep String insert input vectors. collapse Optional string used combine output single string. Generally better use str_flatten() needed behaviour.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Join multiple strings into one string — str_c","text":"collapse = NULL (default) character vector length equal longest input. collapse string, character vector length 1.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_c.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Join multiple strings into one string — str_c","text":"","code":"str_c(\"Letter: \", letters) #> [1] \"Letter: a\" \"Letter: b\" \"Letter: c\" \"Letter: d\" \"Letter: e\" #> [6] \"Letter: f\" \"Letter: g\" \"Letter: h\" \"Letter: i\" \"Letter: j\" #> [11] \"Letter: k\" \"Letter: l\" \"Letter: m\" \"Letter: n\" \"Letter: o\" #> [16] \"Letter: p\" \"Letter: q\" \"Letter: r\" \"Letter: s\" \"Letter: t\" #> [21] \"Letter: u\" \"Letter: v\" \"Letter: w\" \"Letter: x\" \"Letter: y\" #> [26] \"Letter: z\" str_c(\"Letter\", letters, sep = \": \") #> [1] \"Letter: a\" \"Letter: b\" \"Letter: c\" \"Letter: d\" \"Letter: e\" #> [6] \"Letter: f\" \"Letter: g\" \"Letter: h\" \"Letter: i\" \"Letter: j\" #> [11] \"Letter: k\" \"Letter: l\" \"Letter: m\" \"Letter: n\" \"Letter: o\" #> [16] \"Letter: p\" \"Letter: q\" \"Letter: r\" \"Letter: s\" \"Letter: t\" #> [21] \"Letter: u\" \"Letter: v\" \"Letter: w\" \"Letter: x\" \"Letter: y\" #> [26] \"Letter: z\" str_c(letters, \" is for\", \"...\") #> [1] \"a is for...\" \"b is for...\" \"c is for...\" \"d is for...\" \"e is for...\" #> [6] \"f is for...\" \"g is for...\" \"h is for...\" \"i is for...\" \"j is for...\" #> [11] \"k is for...\" \"l is for...\" \"m is for...\" \"n is for...\" \"o is for...\" #> [16] \"p is for...\" \"q is for...\" \"r is for...\" \"s is for...\" \"t is for...\" #> [21] \"u is for...\" \"v is for...\" \"w is for...\" \"x is for...\" \"y is for...\" #> [26] \"z is for...\" str_c(letters[-26], \" comes before \", letters[-1]) #> [1] \"a comes before b\" \"b comes before c\" \"c comes before d\" #> [4] \"d comes before e\" \"e comes before f\" \"f comes before g\" #> [7] \"g comes before h\" \"h comes before i\" \"i comes before j\" #> [10] \"j comes before k\" \"k comes before l\" \"l comes before m\" #> [13] \"m comes before n\" \"n comes before o\" \"o comes before p\" #> [16] \"p comes before q\" \"q comes before r\" \"r comes before s\" #> [19] \"s comes before t\" \"t comes before u\" \"u comes before v\" #> [22] \"v comes before w\" \"w comes before x\" \"x comes before y\" #> [25] \"y comes before z\" str_c(letters, collapse = \"\") #> [1] \"abcdefghijklmnopqrstuvwxyz\" str_c(letters, collapse = \", \") #> [1] \"a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z\" # Differences from paste() ---------------------- # Missing inputs give missing outputs str_c(c(\"a\", NA, \"b\"), \"-d\") #> [1] \"a-d\" NA \"b-d\" paste0(c(\"a\", NA, \"b\"), \"-d\") #> [1] \"a-d\" \"NA-d\" \"b-d\" # Use str_replace_NA to display literal NAs: str_c(str_replace_na(c(\"a\", NA, \"b\")), \"-d\") #> [1] \"a-d\" \"NA-d\" \"b-d\" # Uses tidyverse recycling rules if (FALSE) str_c(1:2, 1:3) # errors paste0(1:2, 1:3) #> [1] \"11\" \"22\" \"13\" str_c(\"x\", character()) #> character(0) paste0(\"x\", character()) #> [1] \"x\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_conv.html","id":null,"dir":"Reference","previous_headings":"","what":"Specify the encoding of a string — str_conv","title":"Specify the encoding of a string — str_conv","text":"convenient way override current encoding string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_conv.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Specify the encoding of a string — str_conv","text":"","code":"str_conv(string, encoding)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_conv.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Specify the encoding of a string — str_conv","text":"string Input vector. Either character vector, something coercible one. encoding Name encoding. See stringi::stri_enc_list() complete list.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_conv.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Specify the encoding of a string — str_conv","text":"","code":"# Example from encoding?stringi::stringi x <- rawToChar(as.raw(177)) x #> [1] \"\\xb1\" str_conv(x, \"ISO-8859-2\") # Polish \"a with ogonek\" #> [1] \"ą\" str_conv(x, \"ISO-8859-1\") # Plus-minus #> [1] \"±\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":null,"dir":"Reference","previous_headings":"","what":"Count number of matches — str_count","title":"Count number of matches — str_count","text":"Counts number times pattern found within element string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Count number of matches — str_count","text":"","code":"str_count(string, pattern = \"\")"},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Count number of matches — str_count","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\").","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Count number of matches — str_count","text":"integer vector length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_count.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Count number of matches — str_count","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_count(fruit, \"a\") #> [1] 1 3 1 1 str_count(fruit, \"p\") #> [1] 2 0 1 3 str_count(fruit, \"e\") #> [1] 1 0 1 2 str_count(fruit, c(\"a\", \"b\", \"p\", \"p\")) #> [1] 1 1 1 3 str_count(c(\"a.\", \"...\", \".a.a\"), \".\") #> [1] 2 3 4 str_count(c(\"a.\", \"...\", \".a.a\"), fixed(\".\")) #> [1] 1 3 2"},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":null,"dir":"Reference","previous_headings":"","what":"Detect the presence/absence of a match — str_detect","title":"Detect the presence/absence of a match — str_detect","text":"str_detect() returns logical vector TRUE element string matches pattern FALSE otherwise. equivalent grepl(pattern, string).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Detect the presence/absence of a match — str_detect","text":"","code":"str_detect(string, pattern, negate = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Detect the presence/absence of a match — str_detect","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). negate TRUE, inverts resulting boolean vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Detect the presence/absence of a match — str_detect","text":"logical vector length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_detect.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Detect the presence/absence of a match — str_detect","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_detect(fruit, \"a\") #> [1] TRUE TRUE TRUE TRUE str_detect(fruit, \"^a\") #> [1] TRUE FALSE FALSE FALSE str_detect(fruit, \"a$\") #> [1] FALSE TRUE FALSE FALSE str_detect(fruit, \"b\") #> [1] FALSE TRUE FALSE FALSE str_detect(fruit, \"[aeiou]\") #> [1] TRUE TRUE TRUE TRUE # Also vectorised over pattern str_detect(\"aecfg\", letters) #> [1] TRUE FALSE TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE #> [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE #> [23] FALSE FALSE FALSE FALSE # Returns TRUE if the pattern do NOT match str_detect(fruit, \"^p\", negate = TRUE) #> [1] TRUE TRUE FALSE FALSE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":null,"dir":"Reference","previous_headings":"","what":"Duplicate a string — str_dup","title":"Duplicate a string — str_dup","text":"str_dup() duplicates characters within string, e.g. str_dup(\"xy\", 3) returns \"xyxyxy\".","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Duplicate a string — str_dup","text":"","code":"str_dup(string, times)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Duplicate a string — str_dup","text":"string Input vector. Either character vector, something coercible one. times Number times duplicate string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Duplicate a string — str_dup","text":"character vector length string/times.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_dup.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Duplicate a string — str_dup","text":"","code":"fruit <- c(\"apple\", \"pear\", \"banana\") str_dup(fruit, 2) #> [1] \"appleapple\" \"pearpear\" \"bananabanana\" str_dup(fruit, 1:3) #> [1] \"apple\" \"pearpear\" \"bananabananabanana\" str_c(\"ba\", str_dup(\"na\", 0:5)) #> [1] \"ba\" \"bana\" \"banana\" \"bananana\" #> [5] \"banananana\" \"bananananana\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":null,"dir":"Reference","previous_headings":"","what":"Determine if two strings are equivalent — str_equal","title":"Determine if two strings are equivalent — str_equal","text":"uses Unicode canonicalisation rules, optionally ignores case.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Determine if two strings are equivalent — str_equal","text":"","code":"str_equal(x, y, locale = \"en\", ignore_case = FALSE, ...)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Determine if two strings are equivalent — str_equal","text":"x, y pair character vectors. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms. ignore_case Ignore case comparing strings? ... options used control collation. Passed stringi::stri_opts_collator().","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Determine if two strings are equivalent — str_equal","text":"logical vector length x/y.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_equal.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Determine if two strings are equivalent — str_equal","text":"","code":"# These two strings encode \"a\" with an accent in two different ways a1 <- \"\\u00e1\" a2 <- \"a\\u0301\" c(a1, a2) #> [1] \"á\" \"á\" a1 == a2 #> [1] FALSE str_equal(a1, a2) #> [1] TRUE # ohm and omega use different code points but should always be treated # as equal ohm <- \"\\u2126\" omega <- \"\\u03A9\" c(ohm, omega) #> [1] \"Ω\" \"Ω\" ohm == omega #> [1] FALSE str_equal(ohm, omega) #> [1] TRUE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":null,"dir":"Reference","previous_headings":"","what":"Escape regular expression metacharacters — str_escape","title":"Escape regular expression metacharacters — str_escape","text":"function escapes metacharacter, characters special meaning regular expression engine. cases better using fixed() since faster, str_escape() useful composing user provided strings pattern.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Escape regular expression metacharacters — str_escape","text":"","code":"str_escape(string)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Escape regular expression metacharacters — str_escape","text":"string Input vector. Either character vector, something coercible one.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Escape regular expression metacharacters — str_escape","text":"character vector length string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_escape.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Escape regular expression metacharacters — str_escape","text":"","code":"str_detect(c(\"a\", \".\"), \".\") #> [1] TRUE TRUE str_detect(c(\"a\", \".\"), str_escape(\".\")) #> [1] FALSE TRUE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract the complete match — str_extract","title":"Extract the complete match — str_extract","text":"str_extract() extracts first complete match string, str_extract_all()extracts matches string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract the complete match — str_extract","text":"","code":"str_extract(string, pattern, group = NULL) str_extract_all(string, pattern, simplify = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract the complete match — str_extract","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). group supplied, instead returning complete match, return matched text specified capturing group. simplify boolean. FALSE (default): returns list character vectors. TRUE: returns character matrix.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract the complete match — str_extract","text":"str_extract(): character vector length string/pattern. str_extract_all(): list character vectors length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_extract.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extract the complete match — str_extract","text":"","code":"shopping_list <- c(\"apples x4\", \"bag of flour\", \"bag of sugar\", \"milk x2\") str_extract(shopping_list, \"\\\\d\") #> [1] \"4\" NA NA \"2\" str_extract(shopping_list, \"[a-z]+\") #> [1] \"apples\" \"bag\" \"bag\" \"milk\" str_extract(shopping_list, \"[a-z]{1,4}\") #> [1] \"appl\" \"bag\" \"bag\" \"milk\" str_extract(shopping_list, \"\\\\b[a-z]{1,4}\\\\b\") #> [1] NA \"bag\" \"bag\" \"milk\" str_extract(shopping_list, \"([a-z]+) of ([a-z]+)\") #> [1] NA \"bag of flour\" \"bag of sugar\" NA str_extract(shopping_list, \"([a-z]+) of ([a-z]+)\", group = 1) #> [1] NA \"bag\" \"bag\" NA str_extract(shopping_list, \"([a-z]+) of ([a-z]+)\", group = 2) #> [1] NA \"flour\" \"sugar\" NA # Extract all matches str_extract_all(shopping_list, \"[a-z]+\") #> [[1]] #> [1] \"apples\" \"x\" #> #> [[2]] #> [1] \"bag\" \"of\" \"flour\" #> #> [[3]] #> [1] \"bag\" \"of\" \"sugar\" #> #> [[4]] #> [1] \"milk\" \"x\" #> str_extract_all(shopping_list, \"\\\\b[a-z]+\\\\b\") #> [[1]] #> [1] \"apples\" #> #> [[2]] #> [1] \"bag\" \"of\" \"flour\" #> #> [[3]] #> [1] \"bag\" \"of\" \"sugar\" #> #> [[4]] #> [1] \"milk\" #> str_extract_all(shopping_list, \"\\\\d\") #> [[1]] #> [1] \"4\" #> #> [[2]] #> character(0) #> #> [[3]] #> character(0) #> #> [[4]] #> [1] \"2\" #> # Simplify results into character matrix str_extract_all(shopping_list, \"\\\\b[a-z]+\\\\b\", simplify = TRUE) #> [,1] [,2] [,3] #> [1,] \"apples\" \"\" \"\" #> [2,] \"bag\" \"of\" \"flour\" #> [3,] \"bag\" \"of\" \"sugar\" #> [4,] \"milk\" \"\" \"\" str_extract_all(shopping_list, \"\\\\d\", simplify = TRUE) #> [,1] #> [1,] \"4\" #> [2,] \"\" #> [3,] \"\" #> [4,] \"2\" # Extract all words str_extract_all(\"This is, suprisingly, a sentence.\", boundary(\"word\")) #> [[1]] #> [1] \"This\" \"is\" \"suprisingly\" \"a\" \"sentence\" #>"},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":null,"dir":"Reference","previous_headings":"","what":"Flatten a string — str_flatten","title":"Flatten a string — str_flatten","text":"str_flatten() reduces character vector single string. summary function regardless length input x, always returns single string. str_flatten_comma() variation designed specifically flattening commas. automatically recognises last uses Oxford comma handles special case 2 elements.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Flatten a string — str_flatten","text":"","code":"str_flatten(string, collapse = \"\", last = NULL, na.rm = FALSE) str_flatten_comma(string, last = NULL, na.rm = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Flatten a string — str_flatten","text":"string Input vector. Either character vector, something coercible one. collapse String insert piece. Defaults \"\". last Optional string use place final separator. na.rm Remove missing values? FALSE (default), result NA element string NA.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Flatten a string — str_flatten","text":"string, .e. character vector length 1.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_flatten.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Flatten a string — str_flatten","text":"","code":"str_flatten(letters) #> [1] \"abcdefghijklmnopqrstuvwxyz\" str_flatten(letters, \"-\") #> [1] \"a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z\" str_flatten(letters[1:3], \", \") #> [1] \"a, b, c\" # Use last to customise the last component str_flatten(letters[1:3], \", \", \" and \") #> [1] \"a, b and c\" # this almost works if you want an Oxford (aka serial) comma str_flatten(letters[1:3], \", \", \", and \") #> [1] \"a, b, and c\" # but it will always add a comma, even when not necessary str_flatten(letters[1:2], \", \", \", and \") #> [1] \"a, and b\" # str_flatten_comma knows how to handle the Oxford comma str_flatten_comma(letters[1:3], \", and \") #> [1] \"a, b, and c\" str_flatten_comma(letters[1:2], \", and \") #> [1] \"a and b\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":null,"dir":"Reference","previous_headings":"","what":"Interpolation with glue — str_glue","title":"Interpolation with glue — str_glue","text":"functions wrappers around glue::glue() glue::glue_data(), provide powerful elegant syntax interpolating strings {}. wrappers provide small set full options. Use glue() glue_data() directly glue control.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Interpolation with glue — str_glue","text":"","code":"str_glue(..., .sep = \"\", .envir = parent.frame()) str_glue_data(.x, ..., .sep = \"\", .envir = parent.frame(), .na = \"NA\")"},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Interpolation with glue — str_glue","text":"... [expressions] Unnamed arguments taken expression string(s) format. Multiple inputs concatenated together formatting. Named arguments taken temporary variables available substitution. .sep [character(1): ‘\"\"’] Separator used separate elements. .envir [environment: parent.frame()] Environment evaluate expression . Expressions evaluated left right. .x environment, expressions evaluated environment .envir ignored. NULL passed, equivalent emptyenv(). .x [listish] environment, list, data frame used lookup values. .na [character(1): ‘NA’] Value replace NA values . NULL missing values propagated, NA result cause NA output. Otherwise value replaced value .na.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Interpolation with glue — str_glue","text":"character vector length longest input.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_glue.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Interpolation with glue — str_glue","text":"","code":"name <- \"Fred\" age <- 50 anniversary <- as.Date(\"1991-10-12\") str_glue( \"My name is {name}, \", \"my age next year is {age + 1}, \", \"and my anniversary is {format(anniversary, '%A, %B %d, %Y')}.\" ) #> My name is Fred, my age next year is 51, and my anniversary is Saturday, October 12, 1991. # single braces can be inserted by doubling them str_glue(\"My name is {name}, not {{name}}.\") #> My name is Fred, not {name}. # You can also used named arguments str_glue( \"My name is {name}, \", \"and my age next year is {age + 1}.\", name = \"Joe\", age = 40 ) #> My name is Joe, and my age next year is 41. # `str_glue_data()` is useful in data pipelines mtcars %>% str_glue_data(\"{rownames(.)} has {hp} hp\") #> Mazda RX4 has 110 hp #> Mazda RX4 Wag has 110 hp #> Datsun 710 has 93 hp #> Hornet 4 Drive has 110 hp #> Hornet Sportabout has 175 hp #> Valiant has 105 hp #> Duster 360 has 245 hp #> Merc 240D has 62 hp #> Merc 230 has 95 hp #> Merc 280 has 123 hp #> Merc 280C has 123 hp #> Merc 450SE has 180 hp #> Merc 450SL has 180 hp #> Merc 450SLC has 180 hp #> Cadillac Fleetwood has 205 hp #> Lincoln Continental has 215 hp #> Chrysler Imperial has 230 hp #> Fiat 128 has 66 hp #> Honda Civic has 52 hp #> Toyota Corolla has 65 hp #> Toyota Corona has 97 hp #> Dodge Challenger has 150 hp #> AMC Javelin has 150 hp #> Camaro Z28 has 245 hp #> Pontiac Firebird has 175 hp #> Fiat X1-9 has 66 hp #> Porsche 914-2 has 91 hp #> Lotus Europa has 113 hp #> Ford Pantera L has 264 hp #> Ferrari Dino has 175 hp #> Maserati Bora has 335 hp #> Volvo 142E has 109 hp"},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":null,"dir":"Reference","previous_headings":"","what":"String interpolation — str_interp","title":"String interpolation — str_interp","text":"str_interp() superseded favour str_glue(). String interpolation useful way specifying character string depends values certain environment. allows string creation easier read write compared using e.g. paste() sprintf(). (template) string can include expression placeholders form ${expression} $[format]{expression}, expressions valid R expressions can evaluated given environment, format format specification valid use sprintf().","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"String interpolation — str_interp","text":"","code":"str_interp(string, env = parent.frame())"},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"String interpolation — str_interp","text":"string template character string. function vectorised: character vector collapsed single string. env environment evaluate expressions.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"String interpolation — str_interp","text":"interpolated character string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"String interpolation — str_interp","text":"Stefan Milton Bache","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_interp.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"String interpolation — str_interp","text":"","code":"# Using values from the environment, and some formats user_name <- \"smbache\" amount <- 6.656 account <- 1337 str_interp(\"User ${user_name} (account $[08d]{account}) has $$[.2f]{amount}.\") #> [1] \"User smbache (account 00001337) has $6.66.\" # Nested brace pairs work inside expressions too, and any braces can be # placed outside the expressions. str_interp(\"Works with } nested { braces too: $[.2f]{{{2 + 2}*{amount}}}\") #> [1] \"Works with } nested { braces too: 26.62\" # Values can also come from a list str_interp( \"One value, ${value1}, and then another, ${value2*2}.\", list(value1 = 10, value2 = 20) ) #> [1] \"One value, 10, and then another, 40.\" # Or a data frame str_interp( \"Values are $[.2f]{max(Sepal.Width)} and $[.2f]{min(Sepal.Width)}.\", iris ) #> [1] \"Values are 4.40 and 2.00.\" # Use a vector when the string is long: max_char <- 80 str_interp(c( \"This particular line is so long that it is hard to write \", \"without breaking the ${max_char}-char barrier!\" )) #> [1] \"This particular line is so long that it is hard to write without breaking the 80-char barrier!\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":null,"dir":"Reference","previous_headings":"","what":"Compute the length/width — str_length","title":"Compute the length/width — str_length","text":"str_length() returns number codepoints string. individual elements (often, always letters) can extracted str_sub(). str_width() returns much space string occupy printed fixed width font (.e. printed console).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compute the length/width — str_length","text":"","code":"str_length(string) str_width(string)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compute the length/width — str_length","text":"string Input vector. Either character vector, something coercible one.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compute the length/width — str_length","text":"numeric vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_length.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Compute the length/width — str_length","text":"","code":"str_length(letters) #> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 str_length(NA) #> [1] NA str_length(factor(\"abc\")) #> [1] 3 str_length(c(\"i\", \"like\", \"programming\", NA)) #> [1] 1 4 11 NA # Some characters, like emoji and Chinese characters (hanzi), are square # which means they take up the width of two Latin characters x <- c(\"\\u6c49\\u5b57\", \"\\U0001f60a\") str_view(x) #> [1] │ 汉字 #> [2] │ 😊 str_width(x) #> [1] 4 2 str_length(x) #> [1] 2 1 # There are two ways of representing a u with an umlaut u <- c(\"\\u00fc\", \"u\\u0308\") # They have the same width str_width(u) #> [1] 1 1 # But a different length str_length(u) #> [1] 1 2 # Because the second element is made up of a u + an accent str_sub(u, 1, 1) #> [1] \"ü\" \"u\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":null,"dir":"Reference","previous_headings":"","what":"Detect a pattern in the same way as SQL's LIKE operator — str_like","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"str_like() follows conventions SQL LIKE operator: Must match entire string. _ matches single character (like .). % matches number characters (like .*). \\% \\_ match literal % _. match case insensitive default.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"","code":"str_like(string, pattern, ignore_case = TRUE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"string Input vector. Either character vector, something coercible one. pattern character vector containing SQL \"like\" pattern. See details. ignore_case Ignore case matches? Defaults TRUE match SQL LIKE operator.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"logical vector length string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_like.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Detect a pattern in the same way as SQL's LIKE operator — str_like","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_like(fruit, \"app\") #> [1] FALSE FALSE FALSE FALSE str_like(fruit, \"app%\") #> [1] TRUE FALSE FALSE FALSE str_like(fruit, \"ba_ana\") #> [1] FALSE TRUE FALSE FALSE str_like(fruit, \"%APPLE\") #> [1] TRUE FALSE FALSE TRUE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":null,"dir":"Reference","previous_headings":"","what":"Find location of match — str_locate","title":"Find location of match — str_locate","text":"str_locate() returns start end position first match; str_locate_all() returns start end position match. start end values inclusive, zero-length matches (e.g. $, ^, \\\\b) end smaller start.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find location of match — str_locate","text":"","code":"str_locate(string, pattern) str_locate_all(string, pattern)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find location of match — str_locate","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\").","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find location of match — str_locate","text":"str_locate() returns integer matrix two columns one row element string. first column, start, gives position start match, second column, end, gives position end. str_locate_all() returns list integer matrices length string/pattern. matrices columns start end , one row match.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_locate.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find location of match — str_locate","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_locate(fruit, \"$\") #> start end #> [1,] 6 5 #> [2,] 7 6 #> [3,] 5 4 #> [4,] 10 9 str_locate(fruit, \"a\") #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 5 5 str_locate(fruit, \"e\") #> start end #> [1,] 5 5 #> [2,] NA NA #> [3,] 2 2 #> [4,] 4 4 str_locate(fruit, c(\"a\", \"b\", \"p\", \"p\")) #> start end #> [1,] 1 1 #> [2,] 1 1 #> [3,] 1 1 #> [4,] 1 1 str_locate_all(fruit, \"a\") #> [[1]] #> start end #> [1,] 1 1 #> #> [[2]] #> start end #> [1,] 2 2 #> [2,] 4 4 #> [3,] 6 6 #> #> [[3]] #> start end #> [1,] 3 3 #> #> [[4]] #> start end #> [1,] 5 5 #> str_locate_all(fruit, \"e\") #> [[1]] #> start end #> [1,] 5 5 #> #> [[2]] #> start end #> #> [[3]] #> start end #> [1,] 2 2 #> #> [[4]] #> start end #> [1,] 4 4 #> [2,] 9 9 #> str_locate_all(fruit, c(\"a\", \"b\", \"p\", \"p\")) #> [[1]] #> start end #> [1,] 1 1 #> #> [[2]] #> start end #> [1,] 1 1 #> #> [[3]] #> start end #> [1,] 1 1 #> #> [[4]] #> start end #> [1,] 1 1 #> [2,] 6 6 #> [3,] 7 7 #> # Find location of every character str_locate_all(fruit, \"\") #> [[1]] #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 4 4 #> [5,] 5 5 #> #> [[2]] #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 4 4 #> [5,] 5 5 #> [6,] 6 6 #> #> [[3]] #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 4 4 #> #> [[4]] #> start end #> [1,] 1 1 #> [2,] 2 2 #> [3,] 3 3 #> [4,] 4 4 #> [5,] 5 5 #> [6,] 6 6 #> [7,] 7 7 #> [8,] 8 8 #> [9,] 9 9 #>"},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract components (capturing groups) from a match — str_match","title":"Extract components (capturing groups) from a match — str_match","text":"Extract number matches defined unnamed, (pattern), named, (?pattern) capture groups. Use non-capturing group, (?:pattern), need override default operate precedence want capture result.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract components (capturing groups) from a match — str_match","text":"","code":"str_match(string, pattern) str_match_all(string, pattern)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract components (capturing groups) from a match — str_match","text":"string Input vector. Either character vector, something coercible one. pattern Unlike stringr functions, str_match() supports regular expressions, described vignette(\"regular-expressions\"). pattern contain least one capturing group.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract components (capturing groups) from a match — str_match","text":"str_match(): character matrix number rows length string/pattern. first column complete match, followed one column capture group. columns named used \"named captured groups\", .e. (?pattern'). str_match_all(): list length string/pattern containing character matrices. matrix columns descrbed one row match.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_match.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extract components (capturing groups) from a match — str_match","text":"","code":"strings <- c(\" 219 733 8965\", \"329-293-8753 \", \"banana\", \"595 794 7569\", \"387 287 6718\", \"apple\", \"233.398.9187 \", \"482 952 3315\", \"239 923 8115 and 842 566 4692\", \"Work: 579-499-7527\", \"$1000\", \"Home: 543.355.3679\") phone <- \"([2-9][0-9]{2})[- .]([0-9]{3})[- .]([0-9]{4})\" str_extract(strings, phone) #> [1] \"219 733 8965\" \"329-293-8753\" NA \"595 794 7569\" #> [5] \"387 287 6718\" NA \"233.398.9187\" \"482 952 3315\" #> [9] \"239 923 8115\" \"579-499-7527\" NA \"543.355.3679\" str_match(strings, phone) #> [,1] [,2] [,3] [,4] #> [1,] \"219 733 8965\" \"219\" \"733\" \"8965\" #> [2,] \"329-293-8753\" \"329\" \"293\" \"8753\" #> [3,] NA NA NA NA #> [4,] \"595 794 7569\" \"595\" \"794\" \"7569\" #> [5,] \"387 287 6718\" \"387\" \"287\" \"6718\" #> [6,] NA NA NA NA #> [7,] \"233.398.9187\" \"233\" \"398\" \"9187\" #> [8,] \"482 952 3315\" \"482\" \"952\" \"3315\" #> [9,] \"239 923 8115\" \"239\" \"923\" \"8115\" #> [10,] \"579-499-7527\" \"579\" \"499\" \"7527\" #> [11,] NA NA NA NA #> [12,] \"543.355.3679\" \"543\" \"355\" \"3679\" # Extract/match all str_extract_all(strings, phone) #> [[1]] #> [1] \"219 733 8965\" #> #> [[2]] #> [1] \"329-293-8753\" #> #> [[3]] #> character(0) #> #> [[4]] #> [1] \"595 794 7569\" #> #> [[5]] #> [1] \"387 287 6718\" #> #> [[6]] #> character(0) #> #> [[7]] #> [1] \"233.398.9187\" #> #> [[8]] #> [1] \"482 952 3315\" #> #> [[9]] #> [1] \"239 923 8115\" \"842 566 4692\" #> #> [[10]] #> [1] \"579-499-7527\" #> #> [[11]] #> character(0) #> #> [[12]] #> [1] \"543.355.3679\" #> str_match_all(strings, phone) #> [[1]] #> [,1] [,2] [,3] [,4] #> [1,] \"219 733 8965\" \"219\" \"733\" \"8965\" #> #> [[2]] #> [,1] [,2] [,3] [,4] #> [1,] \"329-293-8753\" \"329\" \"293\" \"8753\" #> #> [[3]] #> [,1] [,2] [,3] [,4] #> #> [[4]] #> [,1] [,2] [,3] [,4] #> [1,] \"595 794 7569\" \"595\" \"794\" \"7569\" #> #> [[5]] #> [,1] [,2] [,3] [,4] #> [1,] \"387 287 6718\" \"387\" \"287\" \"6718\" #> #> [[6]] #> [,1] [,2] [,3] [,4] #> #> [[7]] #> [,1] [,2] [,3] [,4] #> [1,] \"233.398.9187\" \"233\" \"398\" \"9187\" #> #> [[8]] #> [,1] [,2] [,3] [,4] #> [1,] \"482 952 3315\" \"482\" \"952\" \"3315\" #> #> [[9]] #> [,1] [,2] [,3] [,4] #> [1,] \"239 923 8115\" \"239\" \"923\" \"8115\" #> [2,] \"842 566 4692\" \"842\" \"566\" \"4692\" #> #> [[10]] #> [,1] [,2] [,3] [,4] #> [1,] \"579-499-7527\" \"579\" \"499\" \"7527\" #> #> [[11]] #> [,1] [,2] [,3] [,4] #> #> [[12]] #> [,1] [,2] [,3] [,4] #> [1,] \"543.355.3679\" \"543\" \"355\" \"3679\" #> # You can also name the groups to make further manipulation easier phone <- \"(?[2-9][0-9]{2})[- .](?[0-9]{3}[- .][0-9]{4})\" str_match(strings, phone) #> area phone #> [1,] \"219 733 8965\" \"219\" \"733 8965\" #> [2,] \"329-293-8753\" \"329\" \"293-8753\" #> [3,] NA NA NA #> [4,] \"595 794 7569\" \"595\" \"794 7569\" #> [5,] \"387 287 6718\" \"387\" \"287 6718\" #> [6,] NA NA NA #> [7,] \"233.398.9187\" \"233\" \"398.9187\" #> [8,] \"482 952 3315\" \"482\" \"952 3315\" #> [9,] \"239 923 8115\" \"239\" \"923 8115\" #> [10,] \"579-499-7527\" \"579\" \"499-7527\" #> [11,] NA NA NA #> [12,] \"543.355.3679\" \"543\" \"355.3679\" x <- c(\" \", \" <>\", \"\", \"\", NA) str_match(x, \"<(.*?)> <(.*?)>\") #> [,1] [,2] [,3] #> [1,] \" \" \"a\" \"b\" #> [2,] \" <>\" \"a\" \"\" #> [3,] NA NA NA #> [4,] NA NA NA #> [5,] NA NA NA str_match_all(x, \"<(.*?)>\") #> [[1]] #> [,1] [,2] #> [1,] \"\" \"a\" #> [2,] \"\" \"b\" #> #> [[2]] #> [,1] [,2] #> [1,] \"\" \"a\" #> [2,] \"<>\" \"\" #> #> [[3]] #> [,1] [,2] #> [1,] \"\" \"a\" #> #> [[4]] #> [,1] [,2] #> #> [[5]] #> [,1] [,2] #> [1,] NA NA #> str_extract(x, \"<.*?>\") #> [1] \"\" \"\" \"\" NA NA str_extract_all(x, \"<.*?>\") #> [[1]] #> [1] \"\" \"\" #> #> [[2]] #> [1] \"\" \"<>\" #> #> [[3]] #> [1] \"\" #> #> [[4]] #> character(0) #> #> [[5]] #> [1] NA #>"},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":null,"dir":"Reference","previous_headings":"","what":"Order, rank, or sort a character vector — str_order","title":"Order, rank, or sort a character vector — str_order","text":"str_sort() returns sorted vector. str_order() returns integer vector returns desired order used subsetting, .e. x[str_order(x)] str_sort() str_rank() returns ranks values, .e. arrange(df, str_rank(x)) str_sort(df$x).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Order, rank, or sort a character vector — str_order","text":"","code":"str_order( x, decreasing = FALSE, na_last = TRUE, locale = \"en\", numeric = FALSE, ... ) str_rank(x, locale = \"en\", numeric = FALSE, ...) str_sort( x, decreasing = FALSE, na_last = TRUE, locale = \"en\", numeric = FALSE, ... )"},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Order, rank, or sort a character vector — str_order","text":"x character vector sort. decreasing boolean. FALSE, default, sorts lowest highest; TRUE sorts highest lowest. na_last NA go? TRUE end, FALSE beginning, NA dropped. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms. numeric TRUE, sort digits numerically, instead strings. ... options used control collation. Passed stringi::stri_opts_collator().","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Order, rank, or sort a character vector — str_order","text":"character vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_order.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Order, rank, or sort a character vector — str_order","text":"","code":"x <- c(\"apple\", \"car\", \"happy\", \"char\") str_sort(x) #> [1] \"apple\" \"car\" \"char\" \"happy\" str_order(x) #> [1] 1 2 4 3 x[str_order(x)] #> [1] \"apple\" \"car\" \"char\" \"happy\" str_rank(x) #> [1] 1 2 4 3 # In Czech, ch is a digraph that sorts after h str_sort(x, locale = \"cs\") #> [1] \"apple\" \"car\" \"happy\" \"char\" # Use numeric = TRUE to sort numbers in strings x <- c(\"100a10\", \"100a5\", \"2b\", \"2a\") str_sort(x) #> [1] \"100a10\" \"100a5\" \"2a\" \"2b\" str_sort(x, numeric = TRUE) #> [1] \"2a\" \"2b\" \"100a5\" \"100a10\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":null,"dir":"Reference","previous_headings":"","what":"Pad a string to minimum width — str_pad","title":"Pad a string to minimum width — str_pad","text":"Pad string fixed width, str_length(str_pad(x, n)) always greater equal n.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Pad a string to minimum width — str_pad","text":"","code":"str_pad( string, width, side = c(\"left\", \"right\", \"both\"), pad = \" \", use_width = TRUE )"},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Pad a string to minimum width — str_pad","text":"string Input vector. Either character vector, something coercible one. width Minimum width padded strings. side Side padding character added (left, right ). pad Single padding character (default space). use_width FALSE, use length string instead width; see str_width()/str_length() difference.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Pad a string to minimum width — str_pad","text":"character vector length stringr/width/pad.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_pad.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Pad a string to minimum width — str_pad","text":"","code":"rbind( str_pad(\"hadley\", 30, \"left\"), str_pad(\"hadley\", 30, \"right\"), str_pad(\"hadley\", 30, \"both\") ) #> [,1] #> [1,] \" hadley\" #> [2,] \"hadley \" #> [3,] \" hadley \" # All arguments are vectorised except side str_pad(c(\"a\", \"abc\", \"abcdef\"), 10) #> [1] \" a\" \" abc\" \" abcdef\" str_pad(\"a\", c(5, 10, 20)) #> [1] \" a\" \" a\" \" a\" str_pad(\"a\", 10, pad = c(\"-\", \"_\", \" \")) #> [1] \"---------a\" \"_________a\" \" a\" # Longer strings are returned unchanged str_pad(\"hadley\", 3) #> [1] \"hadley\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":null,"dir":"Reference","previous_headings":"","what":"Remove matched patterns — str_remove","title":"Remove matched patterns — str_remove","text":"Remove matches, .e. replace \"\".","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Remove matched patterns — str_remove","text":"","code":"str_remove(string, pattern) str_remove_all(string, pattern)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Remove matched patterns — str_remove","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\").","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Remove matched patterns — str_remove","text":"character vector length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_remove.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Remove matched patterns — str_remove","text":"","code":"fruits <- c(\"one apple\", \"two pears\", \"three bananas\") str_remove(fruits, \"[aeiou]\") #> [1] \"ne apple\" \"tw pears\" \"thre bananas\" str_remove_all(fruits, \"[aeiou]\") #> [1] \"n ppl\" \"tw prs\" \"thr bnns\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":null,"dir":"Reference","previous_headings":"","what":"Replace matches with new text — str_replace","title":"Replace matches with new text — str_replace","text":"str_replace() replaces first match; str_replace_all() replaces matches.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Replace matches with new text — str_replace","text":"","code":"str_replace(string, pattern, replacement) str_replace_all(string, pattern, replacement)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Replace matches with new text — str_replace","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described stringi::about_search_regex. Control options regex(). perform multiple replacements element string, pass supply named vector (c(pattern1 = replacement1)). Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. replacement replacement value, usually single string, can vector length string pattern. References form \\1, \\2, etc replaced contents respective matched group (created ()). Alternatively, supply function, called match (right left) return value used replace match.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Replace matches with new text — str_replace","text":"character vector length string/pattern/replacement.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Replace matches with new text — str_replace","text":"","code":"fruits <- c(\"one apple\", \"two pears\", \"three bananas\") str_replace(fruits, \"[aeiou]\", \"-\") #> [1] \"-ne apple\" \"tw- pears\" \"thr-e bananas\" str_replace_all(fruits, \"[aeiou]\", \"-\") #> [1] \"-n- -ppl-\" \"tw- p--rs\" \"thr-- b-n-n-s\" str_replace_all(fruits, \"[aeiou]\", toupper) #> [1] \"OnE ApplE\" \"twO pEArs\" \"thrEE bAnAnAs\" str_replace_all(fruits, \"b\", NA_character_) #> [1] \"one apple\" \"two pears\" NA str_replace(fruits, \"([aeiou])\", \"\") #> [1] \"ne apple\" \"tw pears\" \"thre bananas\" str_replace(fruits, \"([aeiou])\", \"\\\\1\\\\1\") #> [1] \"oone apple\" \"twoo pears\" \"threee bananas\" # Note that str_replace() is vectorised along text, pattern, and replacement str_replace(fruits, \"[aeiou]\", c(\"1\", \"2\", \"3\")) #> [1] \"1ne apple\" \"tw2 pears\" \"thr3e bananas\" str_replace(fruits, c(\"a\", \"e\", \"i\"), \"-\") #> [1] \"one -pple\" \"two p-ars\" \"three bananas\" # If you want to apply multiple patterns and replacements to the same # string, pass a named vector to pattern. fruits %>% str_c(collapse = \"---\") %>% str_replace_all(c(\"one\" = \"1\", \"two\" = \"2\", \"three\" = \"3\")) #> [1] \"1 apple---2 pears---3 bananas\" # Use a function for more sophisticated replacement. This example # replaces colour names with their hex values. colours <- str_c(\"\\\\b\", colors(), \"\\\\b\", collapse=\"|\") col2hex <- function(col) { rgb <- col2rgb(col) rgb(rgb[\"red\", ], rgb[\"green\", ], rgb[\"blue\", ], max = 255) } x <- c( \"Roses are red, violets are blue\", \"My favourite colour is green\" ) str_replace_all(x, colours, col2hex) #> [1] \"Roses are #FF0000, violets are #0000FF\" #> [2] \"My favourite colour is #00FF00\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace_na.html","id":null,"dir":"Reference","previous_headings":"","what":"Turn NA into ","title":"Turn NA into ","text":"Turn NA \"NA\"","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace_na.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Turn NA into ","text":"","code":"str_replace_na(string, replacement = \"NA\")"},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace_na.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Turn NA into ","text":"string Input vector. Either character vector, something coercible one. replacement single string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_replace_na.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Turn NA into ","text":"","code":"str_replace_na(c(NA, \"abc\", \"def\")) #> [1] \"NA\" \"abc\" \"def\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":null,"dir":"Reference","previous_headings":"","what":"Split up a string into pieces — str_split","title":"Split up a string into pieces — str_split","text":"functions differ primarily input output types: str_split() takes character vector returns list. str_split_1() takes single string returns character vector. str_split_fixed() takes character vector returns matrix. str_split_i() takes character vector returns character vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Split up a string into pieces — str_split","text":"","code":"str_split(string, pattern, n = Inf, simplify = FALSE) str_split_1(string, pattern) str_split_fixed(string, pattern, n) str_split_i(string, pattern, i)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Split up a string into pieces — str_split","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). n Maximum number pieces return. Default (Inf) uses possible split positions. str_split(), determines maximum length element output. str_split_fixed(), determines number columns output; input short, result padded \"\". simplify boolean. FALSE (default): returns list character vectors. TRUE: returns character matrix. Element return. Use negative value count right hand side.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Split up a string into pieces — str_split","text":"str_split_1(): character vector. str_split(): list length string/pattern containing character vectors. str_split_fixed(): character matrix n columns number rows length string/pattern. str_split_i(): character vector length string/pattern.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_split.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Split up a string into pieces — str_split","text":"","code":"fruits <- c( \"apples and oranges and pears and bananas\", \"pineapples and mangos and guavas\" ) str_split(fruits, \" and \") #> [[1]] #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" #> str_split(fruits, \" and \", simplify = TRUE) #> [,1] [,2] [,3] [,4] #> [1,] \"apples\" \"oranges\" \"pears\" \"bananas\" #> [2,] \"pineapples\" \"mangos\" \"guavas\" \"\" # If you want to split a single string, use `str_split1` str_split_1(fruits[[1]], \" and \") #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" # Specify n to restrict the number of possible matches str_split(fruits, \" and \", n = 3) #> [[1]] #> [1] \"apples\" \"oranges\" \"pears and bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" #> str_split(fruits, \" and \", n = 2) #> [[1]] #> [1] \"apples\" \"oranges and pears and bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos and guavas\" #> # If n greater than number of pieces, no padding occurs str_split(fruits, \" and \", n = 5) #> [[1]] #> [1] \"apples\" \"oranges\" \"pears\" \"bananas\" #> #> [[2]] #> [1] \"pineapples\" \"mangos\" \"guavas\" #> # Use fixed to return a character matrix str_split_fixed(fruits, \" and \", 3) #> [,1] [,2] [,3] #> [1,] \"apples\" \"oranges\" \"pears and bananas\" #> [2,] \"pineapples\" \"mangos\" \"guavas\" str_split_fixed(fruits, \" and \", 4) #> [,1] [,2] [,3] [,4] #> [1,] \"apples\" \"oranges\" \"pears\" \"bananas\" #> [2,] \"pineapples\" \"mangos\" \"guavas\" \"\" # str_split_i extracts only a single piece from a string str_split_i(fruits, \" and \", 1) #> [1] \"apples\" \"pineapples\" str_split_i(fruits, \" and \", 4) #> [1] \"bananas\" NA # use a negative number to select from the end str_split_i(fruits, \" and \", -1) #> [1] \"bananas\" \"guavas\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":null,"dir":"Reference","previous_headings":"","what":"Detect the presence/absence of a match at the start/end — str_starts","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"str_starts() str_ends() special cases str_detect() match beginning end string, respectively.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"","code":"str_starts(string, pattern, negate = FALSE) str_ends(string, pattern, negate = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"string Input vector. Either character vector, something coercible one. pattern Pattern string starts ends. default interpretation regular expression, described stringi::about_search_regex. Control options regex(). Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. negate TRUE, inverts resulting boolean vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"logical vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_starts.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Detect the presence/absence of a match at the start/end — str_starts","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_starts(fruit, \"p\") #> [1] FALSE FALSE TRUE TRUE str_starts(fruit, \"p\", negate = TRUE) #> [1] TRUE TRUE FALSE FALSE str_ends(fruit, \"e\") #> [1] TRUE FALSE FALSE TRUE str_ends(fruit, \"e\", negate = TRUE) #> [1] FALSE TRUE TRUE FALSE"},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":null,"dir":"Reference","previous_headings":"","what":"Get and set substrings using their positions — str_sub","title":"Get and set substrings using their positions — str_sub","text":"str_sub() extracts replaces elements single position string. str_sub_all() allows extract strings multiple elements every string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get and set substrings using their positions — str_sub","text":"","code":"str_sub(string, start = 1L, end = -1L) str_sub(string, start = 1L, end = -1L, omit_na = FALSE) <- value str_sub_all(string, start = 1L, end = -1L)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get and set substrings using their positions — str_sub","text":"string Input vector. Either character vector, something coercible one. start, end pair integer vectors defining range characters extract (inclusive). Alternatively, instead pair vectors, can pass matrix start. matrix two columns, either labelled start end, start length. omit_na Single logical value. TRUE, missing values arguments provided result unchanged input. value replacement string","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get and set substrings using their positions — str_sub","text":"str_sub(): character vector length string/start/end. str_sub_all(): list length string. element character vector length start/end.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_sub.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get and set substrings using their positions — str_sub","text":"","code":"hw <- \"Hadley Wickham\" str_sub(hw, 1, 6) #> [1] \"Hadley\" str_sub(hw, end = 6) #> [1] \"Hadley\" str_sub(hw, 8, 14) #> [1] \"Wickham\" str_sub(hw, 8) #> [1] \"Wickham\" # Negative indices index from end of string str_sub(hw, -1) #> [1] \"m\" str_sub(hw, -7) #> [1] \"Wickham\" str_sub(hw, end = -7) #> [1] \"Hadley W\" # str_sub() is vectorised by both string and position str_sub(hw, c(1, 8), c(6, 14)) #> [1] \"Hadley\" \"Wickham\" # if you want to extract multiple positions from multiple strings, # use str_sub_all() x <- c(\"abcde\", \"ghifgh\") str_sub(x, c(1, 2), c(2, 4)) #> [1] \"ab\" \"hif\" str_sub_all(x, start = c(1, 2), end = c(2, 4)) #> [[1]] #> [1] \"ab\" \"bcd\" #> #> [[2]] #> [1] \"gh\" \"hif\" #> # Alternatively, you can pass in a two column matrix, as in the # output from str_locate_all pos <- str_locate_all(hw, \"[aeio]\")[[1]] pos #> start end #> [1,] 2 2 #> [2,] 5 5 #> [3,] 9 9 #> [4,] 13 13 str_sub(hw, pos) #> [1] \"a\" \"e\" \"i\" \"a\" # You can also use `str_sub()` to modify strings: x <- \"BBCDEF\" str_sub(x, 1, 1) <- \"A\"; x #> [1] \"ABCDEF\" str_sub(x, -1, -1) <- \"K\"; x #> [1] \"ABCDEK\" str_sub(x, -2, -2) <- \"GHIJ\"; x #> [1] \"ABCDGHIJK\" str_sub(x, 2, -2) <- \"\"; x #> [1] \"AK\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":null,"dir":"Reference","previous_headings":"","what":"Find matching elements — str_subset","title":"Find matching elements — str_subset","text":"str_subset() returns elements string least one match pattern. wrapper around x[str_detect(x, pattern)], equivalent grep(pattern, x, value = TRUE). Use str_extract() find location match within string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find matching elements — str_subset","text":"","code":"str_subset(string, pattern, negate = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find matching elements — str_subset","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). negate TRUE, inverts resulting boolean vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find matching elements — str_subset","text":"character vector, usually smaller string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_subset.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find matching elements — str_subset","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_subset(fruit, \"a\") #> [1] \"apple\" \"banana\" \"pear\" \"pineapple\" str_subset(fruit, \"^a\") #> [1] \"apple\" str_subset(fruit, \"a$\") #> [1] \"banana\" str_subset(fruit, \"b\") #> [1] \"banana\" str_subset(fruit, \"[aeiou]\") #> [1] \"apple\" \"banana\" \"pear\" \"pineapple\" # Elements that don't match str_subset(fruit, \"^p\", negate = TRUE) #> [1] \"apple\" \"banana\" # Missings never match str_subset(c(\"a\", NA, \"b\"), \".\") #> [1] \"a\" \"b\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":null,"dir":"Reference","previous_headings":"","what":"Remove whitespace — str_trim","title":"Remove whitespace — str_trim","text":"str_trim() removes whitespace start end string; str_squish() removes whitespace start end, replaces internal whitespace single space.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Remove whitespace — str_trim","text":"","code":"str_trim(string, side = c(\"both\", \"left\", \"right\")) str_squish(string)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Remove whitespace — str_trim","text":"string Input vector. Either character vector, something coercible one. side Side remove whitespace: \"left\", \"right\", \"\", default.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Remove whitespace — str_trim","text":"character vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_trim.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Remove whitespace — str_trim","text":"","code":"str_trim(\" String with trailing and leading white space\\t\") #> [1] \"String with trailing and leading white space\" str_trim(\"\\n\\nString with trailing and leading white space\\n\\n\") #> [1] \"String with trailing and leading white space\" str_squish(\" String with trailing, middle, and leading white space\\t\") #> [1] \"String with trailing, middle, and leading white space\" str_squish(\"\\n\\nString with excess, trailing and leading white space\\n\\n\") #> [1] \"String with excess, trailing and leading white space\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":null,"dir":"Reference","previous_headings":"","what":"Truncate a string to maximum width — str_trunc","title":"Truncate a string to maximum width — str_trunc","text":"Truncate string fixed characters, str_length(str_trunc(x, n)) always less equal n.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Truncate a string to maximum width — str_trunc","text":"","code":"str_trunc(string, width, side = c(\"right\", \"left\", \"center\"), ellipsis = \"...\")"},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Truncate a string to maximum width — str_trunc","text":"string Input vector. Either character vector, something coercible one. width Maximum width string. side, ellipsis Location content ellipsis indicates content removed.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Truncate a string to maximum width — str_trunc","text":"character vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_trunc.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Truncate a string to maximum width — str_trunc","text":"","code":"x <- \"This string is moderately long\" rbind( str_trunc(x, 20, \"right\"), str_trunc(x, 20, \"left\"), str_trunc(x, 20, \"center\") ) #> [,1] #> [1,] \"This string is mo...\" #> [2,] \"...s moderately long\" #> [3,] \"This stri...ely long\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":null,"dir":"Reference","previous_headings":"","what":"Remove duplicated strings — str_unique","title":"Remove duplicated strings — str_unique","text":"str_unique() removes duplicated values, optional control duplication measured.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Remove duplicated strings — str_unique","text":"","code":"str_unique(string, locale = \"en\", ignore_case = FALSE, ...)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Remove duplicated strings — str_unique","text":"string Input vector. Either character vector, something coercible one. locale Locale use comparisons. See stringi::stri_locale_list() possible options. Defaults \"en\" (English) ensure default behaviour consistent across platforms. ignore_case Ignore case comparing strings? ... options used control collation. Passed stringi::stri_opts_collator().","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Remove duplicated strings — str_unique","text":"character vector, usually shorter string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_unique.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Remove duplicated strings — str_unique","text":"","code":"str_unique(c(\"a\", \"b\", \"c\", \"b\", \"a\")) #> [1] \"a\" \"b\" \"c\" str_unique(c(\"a\", \"b\", \"c\", \"B\", \"A\")) #> [1] \"a\" \"b\" \"c\" \"B\" \"A\" str_unique(c(\"a\", \"b\", \"c\", \"B\", \"A\"), ignore_case = TRUE) #> [1] \"a\" \"b\" \"c\" # Use ... to pass additional arguments to stri_unique() str_unique(c(\"motley\", \"mötley\", \"pinguino\", \"pingüino\")) #> [1] \"motley\" \"mötley\" \"pinguino\" \"pingüino\" str_unique(c(\"motley\", \"mötley\", \"pinguino\", \"pingüino\"), strength = 1) #> [1] \"motley\" \"pinguino\""},{"path":"https://stringr.tidyverse.org/dev/reference/str_view.html","id":null,"dir":"Reference","previous_headings":"","what":"View strings and matches — str_view","title":"View strings and matches — str_view","text":"str_view() used print underlying representation string see pattern matches. Matches surrounded <> unusual whitespace (.e. whitespace apart \" \" \"\\n\") surrounded {} escaped. possible, matches unusual whitespace coloured blue NAs red.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_view.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"View strings and matches — str_view","text":"","code":"str_view( string, pattern = NULL, match = TRUE, html = FALSE, use_escapes = FALSE )"},{"path":"https://stringr.tidyverse.org/dev/reference/str_view.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"View strings and matches — str_view","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). match pattern supplied, elements shown? TRUE, default, shows elements match pattern. NA shows elements. FALSE shows elements match pattern. pattern supplied, elements always shown. html Use HTML output? TRUE create HTML widget; FALSE style using ANSI escapes. default prefers ANSI escapes available current terminal; can override setting options(stringr.html = TRUE). use_escapes TRUE, non-ASCII characters rendered unicode escapes. useful see exactly underlying values stored string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_view.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"View strings and matches — str_view","text":"","code":"# Show special characters str_view(c(\"\\\"\\\\\", \"\\\\\\\\\\\\\", \"fgh\", NA, \"NA\")) #> [1] │ \"\\ #> [2] │ \\\\\\ #> [3] │ fgh #> [4] │ NA #> [5] │ NA # A non-breaking space looks like a regular space: nbsp <- \"Hi\\u00A0you\" nbsp #> [1] \"Hi you\" # But it doesn't behave like one: str_detect(nbsp, \" \") #> [1] FALSE # So str_view() brings it to your attention with a blue background str_view(nbsp) #> [1] │ Hi{\\u00a0}you # You can also use escapes to see all non-ASCII characters str_view(nbsp, use_escapes = TRUE) #> [1] │ Hi\\u00a0you # Supply a pattern to see where it matches str_view(c(\"abc\", \"def\", \"fghi\"), \"[aeiou]\") #> [1] │ bc #> [2] │ df #> [3] │ fgh str_view(c(\"abc\", \"def\", \"fghi\"), \"^\") #> [1] │ <>abc #> [2] │ <>def #> [3] │ <>fghi str_view(c(\"abc\", \"def\", \"fghi\"), \"..\") #> [1] │ c #> [2] │ f #> [3] │ # By default, only matching strings will be shown str_view(c(\"abc\", \"def\", \"fghi\"), \"e\") #> [2] │ df # but you can show all: str_view(c(\"abc\", \"def\", \"fghi\"), \"e\", match = NA) #> [1] │ abc #> [2] │ df #> [3] │ fghi # or just those that don't match: str_view(c(\"abc\", \"def\", \"fghi\"), \"e\", match = FALSE) #> [1] │ abc #> [3] │ fghi"},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":null,"dir":"Reference","previous_headings":"","what":"Find matching indices — str_which","title":"Find matching indices — str_which","text":"str_subset() returns indices ofstring least one match pattern. wrapper around (str_detect(x, pattern)), equivalent grep(pattern, x).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Find matching indices — str_which","text":"","code":"str_which(string, pattern, negate = FALSE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Find matching indices — str_which","text":"string Input vector. Either character vector, something coercible one. pattern Pattern look . default interpretation regular expression, described vignette(\"regular-expressions\"). Use regex() finer control matching behaviour. Match fixed string (.e. comparing bytes), using fixed(). fast, approximate. Generally, matching human text, want coll() respects character matching rules specified locale. Match character, word, line sentence boundaries boundary(). empty pattern, \"\", equivalent boundary(\"character\"). negate TRUE, inverts resulting boolean vector.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Find matching indices — str_which","text":"integer vector, usually smaller string.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_which.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Find matching indices — str_which","text":"","code":"fruit <- c(\"apple\", \"banana\", \"pear\", \"pineapple\") str_which(fruit, \"a\") #> [1] 1 2 3 4 # Elements that don't match str_which(fruit, \"^p\", negate = TRUE) #> [1] 1 2 # Missings never match str_which(c(\"a\", NA, \"b\"), \".\") #> [1] 1 3"},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":null,"dir":"Reference","previous_headings":"","what":"Wrap words into nicely formatted paragraphs — str_wrap","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"Wrap words paragraphs, minimizing \"raggedness\" lines (.e. variation length line) using Knuth-Plass algorithm.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"","code":"str_wrap(string, width = 80, indent = 0, exdent = 0, whitespace_only = TRUE)"},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"string Input vector. Either character vector, something coercible one. width Positive integer giving target line width (number characters). width less equal 1 put word line. indent, exdent non-negative integer giving indent first line (indent) subsequent lines (exdent). whitespace_only boolean. TRUE (default) wrapping occur whitespace. FALSE, can break non-word character (e.g. /, -).","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"character vector length string.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/str_wrap.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Wrap words into nicely formatted paragraphs — str_wrap","text":"","code":"thanks_path <- file.path(R.home(\"doc\"), \"THANKS\") thanks <- str_c(readLines(thanks_path), collapse = \"\\n\") thanks <- word(thanks, 1, 3, fixed(\"\\n\\n\")) cat(str_wrap(thanks), \"\\n\") #> R would not be what it is today without the invaluable help of these people #> outside of the (former and current) R Core team, who contributed by donating #> code, bug fixes and documentation: Valerio Aimale, Suharto Anggono, Thomas #> Baier, Gabe Becker, Henrik Bengtsson, Roger Bivand, Ben Bolker, David Brahm, #> G\"oran Brostr\"om, Patrick Burns, Vince Carey, Saikat DebRoy, Matt Dowle, Brian #> D'Urso, Lyndon Drake, Dirk Eddelbuettel, Claus Ekstrom, Sebastian Fischmeister, #> John Fox, Paul Gilbert, Yu Gong, Gabor Grothendieck, Frank E Harrell Jr, Peter #> M. Haverty, Torsten Hothorn, Robert King, Kjetil Kjernsmo, Roger Koenker, #> Philippe Lambert, Jan de Leeuw, Jim Lindsey, Patrick Lindsey, Catherine Loader, #> Gordon Maclean, Arni Magnusson, John Maindonald, David Meyer, Ei-ji Nakama, #> Jens Oehlschl\"agel, Steve Oncley, Richard O'Keefe, Hubert Palme, Roger D. Peng, #> Jose' C. Pinheiro, Tony Plate, Anthony Rossini, Jonathan Rougier, Petr Savicky, #> Guenther Sawitzki, Marc Schwartz, Arun Srinivasan, Detlef Steuer, Bill Simpson, #> Gordon Smyth, Adrian Trapletti, Terry Therneau, Rolf Turner, Bill Venables, #> Gregory R. Warnes, Andreas Weingessel, Morten Welinder, James Wettenhall, Simon #> Wood, and Achim Zeileis. Others have written code that has been adopted by R and #> is acknowledged in the code files, including cat(str_wrap(thanks, width = 40), \"\\n\") #> R would not be what it is today without #> the invaluable help of these people #> outside of the (former and current) R #> Core team, who contributed by donating #> code, bug fixes and documentation: #> Valerio Aimale, Suharto Anggono, Thomas #> Baier, Gabe Becker, Henrik Bengtsson, #> Roger Bivand, Ben Bolker, David Brahm, #> G\"oran Brostr\"om, Patrick Burns, Vince #> Carey, Saikat DebRoy, Matt Dowle, #> Brian D'Urso, Lyndon Drake, Dirk #> Eddelbuettel, Claus Ekstrom, Sebastian #> Fischmeister, John Fox, Paul Gilbert, #> Yu Gong, Gabor Grothendieck, Frank E #> Harrell Jr, Peter M. Haverty, Torsten #> Hothorn, Robert King, Kjetil Kjernsmo, #> Roger Koenker, Philippe Lambert, Jan #> de Leeuw, Jim Lindsey, Patrick Lindsey, #> Catherine Loader, Gordon Maclean, #> Arni Magnusson, John Maindonald, #> David Meyer, Ei-ji Nakama, Jens #> Oehlschl\"agel, Steve Oncley, Richard #> O'Keefe, Hubert Palme, Roger D. Peng, #> Jose' C. Pinheiro, Tony Plate, Anthony #> Rossini, Jonathan Rougier, Petr Savicky, #> Guenther Sawitzki, Marc Schwartz, Arun #> Srinivasan, Detlef Steuer, Bill Simpson, #> Gordon Smyth, Adrian Trapletti, Terry #> Therneau, Rolf Turner, Bill Venables, #> Gregory R. Warnes, Andreas Weingessel, #> Morten Welinder, James Wettenhall, Simon #> Wood, and Achim Zeileis. Others have #> written code that has been adopted by R #> and is acknowledged in the code files, #> including cat(str_wrap(thanks, width = 60, indent = 2), \"\\n\") #> R would not be what it is today without the invaluable #> help of these people outside of the (former and current) #> R Core team, who contributed by donating code, bug fixes #> and documentation: Valerio Aimale, Suharto Anggono, Thomas #> Baier, Gabe Becker, Henrik Bengtsson, Roger Bivand, Ben #> Bolker, David Brahm, G\"oran Brostr\"om, Patrick Burns, #> Vince Carey, Saikat DebRoy, Matt Dowle, Brian D'Urso, #> Lyndon Drake, Dirk Eddelbuettel, Claus Ekstrom, Sebastian #> Fischmeister, John Fox, Paul Gilbert, Yu Gong, Gabor #> Grothendieck, Frank E Harrell Jr, Peter M. Haverty, #> Torsten Hothorn, Robert King, Kjetil Kjernsmo, Roger #> Koenker, Philippe Lambert, Jan de Leeuw, Jim Lindsey, #> Patrick Lindsey, Catherine Loader, Gordon Maclean, Arni #> Magnusson, John Maindonald, David Meyer, Ei-ji Nakama, #> Jens Oehlschl\"agel, Steve Oncley, Richard O'Keefe, Hubert #> Palme, Roger D. Peng, Jose' C. Pinheiro, Tony Plate, Anthony #> Rossini, Jonathan Rougier, Petr Savicky, Guenther Sawitzki, #> Marc Schwartz, Arun Srinivasan, Detlef Steuer, Bill Simpson, #> Gordon Smyth, Adrian Trapletti, Terry Therneau, Rolf Turner, #> Bill Venables, Gregory R. Warnes, Andreas Weingessel, Morten #> Welinder, James Wettenhall, Simon Wood, and Achim Zeileis. #> Others have written code that has been adopted by R and is #> acknowledged in the code files, including cat(str_wrap(thanks, width = 60, exdent = 2), \"\\n\") #> R would not be what it is today without the invaluable help #> of these people outside of the (former and current) R #> Core team, who contributed by donating code, bug fixes #> and documentation: Valerio Aimale, Suharto Anggono, Thomas #> Baier, Gabe Becker, Henrik Bengtsson, Roger Bivand, Ben #> Bolker, David Brahm, G\"oran Brostr\"om, Patrick Burns, #> Vince Carey, Saikat DebRoy, Matt Dowle, Brian D'Urso, #> Lyndon Drake, Dirk Eddelbuettel, Claus Ekstrom, Sebastian #> Fischmeister, John Fox, Paul Gilbert, Yu Gong, Gabor #> Grothendieck, Frank E Harrell Jr, Peter M. Haverty, #> Torsten Hothorn, Robert King, Kjetil Kjernsmo, Roger #> Koenker, Philippe Lambert, Jan de Leeuw, Jim Lindsey, #> Patrick Lindsey, Catherine Loader, Gordon Maclean, Arni #> Magnusson, John Maindonald, David Meyer, Ei-ji Nakama, #> Jens Oehlschl\"agel, Steve Oncley, Richard O'Keefe, Hubert #> Palme, Roger D. Peng, Jose' C. Pinheiro, Tony Plate, #> Anthony Rossini, Jonathan Rougier, Petr Savicky, Guenther #> Sawitzki, Marc Schwartz, Arun Srinivasan, Detlef Steuer, #> Bill Simpson, Gordon Smyth, Adrian Trapletti, Terry #> Therneau, Rolf Turner, Bill Venables, Gregory R. Warnes, #> Andreas Weingessel, Morten Welinder, James Wettenhall, #> Simon Wood, and Achim Zeileis. Others have written code #> that has been adopted by R and is acknowledged in the code #> files, including cat(str_wrap(thanks, width = 0, exdent = 2), \"\\n\") #> R #> would #> not #> be #> what #> it #> is #> today #> without #> the #> invaluable #> help #> of #> these #> people #> outside #> of #> the #> (former #> and #> current) #> R #> Core #> team, #> who #> contributed #> by #> donating #> code, #> bug #> fixes #> and #> documentation: #> Valerio #> Aimale, #> Suharto #> Anggono, #> Thomas #> Baier, #> Gabe #> Becker, #> Henrik #> Bengtsson, #> Roger #> Bivand, #> Ben #> Bolker, #> David #> Brahm, #> G\"oran #> Brostr\"om, #> Patrick #> Burns, #> Vince #> Carey, #> Saikat #> DebRoy, #> Matt #> Dowle, #> Brian #> D'Urso, #> Lyndon #> Drake, #> Dirk #> Eddelbuettel, #> Claus #> Ekstrom, #> Sebastian #> Fischmeister, #> John #> Fox, #> Paul #> Gilbert, #> Yu #> Gong, #> Gabor #> Grothendieck, #> Frank #> E #> Harrell #> Jr, #> Peter #> M. #> Haverty, #> Torsten #> Hothorn, #> Robert #> King, #> Kjetil #> Kjernsmo, #> Roger #> Koenker, #> Philippe #> Lambert, #> Jan #> de #> Leeuw, #> Jim #> Lindsey, #> Patrick #> Lindsey, #> Catherine #> Loader, #> Gordon #> Maclean, #> Arni #> Magnusson, #> John #> Maindonald, #> David #> Meyer, #> Ei-ji #> Nakama, #> Jens #> Oehlschl\"agel, #> Steve #> Oncley, #> Richard #> O'Keefe, #> Hubert #> Palme, #> Roger #> D. #> Peng, #> Jose' #> C. #> Pinheiro, #> Tony #> Plate, #> Anthony #> Rossini, #> Jonathan #> Rougier, #> Petr #> Savicky, #> Guenther #> Sawitzki, #> Marc #> Schwartz, #> Arun #> Srinivasan, #> Detlef #> Steuer, #> Bill #> Simpson, #> Gordon #> Smyth, #> Adrian #> Trapletti, #> Terry #> Therneau, #> Rolf #> Turner, #> Bill #> Venables, #> Gregory #> R. #> Warnes, #> Andreas #> Weingessel, #> Morten #> Welinder, #> James #> Wettenhall, #> Simon #> Wood, #> and #> Achim #> Zeileis. #> Others #> have #> written #> code #> that #> has #> been #> adopted #> by #> R #> and #> is #> acknowledged #> in #> the #> code #> files, #> including"},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-data.html","id":null,"dir":"Reference","previous_headings":"","what":"Sample character vectors for practicing string manipulations — stringr-data","title":"Sample character vectors for practicing string manipulations — stringr-data","text":"fruit words come rcorpora package written Gabor Csardi; data collected Darius Kazemi made available https://github.com/dariusk/corpora. sentences collection \"Harvard sentences\" used standardised testing voice.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-data.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Sample character vectors for practicing string manipulations — stringr-data","text":"","code":"sentences fruit words"},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-data.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Sample character vectors for practicing string manipulations — stringr-data","text":"Character vectors.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-data.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Sample character vectors for practicing string manipulations — stringr-data","text":"","code":"length(sentences) #> [1] 720 sentences[1:5] #> [1] \"The birch canoe slid on the smooth planks.\" #> [2] \"Glue the sheet to the dark blue background.\" #> [3] \"It's easy to tell the depth of a well.\" #> [4] \"These days a chicken leg is a rare dish.\" #> [5] \"Rice is often served in round bowls.\" length(fruit) #> [1] 80 fruit[1:5] #> [1] \"apple\" \"apricot\" \"avocado\" \"banana\" \"bell pepper\" length(words) #> [1] 980 words[1:5] #> [1] \"a\" \"able\" \"about\" \"absolute\" \"accept\""},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-package.html","id":null,"dir":"Reference","previous_headings":"","what":"stringr: Simple, Consistent Wrappers for Common String Operations — stringr-package","title":"stringr: Simple, Consistent Wrappers for Common String Operations — stringr-package","text":"consistent, simple easy use set wrappers around fantastic 'stringi' package. function argument names (positions) consistent, functions deal \"NA\"'s zero length vectors way, output one function easy feed input another.","code":""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/reference/stringr-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"stringr: Simple, Consistent Wrappers for Common String Operations — stringr-package","text":"Maintainer: Hadley Wickham hadley@rstudio.com [copyright holder] contributors: RStudio [copyright holder, funder]","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract words from a sentence — word","title":"Extract words from a sentence — word","text":"Extract words sentence","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract words from a sentence — word","text":"","code":"word(string, start = 1L, end = start, sep = fixed(\" \"))"},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract words from a sentence — word","text":"string Input vector. Either character vector, something coercible one. start, end Pair integer vectors giving range words (inclusive) extract. negative, counts backwards last word. default value select first word. sep Separator words. Defaults single space.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract words from a sentence — word","text":"character vector length string/start/end.","code":""},{"path":"https://stringr.tidyverse.org/dev/reference/word.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Extract words from a sentence — word","text":"","code":"sentences <- c(\"Jane saw a cat\", \"Jane sat down\") word(sentences, 1) #> [1] \"Jane\" \"Jane\" word(sentences, 2) #> [1] \"saw\" \"sat\" word(sentences, -1) #> [1] \"cat\" \"down\" word(sentences, 2, -1) #> [1] \"saw a cat\" \"sat down\" # Also vectorised over start and end word(sentences[1], 1:3, -1) #> [1] \"Jane saw a cat\" \"saw a cat\" \"a cat\" word(sentences[1], 1, 1:4) #> [1] \"Jane\" \"Jane saw\" \"Jane saw a\" \"Jane saw a cat\" # Can define words by other separators str <- 'abc.def..123.4568.999' word(str, 1, sep = fixed('..')) #> [1] \"abc.def\" word(str, 2, sep = fixed('..')) #> [1] \"123.4568.999\""},{"path":[]},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-150","dir":"Changelog","previous_headings":"","what":"stringr 1.5.0","title":"stringr 1.5.0","text":"CRAN release: 2022-12-02","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"breaking-changes-1-5-0","dir":"Changelog","previous_headings":"","what":"Breaking changes","title":"stringr 1.5.0","text":"stringr functions now consistently implement tidyverse recycling rules (#372). two main changes: vectors length 1 recycled. Previously, (e.g.) str_detect(letters, c(\"x\", \"y\")) worked, now errors. str_c() ignores NULLs, rather treating length 0 vectors. Additionally, many arguments now throw errors, rather warnings, supplied wrong type input. regex() friends now generate class names stringr_ prefix (#384). str_detect(), str_starts(), str_ends() str_subset() now error used either empty string (\"\") boundary(). operations didn’t really make sense (str_detect(x, \"\") returned TRUE non-empty strings) made easy make mistakes programming.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"new-features-1-5-0","dir":"Changelog","previous_headings":"","what":"New features","title":"stringr 1.5.0","text":"Many tweaks documentation make useful consistent. New vignette(\"-base\") @sastoudt provides comprehensive comparison base R functions stringr equivalents. ’s designed help move stringr ’re already familiar base R string functions (#266). New str_escape() escapes regular expression metacharacters, providing alternative fixed() want compose pattern user supplied strings (#408). New str_equal() compares two character vectors using unicode rules, optionally ignoring case (#381). str_extract() can now optionally extract capturing group instead complete match (#420). New str_flatten_comma() special case str_flatten() designed comma separated flattening can correctly apply Oxford commas two elements (#444). New str_split_1() tailored special case splitting single string (#409). New str_split_i() extract single piece string (#278, @bfgray3). New str_like() allows use SQL wildcards (#280, @rjpat). New str_rank() complete set order/rank/sort functions (#353). New str_sub_all() extract multiple substrings string. New str_unique() wrapper around stri_unique() returns unique string values character vector (#249, @seasmith). str_view() uses ANSI colouring rather HTML widget (#370). works places requires fewer dependencies. includes number small improvements: longer requires pattern can use display strings special characters. highlights unusual whitespace characters. ’s vectorised stringandpattern` (#407). defaults displaying matches, making str_view_all() redundant (hence deprecated) (#455). New str_width() returns display width string (#380). stringr now licensed MIT (#351).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"minor-improvements-and-bug-fixes-1-5-0","dir":"Changelog","previous_headings":"","what":"Minor improvements and bug fixes","title":"stringr 1.5.0","text":"Better error message supply non-string pattern (#378). new data source sentences fixed many small errors. str_extract() str_exctract_all() now work correctly pattern boundary(). str_flatten() gains last argument optionally override final separator (#377). gains na.rm argument remove missing values (since ’s summary function) (#439). str_pad() gains use_width argument control whether use total code point width number code points “width” string (#190). str_replace() str_replace_all() can use standard tidyverse formula shorthand replacement function (#331). str_starts() str_ends() now correctly respect regex operator precedence (@carlganz). str_wrap() breaks whitespace default; set whitespace_only = FALSE return previous behaviour (#335, @rjpat). word() now returns sentence using negative start parameter greater equal number words. (@pdelboca, #245)","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-141","dir":"Changelog","previous_headings":"","what":"stringr 1.4.1","title":"stringr 1.4.1","text":"CRAN release: 2022-08-20 Hot patch release resolve R CMD check failures.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-140","dir":"Changelog","previous_headings":"","what":"stringr 1.4.0","title":"stringr 1.4.0","text":"CRAN release: 2019-02-10 str_interp() now renders lists consistently independent presence additional placeholders (@amhrasmussen). New str_starts() str_ends() functions detect patterns beginning end strings (@jonthegeek, #258). str_subset(), str_detect(), str_which() get negate argument, useful want elements match (#259, @yutannihilation). New str_to_sentence() function capitalize sentence case (@jonthegeek, #202).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-131","dir":"Changelog","previous_headings":"","what":"stringr 1.3.1","title":"stringr 1.3.1","text":"CRAN release: 2018-05-10 str_replace_all() named vector now respects modifier functions (#207) str_trunc() vectorised correctly (#203, @austin3dickey). str_view() handles NA values gracefully (#217). ’ve also tweaked sizing policy hopefully work better notebooks, preserving existing behaviour knit documents (#232).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-130","dir":"Changelog","previous_headings":"","what":"stringr 1.3.0","title":"stringr 1.3.0","text":"CRAN release: 2018-02-19","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"api-changes-1-3-0","dir":"Changelog","previous_headings":"","what":"API changes","title":"stringr 1.3.0","text":"package build, may see Error : object ‘ignore.case’ exported 'namespace:stringr'. long deprecated str_join(), ignore.case() perl() now removed.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"new-features-1-3-0","dir":"Changelog","previous_headings":"","what":"New features","title":"stringr 1.3.0","text":"str_glue() str_glue_data() provide convenient wrappers around glue glue_data() glue package (#157). str_flatten() wrapper around stri_flatten() clearly conveys flattening character vector single string (#186). str_remove() str_remove_all() functions. wrap str_replace() str_replace_all() remove patterns strings. (@Shians, #178) str_squish() removes spaces left right side strings, also converts multiple space (space-like characters) single space within strings (@stephlocke, #197). str_sub() gains omit_na argument ignoring NA. Accordingly, str_replace() now ignores NAs keeps original strings. (@yutannihilation, #164)","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"bug-fixes-and-minor-improvements-1-3-0","dir":"Changelog","previous_headings":"","what":"Bug fixes and minor improvements","title":"stringr 1.3.0","text":"str_trunc() now preserves NAs (@ClaytonJY, #162) str_trunc() now throws error width shorter ellipsis (@ClaytonJY, #163). Long deprecated str_join(), ignore.case() perl() now removed.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-120","dir":"Changelog","previous_headings":"","what":"stringr 1.2.0","title":"stringr 1.2.0","text":"CRAN release: 2017-02-18","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"api-changes-1-2-0","dir":"Changelog","previous_headings":"","what":"API changes","title":"stringr 1.2.0","text":"str_match_all() now returns NA optional group doesn’t match (previously returned ““). consistent str_match() match failures (#134).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"new-features-1-2-0","dir":"Changelog","previous_headings":"","what":"New features","title":"stringr 1.2.0","text":"str_replace(), replacement can now function called match whose return value used replace match. New str_which() mimics grep() (#129). new vignette (vignette(\"regular-expressions\")) describes details regular expressions supported stringr. main vignette (vignette(\"stringr\")) updated give high-level overview package.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"minor-improvements-and-bug-fixes-1-2-0","dir":"Changelog","previous_headings":"","what":"Minor improvements and bug fixes","title":"stringr 1.2.0","text":"str_order() str_sort() gain explicit numeric argument sorting mixed numbers strings. str_replace_all() now throws error replacement character vector. replacement NA_character_ replaces complete string replaces NA (#124). functions take locale (e.g. str_to_lower() str_sort()) default “en” (English) ensure default consistent across platforms.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-110","dir":"Changelog","previous_headings":"","what":"stringr 1.1.0","title":"stringr 1.1.0","text":"CRAN release: 2016-08-19 Add sample datasets: fruit, words sentences. fixed(), regex(), coll() now throw error use anything plain string (#60). ’ve clarified replacement perl() regex() regexp() (#61). boundary() improved defaults splitting non-word boundaries (#58, @lmullen). str_detect() now can detect boundaries (checking str_count() > 0) (#120). str_subset() works similarly. str_extract() str_extract_all() now work boundary(). particularly useful want extract logical constructs like words sentences. str_extract_all() respects simplify argument used fixed() matches. str_subset() now respects custom options fixed() patterns (#79, @gagolews). str_replace() str_replace_all() now behave correctly replacement string contains $s, \\\\\\\\1, etc. (#83, #99). str_split() gains simplify argument match str_extract_all() etc. str_view() str_view_all() create HTML widgets display regular expression matches (#96). word() returns NA indexes greater number words (#112).","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-100","dir":"Changelog","previous_headings":"","what":"stringr 1.0.0","title":"stringr 1.0.0","text":"CRAN release: 2015-04-30 stringr now powered stringi instead base R regular expressions. improves unicode support, makes operations considerably faster. find stringr inadequate string processing needs, highly recommend looking stringi detail. stringr gains vignette, currently straight forward update article appeared R Journal. str_c() now returns zero length vector inputs zero length vectors. consistent functions, standard R recycling rules. Similarly, using str_c(\"x\", NA) now yields NA. want \"xNA\", use str_replace_na() inputs. str_replace_all() gains convenient syntax applying multiple pairs pattern replacement vector: str_match() now returns NA optional group doesn’t match (previously returned ““). consistent str_extract() match failures. New str_subset() keeps values match pattern. ’s convenient wrapper x[str_detect(x)] (#21, @jiho). New str_order() str_sort() allow sort order strings specified locale. New str_conv() convert strings specified encoding UTF-8. New modifier boundary() allows count, locate split character, word, line sentence boundaries. documentation got lot love, similar functions (e.g. first variants) now documented together. hopefully make easier locate function need. ignore.case(x) deprecated favour fixed|regex|coll(x, ignore.case = TRUE), perl(x) deprecated favour regex(x). str_join() deprecated, please use str_c() instead.","code":"input <- c(\"abc\", \"def\") str_replace_all(input, c(\"[ad]\" = \"!\", \"[cf]\" = \"?\"))"},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-062","dir":"Changelog","previous_headings":"","what":"stringr 0.6.2","title":"stringr 0.6.2","text":"CRAN release: 2012-12-06 fixed path str_wrap example works R installations. remove dependency plyr","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-061","dir":"Changelog","previous_headings":"","what":"stringr 0.6.1","title":"stringr 0.6.1","text":"CRAN release: 2012-07-25 Zero input str_split_fixed returns 0 row matrix n columns Export str_join","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-06","dir":"Changelog","previous_headings":"","what":"stringr 0.6","title":"stringr 0.6","text":"CRAN release: 2011-12-08 new modifier perl switches Perl regular expressions str_match now uses new base function regmatches extract matches - hopefully faster previous pure R algorithm","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-05","dir":"Changelog","previous_headings":"","what":"stringr 0.5","title":"stringr 0.5","text":"CRAN release: 2011-06-30 new str_wrap function gives strwrap output convenient format new word function extract words string given user defined separator (thanks suggestion David Cooper) str_locate now returns consistent type matching empty string (thanks Stavros Macrakis) new str_count counts number matches string. str_pad str_trim receive performance tweaks - large vectors give least two order magnitude speed str_length returns NA invalid multibyte strings fix small bug internal recyclable function","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-04","dir":"Changelog","previous_headings":"","what":"stringr 0.4","title":"stringr 0.4","text":"CRAN release: 2010-08-24 functions now vectorised respect string, pattern (appropriate) replacement parameters fixed() function now tells stringr functions use fixed matching, rather escaping regular expression. improve performance large vectors. new ignore.case() modifier tells stringr functions ignore case pattern. str_replace renamed str_replace_all new str_replace function added. makes str_replace consistent functions. new str_sub<- function (analogous substring<-) substring replacement str_sub now understands negative positions position end string. -1 replaces Inf indicator string end. str_pad side argument can left, right, (instead center) str_trim gains side argument better match str_pad stringr now namespace imports plyr (rather requiring )","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-03","dir":"Changelog","previous_headings":"","what":"stringr 0.3","title":"stringr 0.3","text":"CRAN release: 2010-02-15 fixed() now also escapes | str_join() renamed str_c() functions carefully check input return informative error messages expected. add invert_match() function convert matrix location matches locations non-matches add fixed() function allow matching fixed strings.","code":""},{"path":"https://stringr.tidyverse.org/dev/news/index.html","id":"stringr-02","dir":"Changelog","previous_headings":"","what":"stringr 0.2","title":"stringr 0.2","text":"CRAN release: 2009-11-16 str_length now returns correct results used factors str_sub now correctly replaces Inf end argument length string new function str_split_fixed returns fixed number splits character matrix str_split longer uses strsplit preserve trailing breaks","code":""}]