| grep {base} | R Documentation |
grep searches for matches to pattern (its first
argument) within the vector x of character strings (second
argument). regexpr does too, but returns more detail in a
different format.
sub and gsub perform replacement of matches
determined by regular expression matching.
grep(pattern, x, ignore.case=FALSE, extended=TRUE, value=FALSE)
sub(pattern, replacement, x,
ignore.case=FALSE, extended=TRUE)
gsub(pattern, replacement, x,
ignore.case=FALSE, extended=TRUE)
regexpr(pattern, text, extended=TRUE)
pattern |
character string containing a regular expression
to be matched in the vector of character string vec. |
x, text |
a vector of character strings where matches are sought. |
ignore.case |
if FALSE, the pattern matching is
case sensitive and if TRUE, case is ignored during matching. |
extended |
if TRUE, extended regular expression matching
is used, and if FALSE basic regular expressions are used. |
value |
if FALSE, a vector containing the (integer) indices
of the matches determined by grep is returned,
and if TRUE, a vector containing the matching
elements themselves is returned. |
replacement |
a replacement for matched pattern in
sub and gsub. |
The two *sub functions differ only in that sub replaces only
the first occurrence of a pattern whereas gsub replaces
all occurrences.
The regular expressions used are those specified by POSIX 1003.2,
either extended or basic, depending on the value of the
extended argument.
For gsub a vector giving either the indices of the elements
of x that yielded a match or, if value is TRUE,
the matched elements.
For sub and gsub a character vector of the same
length as the original.
For regexpr an integer vector of the same length as
text giving the starting position of the first match, or -1
if there is none, with attribute "match.length" giving the
length of the matched text (or -1 for no match).
It is possible to compile R without support for regular expressions, and then these functions are not operational.
On the Macintosh port this function is based on the regex regular expression library written by Henry Spencer of the University of Toronto.
charmatch, pmatch, match.
apropos uses regexps and has nice examples.
grep("[a-z]", letters)
txt <- c("arm","foot","lefroo", "bafoobar")
if(any(i <- grep("foo",txt)))
cat("`foo' appears at least once in\n\t",txt,"\n")
i # 2 and 4
txt[i]
## Double all 'a' or 'b's; "\" must be escaped, i.e. `doubled'
gsub("([ab])", "\\1_\\1_", "abc and ABC")
txt <- c("The", "licenses", "for", "most", "software", "are",
"designed", "to", "take", "away", "your", "freedom",
"to", "share", "and", "change", "it.",
"", "By", "contrast,", "the", "GNU", "General", "Public", "License",
"is", "intended", "to", "guarantee", "your", "freedom", "to",
"share", "and", "change", "free", "software", "--",
"to", "make", "sure", "the", "software", "is",
"free", "for", "all", "its", "users")
( i <- grep("[gu]", txt) ) # indices
all( txt[i] == grep("[gu]", txt, value = TRUE) )
(ot <- sub("[b-e]",".", txt))
txt[ot != gsub("[b-e]",".", txt)]#- gsub does "global" substitution
txt[gsub("g","#", txt) !=
gsub("g","#", txt, ignore.case = TRUE)] # the "G" words
regexpr("en", txt)