Intermediate Regular Expressions in R
Angelo Zehr
Data Journalist
Character Class | Example |
---|---|
\\d or [:digit:] |
0, 1, 2, 3,… |
\\w or [:word:] |
a, b, c…, 1, 2, 3…, _ |
[A-Za-z] or [:alpha:] |
A, B, C,…, a, b, c,… |
[aeiou] |
either a , e , i , o or u |
\\s or [:space:] |
" " , tabs or line breaks |
str_match_all() |
Result |
---|---|
"Hi John_35", "\\d" |
"3" , "5" |
"Hi John_35", "\\w" |
"H" , "i" , "J" , "o" , "h" , "n" , "_" , "3" , "5" |
"Hi John_35", "[A-Za-z]" |
"H" , "i" , "J" , "o" , "h" , "n" |
"Hi John_35", "[aeiou]" |
"i" , "o" |
"Hi John_35", "\\s" |
" " |
Syntax | Meaning |
---|---|
\\w{2} |
exactly 2 times |
\\w{2,3} |
minimum 2 times, maximum 3 times |
\\w{2,} |
minimum 2 times, but no maximum |
\\w+ |
1 or more repetitions |
\\w* |
0, 1 or more repetitions |
Original | Negation |
---|---|
\\d match digits |
\\D match all but digits |
\\w match word characters |
\\W match all but word characters |
\\s match spaces |
\\S match all but spaces |
[a-zA-Z] match alphabet |
[^a-zA-Z] match all but alphabet |
str_match_all("Toy Story 3", "[\\d\\s]")
Result:
[,1]
[1,] " "
[2,] " "
[3,] "3"
Intermediate Regular Expressions in R