Using words' boundaries (\\b) and specifying 2 possibilities for the lookaround:
unlist(strsplit("var==5", "(?=(\\b[^a-zA-Z0-9])|(\\b[a-zA-Z0-9]\\b))", perl = TRUE)) [1] "var" "==" "5" unlist(strsplit("var<3", "(?=(\\b[^a-zA-Z0-9])|(\\b[a-zA-Z0-9]\\b))", perl = TRUE)) [1] "var" "<" "3" unlist(strsplit("var>2", "(?=(\\b[^a-zA-Z0-9])|(\\b[a-zA-Z0-9]\\b))", perl = TRUE)) [1] "var" ">" "2"
Explanation:
Split at the end of the "word" and, after, there is either a non-alphanumeric character \\b[^a-zA-Z0-9] or it is the end of the "word" and, after, there is an alphanumeric character.
EDIT:
Actually the above code would have unexpected results if the number at the end is 10 or more.
Another option is to use lookbehind and split when, before, there is either a non alphanum character followed by a word edge, or an alphanum character followed by a word edge:
strsplit("var<20", "(?<=(([^a-zA-Z0-9]\\b)|([a-zA-Z0-9]\\b)))", perl = TRUE)[[1]] #[1] "var" "<" "20" strsplit("var==20", "(?<=(([^a-zA-Z0-9]\\b)|([a-zA-Z0-9]\\b)))", perl = TRUE)[[1]] #[1] "var" "==" "20" strsplit("var!=5", "(?<=(([^a-zA-Z0-9]\\b)|([a-zA-Z0-9]\\b)))", perl = TRUE)[[1]] #[1] "var" "!=" "5"
EDIT2:
Totally stealing @Tensibai way to define alphanum(+underscore)/non alphanum characters, the above regex can be simplify to: "(?<=((\\W\\b)|(\\w\\b)))"
>,<and==?sub("(.*?)([=<>].)(.*)", "\\2", "var==55", perl = TRUE)or something similar. You can also use it for splittingstrsplit(sub("(.*?)([=<>].)(.*)", "\\1 \\2 \\3", "var==55", perl = TRUE), " ")but Wiktors solution is better probablylapply(unlist(lapply(c("var<3", "var==5", "var>2"), function(e) parse(text = e))), sapply, deparse)[1] "var" ">5" "5"when I use"var>55"as x in sub().