0

I'm using helm-swoop to search within the current buffer using regular expressions. My goal is to find lines containing a whitespace character followed immediately by the digit 4.

Consider a buffer containing the line:

I've been here 4 years 

When I run M-x helm-swoop and enter the pattern \s4 in the prompt, it fails to find any matches, even though \s is a standard regex shorthand for whitespace.

However, if I use the POSIX bracket expression for whitespace and search for [[:space:]]4 instead, helm-swoop correctly matches the line.

I have confirmed that fuzzy matching is not interfering (e.g., by setting (setq-local helm-swoop-use-fuzzy-match nil) before searching). I have also ensured helm-swoop starts with an empty prompt using (setq helm-swoop-pre-input-function (lambda () nil)).

Why does the standard \s shorthand not work as expected within the helm-swoop regex search, while the more explicit [[:space:]] POSIX character class does work? Is this expected behavior, related to how Helm or Swoop handles regex input, or potentially a configuration issue?

3
  • I’ve never used helm-swoop and have no idea what it is, but try entering \\s4 instead. Either way, you should consult its documentation for details. Commented Mar 30 at 7:30
  • No, that also doesn't work. Commented Mar 30 at 8:17
  • 1
    \s in Emacs regexp syntax is for matching character syntax. For whitespace, you can match using \s- Commented Mar 31 at 9:05

1 Answer 1

2

\s is string syntax for the space character: it's an escape sequence, not a regexp backslash construct (in contrast to Perl regexps). So when you type a search string interactively, you should just hit the space bar when you want a space. Try with C-M-s. OTOH, in a string you can use "This\sis a space." but it's not buying you much: it's the same as typing "This is a space."

See Basic Char Syntax in the Elisp Reference manual. In particular, this paragraph:

These sequences which start with backslash are also known as “escape sequences”, because backslash plays the role of an escape character; this has nothing to do with the character . ‘\s’ is meant for use in character constants; in string constants, just write the space.


EDIT: As @rpluim points out in a comment, \sC for some C is Emacs regexp syntax for matching character syntax, e.g. \s- matches any character with character syntax of whitespace, \sw for word constituents, s( matches open parens, etc. Note that character syntax can be mode-dependent: each major mode has its own syntax table.

See Regexp Backslash in the Emacs Lisp Reference manual and the links therein. The relevant paragraph is:

‘\sCODE’ matches any character whose syntax is CODE. Here CODE is a character that represents a syntax code: thus, ‘w’ for word constituent, ‘-’ for whitespace, ‘(’ for open parenthesis, etc. Torepresent whitespace syntax, use either ‘-’ or a space character. *Note Syntax Class Table::, for a list of syntax codes and the characters that stand for them. 

Note that \s-(or \s ) would match any character whose syntax code is whitespace, not just spaces.

So you could type \s-4 or \s 4 to C-M-s (and probably to helm-swoop, although I have not tried that out).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.