3

Basically, I want to split a multiline string at the regular expression that, in Perl, would be specified with \n(?=[^\W\d]\w*=).

Note that the Perl regexp features a "zero-width lookahead match" ((?=...)), meaning that the substrings that match this part of the regexp will not be "consumed"; they will appear in their corresponding entries in the resulting list.

For example, the string whose printed representation is

A=1 B_C_D= foo bar baz =whatever= X_Y_Z=quux 

would be split into the following three-element list:

("A=1" "B_C_D=\nfoo\nbar\nbaz\n =whatever=" "X_Y_Z=quux\n") 

Even though Elisp regexps do not support lookahead/lookbehind expressions, this sort of splitting problem (in which part or all of each matching substring is not consumed) is common enough that I hope there are some other standard ways to solve it in Elisp.

Moreover, the ultimate goal here is to split the output of the shell command

/usr/bin/zsh -ilc 'printenv' 2>/dev/null 

so that each element of the result would correspond to one "setting" of an environment variable1. So I also figure that Elisp may possibly have functions that facilitate the handling of this special case2.


1 The desired result is conceptually similar to value of the variable process-environment, but in general it is not be identical to it. Also, the side-effects associated with setting and updating process-environment do not enter into my problem.

2 Unfortunately, it appears that process-environment is initialized in C-code, so whatever functions do that are not available in Elisp.

1
  • 1
    If you are happy about patching the emacs source you could try this: github.com/gamesun/emacs-regex-lookaround But note that it is written against 24.2 so may need modifying for more recent versions Commented Jan 18, 2016 at 18:50

2 Answers 2

4

This might not be quite the answer you want, but use printenv -0 instead. This puts a NUL between each "line" of output, which in this case will be between each var=value pair:

TMUX_PANE=%0^@test=foo bar baz ^@SHLVL=2 

Parsing that will be much easier.

Code:

(split-string (shell-command-to-string "printenv -0") "\0" t) 
1

I think db48x has the right answer for your underlying problem (without the -0 argument, printenv's output is ambiguous in case one of your env vars contains ...\nfoo=bar).

But as for how to do it in Elisp, I'd do it as follows:

(defun parse-foo (str) (with-temp-buffer (insert str) (goto-char (point-min)) (let ((start (point)) (strs ())) (while (re-search-forward "\n\\w+=" nil 'move) (push (buffer-substring start (match-beginning 0)) strs) (setq start (1+ (match-beginning 0)))) (cons (buffer-substring start (point)) strs))) 

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.