14

With os.path.expandvars I can expand environment variables in a string, but with the caveat: "Malformed variable names and references to non-existing variables are left unchanged" (emphasis mine). And besides, os.path.expandvars expands escaped \$ too.

I would like to expand the variables in a bash-like fashion, at least in these two points. Compare:

import os.environ import os.path os.environ['MyVar'] = 'my_var' if 'unknown' in os.environ: del os.environ['unknown'] print(os.path.expandvars("$MyVar$unknown\$MyVar")) 

which gives my_var$unknown\my_var with:

unset unknown MyVar=my_var echo $MyVar$unknown\$MyVar 

which gives my_var$MyVar, and this is what I want.

8 Answers 8

6

The following implementation maintain full compatibility with os.path.expandvars, yet allows a greater flexibility through optional parameters:

import os import re def expandvars(path, default=None, skip_escaped=False): """Expand environment variables of form $var and ${var}. If parameter 'skip_escaped' is True, all escaped variable references (i.e. preceded by backslashes) are skipped. Unknown variables are set to 'default'. If 'default' is None, they are left unchanged. """ def replace_var(m): return os.environ.get(m.group(2) or m.group(1), m.group(0) if default is None else default) reVar = (r'(?<!\\)' if skip_escaped else '') + r'\$(\w+|\{([^}]*)\})' return re.sub(reVar, replace_var, path) 

Below are some invocation examples:

>>> expandvars("$SHELL$unknown\$SHELL") '/bin/bash$unknown\\/bin/bash' >>> expandvars("$SHELL$unknown\$SHELL", '') '/bin/bash\\/bin/bash' >>> expandvars("$SHELL$unknown\$SHELL", '', True) '/bin/bash\\$SHELL' 
Sign up to request clarification or add additional context in comments.

1 Comment

I gave this a test, and it's not 100% equivalent to what you would get from a bash expansion. In bash, "\$IGNORE" would consume the backslash and return "$IGNORE". Whereas this python implementation would leave it as "\$IGNORE". Just pointing this out in case someone is looking for 1-to-1 bash expansion
5

Try this:

re.sub('\$[A-Za-z_][A-Za-z0-9_]*', '', os.path.expandvars(path)) 

The regular expression should match any valid variable name, as per this answer, and every match will be substituted with the empty string.

Edit: if you don't want to replace escaped vars (i.e. \$VAR), use a negative lookbehind assertion in the regex:

re.sub(r'(?<!\\)\$[A-Za-z_][A-Za-z0-9_]*', '', os.path.expandvars(path)) 

(which says the match should not be preceded by \).

Edit 2: let's make this a function:

def expandvars2(path): return re.sub(r'(?<!\\)\$[A-Za-z_][A-Za-z0-9_]*', '', os.path.expandvars(path)) 

check the result:

>>> print(expandvars2('$TERM$FOO\$BAR')) xterm-256color\$BAR 

the variable $TERM gets expanded to its value, the nonexisting variable $FOO is expanded to the empty string, and \$BAR is not touched.

9 Comments

I was going to say that this would expand escaped \$ too, which I don't want, but then i realized that os.path.expandvars does the same. Maybe I have to modify the question...
I do not follow. expandvars already replaces $VAR with its value if VAR is an existing env var. the result can only contain more occurrences of $VAR if VAR is not an env variable, which you said you wanted to replace with the empty string, like bash does.
I've modified and expanded the question now, hopefully it's clearer. Your solution would remove $unknown from the result, but leave \my_var where I want $MyVar (unexpanded, because the dollar was escaped).
But now my problem is that os.path.expandvars already expands the \$MyVar I don't want expanded! (and the regexp wouldn't work if it were the backslash what's escaped, as in \\$unknown).
which python version are you using? on my machine print(os.path.expandvars("$MyVar$unknown\$MyVar")) prints my_var$unknown\$MyVar (check my updated answer)
|
3

There is a pip package called expandvars which does exactly that.

pip3 install expandvars 
from expandvars import expandvars print(expandvars("$PATH:${HOME:?}/bin:${SOME_UNDEFINED_PATH:-/default/path}")) # /bin:/sbin:/usr/bin:/usr/sbin:/home/you/bin:/default/path 

It has the benefit of implementing default value syntax (i.e., ${VARNAME:-default}).

Comments

2

The alternative solution - as pointed out by @HuStmpHrrr - is that you let bash evaluate your string, so that you don't have to replicate all the wanted bash functionality in python.

Not as efficient as the other solution I gave, but it is very simple, which is also a nice feature :)

>>> from subprocess import check_output >>> s = '$TERM$FOO\$TERM' >>> check_output(["bash","-c","echo \"{}\"".format(s)]) b'xterm-256color$TERM\n' 

P.S. beware of escaping of " and \: you may want to replace \ with \\ and " with \" in s before calling check_output

1 Comment

I think this is way better. Regular expressions are hard to maintain and debug. I doubt anybody would really feel the inefficiency of spawning a process calling the shell etc. How many times one would be calling (with different s!!) this? If you're calling with same s over and over, caching is the solution, not regular expressions.
2

Here's a solution that uses the original expandvars logic: Temporarily replace os.environ with a proxy object that makes unknown variables empty strings. Note that a defaultdict wouldn't work because os.environ

For your escape issue, you can replace r'\$' with some value that is guaranteed not to be in the string and will not be expanded, then replace it back.

class EnvironProxy(object): __slots__ = ('_original_environ',) def __init__(self): self._original_environ = os.environ def __enter__(self): self._original_environ = os.environ os.environ = self return self def __exit__(self, exc_type, exc_val, exc_tb): os.environ = self._original_environ def __getitem__(self, item): try: return self._original_environ[item] except KeyError: return '' def expandvars(path): replacer = '\0' # NUL shouldn't be in a file path anyways. while replacer in path: replacer *= 2 path = path.replace('\\$', replacer) with EnvironProxy(): return os.path.expandvars(path).replace(replacer, '$') 

Comments

1

I was unhappy with the various answers, needing a little more sophistication to handle more edge cases such as arbitrary numbers of backslashes and ${} style variables, but not wanting to pay the cost of a bash eval. Here is my regex based solution:

#!/bin/python import re import os def expandvars(data,environ=os.environ): out = "" regex = r''' ( (?:.*?(?<!\\)) # Match non-variable ending in non-slash (?:\\\\)* ) # Match 0 or even number of backslash (?:$|\$ (?: (\w+)|\{(\w+)\} ) ) # Match variable or END ''' for m in re.finditer(regex, data, re.VERBOSE|re.DOTALL): this = re.sub(r'\\(.)',lambda x: x.group(1),m.group(1)) v = m.group(2) if m.group(2) else m.group(3) if v and v in environ: this += environ[v] out += this return out # Replace with os.environ as desired envars = { "foo":"bar", "baz":"$Baz" } tests = { r"foo": r"foo", r"$foo": r"bar", r"$$": r"$$", # This could be considered a bug r"$$foo": r"$bar", # This could be considered a bug r"\n$foo\r": r"nbarr", # This could be considered a bug r"$bar": r"", r"$baz": r"$Baz", r"bar$foo": r"barbar", r"$foo$foo": r"barbar", r"$foobar": r"", r"$foo bar": r"bar bar", r"$foo-Bar": r"bar-Bar", r"$foo_Bar": r"", r"${foo}bar": r"barbar", r"baz${foo}bar": r"bazbarbar", r"foo\$baz": r"foo$baz", r"foo\\$baz": r"foo\$Baz", r"\$baz": r"$baz", r"\\$foo": r"\bar", r"\\\$foo": r"\$foo", r"\\\\$foo": r"\\bar", r"\\\\\$foo": r"\\$foo" } for t,v in tests.iteritems(): g = expandvars(t,envars) if v != g: print "%s -> '%s' != '%s'"%(t,g,v) print "\n\n" 

2 Comments

Unfortunately, a proper treatment requires not only dealing with escaped dollar signs, but also quoting issues. For instance, the syntax $var1" "$var2' '$var3'$var4' should expand such that the quotes disappear. And $var4 should be left verbatim.
What we want is an interface to the wordexp POSIX C library function, more or less.
0

I have run across the same issue, but I would propose a different and very simple approach.

If we look at the basic meaning of "escape character" (as they started in printer devices), the purpose is to tell the device "do something different with whatever comes next". It is a sort of clutch. In our particular case, the only problem we have is when we have the two characters '\' and '$' in a sequence.

Unfortunately, we do not have control of the standard os.path.expandvars, so that the string is passed lock, stock and barrel. What we can do, however, is to fool the function so that it fails to recognize the '$' in that case! The best way is to replace the $ with some arbitrary "entity" and then to transform it back.

def expandvars(value): """ Expand the env variables in a string, respecting the escape sequence \$ """ DOLLAR = r"\&#36;" escaped = value.replace(r"\$", r"\%s" % DOLLAR) return os.path.expandvars(escaped).replace(DOLLAR, "$") 

I used the HTML entity, but any reasonably improbable sequence would do (a random sequence might be even better). We might imagine cases where this method would have an unwanted side effect, but they should be so unlikely as to be negligible.

3 Comments

To ensure no collision, you could do something like: if DOLLAR in value: DOLLAR = DOLLAR + '\0' * len(value).
Thanks a lot! Now that you point this out, this seems obvious. My only question is the rationale behind len(value). Would it be preferable to another method, e.g. looping the test (if DOLLAR in value) and progressively adding fillers?
A problem with this approach (and probably with some of the others) is it will consider a backslash that is itself escaped; for example, "\\$VARIABLE" should still expand $VARIABLE.
0

You can try this. Inspired from Seth Robertson's answer

def expand_vars(input_str: str, env_dict: Optional[dict[str, str]] = None) -> str: """ Method to expand variables in given string data. Args: input_str (str): String data in which variables need to be expanded. env_dict (dict, optional): Environment variables dict. Defaults to os.environ. Returns: str: Expanded string data. """ if env_dict is None: env_dict = dict(os.environ) def replacer(match: re.Match) -> str: prefix = match.group(1) var_name = match.group(2) or match.group(3) if var_name in env_dict: return prefix + env_dict[var_name] return prefix + (f"${{{var_name}}}" if match.group(3) else f"${var_name}") pattern = r"(.*?)(?:\$(\w+)|\$\{(\w+)\})" # Expand variables until no more changes occur previous_str = None while previous_str != input_str: previous_str = input_str input_str = re.sub(pattern, replacer, input_str) return input_str 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.