-1

How can I get a Python list from a text file with the following contents?

'hallo' 'hallo\n' '\x00' * 1 100 '400 + 2' 400 + 2 

For example:

ll = ["hallo", "hallo\n", "\x00", 100, 402, 402] 

with the types:

[string, string, string, int, int, int] 

Means, every string which python understands as int should be from type int.

I tried to use eval but it has difficulties with \n and \x00.

The user input (list of strings to convert) is assumed to be safe.

5
  • Do you only want to convert strings and numbers, or do you plant to eval any Python object? Commented Apr 4, 2017 at 20:55
  • 2
    How would you possibly decide what string stays a string and what gets evaluated? I.e. why does '400 + 2' become an evaluated number, how do you decide that? – Anyway, you need to write some smaller parser for this to detect what you want to do with the input. Once you have that, there shouldn’t be a problem evaluating the input according to your decision. The question as it stands is kind of too broad for SO. Commented Apr 4, 2017 at 20:57
  • From the question you have stated, I agree with poke, it is a little to broad, and you should define what you want to do with the input for each case. It looks like its heading toward regular expression/s in a loop/iterator at the moment. Is that list exhaustive in terms of inputs? Commented Apr 4, 2017 at 21:01
  • You keep changing your input to the point where it’s no longer clear what you are trying to do. Please actually clarify what you are trying to do; not just what the results from your examples are supposed to do, but what actually happens with the input. Commented Apr 4, 2017 at 21:50
  • Sorry for specializing the question multiple times. It should now be fine. Commented Apr 4, 2017 at 21:56

3 Answers 3

3

WARNING : Using eval is dangerous. Be very careful with it, or, better yet, find an alternative without.

That being said, you could define a regex to check if the string looks like something you'd like to eval. For example, anything with only numbers, spaces and mathematical operators could be deemed safe:

import re l = ['hallo', 'hallo\n', '\x00' * 1, '100', 100, '400 + 2', '400 + - ', 400 + 2] def string_or_expression(something): if isinstance(something, str): expression = re.compile('\A[\d\.\-\+\*\/ ]+\Z') if expression.match(something): try: return eval(something) except: return something return something print([string_or_expression(s) for s in l]) # ['hallo', 'hallo\n', '\x00', 100, 100, 402, '400 + - ', 402] 

With Python3, you might use ast.literal_eval, which might be a little less dangerous than a plain eval :

import re import ast l = ['hallo', 'hallo\n', '\x00' * 1, '100', 100, '400 + 2', '400 + - ', 400 + 2] def string_or_expression(something): if isinstance(something,str): expression = re.compile('\A[\d\.\-\+\*\/ ]+\Z') if expression.match(something): try: return ast.literal_eval(something) except: return something return something print([string_or_expression(s) for s in l]) # ['hallo', 'hallo\n', '\x00', 100, 100, 402, '400 + - ', 402] 

Yet another alternative would be to use @poke's "expression evaluation algorithm", since literal_eval doesn't understand '2 * 3'.

Finally, even a "safe" expression like '2**2**2**2**2**2**2**2**2**2' could bring your server down.

Sign up to request clarification or add additional context in comments.

14 Comments

You don't actually need to evaluate the math by the time you've written that regex. You could declare that an expression that "looks like math" is probably a number. This kind of approximation is much safer than using eval, and well worth the tradeoff IMO.
@kojiro: That would be nice, but how would you convert '400 + 2' to 402 then?
Ah, nice, OP changed the question on me. Or perhaps I misunderstood it from the get-go. I thought OP was asking for an interpretation of the types.
Adding a comment “Be very careful here!!!” does not really do anything helpful. Once you use eval on user input, you have already lost and cannot be careful anymore.
@poke: True. Still, I left the comment so that if OP plays with the solution and copy-pastes it somewhere else, that there's a clear reminder of a very dangerous method call. I'd be interested to know what the most dangerous expression could be with just digits and basic operators.
|
0

how about:

 def try_eval(x): try: res=eval(x) except: res=x return res [try_eval(x) for x in l] 

output:

['hallo', 'hallo\n', '\x00', 100, 402] 

4 Comments

eval will not tolerate \x00 so the result will be: ['hallo', 'hallo\n', '\x00 * 1', 100, 402]
Same comment as for user1753919 ['hallo', 'impOrt shutil; shutil.rmtree("/home/user")', '400 + 2'] There's a typo to make it "safer". You get the idea.
@mr.wolle I did copy paste... the result is actually \x00 and not \x00*1. You can check it...
You shouldn't call eval without some kind of check.
0

Let's get serious about avoiding dangerous eval here >:)

import compiler def is_math(expr): """Return True if the expression smells mathematical.""" try: module = compiler.parse(expr) stmt, = module.getChildNodes() discard, = stmt.getChildNodes() code, = discard.getChildNodes() return not isinstance(code, compiler.ast.Name) except ValueError: return False except TypeError: return False t = [eval(s) if is_math(s) else s for s in l] 

Yes, I took a couple of assumptions here, but you can modify them to suit your needs as strictly as you really need. The AST is pretty easy to understand. When you do a parse, you get a Module. Inside the Module is a Statement. Inside that is (most likely) discard code (that just means it isn't being used anywhere).

If it isn't discard code, we assume it's a string. For one thing, this is likely to prevent any dangerous side effects from eval. (Someone prove me wrong here – wrap a dangerous expression in discard code.)

Inside that is the meat of your expression – from there I assume that anything that is a plain string will appear to be a name in the AST. Anything that isn't a name is probably a number or a math operation.

I think eval should be safe at this point, which is necessary if the expression is really math.

4 Comments

OP needs more information than just number or string. Still, your method is interesting.
Updated, somewhat hesitantly. Given my new understanding of the question I don't think eval is avoidable. (Even if I unparsed the AST and executed it, the result would be essentially eval.)
You could write your own expression evaluator, like I’ve outlined in one of my answers. Then you’re completely safe on the evaluating side of the question (I still consider the detection part rather difficult/unclear).
Seems one can defeat this rather easily using parentheses. I will have to try harder…

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.