0

I'm making a basic language. Well, not exactly, but you'll see. Now, I did echo and exit commands, but I need help.

If I give it a string 'echo "hello bob"' I want it to split it up, and give me an array like so [echo, Hello Bob]. Now, I have echo working, but with only ONE word. So I can do --> 'echo bob', and it will output 'bob'. But, If I do 'echo hi bob' it will output 'hi'. And I always want it to do that. If I have a command foo, I want to do 'foo "bar face" boo' and get [foo, bar face, boo]. So basically I want to do myArr.split(' ') except for anything in between quotes. How can I do this?

2
  • I believe the parser module can do this, although I've never used it. Commented May 31, 2012 at 22:06
  • Sorry not the parser module, the shlex module, it provides a way to parse shell-like languages. Commented May 31, 2012 at 22:11

2 Answers 2

4

Here is a simple answer:

>>> import shlex >>> shlex.split('echo "hello bob"') ['echo', 'hello bob'] 

shlex is a module that helps with parsing shell-like languages.

The documentation can be found here (thank you, JIStone): http://docs.python.org/library/shlex.html

Sign up to request clarification or add additional context in comments.

Comments

1

Here is a simple tokenizer

import re def s_ident(scanner, token): return token def s_operator(scanner, token): return "op%s" % token def s_float(scanner, token): return float(token) def s_int(scanner, token): return int(token) scanner = re.Scanner([ (r"[a-zA-Z_]\w*", s_ident), (r"\d+\.\d*", s_float), (r"\d+", s_int), (r"=|\+|-|\*|/", s_operator), (r"\s+", None), ]) print scanner.scan("sum = 3*foo + 312.50 + bar") 

You will need a parser to actually use this lex'd content

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.