Auto-meta-code-golf

Question

You are sick of all of the codegolf challenges. Hence you decide to write a program that will automatically golf some Python code for you. There are 3 test cases:

print quickSort([0,7,3,-1,8,10,57,2]) def quickSort(arr): less = [] pivotList = [] more = [] if len(arr) <= 1: return arr else: pivot = arr[0] for i in arr: if i < pivot: less.append(i) elif i > pivot: more.append(i) else: pivotList.append(i) less = quickSort(less) more = quickSort(more) return less + pivotList + more

for i in xrange(1, 101): if i % 15 == 0: print "FizzBuzz" elif i % 3 == 0: print "Fizz" elif i % 5 == 0: print "Buzz" else: print i

from sys import argv def randomGenerator(seed=1): max_int32 = (1 << 31) - 1 seed = seed & max_int32 while True: seed = (seed * 214013 + 2531011) & max_int32 yield seed >> 16 def deal(seed): nc = 52 cards = range(nc - 1, -1, -1) rnd = randomGenerator(seed) for i, r in zip(range(nc), rnd): j = (nc - 1) - r % (nc - i) cards[i], cards[j] = cards[j], cards[i] return cards def show(cards): l = ["A23456789TJQK"[c / 4] + "CDHS"[c % 4] for c in cards] for i in range(0, len(cards), 8): print " ", " ".join(l[i : i+8]) if __name__ == '__main__': seed = int(argv[1]) if len(argv) == 2 else 11982 print "Hand", seed deck = deal(seed) show(deck)

Rules:

Your program must not target the code I posted specifically, and should work with any Python 2 code. I reserve the right to change the source code being codegolfed. You may assume that there are no multi-line strings (so you don't have build a full-blown parser), and that locals() is not called.
The output of your program should run in an identical manner as the original source code. (Namely, it must produce the same output. Variable names and language constructs can be changed, as long as the output remains the same)
You may use STDIO or a File to do your input/output of the source code.

Your score will be the sum of the bytes of your program's output.

(The code listed above has been taken from http://rosettacode.org/ under the GNU Free Documentation License 1.2)

This seems very similar to codegolf.stackexchange.com/questions/3652/write-a-code-golfer/… — KSab
– KSab, Commented Jan 7, 2015 at 16:34
What's your model for determining whether the output "[runs] in an identical manner as the original source code"? E.g. for the second example, I believe that removing if __name__ == '__main__': would affect the behaviour in some contexts but not others. For another example, if the ungolfed input assumes that it reads an int from stdin and throws one type of exception if given something else, could the golfed input throw a different type of exception if given a non-integer? — Peter Taylor
– Peter Taylor, Commented Jan 7, 2015 at 17:00
What about a program like this: random_long_variable=0;print locals()? — Justin
– Justin, Commented Jan 7, 2015 at 20:49
Multiline strings aren't the only way to make empty lines functional — Sp3000
– Sp3000, Commented Jan 8, 2015 at 1:15

Logic Knight · Accepted Answer · 2015-01-09 06:37:14Z

Python 2.7, 794

I have been meaning to build a minifier for Python for a while, so this is a good opportunity to investigate the problem.

The program uses a mix of regular expression analysis, and Python parser operations. White space is minimised. Variable that are defined by the user are replaced by a single letter variable (which is not in use!). Finally the while True statement is put on a diet.

The three test cases all verify as running correctly. I could imagine some pathological examples which could result in errors in the generated code but the algorithm should be robust under most circumstances.

Results

228 t1.py 128 t2.py 438 t3.py 794 total

Output

def c(a): b=[] d=[] f=[] if len(a)<=1: return a else: e=a[0] for i in a: if i<e: b.append(i) elif i>e: f.append(i) else: d.append(i) b=c(b) f=c(f) return b+d+f print c([0,7,3,-1,8,10,57,2]) for i in xrange(1,101): if i%15==0: print"FizzBuzz" elif i%3==0: print"Fizz" elif i%5==0: print"Buzz" else: print i from sys import argv def a(k=1): b=(1<<31)-1 k=k&b while 1: k=(k*214013+2531011)&b yield k>>16 def d(k): f=52 h=range(f-1,-1,-1) g=a(k) for i,r in zip(range(f),g): j=(f-1)-r%(f-i) h[i],h[j]=h[j],h[i] return h def m(h): l=["A23456789TJQK"[c/4]+"CDHS"[c%4]for c in h] for i in range(0,len(h),8): print" "," ".join(l[i:i+8]) if __name__=='__main__': k=int(argv[1])if len(argv)==2 else 11982 print"Hand",k e=d(k) m(e)

Code

import sys import re from tokenize import generate_tokens from token import tok_name from keyword import iskeyword wr = sys.stdout.write def pyparse(text): 'Return [TYPE,TOKEN] pair list' # Use KEYWORD,NAME,NUMBER,OP,STRING,NL,NEWLINE,COMMENT,INDENT,DEDENT rawtokens = generate_tokens(text.readline) tokens = [[tok_name[n], t] for n,t,p1,p2,dx in rawtokens] for tpair in tokens: if tpair[0] == 'NAME' and iskeyword(tpair[1]): tpair[0] = 'KEYWORD' return tokens def finduservars(filename): 'Return a set of user variables that we can replace with a-zA-Z' varset = set() for line in open(filename): line = line.strip() match = re.match(r'def\s+(\w+)\s*\((.*)\)\s*:', line) if match: func, args = match.groups() varset.add(func) arglist = re.findall(r'(\w+|=)', args) for a in arglist: if a == '=': break # keyword args follow - too hard to parse varset.add(a) continue match = re.match(r'(\w+)\s*=.+', line) if match: assigned = match.group(1) varset.add(assigned) continue return set(v for v in list(varset) if len(v) > 1) filename = sys.argv[1] tokenlist = pyparse(open(filename)) # Build map for var->char conversion: varset = finduservars(filename) singles = [text for tok,text in tokenlist if tok=='NAME' and len(text)==1] allvar = [chr(n) for n in range(97,123)+range(65,91)] charvar = [c for c in allvar if c not in singles] varreplaced = list(varset)[:len(charvar)] varmap = dict((v, charvar.pop(0)) for v in varreplaced) prev = 'NONE' indent = [''] output = [] add = output.append for tok, text in tokenlist: if tok == 'NL': continue elif tok == 'INDENT': indent.append( text.replace(' ', ' ') ) output[-1] = indent[-1] elif tok == 'DEDENT': indent.pop(-1) output[-1] = indent[-1] elif tok == 'NEWLINE': add(text) add(indent[-1]) elif tok in 'KEYWORD,NAME,NUMBER': if prev in 'KEYWORD,NAME,NUMBER': add(' ') if tok == 'NAME': if output[-2] == 'while' and text == 'True': add('1') # common verbose idiom else: add(varmap.get(text, text)) else: add(text) else: add(text) prev = tok wr(''.join(output))

Community · Accepted Answer · 2020-06-17 09:04:33Z

sed, 1074 (down from 1390)

Very mild, low-hanging-fruit type of answer, to get the ball rolling:

/^$/d # Remove empty lines /^[ <--TAB-->]*#/d # Remove whole-line comments s/ /<--TAB-->/g # Replace 4 spaces with tabs /^[^'"]*$/s/ *([|&:,<>=*/%+-]) */\1/g # Remove spaces before/after operators

_{Replace <--TAB--> with real TAB characters}

Obvious shortcoming:

Indents assumed to be exactly 4 spaces in input code.

Since we can assume no multi-line strings, then we only strip leading/trailing spaces from operators if there are no ' or " on the given line. This could be improved, but _{<mumbles something about sed regex always being greedy>}.

Test as follows:

$ cat qs.py fizzbuzz.py cards.py | wc -c 1390 $ sed -rf pygolf.sed qs.py fizzbuzz.py cards.py | wc -c 1074 $ sed -rf pygolf.sed qs.py fizzbuzz.py cards.py | python [-1, 0, 2, 3, 7, 8, 10, 57] 1 2 Fizz ... 98 Fizz Buzz Hand 11982 AH AS 4H AC 2D 6S TS JS 3D 3H QS QC 8S 7H AD KS KD 6H 5S 4D 9H JH 9S 3C JC 5D 5C 8C 9D TD KH 7C 6C 2C TH QH 6D TC 4S 7S JD 7D 8H 9C 2H QD 4C 5H KC 8D 2S 3S $

You don't need to check for multi-line strings, but your last two definitely need to be updated. — Nathan Merrill
– Nathan Merrill, Commented Jan 7, 2015 at 20:41
@NathanMerrill yup. The operator leading/trailing space one is a bit better now, but the indent one will be much harder to generalize - and is where I get about 2/3 of the gain. — Digital Trauma
– Digital Trauma, Commented Jan 7, 2015 at 21:01

TheNumberOne · Accepted Answer · 2015-01-08 01:18:16Z

Python 3.4, 1134

This program should work alright for most programs. Strangely, Sp3000 test case is much easier to optimize for my program than your programs. Input is accepted through the file specified in the first argument. The actual file is modified.

import subprocess from sys import argv progamtext = open(argv[1]).read() if 'argv' in progamtext or 'input' in progamtext or 'open' in programtext:#Make sure the program always produces the same results. exit(0) program = subprocess.Popen(['C:\Python27\python', argv[1]], stdout=subprocess.PIPE, stderr=subprocess.PIPE) program.wait() erroroutput1 = str(program.stderr.read()) output1 = str(program.stdout.read()) program = subprocess.Popen(['C:\Python27\python', argv[1]], stdout=subprocess.PIPE, stderr=subprocess.PIPE) program.wait() erroroutput2 = str(program.stderr.read()) output2 = str(program.stdout.read()) if erroroutput1 != erroroutput2 or output1 != output2:#Make sure the program always produces the same results. exit(0) newprog = '' if erroroutput1: newprog += "import sys\n" + "sys.stderr.write("+ erroroutput1 + ')' if output1: newprog += "\n" if output1: newprog += 'print ' + output1 if len(newprog) > len(progamtext): exit(0) open(argv[1],mode='w').write(newprog)

How it works:

First, this program checks to see if your program interacts with the user at all or uses random. If it does, the program is unmodified. Next, the program is ran. The program is then replaced with print "output". Finally, if the program is shorter than its output, it is unmodified.

Sp3000's program, optimized:

import sys sys.stderr.write(b'') print b'0.540377721372\r\n3\r\n1\r\n7\r\n99\r\nf\r\n[5, 5]\r\n53\r\n53\r\n53\r\n'

Sp3000's super bonus program, optimized:

The optimized version is only off .001% of the time.

import sys sys.stderr.write(b'') print b'B\r\n'

I'm sure there are other external effects than argv, input and random, which your code would break. ;) — Martin Ender
– Martin Ender, Commented Jan 8, 2015 at 0:41
Hah. Maybe I should have put in some non-determinism - print id(0) is a good one. — Sp3000
– Sp3000, Commented Jan 8, 2015 at 0:50

Stack Exchange Network

Auto-meta-code-golf

3 Answers 3

Python 2.7, 794

Results

Output

Code

sed, 1074 (down from 1390)

Test as follows:

Python 3.4, 1134

How it works:

Linked

Hot Network Questions

Auto-meta-code-golf

3 Answers 3

Python 2.7, 794

Results

Output

Code

sed, 1074 (down from 1390)

Test as follows:

Python 3.4, 1134

How it works:

Linked

Related

Hot Network Questions