AI In 2048 Game, Calculations so slow

Question

Inspired by this answer on stackoverflow for AI implementation in 2048 game.

I have been trying to make AI calculate best move by playing the game given specific times in memory preceding by one specific move first, then averaging the score and declaring move with highest score as best move.

The workflow is fine, i am getting results good and as described in the answer, the chances of win in this game at 100 runs per move is about 80%.

The problem is: My implementation in python is really slow. If somebody please could suggest me improvements in my program to increase speed, that'd be so helpful.

Here's extract of main items:

from random import choice, random import numpy as np import time def left(grid): #assumption: grid is 4 x 4 numpy matrix l = grid.copy() for j in range(4): res = [] merged = False for i in l[j]: if i==0:continue if res and i == res[-1] and not merged: res[-1] += i merged = True else: if res: merged = False res.append(i) for i in range(4 - len(res)): res.append(0) l[j] = res return l def right(grid): l = grid.copy() for j in range(4): res = [] merged = False for i in range(4): t = l[j][i] if t == 0: continue if res and t == res[-1] and not merged: res[-1]+=t merged = True else: if res: merged = False res.append(t) for i in range(4-len(res)): res = [0]+res l[j] = res return l def down(grid): l = grid.copy() for j in range(4): res = [] merged = False for i in range(4): t = l[i][j] if t == 0: continue if res and t == res[-1] and not merged: res[-1]+=t merged = True else: if res: merged = False res.append(t) for i in range(4-len(res)): res=[0]+res l[:, j] = res return l def up(grid): l = grid.copy() for j in range(4): res = [] merged = False for i in range(4): t = l[-i-1][j] if t == 0: continue if res and t == res[-1] and not merged: res[-1]+=t merged = True else: if res: merged = False res.append(t) for i in range(4-len(res)): res=[0]+res l[:, j] = res[::-1] return l def c(grid, move): if move == 2: return left(grid) if move == 0: return up(grid) if move == 1: return down(grid) if move == 3: return right(grid) def isvalid(grid): if 0 in grid: return True l = grid for i in range(3): for j in range(4): if l[i][j] == l[i+1][j]: return True if l[i][0] == l[i][1] or l[i][1] == l[i][2] or l[i][2] == l[i][3]: return True i = 3 if l[i][0] == l[i][1] or l[i][1] == l[i][2] or l[i][2] == l[i][3]: return True return False ind = np.arange(16).reshape(4,4) def next_play(grid, move): #assumption: grid is 4 x 4 matrix if move not in range(4): return grid #invalid move. moved_grid = c(grid, move) # c moves grid by specific move "move". moved = not (moved_grid == grid).all() if not moved: return grid # return as it was if 0 not in moved_grid: return moved_grid #no spawn needed idx = choice(ind[moved_grid==0]) #randomly picked empty place's index moved_grid[idx//4][idx%4] = 2 if random() < .9 else 4 return moved_grid def rand_moves(data,first_move,times): #data is playing grid, numpy matrix 4 x 4 assert times >0, 'Wrong value of times' score = 0 k = range(4) for _ in range(times): data1 = data.copy() data1 = next_play(data1, first_move) #next_play moves grid & generate tile randomly on an empty place if moved while isvalid(data1): #isvalid checks validity of grid, ie playable or not. data1 = next_play(data1, choice(k)) #choice is random.choice func. score+= data1.max() return score/times def getAvailableMoves(data): data_list= [(c(data,i),i) for i in range(4)] ret = [] for data1,i in data_list: if (data1==data).all():continue else: ret.append(i) return ret def getBestMove(data, times = 10): sc, mv = float('-inf'), None for move in getAvailableMoves(data): score = 0 score += rand_moves(data.copy(),move,times) if score > sc: sc= score mv = move elif score == sc: mv = choice([mv, move]) #randomly choose one of them return mv #if none, case handing occurs at caller side. data = np.asarray([[2,0,0,2], [4,4,0,2], [32,32,2,8], [0,0,0,2]]) #a sample grid t1 = time.time() print(getBestMove(data, 100)) print(time.time() - t1, 's')

You can run this whole program by copying it. You'll notice that its taking 2.5 to 4 seconds or more for each move decision. I need at least 100 runs in memory for deciding best move. This is really slow, since at least thousand moves or above are needed for scoring better, which is quite bad for me.

I don't know C++ or C language, thus can't shift in those language or make cython's use to optimize them.

Any suggestions or improvements would be helpful . Thanks.

For ease, directly TryItOnline

EDIT 1

I tried to store solutions of vectors in dictionary since accessing is faster in dictionary, here's the code:

import pickle with open('ds_small.pickle', 'rb') as var: ds = pickle.load(var) #list of dicts d1 = ds[0] #dictionary containing left shift of all possible tuples of size 4, having elems from 0 to 2048, 2's powers d2 = ds[1] #dictionary containing right shift of all possible tuples of size 4, having elems from 0 to 2048, 2's powers def l(grid): l1=grid.copy() for i in range(4): l1[i] = d1[tuple(l1[i])] return l1 def r(grid): l1 = grid.copy() for i in range(4): l1[i] = d2[tuple(l1[i])] return l1 def u(grid): l1 = grid.copy() for i in range(4): l1[:,i] = d1[tuple(l1[:,i])] return l1 def d(grid): l1 = grid.copy() for i in range(4): l1[:,i] = d2[tuple(l1[:,i])] return l1 def c(grid, move): if move == 2: return l(grid) if move == 0: return u(grid) if move == 1: return d(grid) if move == 3: return r(grid)

Performance incremented. Time got low to 1.8 seconds per move average. But you see, its still quite slow. Suggestions please.

Edit 2 Improved performance by .2 sec by improving isvalid function.

I tried improving next_play function but no success.

Edit 3

I tried cython here, it surely increased performance:

from random import choice, random import numpy as np cimport numpy as np import time import pickle cimport cython @cython.boundscheck(False) def left(np.ndarray grid): #assumption: grid is 4 x 4 numpy matrix cdef np.ndarray l = grid.copy() cdef int j, i, p, merged; cdef long t; cdef list res; for j in range(4): res = []; merged = 0 for i in range(4): t = l[j][-i-1] if t == 0: continue if res and t == res[-1] and merged == 0: res[-1]+=t merged = 1 else: if res: merged = 0 res+=[t] for p in range(4-len(res)): res = [0]+res l[j] = res[::-1] return l @cython.boundscheck(False) def right(np.ndarray grid): cdef np.ndarray l = grid.copy() cdef int j, i, p, merged; cdef long t; cdef list res; for j in range(4): res = [] merged = 0 for i in range(4): t = l[j][i] if t == 0: continue if res and t == res[-1] and merged == 0: res[-1]+=t merged = 1 else: if res: merged = 0 res+=[t] for p in range(4-len(res)): res = [0]+res l[j] = res return l @cython.boundscheck(False) def down(np.ndarray grid): cdef np.ndarray l = grid.copy() cdef int j, i, p, merged; cdef long t; cdef list res; for j in range(4): res = [] merged = 0 for i in range(4): t = l[i][j] if t == 0: continue if res and t == res[-1] and merged == 0: res[-1]+=t merged = 1 else: if res: merged = 0 res+=[t] for p in range(4-len(res)): res=[0]+res l[:, j] = res return l @cython.boundscheck(False) def up(np.ndarray grid): cdef np.ndarray l = grid.copy() cdef int j, i, p, merged; cdef long t; cdef list res; for j in range(4): res = [] merged = 0 for i in range(4): t = l[-i-1][j] if t == 0: continue if res and t == res[-1] and merged == 0: res[-1]+=t merged = 1 else: if res: merged = 0 res+=[t] for p in range(4-len(res)): res=[0]+res l[:, j] = res[::-1] return l @cython.boundscheck(False) @cython.wraparound(False) def c(np.ndarray grid, int move): if move == 2: return left(grid) if move == 0: return up(grid) if move == 1: return down(grid) if move == 3: return right(grid) @cython.boundscheck(False) @cython.wraparound(False) def isvalid(np.ndarray l):#l is grid if 0 in l: return True cdef int i, j; for i in range(3): for j in range(4): if l[i][j] == l[i+1][j]: return True if l[i][0] == l[i][1] or l[i][1] == l[i][2] or l[i][2] == l[i][3]: return True i = 3 if l[i][0] == l[i][1] or l[i][1] == l[i][2] or l[i][2] == l[i][3]: return True return False cdef np.ndarray ind = np.arange(16).reshape(4,4) @cython.boundscheck(False) @cython.wraparound(False) def next_play(np.ndarray grid, int move): #assumption: grid is 4 x 4 matrix if move not in range(4): return grid #invalid move. cdef np.ndarray moved_grid = c(grid, move) # c moves grid by specific move "move". cdef int moved = (moved_grid == grid).all()^1 if moved == 0: return grid # return as it was cdef np.ndarray p = ind[moved_grid==0] if len(p) == 0: return moved_grid #no spawn needed cdef int idx = choice(p) #randomly picked empty place's index moved_grid[idx//4][idx%4] = 2 if random() < .9 else 4 return moved_grid @cython.boundscheck(False) def rand_moves(np.ndarray data,int first_move,int times): #data is playing grid, numpy matrix 4 x 4 assert times >0, 'Wrong value of times' cdef int score = 0; k = range(4) cdef int p,m; cdef np.ndarray data1; for p in range(times): data1 = data.copy() data1 = next_play(data1, first_move) #next_play moves grid & generate tile randomly on an empty place if moved m = data.max() while isvalid(data1): #isvalid checks validity of grid, ie playable or not. data1 = next_play(data1, choice(k)) #choice is random.choice func. m *= 1 if 2*m not in data else 2 score+= m#data1.max() return score/times def getAvailableMoves(np.ndarray data): data_list= [(c(data,i),i) for i in range(4)] ret = [] cdef int move; for data1,move in data_list: if (data1==data).all():continue else: ret.append(move) return ret def getMove(data, int times = 10): cdef float sc = float('-inf') mv = None cdef int move; cdef int score; for move in getAvailableMoves(data): score = 0 score += rand_moves(data.copy(),move,times) if score > sc: sc= score mv = move elif score == sc: mv = choice([mv, move]) #randomly choose one of them return mv #if none, case handing occurs at caller side. #if __name__ == '__main__': def do(): cdef np.ndarray data = np.asarray([[2,2,0,2], [4,4,0,2], [32,32,32,8], [0,0,0,2]]) #a sample grid print(data) t1 = time.time() from sys import argv print(getMove(data, 100))#int(argv[1]))) t_time = time.time() - t1 print(t_time, 's') return t_time

At this edit, the average speed is 1.35763 seconds per move for 100 runs in memory. I immensely need atleast 5 moves per second.

Not the downvoter, but have you done any profiling to determine what precisely is slowing you code down? — Brian61354270
– Brian61354270, Commented Feb 9, 2020 at 18:01
@Brian Its sure that its all because of the game trials in each move. It needs optimization is all i know. The post i linked has js version of that algorithm, this is the link for that js version github.com/ronzil/2048-AI. I did see the code and saw that its roughly doing same stuff in ai.js. I think its python's slowness, but since i can't shift to other language, thus trying to see if optimization is possible. — Vicrobot
– Vicrobot, Commented Feb 9, 2020 at 18:14
The Javascript version looks a lot different than yours, and not just because of the different language. In that version there is a set of "prototype vectors" to deal with the different move directions; in yours you apparently do all kinds of transformations on the entire array to set it up for just a single move. — David K
– David K, Commented Feb 9, 2020 at 21:45
@Vicrobot: Please update your question in that case. Since you have not received any answer yet, you can freely do so. At the moment it is unclear which of your three solutions should be reviewed, or should all of them be reviewed? I would suggest to pick one (maybe the most vanilla one, or maybe the fastest) and ask about that. You can ask about the other approaches in separate questions (maybe a day or so later in order to incorporate any generically applicable advice you might receive). — Graipher
– Graipher, Commented Feb 13, 2020 at 8:44

RootTwo · Accepted Answer · 2020-02-12 08:09:25Z

A couple observations:

Using different names for the playing grid (data, grid, l) makes it more difficult to follow the code.
For the top level call getMove(data, 100), let's assume the 'up' is a possible move. Then c(data, up) gets called 101 times.

getMove() calls getAvailableMoves() which calls c(data, i) once for each direction (but the moves are discarded).

getMove() then calls rand_moves() which calls next_play(data1, first_move) 100 times. And next_play() calls c(data, move).

Move constant calculations outside of the loop. In random_moves():

 for p in range(times): data1 = data.copy() data1 = next_play(data1, first_move) m = data.max()

should be:

 data = next_play(data, first_move) m = data.max() for p in range(times): data1 = data.copy() ...

Extra copying. For example:

getMove() calls rand_move() with a copy of data:
```
rand_moves(data.copy(),move,times) 
```
rand_move() then makes another copy:
```
data1 = data.copy() 
```
is_valid() could use features of numpy array:

def isvalid(np.ndarray grid): return ( 0 in grid or (grid[:3]==grid[1:]).any() or (grid[:,:3]==grid[:,1:]).any() )
It might be possible to cache some of the score calculations. For example, there are many ways to arrive at a grid like:
```
2 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
```
Figuring out a way to cache the score for that grid might save some duplicate calculations.

Thanks so much for taking this deep look in there. I implemented what you said, but performance gain was 0.005 seconds only for each real move per 100 runs in memory. I had to leave my hashing of dict because those trivial left right functions are performing better due to most conversion to c types in cython. Is there any way i could use hashing in cython just like dictionary of python? — Vicrobot
– Vicrobot, Commented Feb 12, 2020 at 16:04

Stack Exchange Network

AI In 2048 Game, Calculations so slow

1 Answer 1

You must log in to answer this question.

Hot Network Questions

AI In 2048 Game, Calculations so slow

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions