- Game overview
2048 is played on a 4x4 board where tiles appear and are moved either horizontally or vertically, to combine adjacent tiles of equal value. When two tiles merge, they form a new tile that is worth 2x what each individual tiles were worth. With each combined tile pair, the game score increases by the value of the new tile.
For example, two adjacent tiles of value 4 could be combined to make a single tile of value 8, while the game score increases by 8 points. In action, it looks like:
Mechanically, all the user has to do is decide whether to swipe up, down, left, or right when given a board layout. The goal of the game is to reach the highest score, however it is also possible to “win” the game by achieving a tile of value 2048 (the game does not end at this point, it is just a relatively difficult milestone).
I decided to implement the game myself, using only numpy, with the goal of connecting this module to a neural network, which will “learn” to play the game by repeatedly trying different strategies.
import numpy as np
Each game consists, at any given moment, of a board layout (the 4x4 tile values) and a score. As the user (or computer) makes decisions, both the layout and score are updated.
class GameLayout: def __init__(self): # initialize empty layout and zero points self.layout = np.zeros((4,4), dtype=np.int) self.score = 0 ...
But wait a second! The user never sees an empty board. When a 2048 game is initialized, it always has 2 full tiles.
class GameLayout: def __init__(self): ... # each game starts with 2 full tiles self.add_random() self.add_random() ...
add_random is a function that updates the layout, adding either a 2 or 4 tile to any empty tile in the 4x4.
class GameLayout: def add_random(self): # randomly choose any empty tile and fills it with a 2 or 4 tile (with 90%, 10% probabilities, respectively) layout = self.layout.flatten() options = (layout==0) if sum(options)!=0: layout[np.random.choice(range(16), p=(np.repeat(1,(16))*options)/sum(options))] = np.random.choice([2,4], p=[.9, .1]) self.layout = layout.reshape((4,4))
Generally it is used to add randomness to the board after each move, but we’re also using it at the game’s initialization to give the user tiles to move at the start. So after initializing
game = GameLayout(),
game.score is zero, and
game.layout could look like:
[[0, 0, 0, 2], [0, 0, 0, 0], [0, 0, 0, 2], [0, 0, 0, 0]]
Next, we need to implement functionality for the user to input a move, by “swiping” on the screen in the app. In this computer-based version, I’ve chosen to allow text input, using the common
d format to represent the directions (up, left, down, right, respectively).
class GameLayout: def swipe(self, choice): scores = np.zeros(4, dtype=np.int) new_layout = np.zeros((4,4), dtype=np.int) if choice=='w': score, new_layout = self.swipe_up(scores, new_layout) elif choice=='s': score, new_layout = self.swipe_down(scores, new_layout) elif choice=='d': score, new_layout = self.swipe_right(scores, new_layout) elif choice=='a': score, new_layout = self.swipe_left(scores, new_layout) else: raise Exception("Invalid input to swipe.") ...
Each method (
swipe_left) are very similar, simply iterating through the rows/columns in a different order or direction to
condense the layout appropriately. Full code is available in the repo, and is omitted here for brevity.
Some error checking is necessary - the move is only “valid” if some tiles can be moved. If not, the user must choose another direction to swipe or, if all options are exhausted, the game is over! I’ve chosen to log failed moves in a set, and when the set has all 4 valid moves (
d) the game must end.
class GameLayout: def swipe(self, choice): ... if (new_layout!=self.layout).sum()>0: # some tiles were moved so the move is valid self.failed_moves = set() # reset failed move counter # update the game's score and layout self.score += scores.sum() self.num_moves += 1 self.log_data(choice) # log data w/ old layout self.layout = new_layout # update to new layout # include the random next tile self.add_random() else: self.failed_moves.add(choice) self.active = len(self.failed_moves)<4 # otherwise, all moves have been tried and game should end # assertion error means game ends assert self.active # exception means move didn't change the layout, and another input is required raise Exception('Not a valid move.')
So given our previous
[[0, 0, 0, 2], [0, 0, 0, 0], [0, 0, 0, 2], [0, 0, 0, 0]]
If we call
game.swipe('s'), we’re “swiping down”, and could find
game.layout has been updated to:
[[0, 0, 0, 0], [0, 2, 0, 0], [0, 0, 0, 0], [0, 0, 0, 4]]
Note the random new tile that has been added!
The most important part of this project, to me, is the data generated as a game is played, as we’ll need that data to learn from each game later on, when we’re training a neural network. You may have noticed the line with
swipe(...) earlier - this happens after a valid move is performed, to log the layout, the valid move chosen, and the score after the move happens.
class GameLayout: def log_data(self, move): formatted_move = np.zeros(4) formatted_move[['w','a','s','d'].index(move)] = 1 try: self.layouts = np.concatenate((self.layouts, self.layout.reshape(1,-1))) self.moves = np.concatenate((self.moves, formatted_move.reshape(1,-1))) self.scores = np.append(self.scores, self.score) except AttributeError: self.layouts = self.layout.reshape(1,-1) self.moves = formatted_move.reshape(1,-1) self.scores = np.array(self.score)
16xnmatrix, where each row is a layout during the game
4xnmatrix, where each row is one-hot encoded the
scoresis a flat array of length
Note we’ve performed a single move (“swiping down”) so far in our demonstration game, so
n has a value of 1. Checking back in on our example:
- Our current
array([[0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0]])
- Our current
- Our current
Playing the game
We have all the functionality we need! Out of personal pride I decided to also cobble together a decent user interface, which takes that numpy array (
game.layout) and formats it to look similar to the app interface.
The full game implementation is available in the GitHub repository, and can be played (basic UI and all), out of either a jupyter notebook locally, or via this binder:
If you haven’t heard of binder before, Binder allows you to open notebooks in an executable environment, without worrying about requirements or installing/downloading anything.
We can’t just play a single game and expect to learn anything - the learning comes from comparing different strategies and outcomes, so we need to run and log data from several games in a row.
The obvious method we need is programatically running multiple games and logging that data. This is what we’ll need to let the neural network play the game. The
GameDriver class does just that.
class GameDriver(): def run_games(self, n, method=lambda layout:np.array([.25,.25,.25,.25])): from game import GameLayout for i in range(n): game = GameLayout() moves = ['w','a','s','d'] while game.active: # repeatedly attempt to make a move for move in self.weighted_shuffle(moves, method(game.layout)): try: game.swipe(move) except: # move didn't work, try next move continue # game is over self.log_game(game)
run_games(...), will progamatically play
n number of games, using the
method function to map a layout to a probability density array of size 4, where the elements represent the probability of choosing
This function also logs all the same data as an individual game (
scores) but also keeps track of the
final_scores across the games, and
num_moves across the games. Both of these should have length
run_games(...) is called multiple times.
Initially I thought the automated method was all I needed, but I later decided to include logging functionality for a game played manually, in case I wanted to teach the neural network to play like I do. In this case, we can call
log_game(...) to store the user-created game’s data in the same way that we’re creating and storing the automated games’ data.
Complete code can be found in my 2048 git repository. This post details code contained in
helper.py. Further explanation will follow in another post.
This section of the project relies solely on numpy. Additional packages (such as widgets and matploblib) may be required to run any notebooks in the repository, and the deep learning will be built using pytorch.