Typically when developing an AI for a game, you'd check to see if a certain condition is true (i.e. moving the block into the goal), we give it +1 reward. Traditional ML algorithms need to be trained with an input and a "correct answer" called target. In our case, the state is an array containing 11 boolean variables. Different architectures and different hyper-parameters contribute to a quicker convergence to an optimum, as well as possible highest scores. The network receives as input the state, and returns as output three values related to the three actions: move left, move right, move straight. Training a virtual agent to outperform human players, and to optimize its score, can teach us how to optimize different processes in a variety of different and exciting subfields. To train the network, I needed a high-quality dataset of the various board configurations described by images with corresponding labels. Well, there is Youtube channel with a lot of videos showing Tetris World Championship matches. Now I can say that most of the time, I spent not on programming but on collecting and pre-processing data. But before processing, I changed all boards by converting all gaps into occupied fields. A lot of people ask me the same question why I didn't use RL. The brain of the artificial intelligence uses Deep learning. Deep understanding of game theory isn't important for most work in machine learning of games, but some knowledge can only help.