Rewards

Rewards are an essential part of reinforcement learning frameworks. They are typically float values (usually between 0.0 and 1.0, but they can be negative also) that guide the process into making more efficient decisions or predictions. During training, the goal is to maximize the reward function. The reward gives the system an indication of how well it performed for that task.

The purpose of a reward function is to compute the quantity that the program should maximize during training.