|
SELAB3 Python Code
|
This class provides logging functionality to the reinforcement part of the code. More...
Public Member Functions | |
| def | __init__ (self) |
| def | log_episode (self, int episode, np.ndarray final_state, np.ndarray goal, int time_steps, int total_finished, List[bool] episodes_finished, float reward, float eps) |
| Log 1 episode. More... | |
This class provides logging functionality to the reinforcement part of the code.
| def rl.logger.Logger.log_episode | ( | self, | |
| int | episode, | ||
| np.ndarray | final_state, | ||
| np.ndarray | goal, | ||
| int | time_steps, | ||
| int | total_finished, | ||
| List[bool] | episodes_finished, | ||
| float | reward, | ||
| float | eps | ||
| ) |
Log 1 episode.
| episode | Episode number |
| final_state | State at the end of the episode |
| goal | Goal this episode |
| time_steps | Amount of steps taken in the episode |
| total_finished | Total amount of times during the reinforcement run the arm reached the goal. |
| episodes_finished | boolean array with length 50. It describes the last 50 episodes. True = the episode finished, False = the episode did not finish. |
| reward | The reward of the last step in the episode |
| eps | Epsilon parameter of the DQN at this moment. |