Reinforcement Learning with Gymnasium in Python
Fouad Trad
Machine Learning Engineer






env = create_environment() state = env.get_initial_state()for i in range(n_iterations): action = choose_action(state)state, reward = env.execute(action)update_knowledge(state, action, reward)






import numpy as np expected_rewards = np.array([1, 6, 3])discount_factor = 0.9discounts = np.array([discount_factor ** i for i in range(len(expected_rewards))])print(f"Discounts: {discounts}")
Discounts: [1. 0.9 0.81]
discounted_return = np.sum(expected_rewards * discounts)
print(f"The discounted return is {discounted_return}")
The discounted return is 8.83
Reinforcement Learning with Gymnasium in Python