Reinforcement Learning with Gymnasium in Python
Fouad Trad
Machine Learning Engineer
env = create_environment() state = env.get_initial_state()
for i in range(n_iterations): action = choose_action(state)
state, reward = env.execute(action)
update_knowledge(state, action, reward)
import numpy as np expected_rewards = np.array([1, 6, 3])
discount_factor = 0.9
discounts = np.array([discount_factor ** i for i in range(len(expected_rewards))])
print(f"Discounts: {discounts}")
Discounts: [1. 0.9 0.81]
discounted_return = np.sum(expected_rewards * discounts)
print(f"The discounted return is {discounted_return}")
The discounted return is 8.83
Reinforcement Learning with Gymnasium in Python