Альфа и Гамма параметры в QLearning

Я часто использую Ctrl - 6 для этого.

Это удобно, потому что это позволяет мне быстро переходить назад и вперед между этими двумя файлами.

6
задан devoured elysium 6 December 2009 в 07:37
поделиться

1 ответ

I haven't worked with systems exactly like this before, so I don't know how useful I can be, but...

Gamma is a measure of the agent's tendency to look forward to future rewards. The smaller it is, the more the agent will tend to take the action with the greatest reward, regardless of resultant state. Agents with larger gamma will learn long paths to big rewards. As for all Q values approaching zero, have you tried with a very simple state map (say, one state and two actions) with gamma=0? That should quickly approach Q=reward.

The idea of reducing alpha is to damp down oscillations in the Q values, so that the agent can settle into a stable pattern after a wild youth.

Exploring the state space? Why not just iterate over it, have the agent try everything? There's no reason to have the agent actually follow a course of action in its learning-- unless that's the point of your simulation. If the idea is just to find the optimal behavior pattern, adjust all Q's, not just the highest ones along a path.

-4
ответ дан 10 December 2019 в 02:48
поделиться
Другие вопросы по тегам:

Похожие вопросы: