VALUE-ITERATION

AIMA3e

function VALUE-ITERATION(mdp, ε) returns a utility function
inputs: mdp, an MDP with states S, actions A(s), transition model P(s′ | s, a),
rewards R(s), discount γ
ε, the maximum error allowed in the utility of any state
local variables: U, U′, vectors of utilities for states in S, initially zero
δ, the maximum change in the utility of any state in an iteration

repeat
U ← U′; δ ← 0
for each state s in S do
U′[s] ← R(s) + γ max_{a ∈ A(s)} Σ P(s′ | s, a) U[s′]
if | U′[s] − U[s] | > δ then δ ← | U′[s] − U[s] |
until δ < ε(1 − γ)/γ
return U

Figure ?? The value iteration algorithm for calculating utilities of states. The termination condition is from Equation (??).

AIMA4e

function VALUE-ITERATION(mdp, ε) returns a utility function
inputs: mdp, an MDP with states S, actions A(s), transition model P(s′ | s, a),
rewards R(s,a,s′), discount γ
ε, the maximum error allowed in the utility of any state
local variables: U, U′, vectors of utilities for states in S, initially zero
δ, the maximum change in the utility of any state in an iteration

repeat
U ← U′; δ ← 0
for each state s in S do
U′[s] ← max_{a ∈ A(s)} Q-VALUE(mdp,s,a,U)
if | U′[s] − U[s] | > δ then δ ← | U′[s] − U[s] |
until δ < ε(1 − γ)/γ
return U

Figure ?? The value iteration algorithm for calculating utilities of states. The termination condition is from Equation (??). ~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Value-Iteration.md

Value-Iteration.md

VALUE-ITERATION

AIMA3e

AIMA4e

Files

Value-Iteration.md

Latest commit

History

Value-Iteration.md

File metadata and controls

VALUE-ITERATION

AIMA3e

AIMA4e