Markov Decision Process
In this project, I solved the problem explain in Figure 14.2 of the book “Probabilistic Robotics” using Dynamic Programing.
The robot can move in 8 directions (4 straight + 4 diagonal). The robot has two model: a) Deterministic model, that always executes movements perfectly. b) Stochastic model, that has a 20% probability of moving +/-45degrees from the commanded move. (1 means occupied and 0 means free). The reward of hitting obstacle is -50.0 . Reward for any other movement that does not end up at goal is -1.0. The reward for reaching the goal is 100.0. The goal location is at W(8,11) Use gamma =0.95.
Results
Results generated for the optimal policy for the robot using the following algorithms:
1. Policy iteration
2. Value Iteration
3. Generalized Policy Iteration