GitHub - socool1003/Markov_chain: A simulator for analyzing robot navigation using Markov Chains and MDPs.

Abstract: This project systematically demonstrates the hierarchical application of Markov processes in robot state analysis and decision optimization through a mobile robot navigation example. Starting from Markov chain modeling, the project delves into the limitations of predicting stochastic system behavior, then further upgrades to a Markov Decision Process (MDP) to achieve optimal policy solutions, aiming to provide theoretical and practical foundations for intelligent robot control under uncertainty.

Part 1: Robot Motion Analysis Based on Markov Chains

This section first establishes a simplified mobile robot model navigating within a 3x3 grid environment. Its purpose is to analyze the robot's state transition patterns under the influence of both a fixed control policy and random movement noise.

Key Findings:

Problem Modeling: Defined 8 legal grid cells as system states and constructed an $8 \times 8$ probability transition matrix based on a fixed "preferential right-turn" policy and 80%/10%/10% movement noise (intended direction/left/right).
Short-Term State Evolution: Simulated and calculated the robot's positional probability distribution at different time steps (k=0, 5, 10, 20) starting from an initial position, visually demonstrating its movement trend through heatmaps.
Long-Term Stationary Distribution: Solved for the stationary distribution of the Markov chain, revealing that under the current fixed policy, the robot would be "stuck" in the bottom-right corner ($S_8$) for approximately 77% of the time, unable to effectively explore the entire environment, highlighting the severe limitations of a fixed policy.

Part 2: Policy Optimization Based on Markov Decision Processes

To overcome the fixed policy deficiencies identified in the Markov chain analysis, this project further elevated the problem to a Markov Decision Process (MDP) model. By introducing an action space and reward function, the problem was transformed from passive system analysis into an active task of solving for an optimal policy.

Key Content & Discoveries:

MDP Modeling: Redefined the action space (Up, Down, Left, Right) and the reward function (high reward for target $S_3$, cost for movement, penalty for collision), and set a discount factor $\gamma=0.9$.
Goal-Oriented Policy: Theoretical analysis showed that solving the MDP through algorithms like value iteration or policy iteration could yield an optimal policy. This policy effectively guides the robot to efficiently reach the target position ($S_3$).
Breaking Limitations: The MDP framework enabled the learning of intelligent, goal-oriented behaviors, completely resolving the problem of the robot being "stuck" as seen in the Markov chain analysis, thereby optimizing robot behavior and facilitating more effective environmental exploration.

Technical Stack & Dependencies

Programming Language: Python
Core Libraries: numpy (for numerical computation and matrix operations), matplotlib (for plotting), seaborn (for heatmap visualization).

Project Summary & Learning Outcomes

Through this project, I systematically constructed and simulated methods for modeling, analyzing, and optimizing mobile robot systems in uncertain environments. Markov chains provided a quantitative tool for understanding stochastic system behavior, while Markov Decision Processes offered a powerful theoretical framework for designing intelligent, goal-oriented behaviors.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
.gitattributes		.gitattributes
.gitignore		.gitignore
Markov_Chain.py		Markov_Chain.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Part 1: Robot Motion Analysis Based on Markov Chains

Part 2: Policy Optimization Based on Markov Decision Processes

Technical Stack & Dependencies

Project Summary & Learning Outcomes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Part 1: Robot Motion Analysis Based on Markov Chains

Part 2: Policy Optimization Based on Markov Decision Processes

Technical Stack & Dependencies

Project Summary & Learning Outcomes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages