A reinforcement learning approach for shortest pathfinding in dynamic environments, using Q-Learning, Double Q-Learning, Reward Shaping, and exploration-exploitation strategies.
Publication URL: Reinforcement Learning Model for Finding Optimal Path
This project implements and compares Q-Learning, Double Q-Learning, Dijkstra's Algorithm, and Random Selection to find optimal paths in large grid environments. Tested on 17×17 and 27×27 grids, Q-Learning showed superior path quality and learning ability, while slightly slower than Dijkstra in execution time.
✅ Reinforcement Learning-based pathfinding
✅ Q-Learning & Double Q-Learning algorithms
✅ Reward Shaping for improved learning
✅ Epsilon-greedy with decaying epsilon for exploration-exploitation balance
✅ Comparative analysis with Dijkstra's Algorithm & Random Selection
✅ Performance metrics: path length, completion time, cumulative rewards
Algorithm | Avg Path Length | Completion Time (s) | Cumulative Rewards |
---|---|---|---|
Q-Learning | Optimal | 0.00093 - 0.0097 | 2,197,570 → 205,823 |
Dijkstra | Optimal | 0.000023 - 0.000034 | 6 → 1 |
Random Select | Suboptimal | Variable | 132,449 → 11,901 |
git clone https://github.com/Vikhorz/optimal-path-qlearning.git cd optimal-path-qlearning
pip install numpy pandas openpyxl
Note: openpyxl is required for reading Excel .xlsx files.
Make sure your reward-27x27.xlsx file is in the root directory (or adjust the path in the script).
python src/dl-ql.py
Replace main.py with the actual script filename if different.
-
Loads the reward matrix from Excel (reward-27x27.xlsx)
-
Implements Double Q-Learning on a 27×27 grid
-
Runs 3.2 million training iterations with epsilon-greedy exploration
-
Prints the trained Q-matrix and finds the optimal path to a goal state
-
Outputs cumulative rewards and the selected path
If you use this work in your research, please cite:
Ismail, A. S., Mohammed, Z. A., Hussain, K. M., & Hassan, H. O. (2023). Finding optimal path using Q-learning and Double Q-learning: A comparative study. International Journal of Applied Mathematics and Computer Science.
bibtex @article{Ismail2023Qlearning, title={Finding Optimal Path Using Q-learning and Double Q-learning: A Comparative Study}, author={Ismail, Aran Sirwan and Mohammed, Zhiar Ahmed and Hussain, Kozhir Mustafa and Hassan, Hiwa Omer}, year={2023}, journal={International Journal of Applied Mathematics and Computer Science} }
This project is licensed under the MIT License. Feel free to use and modify it.
-
Author: Aran Sirwan Ismail
-
Email: [email protected]
-
ResearchGate: Aran-Sirwan