Skip to content
karim.semaan(open to work)
WorkExperienceAboutSkillsContactResume ↓
← All work
Pac-Man AI preview
ML / Data ScienceCompleted2023

Pac-Man AI

search · minimax · Q-learning agents

Value iteration · γ 0.9 · noise 0.2 · sweep 0
0.000.000.001.00goal0.000.00-1.00trap0.00start0.000.000.00

Bellman backups propagate reward outward from the +1 goal; the arrows are the greedy policy under the current values, the same MDP solve the project runs.

Interactive Gridworld: watch value iteration and Q-learning converge in your browser.

Implements the agent logic for the UC Berkeley Pac-Man projects across three modules. Search: depth-first, breadth-first, uniform-cost and A* graph search, plus a CornersProblem state representation with admissible corners/food heuristics to route Pac-Man through mazes optimally. Adversarial: a reflex evaluation function, a recursive multi-agent minimax agent, an expectimax agent (ghosts modeled as uniform-random), and a hand-tuned evaluation weighing food/ghost distance, scared timers and remaining pellets. Reinforcement learning: a value-iteration agent that solves known MDPs by Bellman sweeps over Gridworld, a tabular Q-learning agent (ε-greedy with α/γ temporal-difference updates), and an approximate Q-learning agent using linear function approximation over feature vectors, plus discount/noise/living-reward analysis for target policies. The in-page demo brings the Gridworld value-iteration / Q-learning loop into the browser.

  • Python
  • Search (DFS/BFS/UCS/A*)
  • Minimax
  • Expectimax
  • MDPs
  • Value Iteration
  • Q-Learning
Search algorithms
4 (DFS/BFS/UCS/A*)
Agent code
1,549 LOC (5 core files)
Q-learning
ε .05 · γ .8 · α .2
Value iteration
γ .9 · 100 sweeps
Request access
Want something like this? Get in touch →
© 2026 Karim SemaanBuilt with Next.js, Tailwind & Supabase.LinkedIn ↗GitHub ↗