I have read the paper on POMDP twice. Most of it is a very comprehensive paper for the lay person but hard to find other such reads on the subject online. I think in some way we are all kind of like the Mars Rover as the example thats given. We are all agents living in an environment that is kind of like a maze, has a lot of noise, and we have limited sensors to navigate our tasks to find the optimal reward. I think the model-based approach is probably the way we make decisions. But what do we do when there are millions of variables/observation noise in our environment? How are all of the variables gathered and how do you know what ones to use and what ones to discard? I would like to know how to make better decisions. Maybe POMDP’s are a way of helping make policy that helps real world scenarios in health care, or computer science.