Necessary and Sufficient Condition for Control Limit Policy for Partially Observable Markov Decision Process

  • JIN Lu
    Graduate School of Informatics and Engineering, The University of Electro-Communications
  • SUZUKI Kazuyuki
    Graduate School of Informatics and Engineering, The University of Electro-Communications
  • KUMAGAI Kazuhiro
    Graduate School of Informatics and Engineering, The University of Electro-Communications

Bibliographic Information

Other Title
  • 部分的に観測可能なマルコフ決定過程における Control Limit Policy の最適性への必要十分条件

Search this article

Abstract

Online monitoring of the condition of products anywhere in the world has become possible with recent advances in information technology. This enables manufacturers to provide appropriate and effective maintenance service to each customer by using information specific to each product, such as degradation and usage. The investigation of the optimal maintenance policy has thus become important. This research investigated the optimal maintenance policy for a multi-state deteriorated system under the assumption that the system is monitored incompletely by a monitor that gives information related to the true state of the system. The problem was formulated as a partially observable Markov decision process (POMDP). Derman investigated the optimal maintenance problem for a system with complete observations and showed that a state transition probability matrix with the stochastic increasing (SI) property is a sufficient condition for the optimal policy to be given by a control limit policy. Complete observations means that the monitor shows the true internal state of the system; that is, the conditional probability matrix, which describes the relationship between the monitor's output and the internal state of the system, is an identity matrix, which is a special case of the POMDP model. Investigation of the optimal maintenance problem with incomplete observations when the transition probability matrix has the SI property showed that the optimal procedure is given by a control limit policy if and only if the conditional probability matrix is given by (i) an identity matrix or (ii) a matrix in which the probabilities of the same monitoring output are the same whatever the true state. This research brings to a close over have a century of research on ways to extend the optimal control limit policy on the basis of SI ordering of the state probability vectors.

Journal

References(14)*help

See more

Details 詳細情報について

Report a problem

Back to top