This paper observes a job search problem on a partially observable Markov chain, which can be considered as an extension of a job search in a dynamic economy in [1]. This problem is formulated as the state changes according to a partially observable Markov chain, i.e., the current state cannot be observed but there exists some information regarding what a present state is. All information about the unobservable state are summarized by the probability distributions on the state space, and we employ the Bayes' theorem as a learning procedure. The total positivity of order two, or simply TP2, is a fundamental property to investigate sequential decision problems, and it also plays an important role in the Bayesian learning procedure for a partially observable Markov process. By using this property, we consider some relationships among prior and posterior information, and the optimal policy. We will also observe the probabilities to make a transition into each state after some additional transitions by empolying the optimal policy. In the stock market, suppose that the states correspond to the business situation of one company and if there is a state designating the default, then the problem is what time the stocks are sold off before bankrupt, and the probability to become bankrupt will be also observed.
This paper will observe a job search problem on a partially observable Markov chain and will
consider the probability to make a transition into each state after some additional transitions
by employing the optimal policy. This is one of the optimal stopping problems and can be
considered as an extension of a job search in a dynamic economy discussed by Lippmann and
MacCall [1]. For instance, in economics, we consider that the conditions of economy are divided
into some classes, and assume them to be getting worse. Let us assume that these conditions
are not directly observable. That is, it cannot be known which one of these classes it is now,
but there is some information regarding what a present class is. When each state of this Markov
chain corresponds the class of the economy, we suppose that the wages of a job are a random
variable depending on these classes. Differing from the case in [1], the state changes according
to a partially observable Markov chain. On the other hand, in the stock market, we consider
these classes to correspond to the business situation of one company, i.e., these conditions can beestimated through the stock price of this company in the stock market. Since the stock price is
observable, the problem is what time the stocks are sold off. For a job search problem in which
the current state is observable, it is known that the maximization is achieved by classifying all
possible job offers into two mutually exclusive classes, and the wage of a job offer that separates
these two classes is called the reservation wage. It is not, however, always true for this problem
since the state of the chain is unobservable for the decision maker.
All information about the unobservable state is summarized by probability distributions on the
state space, and we employ the Bayes' theorem as a learning procedure. The total positivity of
order two, or simply TP2, is a fundamental property to investigate sequential decision problems,
and it also plays an important role in the Bayesian learning procedure for a partially observable
Markov process. By using this property, we consider some relationships among prior and posterior
informations, the optimal policy and the probabilities to make a transition into each state. The
properties of this TP2 are also investigated by Karlin and McGregor [2], Karlin [3], Karlin and
Rinott [4], and by others regarding the stochastic processes.
In order to observe these probabilities when we employ the Bayes' theorem as a learning
procedure, we will start to reconsider a job search in a dynamic economy to compare the properties
of this problem on a partially observable Markov chain. In Section 2, we summarize the properties
of a job search problem when the state of the chain is directly observable. It will be shown that,
the probabilities to make a transition into each state after some additional transitions is TP2.
In Section 3, we will investigate the probabilities to make a transition into each state when the
state changes according to a partially observable Markov chain. We will also observe the similar
probabilities by employing the optimal policy of the job search problem. Suppose that the State i
represents the class of the business situation of one company (i E {I, 2 , . . . , K}), and we specially
suppose that the State K designates the default. Then the problem is what time the stocks
are sold off before bankrupt, and the probability to become bankrupt is also observed by these
considerations.
In this paper, a job search problem on a partially observable Markov chain is considered, where
the current state is not known directly. This problem can be considered to sell an asset of one
company as watching the movement of the stock market. It is usual that the business situation of
this company cannot be directly observable, and these conditions are estimated through the stock
price of this company in the stock market. If there is a state designating the default, then this
problem is what time the stocks are sold off before becoming bankrupt, and the probability to
become bankrupt is also observed. When the current state is observable, this probability has TP2
property, but it is not true when the current state is not known directly. The difficulty of these
problems comes from the incompleteness about the knowledge of the current state as was shown
in this paper. These problems can be considered when the state changes according to a partially
observable Markov process or the diffusion process which is characterized by the total positivity.
It is also possible to apply this model to a sequential decision problem to make decisions before
entering some particular states.