دانلود مقاله ISI انگلیسی شماره 24787
ترجمه فارسی عنوان مقاله

برنامه ریزی پویا تصادفی با نمایندگی عامل

عنوان انگلیسی
Stochastic dynamic programming with factored representations
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
24787 2014 59 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Artificial Intelligence, Volume 121, Issues 1–2, August 2000, Pages 49–107

ترجمه کلمات کلیدی
برنامه ریزی نظری تصمیم - پردازش های تصمیم گیری مارکوف - شبکه های بیزی - رگرسیون - درخت تصمیم گیری - انتزاع -
کلمات کلیدی انگلیسی
Decision-theoretic planning, Markov decision processes, Bayesian networks, Regression, Decision trees, Abstraction,
پیش نمایش مقاله
پیش نمایش مقاله  برنامه ریزی پویا تصادفی با نمایندگی عامل

چکیده انگلیسی

Markov decision processes (MDPs) have proven to be popular models for decision-theoretic planning, but standard dynamic programming algorithms for solving MDPs rely on explicit, state-based specifications and computations. To alleviate the combinatorial problems associated with such methods, we propose new representational and computational techniques for MDPs that exploit certain types of problem structure. We use dynamic Bayesian networks (with decision trees representing the local families of conditional probability distributions) to represent stochastic actions in an MDP, together with a decision-tree representation of rewards. Based on this representation, we develop versions of standard dynamic programming algorithms that directly manipulate decision-tree representations of policies and value functions. This generally obviates the need for state-by-state computation, aggregating states at the leaves of these trees and requiring computations only for each aggregate state. The key to these algorithms is a decision-theoretic generalization of classic regression analysis, in which we determine the features relevant to predicting expected value. We demonstrate the method empirically on several planning problems, showing significant savings for certain types of domains. We also identify certain classes of problems for which this technique fails to perform well and suggest extensions and related ideas that may prove useful in such circumstances. We also briefly describe an approximation scheme based on this approach.