Downloads & Free Reading Options - Results
Markov Decision Processes by Martin L. Puterman
Read "Markov Decision Processes" by Martin L. Puterman through these free online access and download options.
Books Results
Source: The Internet Archive
The internet Archive Search Results
Available books for downloads and borrow from The internet Archive
1A Conjugate Approach To Partially Observable Markov Decision Processes On Borel Spaces
By Yun Shen, Wilhelm Stannat and Klaus Obermayer
This paper presents a conjugate approach to solve the optimization problem of partially observable Markov decision processes (POMDPs) on Borel spaces with unbounded reward functions. We equip the belief state space of probability distributions with the Wasserstein metric and construct thereby a weighted norm for the functions on belief states. Under the weighted norm, we proof the existence of an optimal solution to POMDPs with possibly unbounded reward functions on both sides by reducing them to canonical Markov decision processes (MDPs). Furthermore, we present a conjugate duality theorem of the Fenchel-Moreau type on the Wasserstein space of belief states. Applying the conjugate duality to POMDPs, we derive an iterative conjugate solution in form of a convex and lower semicontinuous function by iterating the level sets, which is shown to be arbitrarily close to the optimal solution to POMDPs.
“A Conjugate Approach To Partially Observable Markov Decision Processes On Borel Spaces” Metadata:
- Title: ➤ A Conjugate Approach To Partially Observable Markov Decision Processes On Borel Spaces
- Authors: Yun ShenWilhelm StannatKlaus Obermayer
“A Conjugate Approach To Partially Observable Markov Decision Processes On Borel Spaces” Subjects and Themes:
- Subjects: Probability - Mathematics
Edition Identifiers:
- Internet Archive ID: arxiv-1603.02882
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.35 Mbs, the file-s for this book were downloaded 15 times, the file-s went public at Fri Jun 29 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find A Conjugate Approach To Partially Observable Markov Decision Processes On Borel Spaces at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
2Computing Monotone Policies For Markov Decision Processes: A Nearly-isotonic Penalty Approach
By Robert Mattila, Cristian R. Rojas, Vikram Krishnamurthy and Bo Wahlberg
This paper discusses algorithms for solving Markov decision processes (MDPs) that have monotone optimal policies. We propose a two-stage alternating convex optimization scheme that can accelerate the search for an optimal policy by exploiting the monotone property. The first stage is a linear program formulated in terms of the joint state-action probabilities. The second stage is a regularized problem formulated in terms of the conditional probabilities of actions given states. The regularization uses techniques from nearly-isotonic regression. While a variety of iterative method can be used in the first formulation of the problem, we show in numerical simulations that, in particular, the alternating method of multipliers (ADMM) can be significantly accelerated using the regularization step.
“Computing Monotone Policies For Markov Decision Processes: A Nearly-isotonic Penalty Approach” Metadata:
- Title: ➤ Computing Monotone Policies For Markov Decision Processes: A Nearly-isotonic Penalty Approach
- Authors: Robert MattilaCristian R. RojasVikram KrishnamurthyBo Wahlberg
“Computing Monotone Policies For Markov Decision Processes: A Nearly-isotonic Penalty Approach” Subjects and Themes:
- Subjects: Systems and Control - Computing Research Repository
Edition Identifiers:
- Internet Archive ID: arxiv-1704.00621
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.21 Mbs, the file-s for this book were downloaded 18 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Computing Monotone Policies For Markov Decision Processes: A Nearly-isotonic Penalty Approach at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
3Sleeping Experts And Bandits Approach To Constrained Markov Decision Processes
By Hyeong Soo Chang
This brief paper presents simple simulation-based algorithms for obtaining an approximately optimal policy in a given finite set in large finite constrained Markov decision processes. The algorithms are adapted from playing strategies for "sleeping experts and bandits" problem and their computational complexities are independent of state and action space sizes if the given policy set is relatively small. We establish convergence of their expected performances to the value of an optimal policy and convergence rates, and also almost-sure convergence to an optimal policy with an exponential rate for the algorithm adapted within the context of sleeping experts.
“Sleeping Experts And Bandits Approach To Constrained Markov Decision Processes” Metadata:
- Title: ➤ Sleeping Experts And Bandits Approach To Constrained Markov Decision Processes
- Author: Hyeong Soo Chang
“Sleeping Experts And Bandits Approach To Constrained Markov Decision Processes” Subjects and Themes:
- Subjects: Mathematics - Optimization and Control
Edition Identifiers:
- Internet Archive ID: arxiv-1412.4898
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.12 Mbs, the file-s for this book were downloaded 15 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Sleeping Experts And Bandits Approach To Constrained Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
4A New Condition For The Existence Of Optimal Stationary Policies In Denumerable State Average Cost Continuous Time Markov Decision Processes With Unbounded Cost And Transition Rates
By Cao Ping and Xie Jingui
This paper presents a new condition for the existence of optimal stationary policies in average-cost continuous-time Markov decision processes with unbounded cost and transition rates, arising from controlled queueing systems. This condition is closely related to the stability of queueing systems. It suggests that the proof of the stability can be exploited to verify the existence of an optimal stationary policy. This new condition is easier to verify than existing conditions. Moreover, several conditions are provided which suffice for the average-cost optimality equality to hold.
“A New Condition For The Existence Of Optimal Stationary Policies In Denumerable State Average Cost Continuous Time Markov Decision Processes With Unbounded Cost And Transition Rates” Metadata:
- Title: ➤ A New Condition For The Existence Of Optimal Stationary Policies In Denumerable State Average Cost Continuous Time Markov Decision Processes With Unbounded Cost And Transition Rates
- Authors: Cao PingXie Jingui
- Language: English
“A New Condition For The Existence Of Optimal Stationary Policies In Denumerable State Average Cost Continuous Time Markov Decision Processes With Unbounded Cost And Transition Rates” Subjects and Themes:
- Subjects: Optimization and Control - Mathematics
Edition Identifiers:
- Internet Archive ID: arxiv-1504.05674
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 5.85 Mbs, the file-s for this book were downloaded 22 times, the file-s went public at Wed Jun 27 2018.
Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find A New Condition For The Existence Of Optimal Stationary Policies In Denumerable State Average Cost Continuous Time Markov Decision Processes With Unbounded Cost And Transition Rates at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
5Central-limit Approach To Risk-aware Markov Decision Processes
By Pengqian Yu, Jia Yuan Yu and Huan Xu
Whereas classical Markov decision processes maximize the expected reward, we consider minimizing the risk. We propose to evaluate the risk associated to a given policy over a long-enough time horizon with the help of a central limit theorem. The proposed approach works whether the transition probabilities are known or not. We also provide a gradient-based policy improvement algorithm that converges to a local optimum of the risk objective.
“Central-limit Approach To Risk-aware Markov Decision Processes” Metadata:
- Title: ➤ Central-limit Approach To Risk-aware Markov Decision Processes
- Authors: Pengqian YuJia Yuan YuHuan Xu
“Central-limit Approach To Risk-aware Markov Decision Processes” Subjects and Themes:
- Subjects: Systems and Control - Optimization and Control - Computing Research Repository - Mathematics
Edition Identifiers:
- Internet Archive ID: arxiv-1512.00583
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.35 Mbs, the file-s for this book were downloaded 19 times, the file-s went public at Thu Jun 28 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Central-limit Approach To Risk-aware Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
6Contextual Markov Decision Processes
By Assaf Hallak, Dotan Di Castro and Shie Mannor
We consider a planning problem where the dynamics and rewards of the environment depend on a hidden static parameter referred to as the context. The objective is to learn a strategy that maximizes the accumulated reward across all contexts. The new model, called Contextual Markov Decision Process (CMDP), can model a customer's behavior when interacting with a website (the learner). The customer's behavior depends on gender, age, location, device, etc. Based on that behavior, the website objective is to determine customer characteristics, and to optimize the interaction between them. Our work focuses on one basic scenario--finite horizon with a small known number of possible contexts. We suggest a family of algorithms with provable guarantees that learn the underlying models and the latent contexts, and optimize the CMDPs. Bounds are obtained for specific naive implementations, and extensions of the framework are discussed, laying the ground for future research.
“Contextual Markov Decision Processes” Metadata:
- Title: ➤ Contextual Markov Decision Processes
- Authors: Assaf HallakDotan Di CastroShie Mannor
- Language: English
“Contextual Markov Decision Processes” Subjects and Themes:
- Subjects: Machine Learning - Learning - Statistics - Computing Research Repository
Edition Identifiers:
- Internet Archive ID: arxiv-1502.02259
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 9.02 Mbs, the file-s for this book were downloaded 82 times, the file-s went public at Tue Jun 26 2018.
Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Contextual Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
7A Distance For Probability Spaces, And Long-term Values In Markov Decision Processes And Repeated Games
By Jérôme Renault and Xavier Venel
Given a finite set $K$, we denote by $X=\Delta(K)$ the set of probabilities on $K$ and by $Z=\Delta_f(X)$ the set of Borel probabilities on $X$ with finite support. Studying a Markov Decision Process with partial information on $K$ naturally leads to a Markov Decision Process with full information on $X$. We introduce a new metric $d_*$ on $Z$ such that the transitions become 1-Lipschitz from $(X, \|.\|_1)$ to $(Z,d_*)$. In the first part of the article, we define and prove several properties of the metric $d_*$. Especially, $d_*$ satisfies a Kantorovich-Rubinstein type duality formula and can be characterized by using disintegrations. In the second part, we characterize the limit values in several classes of "compact non expansive" Markov Decision Processes. In particular we use the metric $d_*$ to characterize the limit value in Partial Observation MDP with finitely many states and in Repeated Games with an informed controller with finite sets of states and actions. Moreover in each case we can prove the existence of a generalized notion of uniform value where we consider not only the Ces\`aro mean when the number of stages is large enough but any evaluation function $\theta \in \Delta(\N^*)$ when the impatience $I(\theta)=\sum_{t\geq 1} |\theta_{t+1}-\theta_t|$ is small enough.
“A Distance For Probability Spaces, And Long-term Values In Markov Decision Processes And Repeated Games” Metadata:
- Title: ➤ A Distance For Probability Spaces, And Long-term Values In Markov Decision Processes And Repeated Games
- Authors: Jérôme RenaultXavier Venel
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-1202.6259
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 21.84 Mbs, the file-s for this book were downloaded 73 times, the file-s went public at Mon Sep 23 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find A Distance For Probability Spaces, And Long-term Values In Markov Decision Processes And Repeated Games at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
8Simple Regret Optimization In Online Planning For Markov Decision Processes
Given a finite set $K$, we denote by $X=\Delta(K)$ the set of probabilities on $K$ and by $Z=\Delta_f(X)$ the set of Borel probabilities on $X$ with finite support. Studying a Markov Decision Process with partial information on $K$ naturally leads to a Markov Decision Process with full information on $X$. We introduce a new metric $d_*$ on $Z$ such that the transitions become 1-Lipschitz from $(X, \|.\|_1)$ to $(Z,d_*)$. In the first part of the article, we define and prove several properties of the metric $d_*$. Especially, $d_*$ satisfies a Kantorovich-Rubinstein type duality formula and can be characterized by using disintegrations. In the second part, we characterize the limit values in several classes of "compact non expansive" Markov Decision Processes. In particular we use the metric $d_*$ to characterize the limit value in Partial Observation MDP with finitely many states and in Repeated Games with an informed controller with finite sets of states and actions. Moreover in each case we can prove the existence of a generalized notion of uniform value where we consider not only the Ces\`aro mean when the number of stages is large enough but any evaluation function $\theta \in \Delta(\N^*)$ when the impatience $I(\theta)=\sum_{t\geq 1} |\theta_{t+1}-\theta_t|$ is small enough.
“Simple Regret Optimization In Online Planning For Markov Decision Processes” Metadata:
- Title: ➤ Simple Regret Optimization In Online Planning For Markov Decision Processes
Edition Identifiers:
- Internet Archive ID: arxiv-1206.3382
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 17.55 Mbs, the file-s for this book were downloaded 63 times, the file-s went public at Fri Sep 20 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Simple Regret Optimization In Online Planning For Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
9A Unified Bellman Equation For Causal Information And Value In Markov Decision Processes
By Stas Tiomkin and Naftali Tishby
The interaction between an artificial agent and its environment is bi-directional. The agent extracts relevant information from the environment, and affects the environment by its actions in return to accumulate high expected reward. Standard reinforcement learning (RL) deals with the expected reward maximization. However, there are always information-theoretic limitations that restrict the expected reward, which are not properly considered by the standard RL. In this work we consider RL objectives with information-theoretic limitations. For the first time we derive a Bellman-type recursive equa- tion for the causal information between the environment and the agent, which is combined plausibly with the Bellman recursion for the value function. The unified equitation serves to explore the typical behavior of artificial agents in an infinite time horizon.
“A Unified Bellman Equation For Causal Information And Value In Markov Decision Processes” Metadata:
- Title: ➤ A Unified Bellman Equation For Causal Information And Value In Markov Decision Processes
- Authors: Stas TiomkinNaftali Tishby
“A Unified Bellman Equation For Causal Information And Value In Markov Decision Processes” Subjects and Themes:
- Subjects: Systems and Control - Computing Research Repository
Edition Identifiers:
- Internet Archive ID: arxiv-1703.01585
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.46 Mbs, the file-s for this book were downloaded 16 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find A Unified Bellman Equation For Causal Information And Value In Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
10DTIC ADA438506: Multi-time Scale Markov Decision Processes
By Defense Technical Information Center
This paper proposes a simple analytical model called M time-scale Markov Decision Process (MMDP) for hierarchically structured sequential decision-making processes, where decisions at each level in the M-level hierarchy are made in M different time-scales. In this model, the state space and the control space of each level in the hierarchy are non-overlapping with those of the other levels, respectively, and the hierarchy is structured in a pyramid sense such that a decision made at level m (slower time-scale) state will affect the evolutionary decision-making process of the lower level m + 1 (faster time-scale) until a new decision is made at the higher level, but the lower level decisions themselves do not affect the higher level's transition dynamics. The performance produced by the lower level's decisions will affect the higher level's decisions. A hierarchical objective function is defined such that the finite-horizon value of following a (nonstationary) policy at the level m + 1 over a decision epoch of the level m plus an immediate reward at the level m is the single step reward for the level m decision-making process. From this the authors define multi-level optimal value function and derive multi-level optimality equation. They then discuss how to solve MMDPs exactly or approximately and also examine heuristic online methods to solve MMDPs. Finally, they give some example control problems that can be modeled as MMDPs.
“DTIC ADA438506: Multi-time Scale Markov Decision Processes” Metadata:
- Title: ➤ DTIC ADA438506: Multi-time Scale Markov Decision Processes
- Author: ➤ Defense Technical Information Center
- Language: English
“DTIC ADA438506: Multi-time Scale Markov Decision Processes” Subjects and Themes:
- Subjects: ➤ DTIC Archive - Chang, Hyeong S - MARYLAND UNIV COLLEGE PARK INST FOR SYSTEMS RESEARCH - *TIME INTERVALS - *DECISION MAKING - *COMMUNICATIONS TRAFFIC - *MARKOV PROCESSES - *STOCHASTIC CONTROL - *HIERARCHIES - MATHEMATICAL MODELS - APPROXIMATION(MATHEMATICS) - COMPUTER NETWORKS - HEURISTIC METHODS - COMPUTER COMMUNICATIONS - OPTIMIZATION - QUEUEING THEORY
Edition Identifiers:
- Internet Archive ID: DTIC_ADA438506
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 24.99 Mbs, the file-s for this book were downloaded 63 times, the file-s went public at Mon May 28 2018.
Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - Item Tile - Metadata - OCR Page Index - OCR Search Text - Page Numbers JSON - Scandata - Single Page Processed JP2 ZIP - Text PDF - chOCR - hOCR -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find DTIC ADA438506: Multi-time Scale Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
11A MOVING TARGET DEFENSE SCHEME WITH OVERHEAD OPTIMIZATION USING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH ABSORBING STATES
By McAbee, Ashley S.
Moving target defense (MTD) is a promising strategy for gaining advantage over cyber attackers, but these dynamic reconfigurations can impose significant overhead. We propose implementing MTD within an optimization framework so that we seize defensive advantage while minimizing overhead. This dissertation presents an MTD scheme that leverages partially observable Markov decision processes (POMDP) with absorbing states to select the optimal defense based on partial observations of the cyber attack phase. In this way, overhead is minimized as reconfigurations are triggered only when the potential benefit outweighs the cost. We formulate and implement a POMDP within a system with Monte-Carlo planning-based decision making configured to reflect defender-defined priorities for the cost-benefit tradeoff. The proposed system also includes a performance -monitoring scheme for continuous validation of the model, critical given attackers' ever-changing techniques. We present simulation results that confirm the system fulfills the design goals, thwarting 99% of inbound attacks while sustaining system availability at greater than 94% even as probability of attack phase detection dropped to 0.74. A comparable system that triggered MTD techniques pseudorandomly maintained just 43% availability when providing equivalent attack suppression, which illustrates the utility of our proposed scheme.
“A MOVING TARGET DEFENSE SCHEME WITH OVERHEAD OPTIMIZATION USING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH ABSORBING STATES” Metadata:
- Title: ➤ A MOVING TARGET DEFENSE SCHEME WITH OVERHEAD OPTIMIZATION USING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH ABSORBING STATES
- Author: McAbee, Ashley S.
- Language: English
“A MOVING TARGET DEFENSE SCHEME WITH OVERHEAD OPTIMIZATION USING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH ABSORBING STATES” Subjects and Themes:
- Subjects: Markov processes - cyber defense - moving target defense - decision making under uncertainty
Edition Identifiers:
- Internet Archive ID: amovingtargetdef1094566107
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.01 Mbs, the file-s for this book were downloaded 7 times, the file-s went public at Sat Jan 30 2021.
Available formats:
Archive BitTorrent - Metadata -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find A MOVING TARGET DEFENSE SCHEME WITH OVERHEAD OPTIMIZATION USING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH ABSORBING STATES at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
12Approximate Dynamic Programming With $(\min,+)$ Linear Function Approximation For Markov Decision Processes
By Chandrashekar Lakshminarayanan and Shalabh Bhatnagar
Markov Decision Processes (MDP) is an useful framework to cast optimal sequential decision making problems. Given any MDP the aim is to find the optimal action selection mechanism i.e., the optimal policy. Typically, the optimal policy ($u^*$) is obtained by substituting the optimal value-function ($J^*$) in the Bellman equation. Alternately $u^*$ is also obtained by learning the optimal state-action value function $Q^*$ known as the $Q$ value-function. However, it is difficult to compute the exact values of $J^*$ or $Q^*$ for MDPs with large number of states. Approximate Dynamic Programming (ADP) methods address this difficulty by computing lower dimensional approximations of $J^*$/$Q^*$. Most ADP methods employ linear function approximation (LFA), i.e., the approximate solution lies in a subspace spanned by a family of pre-selected basis functions. The approximation is obtain via a linear least squares projection of higher dimensional quantities and the $L_2$ norm plays an important role in convergence and error analysis. In this paper, we discuss ADP methods for MDPs based on LFAs in $(\min,+)$ algebra. Here the approximate solution is a $(\min,+)$ linear combination of a set of basis functions whose span constitutes a subsemimodule. Approximation is obtained via a projection operator onto the subsemimodule which is different from linear least squares projection used in ADP methods based on conventional LFAs. MDPs are not $(\min,+)$ linear systems, nevertheless, we show that the monotonicity property of the projection operator helps us to establish the convergence of our ADP schemes. We also discuss future directions in ADP methods for MDPs based on the $(\min,+)$ LFAs.
“Approximate Dynamic Programming With $(\min,+)$ Linear Function Approximation For Markov Decision Processes” Metadata:
- Title: ➤ Approximate Dynamic Programming With $(\min,+)$ Linear Function Approximation For Markov Decision Processes
- Authors: Chandrashekar LakshminarayananShalabh Bhatnagar
“Approximate Dynamic Programming With $(\min,+)$ Linear Function Approximation For Markov Decision Processes” Subjects and Themes:
- Subjects: Mathematics - Systems and Control - Computing Research Repository - Optimization and Control
Edition Identifiers:
- Internet Archive ID: arxiv-1403.4179
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.21 Mbs, the file-s for this book were downloaded 19 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Approximate Dynamic Programming With $(\min,+)$ Linear Function Approximation For Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
13Markov Decision Processes With Applications In Wireless Sensor Networks: A Survey
By Mohammad Abu Alsheikh, Dinh Thai Hoang, Dusit Niyato, Hwee-Pink Tan and Shaowei Lin
Wireless sensor networks (WSNs) consist of autonomous and resource-limited devices. The devices cooperate to monitor one or more physical phenomena within an area of interest. WSNs operate as stochastic systems because of randomness in the monitored environments. For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection, and security challenges. In these problems, sensor nodes are to make optimized decisions from a set of accessible strategies to achieve design goals. This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs. Furthermore, various solution methods are discussed and compared to serve as a guide for using MDPs in WSNs.
“Markov Decision Processes With Applications In Wireless Sensor Networks: A Survey” Metadata:
- Title: ➤ Markov Decision Processes With Applications In Wireless Sensor Networks: A Survey
- Authors: Mohammad Abu AlsheikhDinh Thai HoangDusit NiyatoHwee-Pink TanShaowei Lin
- Language: English
“Markov Decision Processes With Applications In Wireless Sensor Networks: A Survey” Subjects and Themes:
Edition Identifiers:
- Internet Archive ID: arxiv-1501.00644
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 68.37 Mbs, the file-s for this book were downloaded 33 times, the file-s went public at Mon Jun 25 2018.
Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Markov Decision Processes With Applications In Wireless Sensor Networks: A Survey at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
14Decentralized Control Of Partially Observable Markov Decision Processes Using Belief Space Macro-actions
By Shayegan Omidshafiei, Ali-akbar Agha-mohammadi, Christopher Amato and Jonathan P. How
The focus of this paper is on solving multi-robot planning problems in continuous spaces with partial observability. Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for multi-robot coordination problems, but representing and solving Dec-POMDPs is often intractable for large problems. To allow for a high-level representation that is natural for multi-robot problems and scalable to large discrete and continuous problems, this paper extends the Dec-POMDP model to the decentralized partially observable semi-Markov decision process (Dec-POSMDP). The Dec-POSMDP formulation allows asynchronous decision-making by the robots, which is crucial in multi-robot domains. We also present an algorithm for solving this Dec-POSMDP which is much more scalable than previous methods since it can incorporate closed-loop belief space macro-actions in planning. These macro-actions are automatically constructed to produce robust solutions. The proposed method's performance is evaluated on a complex multi-robot package delivery problem under uncertainty, showing that our approach can naturally represent multi-robot problems and provide high-quality solutions for large-scale problems.
“Decentralized Control Of Partially Observable Markov Decision Processes Using Belief Space Macro-actions” Metadata:
- Title: ➤ Decentralized Control Of Partially Observable Markov Decision Processes Using Belief Space Macro-actions
- Authors: Shayegan OmidshafieiAli-akbar Agha-mohammadiChristopher AmatoJonathan P. How
- Language: English
“Decentralized Control Of Partially Observable Markov Decision Processes Using Belief Space Macro-actions” Subjects and Themes:
Edition Identifiers:
- Internet Archive ID: arxiv-1502.06030
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 8.95 Mbs, the file-s for this book were downloaded 42 times, the file-s went public at Tue Jun 26 2018.
Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Decentralized Control Of Partially Observable Markov Decision Processes Using Belief Space Macro-actions at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
15Continuous-time Markov Decision Processes With Finite-horizon Expected Total Cost Criteria
By Qingda Wei and Xian Chen
This paper deals with the unconstrained and constrained cases for continuous-time Markov decision processes under the finite-horizon expected total cost criterion. The state space is denumerable and the transition and cost rates are allowed to be unbounded from above and from below. We give conditions for the existence of optimal policies in the class of all randomized history-dependent policies. For the unconstrained case, using the analogue of the forward Kolmogorov equation in the form of conditional expectation, we show that the finite-horizon optimal value function is the unique solution to the optimality equation and obtain the existence of an optimal deterministic Markov policy. For the constrained case, employing the technique of occupation measures, we first give an equivalent characterization of the occupation measures, and derive that for each occupation measure generated by a randomized history-dependent policy, there exists an occupation measure generated by a randomized Markov policy equal to it. Then using the compactness and convexity of the set of all occupation measures, we obtain the existence of a constrained-optimal randomized Markov policy. Moreover, the constrained optimization problem is reformulated as a linear program, and the strong duality between the linear program and its dual program is established. Finally, a controlled birth and death system is used to illustrate our main results.
“Continuous-time Markov Decision Processes With Finite-horizon Expected Total Cost Criteria” Metadata:
- Title: ➤ Continuous-time Markov Decision Processes With Finite-horizon Expected Total Cost Criteria
- Authors: Qingda WeiXian Chen
“Continuous-time Markov Decision Processes With Finite-horizon Expected Total Cost Criteria” Subjects and Themes:
- Subjects: Mathematics - Optimization and Control
Edition Identifiers:
- Internet Archive ID: arxiv-1408.5497
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.30 Mbs, the file-s for this book were downloaded 18 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Continuous-time Markov Decision Processes With Finite-horizon Expected Total Cost Criteria at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
16Smart Sampling For Lightweight Verification Of Markov Decision Processes
By Pedro D'Argenio, Axel Legay, Sean Sedwards and Louis-Marie Traonouez
Markov decision processes (MDP) are useful to model optimisation problems in concurrent systems. To verify MDPs with efficient Monte Carlo techniques requires that their nondeterminism be resolved by a scheduler. Recent work has introduced the elements of lightweight techniques to sample directly from scheduler space, but finding optimal schedulers by simple sampling may be inefficient. Here we describe "smart" sampling algorithms that can make substantial improvements in performance.
“Smart Sampling For Lightweight Verification Of Markov Decision Processes” Metadata:
- Title: ➤ Smart Sampling For Lightweight Verification Of Markov Decision Processes
- Authors: Pedro D'ArgenioAxel LegaySean SedwardsLouis-Marie Traonouez
“Smart Sampling For Lightweight Verification Of Markov Decision Processes” Subjects and Themes:
Edition Identifiers:
- Internet Archive ID: arxiv-1409.2116
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.46 Mbs, the file-s for this book were downloaded 14 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Smart Sampling For Lightweight Verification Of Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
17A Bayesian Approach For Learning And Planning In Partially Observable Markov Decision Processes
By Stphane Ross, Joelle Pineau, Brahim Chaib-draa and Pierre Kreitmann
Markov decision processes (MDP) are useful to model optimisation problems in concurrent systems. To verify MDPs with efficient Monte Carlo techniques requires that their nondeterminism be resolved by a scheduler. Recent work has introduced the elements of lightweight techniques to sample directly from scheduler space, but finding optimal schedulers by simple sampling may be inefficient. Here we describe "smart" sampling algorithms that can make substantial improvements in performance.
“A Bayesian Approach For Learning And Planning In Partially Observable Markov Decision Processes” Metadata:
- Title: ➤ A Bayesian Approach For Learning And Planning In Partially Observable Markov Decision Processes
- Authors: Stphane RossJoelle PineauBrahim Chaib-draaPierre Kreitmann
Edition Identifiers:
- Internet Archive ID: ➤ academictorrents_55f4ffc91509ab0f716cb86c642585a25bfb93cd
Downloads Information:
The book is available for download in "data" format, the size of the file-s is: 0.02 Mbs, the file-s for this book were downloaded 29 times, the file-s went public at Tue Aug 11 2020.
Available formats:
Archive BitTorrent - BitTorrent - Metadata - Unknown -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find A Bayesian Approach For Learning And Planning In Partially Observable Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
18An Analysis Of Primal-Dual Algorithms For Discounted Markov Decision Processes
By Randy Cogill
Several well-known algorithms in the field of combinatorial optimization can be interpreted in terms of the primal-dual method for solving linear programs. For example, Dijkstra's algorithm, the Ford-Fulkerson algorithm, and the Hungarian algorithm can all be viewed as the primal-dual method applied to the linear programming formulations of their respective optimization problems. Roughly speaking, successfully applying the primal-dual method to an optimization problem that can be posed as a linear program relies on the ability to find a simple characterization of the optimal solutions to a related linear program, called the `dual of the restricted primal' (DRP). This paper is motivated by the following question: What is the algorithm we obtain if we apply the primal-dual method to a linear programming formulation of a discounted cost Markov decision process? We will first show that several widely-used algorithms for Markov decision processes can be interpreted in terms of the primal-dual method, where the value function is updated with suboptimal solutions to the DRP in each iteration. We then provide the optimal solution to the DRP in closed-form, and present the algorithm that results when using this solution to update the value function in each iteration. Unlike the algorithms obtained from suboptimal DRP updates, this algorithm is guaranteed to yield the optimal value function in a finite number of iterations. Finally, we show that the iterations of the primal-dual algorithm can be interpreted as repeated application of the policy iteration algorithm to a special class of Markov decision processes. When considered alongside recent results characterizing the computational complexity of the policy iteration algorithm, this observation could provide new insights into the computational complexity of solving discounted-cost Markov decision processes.
“An Analysis Of Primal-Dual Algorithms For Discounted Markov Decision Processes” Metadata:
- Title: ➤ An Analysis Of Primal-Dual Algorithms For Discounted Markov Decision Processes
- Author: Randy Cogill
“An Analysis Of Primal-Dual Algorithms For Discounted Markov Decision Processes” Subjects and Themes:
- Subjects: Optimization and Control - Mathematics
Edition Identifiers:
- Internet Archive ID: arxiv-1601.04175
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.19 Mbs, the file-s for this book were downloaded 23 times, the file-s went public at Fri Jun 29 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find An Analysis Of Primal-Dual Algorithms For Discounted Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
19Average-Cost Markov Decision Processes With Weakly Continuous Transition Probabilities
By Eugene A. Feinberg, Pavlo O. Kasyanov and Nina V. Zadoianchuk
This paper presents sufficient conditions for the existence of stationary optimal policies for average-cost Markov Decision Processes with Borel state and action sets and with weakly continuous transition probabilities. The one-step cost functions may be unbounded, and action sets may be noncompact. The main contributions of this paper are: (i) general sufficient conditions for the existence of stationary discount-optimal and average-cost optimal policies and descriptions of properties of value functions and sets of optimal actions, (ii) a sufficient condition for the average-cost optimality of a stationary policy in the form of optimality inequalities, and (iii) approximations of average-cost optimal actions by discount-optimal actions.
“Average-Cost Markov Decision Processes With Weakly Continuous Transition Probabilities” Metadata:
- Title: ➤ Average-Cost Markov Decision Processes With Weakly Continuous Transition Probabilities
- Authors: Eugene A. FeinbergPavlo O. KasyanovNina V. Zadoianchuk
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-1202.4122
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 12.92 Mbs, the file-s for this book were downloaded 69 times, the file-s went public at Mon Sep 23 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Average-Cost Markov Decision Processes With Weakly Continuous Transition Probabilities at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
20One-Counter Markov Decision Processes
By Tomáš Brázdil, Václav Brožek, Kousha Etessami, Antonín Kučera and Dominik Wojtczak
We study the computational complexity of central analysis problems for One-Counter Markov Decision Processes (OC-MDPs), a class of finitely-presented, countable-state MDPs. OC-MDPs are equivalent to a controlled extension of (discrete-time) Quasi-Birth-Death processes (QBDs), a stochastic model studied heavily in queueing theory and applied probability. They can thus be viewed as a natural ``adversarial'' version of a classic stochastic model. Alternatively, they can also be viewed as a natural probabilistic/controlled extension of classic one-counter automata. OC-MDPs also subsume (as a very restricted special case) a recently studied MDP model called ``solvency games'' that model a risk-averse gambling scenario. Basic computational questions about these models include ``termination'' questions and ``limit'' questions, such as the following: does the controller have a ``strategy'' (or ``policy'') to ensure that the counter (which may for example count the number of jobs in the queue) will hit value 0 (the empty queue) almost surely (a.s.)? Or that it will have infinite limsup value, a.s.? Or, that it will hit value 0 in selected terminal states, a.s.? Or, in case these are not satisfied a.s., compute the maximum (supremum) such probability over all strategies. We provide new upper and lower bounds on the complexity of such problems. For some of them we present a polynomial-time algorithm, whereas for others we show PSPACE- or BH-hardness and give an EXPTIME upper bound. Our upper bounds combine techniques from the theory of MDP reward models, the theory of random walks, and a variety of automata-theoretic methods.
“One-Counter Markov Decision Processes” Metadata:
- Title: ➤ One-Counter Markov Decision Processes
- Authors: Tomáš BrázdilVáclav BrožekKousha EtessamiAntonín KučeraDominik Wojtczak
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-0904.2511
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 28.34 Mbs, the file-s for this book were downloaded 72 times, the file-s went public at Mon Sep 23 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find One-Counter Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
21Discounted Continuous-time Markov Decision Processes With Unbounded Rates: The Dynamic Programming Approach
By Alexey Piunovskiy and Yi Zhang
This paper deals with unconstrained discounted continuous-time Markov decision processes in Borel state and action spaces. Under some conditions imposed on the primitives, allowing unbounded transition rates and unbounded (from both above and below) cost rates, we show the regularity of the controlled process, which ensures the underlying models to be well defined. Then we develop the dynamic programming approach by showing that the Bellman equation is satisfied (by the optimal value). Finally, under some compactness-continuity conditions, we obtain the existence of a deterministic stationary optimal policy out of the class of randomized history-dependent policies.
“Discounted Continuous-time Markov Decision Processes With Unbounded Rates: The Dynamic Programming Approach” Metadata:
- Title: ➤ Discounted Continuous-time Markov Decision Processes With Unbounded Rates: The Dynamic Programming Approach
- Authors: Alexey PiunovskiyYi Zhang
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-1103.0134
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 10.95 Mbs, the file-s for this book were downloaded 89 times, the file-s went public at Sun Sep 22 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Discounted Continuous-time Markov Decision Processes With Unbounded Rates: The Dynamic Programming Approach at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
22Safe Exploration In Finite Markov Decision Processes With Gaussian Processes
By Matteo Turchetta, Felix Berkenkamp and Andreas Krause
In classical reinforcement learning, when exploring an environment, agents accept arbitrary short term loss for long term gain. This is infeasible for safety critical applications, such as robotics, where even a single unsafe action may cause system failure. In this paper, we address the problem of safely exploring finite Markov decision processes (MDP). We define safety in terms of an, a priori unknown, safety constraint that depends on states and actions. We aim to explore the MDP under this constraint, assuming that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop a novel algorithm for this task and prove that it is able to completely explore the safely reachable part of the MDP without violating the safety constraint. To achieve this, it cautiously explores safe states and actions in order to gain statistical confidence about the safety of unvisited state-action pairs from noisy observations collected while navigating the environment. Moreover, the algorithm explicitly considers reachability when exploring the MDP, ensuring that it does not get stuck in any state with no safe way out. We demonstrate our method on digital terrain models for the task of exploring an unknown map with a rover.
“Safe Exploration In Finite Markov Decision Processes With Gaussian Processes” Metadata:
- Title: ➤ Safe Exploration In Finite Markov Decision Processes With Gaussian Processes
- Authors: Matteo TurchettaFelix BerkenkampAndreas Krause
“Safe Exploration In Finite Markov Decision Processes With Gaussian Processes” Subjects and Themes:
- Subjects: ➤ Machine Learning - Artificial Intelligence - Statistics - Learning - Computing Research Repository - Robotics
Edition Identifiers:
- Internet Archive ID: arxiv-1606.04753
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.61 Mbs, the file-s for this book were downloaded 133 times, the file-s went public at Fri Jun 29 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Safe Exploration In Finite Markov Decision Processes With Gaussian Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
23Probabilistic Opacity For Markov Decision Processes
By Béatrice Bérard, Krishnendu Chatterjee and Nathalie Sznajder
Opacity is a generic security property, that has been defined on (non probabilistic) transition systems and later on Markov chains with labels. For a secret predicate, given as a subset of runs, and a function describing the view of an external observer, the value of interest for opacity is a measure of the set of runs disclosing the secret. We extend this definition to the richer framework of Markov decision processes, where non deterministic choice is combined with probabilistic transitions, and we study related decidability problems with partial or complete observation hypotheses for the schedulers. We prove that all questions are decidable with complete observation and $\omega$-regular secrets. With partial observation, we prove that all quantitative questions are undecidable but the question whether a system is almost surely non opaque becomes decidable for a restricted class of $\omega$-regular secrets, as well as for all $\omega$-regular secrets under finite-memory schedulers.
“Probabilistic Opacity For Markov Decision Processes” Metadata:
- Title: ➤ Probabilistic Opacity For Markov Decision Processes
- Authors: Béatrice BérardKrishnendu ChatterjeeNathalie Sznajder
“Probabilistic Opacity For Markov Decision Processes” Subjects and Themes:
Edition Identifiers:
- Internet Archive ID: arxiv-1407.4225
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.21 Mbs, the file-s for this book were downloaded 18 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Probabilistic Opacity For Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
24Mean-Variance Optimization In Markov Decision Processes
By Shie Mannor and John Tsitsiklis
We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for others. We finally offer pseudopolynomial exact and approximation algorithms.
“Mean-Variance Optimization In Markov Decision Processes” Metadata:
- Title: ➤ Mean-Variance Optimization In Markov Decision Processes
- Authors: Shie MannorJohn Tsitsiklis
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-1104.5601
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 12.03 Mbs, the file-s for this book were downloaded 65 times, the file-s went public at Sat Sep 21 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Mean-Variance Optimization In Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
25Speeding Up The Convergence Of Value Iteration In Partially Observable Markov Decision Processes
By N. L. Zhang and W. Zhang
Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.
“Speeding Up The Convergence Of Value Iteration In Partially Observable Markov Decision Processes” Metadata:
- Title: ➤ Speeding Up The Convergence Of Value Iteration In Partially Observable Markov Decision Processes
- Authors: N. L. ZhangW. Zhang
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-1106.0251
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 15.13 Mbs, the file-s for this book were downloaded 59 times, the file-s went public at Sat Sep 21 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Speeding Up The Convergence Of Value Iteration In Partially Observable Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
26Examples In Markov Decision Processes
By Piunovskiy, A. B
Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.
“Examples In Markov Decision Processes” Metadata:
- Title: ➤ Examples In Markov Decision Processes
- Author: Piunovskiy, A. B
- Language: English
Edition Identifiers:
- Internet Archive ID: examplesinmarkov0000piun
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 608.56 Mbs, the file-s for this book were downloaded 25 times, the file-s went public at Thu Jun 01 2023.
Available formats:
ACS Encrypted PDF - Cloth Cover Detection Log - DjVuTXT - Djvu XML - Dublin Core - Extra Metadata JSON - Item Tile - JPEG Thumb - JSON - LCP Encrypted EPUB - LCP Encrypted PDF - Log - MARC - MARC Binary - Metadata - Metadata Log - OCR Page Index - OCR Search Text - PNG - Page Numbers JSON - RePublisher Final Processing Log - RePublisher Initial Processing Log - Scandata - Single Page Original JP2 Tar - Single Page Processed JP2 ZIP - Text PDF - Title Page Detection Log - chOCR - hOCR -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Examples In Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
27Simulation-based Algorithms For Markov Decision Processes
By Chang, Hyeong Soo, author
Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.
“Simulation-based Algorithms For Markov Decision Processes” Metadata:
- Title: ➤ Simulation-based Algorithms For Markov Decision Processes
- Author: Chang, Hyeong Soo, author
- Language: English
“Simulation-based Algorithms For Markov Decision Processes” Subjects and Themes:
- Subjects: ➤ Decision making -- Mathematical models - Markov processes
Edition Identifiers:
- Internet Archive ID: simulationbaseda0000chan
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 687.19 Mbs, the file-s for this book were downloaded 14 times, the file-s went public at Tue May 30 2023.
Available formats:
ACS Encrypted PDF - Cloth Cover Detection Log - DjVuTXT - Djvu XML - Dublin Core - Extra Metadata JSON - Item Tile - JPEG Thumb - JSON - LCP Encrypted EPUB - LCP Encrypted PDF - Log - MARC - MARC Binary - Metadata - Metadata Log - OCR Page Index - OCR Search Text - PNG - Page Numbers JSON - RePublisher Final Processing Log - RePublisher Initial Processing Log - Scandata - Single Page Original JP2 Tar - Single Page Processed JP2 ZIP - Text PDF - Title Page Detection Log - chOCR - hOCR -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Simulation-based Algorithms For Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
28Multi-Objective Model Checking Of Markov Decision Processes
By Kousha Etessami, Marta Kwiatkowska, Moshe Y. Vardi and Mihalis Yannakakis
We study and provide efficient algorithms for multi-objective model checking problems for Markov Decision Processes (MDPs). Given an MDP, M, and given multiple linear-time (\omega -regular or LTL) properties \varphi\_i, and probabilities r\_i \epsilon [0,1], i=1,...,k, we ask whether there exists a strategy \sigma for the controller such that, for all i, the probability that a trajectory of M controlled by \sigma satisfies \varphi\_i is at least r\_i. We provide an algorithm that decides whether there exists such a strategy and if so produces it, and which runs in time polynomial in the size of the MDP. Such a strategy may require the use of both randomization and memory. We also consider more general multi-objective \omega -regular queries, which we motivate with an application to assume-guarantee compositional reasoning for probabilistic systems. Note that there can be trade-offs between different properties: satisfying property \varphi\_1 with high probability may necessitate satisfying \varphi\_2 with low probability. Viewing this as a multi-objective optimization problem, we want information about the "trade-off curve" or Pareto curve for maximizing the probabilities of different properties. We show that one can compute an approximate Pareto curve with respect to a set of \omega -regular properties in time polynomial in the size of the MDP. Our quantitative upper bounds use LP methods. We also study qualitative multi-objective model checking problems, and we show that these can be analysed by purely graph-theoretic methods, even though the strategies may still require both randomization and memory.
“Multi-Objective Model Checking Of Markov Decision Processes” Metadata:
- Title: ➤ Multi-Objective Model Checking Of Markov Decision Processes
- Authors: Kousha EtessamiMarta KwiatkowskaMoshe Y. VardiMihalis Yannakakis
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-0810.5728
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 13.95 Mbs, the file-s for this book were downloaded 78 times, the file-s went public at Mon Sep 23 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Multi-Objective Model Checking Of Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
29A Counterexample Guided Abstraction-Refinement Framework For Markov Decision Processes
By Rohit Chadha and Mahesh Viswanthan
The main challenge in using abstractions effectively, is to construct a suitable abstraction for the system being verified. One approach that tries to address this problem is that of {\it counterexample guided abstraction-refinement (CEGAR)}, wherein one starts with a coarse abstraction of the system, and progressively refines it, based on invalid counterexamples seen in prior model checking runs, until either an abstraction proves the correctness of the system or a valid counterexample is generated. While CEGAR has been successfully used in verifying non-probabilistic systems automatically, CEGAR has not been applied in the context of probabilistic systems. The main issues that need to be tackled in order to extend the approach to probabilistic systems is a suitable notion of ``counterexample'', algorithms to generate counterexamples, check their validity, and then automatically refine an abstraction based on an invalid counterexample. In this paper, we address these issues, and present a CEGAR framework for Markov Decision Processes.
“A Counterexample Guided Abstraction-Refinement Framework For Markov Decision Processes” Metadata:
- Title: ➤ A Counterexample Guided Abstraction-Refinement Framework For Markov Decision Processes
- Authors: Rohit ChadhaMahesh Viswanthan
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-0807.1173
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 25.31 Mbs, the file-s for this book were downloaded 75 times, the file-s went public at Tue Sep 17 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find A Counterexample Guided Abstraction-Refinement Framework For Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
30Energy And Mean-Payoff Parity Markov Decision Processes
By Krishnendu Chatterjee and Laurent Doyen
We consider Markov Decision Processes (MDPs) with mean-payoff parity and energy parity objectives. In system design, the parity objective is used to encode \omega-regular specifications, and the mean-payoff and energy objectives can be used to model quantitative resource constraints. The energy condition requires that the resource level never drops below 0, and the mean-payoff condition requires that the limit-average value of the resource consumption is within a threshold. While these two (energy and mean-payoff) classical conditions are equivalent for two-player games, we show that they differ for MDPs. We show that the problem of deciding whether a state is almost-sure winning (i.e., winning with probability 1) in energy parity MDPs is in NP \cap coNP, while for mean-payoff parity MDPs, the problem is solvable in polynomial time, improving a recent PSPACE bound.
“Energy And Mean-Payoff Parity Markov Decision Processes” Metadata:
- Title: ➤ Energy And Mean-Payoff Parity Markov Decision Processes
- Authors: Krishnendu ChatterjeeLaurent Doyen
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-1104.2909
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 10.95 Mbs, the file-s for this book were downloaded 70 times, the file-s went public at Sat Sep 21 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Energy And Mean-Payoff Parity Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
31DTIC AD1005567: An Evolutionary Random Policy Search Algorithm For Solving Markov Decision Processes
By Defense Technical Information Center
This paper presents a new randomized search method called Evolutionary Random Policy Search (ERPS) for solving infinite horizon discounted cost Markov Decision Process (MDP) problems. The algorithm is particularly targeted at problems with large or uncountable action spaces. ERPS approaches a given MDP by iteratively dividing it into a sequence of smaller, random, sub-MDP problems based on information obtained from random sampling of the entire action space and local search. Each sub-MDP is then solved approximately by using a variant of the standard policy improvement technique, where an elite policy is obtained. We show that the sequence of elite policies converges to an optimal policy with probability one. An adaptive version of the algorithm that improves the efficiency of the search process while maintaining the convergence properties of ERPS is also proposed. Some numerical studies are carried out to illustrate the algorithm and compare it with existing procedures.
“DTIC AD1005567: An Evolutionary Random Policy Search Algorithm For Solving Markov Decision Processes” Metadata:
- Title: ➤ DTIC AD1005567: An Evolutionary Random Policy Search Algorithm For Solving Markov Decision Processes
- Author: ➤ Defense Technical Information Center
- Language: English
“DTIC AD1005567: An Evolutionary Random Policy Search Algorithm For Solving Markov Decision Processes” Subjects and Themes:
- Subjects: ➤ DTIC Archive - Hu,Jiaqiao - University of Maryland College Park United States - genetic algorithms - Markov processes - optimization
Edition Identifiers:
- Internet Archive ID: DTIC_AD1005567
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 19.96 Mbs, the file-s for this book were downloaded 49 times, the file-s went public at Sun Jan 19 2020.
Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - Item Tile - Metadata - OCR Page Index - OCR Search Text - Page Numbers JSON - Scandata - Single Page Processed JP2 ZIP - Text PDF - chOCR - hOCR -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find DTIC AD1005567: An Evolutionary Random Policy Search Algorithm For Solving Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
32Exact Finite Approximations Of Average-cost Countable Markov Decision Processes
By Arie Leizarowitz and Adam Shwartz
For a countable-state Markov decision process we introduce an embedding which produces a finite-state Markov decision process. The finite-state embedded process has the same optimal cost, and moreover, it has the same dynamics as the original process when restricting to the approximating set. The embedded process can be used as an approximation which, being finite, is more convenient for computation and implementation.
“Exact Finite Approximations Of Average-cost Countable Markov Decision Processes” Metadata:
- Title: ➤ Exact Finite Approximations Of Average-cost Countable Markov Decision Processes
- Authors: Arie LeizarowitzAdam Shwartz
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-0711.2185
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 8.21 Mbs, the file-s for this book were downloaded 128 times, the file-s went public at Tue Sep 17 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Exact Finite Approximations Of Average-cost Countable Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
33Equilibrium In Misspecified Markov Decision Processes
By Ignacio Esponda and Demian Pouzo
We study Markov decision problems where the agent does not know the transition probability function mapping current states and actions to future states. The agent has a prior belief over a set of possible transition functions and updates beliefs using Bayes' rule. We allow her to be misspecified in the sense that the true transition probability function is not in the support of her prior. This problem is relevant in many economic settings but is usually not amenable to analysis by the researcher. We make the problem tractable by studying asymptotic behavior. We propose an equilibrium notion and provide conditions under which it characterizes steady state behavior. In the special case where the problem is static, equilibrium coincides with the single-agent version of Berk-Nash equilibrium (Esponda and Pouzo (2016)). We also discuss subtle issues that arise exclusively in dynamic settings due to the possibility of a negative value of experimentation.
“Equilibrium In Misspecified Markov Decision Processes” Metadata:
- Title: ➤ Equilibrium In Misspecified Markov Decision Processes
- Authors: Ignacio EspondaDemian Pouzo
- Language: English
“Equilibrium In Misspecified Markov Decision Processes” Subjects and Themes:
- Subjects: Quantitative Finance - Economics
Edition Identifiers:
- Internet Archive ID: arxiv-1502.06901
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 25.40 Mbs, the file-s for this book were downloaded 38 times, the file-s went public at Tue Jun 26 2018.
Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Equilibrium In Misspecified Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
34Finite Dynamic Programming : An Approach To Finite Markov Decision Processes
By White, D. J. (Douglas John)
We study Markov decision problems where the agent does not know the transition probability function mapping current states and actions to future states. The agent has a prior belief over a set of possible transition functions and updates beliefs using Bayes' rule. We allow her to be misspecified in the sense that the true transition probability function is not in the support of her prior. This problem is relevant in many economic settings but is usually not amenable to analysis by the researcher. We make the problem tractable by studying asymptotic behavior. We propose an equilibrium notion and provide conditions under which it characterizes steady state behavior. In the special case where the problem is static, equilibrium coincides with the single-agent version of Berk-Nash equilibrium (Esponda and Pouzo (2016)). We also discuss subtle issues that arise exclusively in dynamic settings due to the possibility of a negative value of experimentation.
“Finite Dynamic Programming : An Approach To Finite Markov Decision Processes” Metadata:
- Title: ➤ Finite Dynamic Programming : An Approach To Finite Markov Decision Processes
- Author: White, D. J. (Douglas John)
- Language: English
“Finite Dynamic Programming : An Approach To Finite Markov Decision Processes” Subjects and Themes:
- Subjects: Dynamic programming - Decision making - Markov processes
Edition Identifiers:
- Internet Archive ID: finitedynamicpro0000whit
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 408.00 Mbs, the file-s for this book were downloaded 34 times, the file-s went public at Sun Oct 25 2020.
Available formats:
ACS Encrypted PDF - Cloth Cover Detection Log - DjVuTXT - Djvu XML - Dublin Core - EPUB - Item Tile - JSON - LCP Encrypted EPUB - LCP Encrypted PDF - Log - MARC - MARC Binary - Metadata - OCR Page Index - OCR Search Text - Page Numbers JSON - Scandata - Single Page Original JP2 Tar - Single Page Processed JP2 ZIP - Text PDF - Title Page Detection Log - chOCR - hOCR -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Finite Dynamic Programming : An Approach To Finite Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
35Proto-value Functions: A Laplacian Framework For Learning Representation And Control In Markov Decision Processes
By Sridhar Mahadevan and Mauro Maggioni
We study Markov decision problems where the agent does not know the transition probability function mapping current states and actions to future states. The agent has a prior belief over a set of possible transition functions and updates beliefs using Bayes' rule. We allow her to be misspecified in the sense that the true transition probability function is not in the support of her prior. This problem is relevant in many economic settings but is usually not amenable to analysis by the researcher. We make the problem tractable by studying asymptotic behavior. We propose an equilibrium notion and provide conditions under which it characterizes steady state behavior. In the special case where the problem is static, equilibrium coincides with the single-agent version of Berk-Nash equilibrium (Esponda and Pouzo (2016)). We also discuss subtle issues that arise exclusively in dynamic settings due to the possibility of a negative value of experimentation.
“Proto-value Functions: A Laplacian Framework For Learning Representation And Control In Markov Decision Processes” Metadata:
- Title: ➤ Proto-value Functions: A Laplacian Framework For Learning Representation And Control In Markov Decision Processes
- Authors: Sridhar MahadevanMauro Maggioni
Edition Identifiers:
- Internet Archive ID: ➤ academictorrents_cca65715cb31b53876c27c0f4ddcd2be9ad7036a
Downloads Information:
The book is available for download in "data" format, the size of the file-s is: 0.02 Mbs, the file-s for this book were downloaded 26 times, the file-s went public at Tue Aug 11 2020.
Available formats:
Archive BitTorrent - BitTorrent - Metadata - Unknown -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Proto-value Functions: A Laplacian Framework For Learning Representation And Control In Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
36Thompson Sampling For Learning Parameterized Markov Decision Processes
By Aditya Gopalan and Shie Mannor
We consider reinforcement learning in parameterized Markov Decision Processes (MDPs), where the parameterization may induce correlation across transition probabilities or rewards. Consequently, observing a particular state transition might yield useful information about other, unobserved, parts of the MDP. We present a version of Thompson sampling for parameterized reinforcement learning problems, and derive a frequentist regret bound for priors over general parameter spaces. The result shows that the number of instants where suboptimal actions are chosen scales logarithmically with time, with high probability. It holds for prior distributions that put significant probability near the true model, without any additional, specific closed-form structure such as conjugate or product-form priors. The constant factor in the logarithmic scaling encodes the information complexity of learning the MDP in terms of the Kullback-Leibler geometry of the parameter space.
“Thompson Sampling For Learning Parameterized Markov Decision Processes” Metadata:
- Title: ➤ Thompson Sampling For Learning Parameterized Markov Decision Processes
- Authors: Aditya GopalanShie Mannor
“Thompson Sampling For Learning Parameterized Markov Decision Processes” Subjects and Themes:
- Subjects: Machine Learning - Computing Research Repository - Statistics - Learning
Edition Identifiers:
- Internet Archive ID: arxiv-1406.7498
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.47 Mbs, the file-s for this book were downloaded 19 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Thompson Sampling For Learning Parameterized Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
37Sufficient Markov Decision Processes With Alternating Deep Neural Networks
By Longshaokan Wang, Eric B. Laber and Katie Witkiewitz
Advances in mobile computing technologies have made it possible to monitor and apply data-driven interventions across complex systems in real time. Markov decision processes (MDPs) are the primary model for sequential decision problems with a large or indefinite time horizon. Choosing a representation of the underlying decision process that is both Markov and low-dimensional is non-trivial. We propose a method for constructing a low-dimensional representation of the original decision process for which: 1. the MDP model holds; 2. a decision strategy that maximizes mean utility when applied to the low-dimensional representation also maximizes mean utility when applied to the original process. We use a deep neural network to define a class of potential process representations and estimate the process of lowest dimension within this class. The method is illustrated using data from a mobile study on heavy drinking and smoking among college students.
“Sufficient Markov Decision Processes With Alternating Deep Neural Networks” Metadata:
- Title: ➤ Sufficient Markov Decision Processes With Alternating Deep Neural Networks
- Authors: Longshaokan WangEric B. LaberKatie Witkiewitz
“Sufficient Markov Decision Processes With Alternating Deep Neural Networks” Subjects and Themes:
- Subjects: Statistics Theory - Statistics - Machine Learning - Methodology - Mathematics
Edition Identifiers:
- Internet Archive ID: arxiv-1704.07531
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.50 Mbs, the file-s for this book were downloaded 16 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Sufficient Markov Decision Processes With Alternating Deep Neural Networks at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
38Elastic Resource Management With Adaptive State Space Partitioning Of Markov Decision Processes
By Konstantinos Lolos, Ioannis Konstantinou, Verena Kantere and Nectarios Koziris
Modern large-scale computing deployments consist of complex applications running over machine clusters. An important issue in these is the offering of elasticity, i.e., the dynamic allocation of resources to applications to meet fluctuating workload demands. Threshold based approaches are typically employed, yet they are difficult to configure and optimize. Approaches based on reinforcement learning have been proposed, but they require a large number of states in order to model complex application behavior. Methods that adaptively partition the state space have been proposed, but their partitioning criteria and strategies are sub-optimal. In this work we present MDP_DT, a novel full-model based reinforcement learning algorithm for elastic resource management that employs adaptive state space partitioning. We propose two novel statistical criteria and three strategies and we experimentally prove that they correctly decide both where and when to partition, outperforming existing approaches. We experimentally evaluate MDP_DT in a real large scale cluster over variable not-encountered workloads and we show that it takes more informed decisions compared to static and model-free approaches, while requiring a minimal amount of training data.
“Elastic Resource Management With Adaptive State Space Partitioning Of Markov Decision Processes” Metadata:
- Title: ➤ Elastic Resource Management With Adaptive State Space Partitioning Of Markov Decision Processes
- Authors: Konstantinos LolosIoannis KonstantinouVerena KantereNectarios Koziris
“Elastic Resource Management With Adaptive State Space Partitioning Of Markov Decision Processes” Subjects and Themes:
Edition Identifiers:
- Internet Archive ID: arxiv-1702.02978
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.89 Mbs, the file-s for this book were downloaded 20 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Elastic Resource Management With Adaptive State Space Partitioning Of Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
39Optimizing The Expected Mean Payoff In Energy Markov Decision Processes
By Tomáš Brázdil, Antonín Kučera and Petr Novotný
Energy Markov Decision Processes (EMDPs) are finite-state Markov decision processes where each transition is assigned an integer counter update and a rational payoff. An EMDP configuration is a pair s(n), where s is a control state and n is the current counter value. The configurations are changed by performing transitions in the standard way. We consider the problem of computing a safe strategy (i.e., a strategy that keeps the counter non-negative) which maximizes the expected mean payoff.
“Optimizing The Expected Mean Payoff In Energy Markov Decision Processes” Metadata:
- Title: ➤ Optimizing The Expected Mean Payoff In Energy Markov Decision Processes
- Authors: Tomáš BrázdilAntonín KučeraPetr Novotný
“Optimizing The Expected Mean Payoff In Energy Markov Decision Processes” Subjects and Themes:
Edition Identifiers:
- Internet Archive ID: arxiv-1607.00678
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.28 Mbs, the file-s for this book were downloaded 17 times, the file-s went public at Fri Jun 29 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Optimizing The Expected Mean Payoff In Energy Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
40A Learning Based Approach To Control Synthesis Of Markov Decision Processes For Linear Temporal Logic Specifications
By Dorsa Sadigh, Eric S. Kim, Samuel Coogan, S. Shankar Sastry and Sanjit A. Seshia
We propose to synthesize a control policy for a Markov decision process (MDP) such that the resulting traces of the MDP satisfy a linear temporal logic (LTL) property. We construct a product MDP that incorporates a deterministic Rabin automaton generated from the desired LTL property. The reward function of the product MDP is defined from the acceptance condition of the Rabin automaton. This construction allows us to apply techniques from learning theory to the problem of synthesis for LTL specifications even when the transition probabilities are not known a priori. We prove that our method is guaranteed to find a controller that satisfies the LTL property with probability one if such a policy exists, and we suggest empirically with a case study in traffic control that our method produces reasonable control strategies even when the LTL property cannot be satisfied with probability one.
“A Learning Based Approach To Control Synthesis Of Markov Decision Processes For Linear Temporal Logic Specifications” Metadata:
- Title: ➤ A Learning Based Approach To Control Synthesis Of Markov Decision Processes For Linear Temporal Logic Specifications
- Authors: Dorsa SadighEric S. KimSamuel CooganS. Shankar SastrySanjit A. Seshia
“A Learning Based Approach To Control Synthesis Of Markov Decision Processes For Linear Temporal Logic Specifications” Subjects and Themes:
- Subjects: Systems and Control - Computing Research Repository
Edition Identifiers:
- Internet Archive ID: arxiv-1409.5486
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.47 Mbs, the file-s for this book were downloaded 15 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find A Learning Based Approach To Control Synthesis Of Markov Decision Processes For Linear Temporal Logic Specifications at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
41Feature Markov Decision Processes
By Marcus Hutter
General purpose intelligent learning agents cycle through (complex,non-MDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is well-developed for small finite state Markov Decision Processes (MDPs). So far it is an art performed by human designers to extract the right state representation out of the bare observations, i.e. to reduce the agent setup to the MDP framework. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in a companion article.
“Feature Markov Decision Processes” Metadata:
- Title: ➤ Feature Markov Decision Processes
- Author: Marcus Hutter
Edition Identifiers:
- Internet Archive ID: arxiv-0812.4580
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 6.61 Mbs, the file-s for this book were downloaded 84 times, the file-s went public at Sun Sep 22 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Feature Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
42Counterexample Explanation By Learning Small Strategies In Markov Decision Processes
By Tomáš Brázdil, Krishnendu Chatterjee, Martin Chmelík, Andreas Fellner and Jan Křetínský
While for deterministic systems, a counterexample to a property can simply be an error trace, counterexamples in probabilistic systems are necessarily more complex. For instance, a set of erroneous traces with a sufficient cumulative probability mass can be used. Since these are too large objects to understand and manipulate, compact representations such as subchains have been considered. In the case of probabilistic systems with non-determinism, the situation is even more complex. While a subchain for a given strategy (or scheduler, resolving non-determinism) is a straightforward choice, we take a different approach. Instead, we focus on the strategy - which can be a counterexample to violation of or a witness of satisfaction of a property - itself, and extract the most important decisions it makes, and present its succinct representation. The key tools we employ to achieve this are (1) introducing a concept of importance of a state w.r.t. the strategy, and (2) learning using decision trees. There are three main consequent advantages of our approach. Firstly, it exploits the quantitative information on states, stressing the more important decisions. Secondly, it leads to a greater variability and degree of freedom in representing the strategies. Thirdly, the representation uses a self-explanatory data structure. In summary, our approach produces more succinct and more explainable strategies, as opposed to e.g. binary decision diagrams. Finally, our experimental results show that we can extract several rules describing the strategy even for very large systems that do not fit in memory, and based on the rules explain the erroneous behaviour.
“Counterexample Explanation By Learning Small Strategies In Markov Decision Processes” Metadata:
- Title: ➤ Counterexample Explanation By Learning Small Strategies In Markov Decision Processes
- Authors: Tomáš BrázdilKrishnendu ChatterjeeMartin ChmelíkAndreas FellnerJan Křetínský
- Language: English
“Counterexample Explanation By Learning Small Strategies In Markov Decision Processes” Subjects and Themes:
Edition Identifiers:
- Internet Archive ID: arxiv-1502.02834
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 10.93 Mbs, the file-s for this book were downloaded 59 times, the file-s went public at Tue Jun 26 2018.
Available formats:
Abbyy GZ - Archive BitTorrent - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Counterexample Explanation By Learning Small Strategies In Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
43Symblicit Algorithms For Optimal Strategy Synthesis In Monotonic Markov Decision Processes
By Aaron Bohy, Véronique Bruyère and Jean-François Raskin
When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the quantitative setting of expected mean-payoff. This algorithm, based on the strategy iteration algorithm of Howard and Veinott, efficiently combines symbolic and explicit data structures, and uses binary decision diagrams as symbolic representation. The aim of this paper is to show that the new data structure of pseudo-antichains (an extension of antichains) provides another interesting alternative, especially for the class of monotonic MDPs. We design efficient pseudo-antichain based symblicit algorithms (with open source implementations) for two quantitative settings: the expected mean-payoff and the stochastic shortest path. For two practical applications coming from automated planning and LTL synthesis, we report promising experimental results w.r.t. both the run time and the memory consumption.
“Symblicit Algorithms For Optimal Strategy Synthesis In Monotonic Markov Decision Processes” Metadata:
- Title: ➤ Symblicit Algorithms For Optimal Strategy Synthesis In Monotonic Markov Decision Processes
- Authors: Aaron BohyVéronique BruyèreJean-François Raskin
“Symblicit Algorithms For Optimal Strategy Synthesis In Monotonic Markov Decision Processes” Subjects and Themes:
- Subjects: Systems and Control - Logic in Computer Science - Computing Research Repository - Data Structures and Algorithms
Edition Identifiers:
- Internet Archive ID: arxiv-1407.5396
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.24 Mbs, the file-s for this book were downloaded 21 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Symblicit Algorithms For Optimal Strategy Synthesis In Monotonic Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
44On The Use Of Non-Stationary Policies For Stationary Infinite-Horizon Markov Decision Processes
By Bruno Scherrer and Boris Lesner
We consider infinite-horizon stationary $\gamma$-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. Using Value and Policy Iteration with some error $\epsilon$ at each iteration, it is well-known that one can compute stationary policies that are $\frac{2\gamma}{(1-\gamma)^2}\epsilon$-optimal. After arguing that this guarantee is tight, we develop variations of Value and Policy Iteration for computing non-stationary policies that can be up to $\frac{2\gamma}{1-\gamma}\epsilon$-optimal, which constitutes a significant improvement in the usual situation when $\gamma$ is close to 1. Surprisingly, this shows that the problem of "computing near-optimal non-stationary policies" is much simpler than that of "computing near-optimal stationary policies".
“On The Use Of Non-Stationary Policies For Stationary Infinite-Horizon Markov Decision Processes” Metadata:
- Title: ➤ On The Use Of Non-Stationary Policies For Stationary Infinite-Horizon Markov Decision Processes
- Authors: Bruno ScherrerBoris Lesner
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-1211.6898
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 5.16 Mbs, the file-s for this book were downloaded 70 times, the file-s went public at Wed Sep 18 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find On The Use Of Non-Stationary Policies For Stationary Infinite-Horizon Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
45Permissive Supervisor Synthesis For Markov Decision Processes Through Learning
By Bo Wu, Xiaobin Zhang and Hai Lin
This paper considers the permissive supervisor synthesis for probabilistic systems modeled as Markov Decision Processes (MDP). Such systems are prevalent in power grids, transportation networks, communication networks and robotics. Unlike centralized planning and optimization based planning, we propose a novel supervisor synthesis framework based on learning and compositional model checking to generate permissive local supervisors in a distributed manner. With the recent advance in assume-guarantee reasoning verification for probabilistic systems, building the composed system can be avoided to alleviate the state space explosion and our framework learn the supervisors iteratively based on the counterexamples from verification. Our approach is guaranteed to terminate in finite steps and to be correct.
“Permissive Supervisor Synthesis For Markov Decision Processes Through Learning” Metadata:
- Title: ➤ Permissive Supervisor Synthesis For Markov Decision Processes Through Learning
- Authors: Bo WuXiaobin ZhangHai Lin
“Permissive Supervisor Synthesis For Markov Decision Processes Through Learning” Subjects and Themes:
- Subjects: ➤ Logic in Computer Science - Formal Languages and Automata Theory - Systems and Control - Computing Research Repository
Edition Identifiers:
- Internet Archive ID: arxiv-1703.07351
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.22 Mbs, the file-s for this book were downloaded 22 times, the file-s went public at Sat Jun 30 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Permissive Supervisor Synthesis For Markov Decision Processes Through Learning at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
46Combinations And Mixtures Of Optimal Policies In Unichain Markov Decision Processes Are Optimal
By Ronald Ortner
We show that combinations of optimal (stationary) policies in unichain Markov decision processes are optimal. That is, let M be a unichain Markov decision process with state space S, action space A and policies \pi_j^*: S -> A (1\leq j\leq n) with optimal average infinite horizon reward. Then any combination \pi of these policies, where for each state i in S there is a j such that \pi(i)=\pi_j^*(i), is optimal as well. Furthermore, we prove that any mixture of optimal policies, where at each visit in a state i an arbitrary action \pi_j^*(i) of an optimal policy is chosen, yields optimal average reward, too.
“Combinations And Mixtures Of Optimal Policies In Unichain Markov Decision Processes Are Optimal” Metadata:
- Title: ➤ Combinations And Mixtures Of Optimal Policies In Unichain Markov Decision Processes Are Optimal
- Author: Ronald Ortner
- Language: English
Edition Identifiers:
- Internet Archive ID: arxiv-math0508319
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 4.68 Mbs, the file-s for this book were downloaded 90 times, the file-s went public at Sat Sep 21 2013.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Combinations And Mixtures Of Optimal Policies In Unichain Markov Decision Processes Are Optimal at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
47NASA Technical Reports Server (NTRS) 20000085879: Cooperation And Coordination Between Fuzzy Reinforcement Learning Agents In Continuous State Partially Observable Markov Decision Processes
By NASA Technical Reports Server (NTRS)
Successful operations of future multi-agent intelligent systems require efficient cooperation schemes between agents sharing learning experiences. We consider a pseudo-realistic world in which one or more opportunities appear and disappear in random locations. Agents use fuzzy reinforcement learning to learn which opportunities are most worthy of pursuing based on their promise rewards, expected lifetimes, path lengths and expected path costs. We show that this world is partially observable because the history of an agent influences the distribution of its future states. We consider a cooperation mechanism in which agents share experience by using and-updating one joint behavior policy. We also implement a coordination mechanism for allocating opportunities to different agents in the same world. Our results demonstrate that K cooperative agents each learning in a separate world over N time steps outperform K independent agents each learning in a separate world over K*N time steps, with this result becoming more pronounced as the degree of partial observability in the environment increases. We also show that cooperation between agents learning in the same world decreases performance with respect to independent agents. Since cooperation reduces diversity between agents, we conclude that diversity is a key parameter in the trade off between maximizing utility from cooperation when diversity is low and maximizing utility from competitive coordination when diversity is high.
“NASA Technical Reports Server (NTRS) 20000085879: Cooperation And Coordination Between Fuzzy Reinforcement Learning Agents In Continuous State Partially Observable Markov Decision Processes” Metadata:
- Title: ➤ NASA Technical Reports Server (NTRS) 20000085879: Cooperation And Coordination Between Fuzzy Reinforcement Learning Agents In Continuous State Partially Observable Markov Decision Processes
- Author: ➤ NASA Technical Reports Server (NTRS)
- Language: English
“NASA Technical Reports Server (NTRS) 20000085879: Cooperation And Coordination Between Fuzzy Reinforcement Learning Agents In Continuous State Partially Observable Markov Decision Processes” Subjects and Themes:
- Subjects: ➤ NASA Technical Reports Server (NTRS) - MARKOV PROCESSES - COORDINATION - SMART MATERIALS - COSTS - OBSERVATION - POLICIES - POSITION (LOCATION) - Berenji, Hamid R. - Vengerov, David
Edition Identifiers:
- Internet Archive ID: NASA_NTRS_Archive_20000085879
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 16.29 Mbs, the file-s for this book were downloaded 49 times, the file-s went public at Sun Oct 16 2016.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVuTXT - Djvu XML - Item Tile - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find NASA Technical Reports Server (NTRS) 20000085879: Cooperation And Coordination Between Fuzzy Reinforcement Learning Agents In Continuous State Partially Observable Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
48Hybrid Discrete-Continuous Markov Decision Processes
By Feng, Zhengzhu, Dearden, Richard, Meuleau, Nicholas and Washington, Ric
This paper proposes a Markov decision process (MDP) model that features both discrete and continuous state variables. We extend previous work by Boyan and Littman on the mono-dimensional time-dependent MDP to multiple dimensions. We present the principle of lazy discretization, and piecewise constant and linear approximations of the model. Having to deal with several continuous dimensions raises several new problems that require new solutions. In the (piecewise) linear case, we use techniques from partially- observable MDPs (POMDPS) to represent value functions as sets of linear functions attached to different partitions of the state space.
“Hybrid Discrete-Continuous Markov Decision Processes” Metadata:
- Title: ➤ Hybrid Discrete-Continuous Markov Decision Processes
- Authors: Feng, ZhengzhuDearden, RichardMeuleau, NicholasWashington, Ric
- Language: English
“Hybrid Discrete-Continuous Markov Decision Processes” Subjects and Themes:
- Subjects: ➤ APPROXIMATION - PARABOLIC FLIGHT - TEMPERATURE PROFILES - BLUNT BODIES - FILM COOLING - GAS INJECTION - STAGNATION POINT - LAMINAR HEAT TRANSFER - COOLANTS - VELOCITY DISTRIBUTION - FLIGHT CONDITIONS
Edition Identifiers:
- Internet Archive ID: nasa_techdoc_20040010791
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 4.02 Mbs, the file-s for this book were downloaded 232 times, the file-s went public at Thu Jun 02 2011.
Available formats:
Abbyy GZ - Animated GIF - Archive BitTorrent - DjVu - DjVuTXT - Djvu XML - JPEG Thumb - Metadata - Scandata - Single Page Processed JP2 ZIP - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Hybrid Discrete-Continuous Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
49Planning With Information-Processing Constraints And Model Uncertainty In Markov Decision Processes
By Jordi Grau-Moya, Felix Leibfried, Tim Genewein and Daniel A. Braun
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.
“Planning With Information-Processing Constraints And Model Uncertainty In Markov Decision Processes” Metadata:
- Title: ➤ Planning With Information-Processing Constraints And Model Uncertainty In Markov Decision Processes
- Authors: Jordi Grau-MoyaFelix LeibfriedTim GeneweinDaniel A. Braun
“Planning With Information-Processing Constraints And Model Uncertainty In Markov Decision Processes” Subjects and Themes:
Edition Identifiers:
- Internet Archive ID: arxiv-1604.02080
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.72 Mbs, the file-s for this book were downloaded 20 times, the file-s went public at Fri Jun 29 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Planning With Information-Processing Constraints And Model Uncertainty In Markov Decision Processes at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
50Ordinary Differential Equation Methods For Markov Decision Processes And Application To Kullback-Leibler Control Cost
By Ana Bušić and Sean Meyn
A new approach to computation of optimal policies for MDP (Markov decision process) models is introduced. The main idea is to solve not one, but an entire family of MDPs, parameterized by a weighting factor $\zeta$ that appears in the one-step reward function. For an MDP with $d$ states, the family of value functions $\{ h^*_\zeta : \zeta\in\Re\}$ is the solution to an ODE, $$ \frac{d}{d\zeta} h^*_\zeta = {\cal V}(h^*_\zeta) $$ where the vector field ${\cal V}\colon\Re^d\to\Re^d$ has a simple form, based on a matrix inverse. This general methodology is applied to a family of average-cost optimal control models in which the one-step reward function is defined by Kullback-Leibler divergence. The motivation for this reward function in prior work is computation: The solution to the MDP can be expressed in terms of the Perron-Frobenius eigenvector for an associated positive matrix. The drawback with this approach is that no hard constraints on the control are permitted. It is shown here that it is possible to extend this framework to model randomness from nature that cannot be modified by the controller. Perron-Frobenius theory is no longer applicable -- the resulting dynamic programming equations appear as complex as a completely unstructured MDP model. Despite this apparent complexity, it is shown that this class of MDPs admits a solution via this new ODE technique. This approach is new and practical even for the simpler problem in which randomness from nature is absent.
“Ordinary Differential Equation Methods For Markov Decision Processes And Application To Kullback-Leibler Control Cost” Metadata:
- Title: ➤ Ordinary Differential Equation Methods For Markov Decision Processes And Application To Kullback-Leibler Control Cost
- Authors: Ana BušićSean Meyn
“Ordinary Differential Equation Methods For Markov Decision Processes And Application To Kullback-Leibler Control Cost” Subjects and Themes:
- Subjects: Optimization and Control - Systems and Control - Computing Research Repository - Mathematics
Edition Identifiers:
- Internet Archive ID: arxiv-1605.04591
Downloads Information:
The book is available for download in "texts" format, the size of the file-s is: 0.55 Mbs, the file-s for this book were downloaded 22 times, the file-s went public at Fri Jun 29 2018.
Available formats:
Archive BitTorrent - Metadata - Text PDF -
Related Links:
- Whefi.com: Download
- Whefi.com: Review - Coverage
- Internet Archive: Details
- Internet Archive Link: Downloads
Online Marketplaces
Find Ordinary Differential Equation Methods For Markov Decision Processes And Application To Kullback-Leibler Control Cost at online marketplaces:
- Amazon: Audiable, Kindle and printed editions.
- Ebay: New & used books.
Buy “Markov Decision Processes” online:
Shop for “Markov Decision Processes” on popular online marketplaces.
- Ebay: New and used books.