05398nam 2200661Ia 450 991087661860332120200520144314.01-283-37409-997866133740970-470-98004-40-470-98003-6(CKB)3400000000000317(EBL)661847(SSID)ssj0000477769(PQKBManifestationID)11296731(PQKBTitleCode)TC0000477769(PQKBWorkID)10513471(PQKB)11586395(MiAaPQ)EBC661847(OCoLC)705354489(EXLCZ)99340000000000031720101103d2011 uy 0engur|n|---|||||txtccrMulti-armed bandit allocation indices /John Gittins, Kevin Glazebrook, Richard Weber2nd ed.Chichester Wiley20111 online resource (311 p.)Description based upon print version of record.0-470-67002-9 Includes bibliographical references and index.Multi-armed Bandit Allocation Indices; Contents; Foreword; Foreword to the first edition; Preface; Preface to the first edition; 1 Introduction or exploration; Exercises; 2 Main ideas: Gittins index; 2.1 Introduction; 2.2 Decision processes; 2.3 Simple families of alternative bandit processes; 2.4 Dynamic programming; 2.5 Gittins index theorem; 2.6 Gittins index; 2.6.1 Gittins index and the multi-armed bandit; 2.6.2 Coins problem; 2.6.3 Characterization of the optimal stopping time; 2.6.4 The restart-in-state formulation; 2.6.5 Dependence on discount factor2.6.6 Myopic and forwards induction policies2.7 Proof of the index theorem by interchanging bandit portions; 2.8 Continuous-time bandit processes; 2.9 Proof of the index theorem by induction and interchange argument; 2.10 Calculation of Gittins indices; 2.11 Monotonicity conditions; 2.11.1 Monotone indices; 2.11.2 Monotone jobs; 2.12 History of the index theorem; 2.13 Some decision process theory; Exercises; 3 Necessary assumptions for indices; 3.1 Introduction; 3.2 Jobs; 3.3 Continuous-time jobs; 3.3.1 Definition; 3.3.2 Policies for continuous-time jobs3.3.3 The continuous-time index theorem for a SFABP of jobs3.4 Necessary assumptions; 3.4.1 Necessity of an infinite time horizon; 3.4.2 Necessity of constant exponential discounting; 3.4.3 Necessity of a single processor; 3.5 Beyond the necessary assumptions; 3.5.1 Bandit-dependent discount factors; 3.5.2 Stochastic discounting; 3.5.3 Undiscounted rewards; 3.5.4 A discrete search problem; 3.5.5 Multiple processors; Exercises; 4 Superprocesses, precedence constraints and arrivals; 4.1 Introduction; 4.2 Bandit superprocesses; 4.3 The index theorem for superprocesses4.4 Stoppable bandit processes4.5 Proof of the index theorem by freezing and promotion rules; 4.5.1 Freezing rules; 4.5.2 Promotion rules; 4.6 The index theorem for jobs with precedence constraints; 4.7 Precedence constraints forming an out-forest; 4.8 Bandit processes with arrivals; 4.9 Tax problems; 4.9.1 Ongoing bandits and tax problems; 4.9.2 Klimov's model; 4.9.3 Minimum EWFT for the M/G/1 queue; 4.10 Near optimality of nearly index policies; Exercises; 5 The achievable region methodology; 5.1 Introduction; 5.2 A simple example; 5.3 Proof of the index theorem by greedy algorithm5.4 Generalized conservation laws and indexable systems5.5 Performance bounds for policies for branching bandits; 5.6 Job selection and scheduling problems; 5.7 Multi-armed bandits on parallel machines; Exercises; 6 Restless bandits and Lagrangian relaxation; 6.1 Introduction; 6.2 Restless bandits; 6.3 Whittle indices for restless bandits; 6.4 Asymptotic optimality; 6.5 Monotone policies and simple proofs of indexability; 6.6 Applications to multi-class queueing systems; 6.7 Performance bounds for the Whittle index policy; 6.8 Indices for more general resource configurations; Exercises7 Multi-population random sampling (theory)In 1989 the first edition of this book set out Gittins' pioneering index solution to the multi-armed bandit problem and his subsequent investigation of a wide of sequential resource allocation and stochastic scheduling problems. Since then there has been a remarkable flowering of new insights, generalizations and applications, to which Glazebrook and Weber have made major contributions. This second edition brings the story up to date. There are new chapters on the achievable region approach to stochastic optimization problems, the construction of performance bounds for suboptimal policies, WResource allocationMathematical modelsMathematical optimizationProgramming (Mathematics)Resource allocationMathematical models.Mathematical optimization.Programming (Mathematics)519.5Gittins John C.1938-1754207Glazebrook Kevin D.1950-1443322Weber Richard1953-942460MiAaPQMiAaPQMiAaPQBOOK9910876618603321Multi-armed bandit allocation indices4190436UNINA